Get Your Hands Dirty on Clean Architecture

5 (2 reviews total)
By Tom Hombergs
  • Instant online access to over 7,500+ books and videos
  • Constantly updated with 100+ new titles each month
  • Breadth and depth in over 1,000+ technologies
  1. What's Wrong with Layers?

About this book

We would all like to build software architecture that yields adaptable and flexible software with low development costs. But, unreasonable deadlines and shortcuts make it very hard to create such an architecture.

Get Your Hands Dirty on Clean Architecture starts with a discussion about the conventional layered architecture style and its disadvantages. It also talks about the advantages of the domain-centric architecture styles of Robert C. Martin's Clean Architecture and Alistair Cockburn's Hexagonal Architecture. Then, the book dives into hands-on chapters that show you how to manifest a hexagonal architecture in actual code. You'll learn in detail about different mapping strategies between the layers of a hexagonal architecture and see how to assemble the architecture elements into an application. The later chapters demonstrate how to enforce architecture boundaries. You'll also learn what shortcuts produce what types of technical debt and how, sometimes, it is a good idea to willingly take on those debts.

After reading this book, you'll have all the knowledge you need to create applications using the hexagonal architecture style of web development.

Publication date:
September 2019
Publisher
Packt
Pages
156
ISBN
9781839211966

 

What's Wrong with Layers?

Chances are that you have developed a layered (web) application in the past. You might even be doing it in your current project right now (actually, I am).

Thinking in layers has been drilled into us in computer science classes, tutorials, and best practices. It has even been taught in books (Software Architecture Patterns by Mark Richards, O'Reilly, 2015):

Figure 1.1: A conventional web application architecture consists of a web layer, a domain layer, and a persistence layer
Figure 1.1: A conventional web application architecture consists of a web layer, a domain layer, and a persistence layer

The preceding figure shows a high-level view of the very common three-layer architecture. We have a Web layer, which receives requests and routes them to a service in the Domain or business layer. The service does some business magic and calls components from the Persistence layer to query for or modify the current state of our domain entities.

You know what? Layers are a solid architecture pattern. If we get them right, we can build domain logic that is independent of the web and persistence layers. We can switch the Web or Persistence technologies without affecting our Domain logic if we feel like it. We can add new features without affecting existing features.

With a good layered architecture, we're keeping our options open and are able to quickly adapt to changing requirements and external factors. And if we believe Uncle Bob, this is exactly what architecture is all about (Clean Architecture by Robert C. Martin, Prentice Hall, 2017, Chapter 15).

So, what's wrong with layers?

In my experience, a layered architecture has too many open flanks that allow bad habits to creep in and make the software increasingly hard to change over time. In the following sections, I'll tell you why.

It Promotes Database-Driven Design

By its very definition, the foundation of a conventional layered architecture is the database.

The web layer depends on the domain layer, which in turn depends on the persistence layer and thus the database.

Everything builds on top of the persistence layer. This is problematic due to several reasons.

Let's take a step back and think about what we're trying to achieve with almost any application we're building. We're typically trying to create a model of the rules or "policies" that govern the business in order to make it easier for the users to interact with them.

We're primarily trying to model behavior, and not state. Yes, state is an important part of any application, but the behavior is what changes the state and thus drives the business.

So, why are we making the database the foundation of our architecture and not the domain logic?

Think back to the last use cases you have implemented in any application. Did you start by implementing the domain logic or the persistence layer? Most likely, you thought about what the database structure would look like and only then moved on to implementing the domain logic on top of it.

This makes sense in a conventional layered architecture since we're going with the natural flow of dependencies. But it makes absolutely no sense from a business point of view. We should build the domain logic before doing anything else. Only then can we find out whether we have understood it correctly. And only once we know we're building the right domain logic should we move on to build a persistence and web layer around it.

A driving force in such a database-centric architecture is the use of object-relational mapping (ORM) frameworks. Don't get me wrong, I love those frameworks and I'm working with JPA and Hibernate on a daily basis.

But if we combine an ORM framework with a layered architecture, we're easily tempted to mix business rules with persistence aspects:

Figure 1.2: Using the database entities in the domain layer leads to strong coupling with the persistence layer
Figure 1.2: Using the database entities in the domain layer leads to strong coupling with the persistence layer

Usually, we have ORM-managed entities as part of the persistence layer, as shown in the preceding figure. Since layers may access the layers below them, the Domain layer is allowed to access those entities. And if it's allowed to use them, they will be used.

This creates a strong coupling between the Persistence layer and the Domain layer. Our services use the persistence model as their business model and not only have to deal with the domain logic but also with eager versus lazy loading, database transactions, flushing caches, and similar housekeeping tasks.

The persistence code is virtually fused into the domain code, and thus it's hard to change one without the other. That's the opposite of being flexible and keeping options open, which should be the goal of our architecture.

It's Prone to Shortcuts

In a conventional layered architecture, the only global rule is that from a certain layer, we can only access components in the same layer or in a layer below.

There may be other rules that a development team has agreed upon, and some of them might even be enforced by tooling, but the layered architecture style itself does not impose those rules on us.

So, if we need access to a certain component in a layer above ours, we can just push the component down a layer, and we're allowed to access it. Problem solved.

Doing this once may be OK. But doing it once opens the door for doing it a second time. And if someone else was allowed to do it, so am I, right?

I'm not saying that as developers, we take such shortcuts lightly. But if there is an option to do something, someone will do it, especially in combination with a looming deadline. And if something has been done before, the threshold for someone to do it again will lower drastically. This is a psychological effect called the "Broken Windows Theory" – more on this in Chapter 11, Taking Shortcuts Consciously:

Figure 1.3: Since we may access everything in the persistence layer, it tends to grow fat over time
Figure 1.3: Since we may access everything in the persistence layer, it tends to grow fat over time

Over years of development and maintenance of a software project, the persistence layer may very well end up like the one in the preceding figure.

The persistence layer (or, in more generic terms, the bottom-most layer) will grow fat as we push components down through the layers. Perfect candidates for this are helper or utility components since they don't seem to belong to any specific layer.

So, if we want to disable the "shortcut mode" for our architecture, layers are not the best option, at least not without enforcing some kind of additional architecture rules. And by "enforce," I don't mean a senior developer doing code reviews but rules that make the build fail when they're broken.

It Grows Hard to Test

A common evolution within a layered architecture is that layers are being skipped. We access the Persistence layer directly from the Web layer since we're only manipulating a single field of an Entity, and for that we need not bother the Domain layer, right?

Figure 1.4: Skipping the domain layer tends to scatter domain logic across the code base
Figure 1.4: Skipping the domain layer tends to scatter domain logic across the code base

Again, this feels OK the first couple of times, but it has two drawbacks if it happens often (and it will, once someone has done the first step).

First, we're implementing domain logic in the Web layer, even if it's only manipulating a single field. What if the use case expands in the future? We're most likely going to add more domain logic to the Web layer, mixing responsibilities and spreading essential domain logic all over the application.

Second, in the tests of our Web layer, we not only have to mock away the domain layer, but also the persistence layer. This adds complexity to the unit test. And a complex test setup is the first step toward no tests at all because we don't have time for them.

As the web component grows over time, it may accumulate a lot of dependencies to different persistence components, adding to the test's complexity. At some point, it takes more time for us to understand and mock away the dependencies than to actually write test code.

It Hides the Use Cases

As developers, we like to create new code that implements shiny new use cases. But we usually spend much more time changing existing code than we do creating new code. This is not only true for those dreaded legacy projects in which we're working on a decades-old code base but also for a hot new greenfield project after the initial use cases have been implemented.

Since we're so often searching for the right place to add or change functionality, our architecture should help us to quickly navigate the code base. How is a layered architecture holding up in this regard?

As already discussed, in a layered architecture domain logic can easily be scattered throughout the layers. It may exist in the web layer if we're skipping the domain logic for an "easy" use case. And it may exist in the persistence layer if we have pushed a certain component down so it can be accessed from both the domain and the persistence layer. This already makes finding the right place to add new functionality hard.

But there's more. A layered architecture does not impose rules on the "width" of domain services. Over time, this often leads to very broad services that serve multiple use cases, as shown in the following figure:

Figure 1.5: Broad services make it hard to find a certain use case within the code base
Figure 1.5: Broad services make it hard to find a certain use case within the code base

A broad service has many dependencies on the persistence layer, and many components in the web layer depend on it. This not only makes the service hard to test but also makes it hard for us to find the service that's responsible for the use case we want to work on.

How much easier would it be if we had highly specialized narrow domain services that each serve a single use case? Instead of searching for the user registration use case in the UserService, we would just open up the RegisterUserService and start working.

It Makes Parallel Work Difficult

Management usually expects us to be done with building the software they sponsor at a certain date. Actually, they even expect us to be done within a certain budget as well, but let's not complicate things here.

Aside from the fact that I have never seen "done" software in my career as a software developer, to be done by a certain date usually implies that we have to work in parallel.

Probably you know this famous conclusion from "The Mythical Man-Month," even if you haven't read the book:

"Adding manpower to a late software project makes it later" – The Mythical Man-Month: Essays on Software Engineering by Frederick P. Brooks, Jr., Addison-Wesley, 1995.

This also holds true, to a degree, to software projects that are not (yet) late. You cannot expect a large group of 50 developers to be 5 times as fast as a smaller team of 10 developers in every context. If they're working on a very large application where and they can split up in sub-teams and work on separate parts of the software, it may work, but in most contexts, they would stand on each other's feet.

But on a healthy scale, we can certainly expect to be faster with more people on the project. And management is right to expect that of us.

To meet this expectation, our architecture must support parallel work. This is not easy. And a layered architecture doesn't really help us here.

Imagine we're adding a new use case to our application. We have three developers available. One can add the needed features to the web layer, one to the domain layer, and the third to the persistence layer, right?

Well, it usually doesn't work that way in a layered architecture. Since everything builds on top of the persistence layer, the persistence layer must be developed first. Then comes the domain layer, and finally the web layer. So, only one developer can work on the feature at the same time.

Ah, but the developers can define interfaces first, you say, and then each developer can work against these interfaces without having to wait for the actual implementation. Sure, this is possible, but only if we're not doing database-driven design, as discussed earlier, where our persistence logic is so mixed up with our domain logic that we just cannot work on each aspect separately.

If we have broad services in our codebase, it may even be hard to work on different features in parallel. Working on different use cases will cause the same service to be edited in parallel, which leads to merge conflicts and, potentially, regressions.

How Does This Help Me Build Maintainable Software?

If you have built layered architectures in the past, you can probably relate to some of the disadvantages discussed in this chapter, and you could maybe even add some more.

If done correctly, and if some additional rules are imposed on it, a layered architecture can be very maintainable and make changing or adding to the codebase a breeze.

However, the discussion shows that a layered architecture allows many things to go wrong. Without very strict self-discipline, it's prone to degrade and become less maintainable over time. And this self-discipline usually becomes a little less strict each time a manager draws a new deadline around the development team.

Keeping the traps of a layered architecture in mind will help us the next time we argue against taking a shortcut and for building a more maintainable solution instead – whether in a layered architecture or a different architecture style.

 

It Promotes Database-Driven Design

By its very definition, the foundation of a conventional layered architecture is the database.

The web layer depends on the domain layer, which in turn depends on the persistence layer and thus the database.

Everything builds on top of the persistence layer. This is problematic due to several reasons.

Let's take a step back and think about what we're trying to achieve with almost any application we're building. We're typically trying to create a model of the rules or "policies" that govern the business in order to make it easier for the users to interact with them.

We're primarily trying to model behavior, and not state. Yes, state is an important part of any application, but the behavior is what changes the state and thus drives the business.

So, why are we making the database the foundation of our architecture and not the domain logic?

Think back to the last use cases you have implemented in any application. Did you start by implementing the domain logic or the persistence layer? Most likely, you thought about what the database structure would look like and only then moved on to implementing the domain logic on top of it.

This makes sense in a conventional layered architecture since we're going with the natural flow of dependencies. But it makes absolutely no sense from a business point of view. We should build the domain logic before doing anything else. Only then can we find out whether we have understood it correctly. And only once we know we're building the right domain logic should we move on to build a persistence and web layer around it.

A driving force in such a database-centric architecture is the use of object-relational mapping (ORM) frameworks. Don't get me wrong, I love those frameworks and I'm working with JPA and Hibernate on a daily basis.

But if we combine an ORM framework with a layered architecture, we're easily tempted to mix business rules with persistence aspects:

Figure 1.2: Using the database entities in the domain layer leads to strong coupling with the persistence layer
Figure 1.2: Using the database entities in the domain layer leads to strong coupling with the persistence layer

Usually, we have ORM-managed entities as part of the persistence layer, as shown in the preceding figure. Since layers may access the layers below them, the Domain layer is allowed to access those entities. And if it's allowed to use them, they will be used.

This creates a strong coupling between the Persistence layer and the Domain layer. Our services use the persistence model as their business model and not only have to deal with the domain logic but also with eager versus lazy loading, database transactions, flushing caches, and similar housekeeping tasks.

The persistence code is virtually fused into the domain code, and thus it's hard to change one without the other. That's the opposite of being flexible and keeping options open, which should be the goal of our architecture.

 

It's Prone to Shortcuts

In a conventional layered architecture, the only global rule is that from a certain layer, we can only access components in the same layer or in a layer below.

There may be other rules that a development team has agreed upon, and some of them might even be enforced by tooling, but the layered architecture style itself does not impose those rules on us.

So, if we need access to a certain component in a layer above ours, we can just push the component down a layer, and we're allowed to access it. Problem solved.

Doing this once may be OK. But doing it once opens the door for doing it a second time. And if someone else was allowed to do it, so am I, right?

I'm not saying that as developers, we take such shortcuts lightly. But if there is an option to do something, someone will do it, especially in combination with a looming deadline. And if something has been done before, the threshold for someone to do it again will lower drastically. This is a psychological effect called the "Broken Windows Theory" – more on this in Chapter 11, Taking Shortcuts Consciously:

Figure 1.3: Since we may access everything in the persistence layer, it tends to grow fat over time
Figure 1.3: Since we may access everything in the persistence layer, it tends to grow fat over time

Over years of development and maintenance of a software project, the persistence layer may very well end up like the one in the preceding figure.

The persistence layer (or, in more generic terms, the bottom-most layer) will grow fat as we push components down through the layers. Perfect candidates for this are helper or utility components since they don't seem to belong to any specific layer.

So, if we want to disable the "shortcut mode" for our architecture, layers are not the best option, at least not without enforcing some kind of additional architecture rules. And by "enforce," I don't mean a senior developer doing code reviews but rules that make the build fail when they're broken.

 

It Grows Hard to Test

A common evolution within a layered architecture is that layers are being skipped. We access the Persistence layer directly from the Web layer since we're only manipulating a single field of an Entity, and for that we need not bother the Domain layer, right?

Figure 1.4: Skipping the domain layer tends to scatter domain logic across the code base
Figure 1.4: Skipping the domain layer tends to scatter domain logic across the code base

Again, this feels OK the first couple of times, but it has two drawbacks if it happens often (and it will, once someone has done the first step).

First, we're implementing domain logic in the Web layer, even if it's only manipulating a single field. What if the use case expands in the future? We're most likely going to add more domain logic to the Web layer, mixing responsibilities and spreading essential domain logic all over the application.

Second, in the tests of our Web layer, we not only have to mock away the domain layer, but also the persistence layer. This adds complexity to the unit test. And a complex test setup is the first step toward no tests at all because we don't have time for them.

As the web component grows over time, it may accumulate a lot of dependencies to different persistence components, adding to the test's complexity. At some point, it takes more time for us to understand and mock away the dependencies than to actually write test code.

 

It Hides the Use Cases

As developers, we like to create new code that implements shiny new use cases. But we usually spend much more time changing existing code than we do creating new code. This is not only true for those dreaded legacy projects in which we're working on a decades-old code base but also for a hot new greenfield project after the initial use cases have been implemented.

Since we're so often searching for the right place to add or change functionality, our architecture should help us to quickly navigate the code base. How is a layered architecture holding up in this regard?

As already discussed, in a layered architecture domain logic can easily be scattered throughout the layers. It may exist in the web layer if we're skipping the domain logic for an "easy" use case. And it may exist in the persistence layer if we have pushed a certain component down so it can be accessed from both the domain and the persistence layer. This already makes finding the right place to add new functionality hard.

But there's more. A layered architecture does not impose rules on the "width" of domain services. Over time, this often leads to very broad services that serve multiple use cases, as shown in the following figure:

Figure 1.5: Broad services make it hard to find a certain use case within the code base
Figure 1.5: Broad services make it hard to find a certain use case within the code base

A broad service has many dependencies on the persistence layer, and many components in the web layer depend on it. This not only makes the service hard to test but also makes it hard for us to find the service that's responsible for the use case we want to work on.

How much easier would it be if we had highly specialized narrow domain services that each serve a single use case? Instead of searching for the user registration use case in the UserService, we would just open up the RegisterUserService and start working.

 

It Makes Parallel Work Difficult

Management usually expects us to be done with building the software they sponsor at a certain date. Actually, they even expect us to be done within a certain budget as well, but let's not complicate things here.

Aside from the fact that I have never seen "done" software in my career as a software developer, to be done by a certain date usually implies that we have to work in parallel.

Probably you know this famous conclusion from "The Mythical Man-Month," even if you haven't read the book:

"Adding manpower to a late software project makes it later" – The Mythical Man-Month: Essays on Software Engineering by Frederick P. Brooks, Jr., Addison-Wesley, 1995.

This also holds true, to a degree, to software projects that are not (yet) late. You cannot expect a large group of 50 developers to be 5 times as fast as a smaller team of 10 developers in every context. If they're working on a very large application where and they can split up in sub-teams and work on separate parts of the software, it may work, but in most contexts, they would stand on each other's feet.

But on a healthy scale, we can certainly expect to be faster with more people on the project. And management is right to expect that of us.

To meet this expectation, our architecture must support parallel work. This is not easy. And a layered architecture doesn't really help us here.

Imagine we're adding a new use case to our application. We have three developers available. One can add the needed features to the web layer, one to the domain layer, and the third to the persistence layer, right?

Well, it usually doesn't work that way in a layered architecture. Since everything builds on top of the persistence layer, the persistence layer must be developed first. Then comes the domain layer, and finally the web layer. So, only one developer can work on the feature at the same time.

Ah, but the developers can define interfaces first, you say, and then each developer can work against these interfaces without having to wait for the actual implementation. Sure, this is possible, but only if we're not doing database-driven design, as discussed earlier, where our persistence logic is so mixed up with our domain logic that we just cannot work on each aspect separately.

If we have broad services in our codebase, it may even be hard to work on different features in parallel. Working on different use cases will cause the same service to be edited in parallel, which leads to merge conflicts and, potentially, regressions.

 

How Does This Help Me Build Maintainable Software?

If you have built layered architectures in the past, you can probably relate to some of the disadvantages discussed in this chapter, and you could maybe even add some more.

If done correctly, and if some additional rules are imposed on it, a layered architecture can be very maintainable and make changing or adding to the codebase a breeze.

However, the discussion shows that a layered architecture allows many things to go wrong. Without very strict self-discipline, it's prone to degrade and become less maintainable over time. And this self-discipline usually becomes a little less strict each time a manager draws a new deadline around the development team.

Keeping the traps of a layered architecture in mind will help us the next time we argue against taking a shortcut and for building a more maintainable solution instead – whether in a layered architecture or a different architecture style.

About the Author

  • Tom Hombergs

    Tom Hombergs is a software engineer by profession and by passion with more than a decade of experience working on many different software projects for many different clients across various industries. In software projects, he takes on the roles of software developer, architect, and coach, with a focus on the Java ecosystem. He has found that writing is the best way to learn, so he likes to dive deep into topics he encounters in his software projects to create texts that give structure to the chaotic world of software development. He regularly writes about software development on his blog and is an occasional speaker at conferences.

    Browse publications by this author

Latest Reviews

(2 reviews total)
Quickly and easy
The book is very informative when leaning clean architecture using java with spring framework

Recommended For You

The Complete Edition - Software Engineering for Real-Time Systems

Adopt a diagrammatic approach to creating robust real-time embedded systems

By Jim Cooling
The Python Workshop

Cut through the noise and get real results with a step-by-step approach to learning Python 3.X programming

By Andrew Bird and 4 more
arc42 by Example

Document the architecture of your software easily with this highly practical, open-source template.

By Dr. Gernot Starke and 3 more
Hands-On Reinforcement Learning for Games

Explore reinforcement learning (RL) techniques to build cutting-edge games using Python libraries such as PyTorch, OpenAI Gym, and TensorFlow

By Micheal Lanham