We don't scale our software systems just because we can. While it's common to tout scalability, these claims need to be put into practice. In order to do so, there has to be a reason for scalable software. If there's no need to scale, then it's much easier, not to mention cost-effective, to simply build a system that doesn't scale. Putting something that was built to handle a wide variety of scaling issues into a context where scale isn't warranted just feels clunky. Especially to the end user.
Scaling software is a reactive event. Thinking about scaling influencers helps us proactively prepare for these scaling events. In other systems, such as web application backends, these scaling events may be brief spikes, and are generally handled automatically. For example, there's an increased load due to more users issuing more requests. The load balancer kicks in and distributes the load evenly across backend servers. In the extreme case, the system may automatically provision new backend resources when needed, and destroy them when they're no longer of use.
The preceding figure shows us a top-down flow chart of scaling influencers, starting with users, who require that our software implements features. Depending on various aspects of the features, such as their size and how they relate to other features, this influences the team of developers working on features. As we move down through the scaling influencers, this grows.
We're not building an application for just one user. If we were, there would be no need to scale our efforts. While what we build might be based on the requirements of one user representative, our software serves the needs of many users. We need to anticipate a growing user base as our application evolves. There's no exact target user count, although, depending on the nature of our application, we may set goals for the number of active users, possibly by benchmarking similar applications using a tool such as http://www.alexa.com/. For example, if our application is exposed on the public internet, we want lots of registered users. On the other hand, we might target private installations, and there, the number of users joining the system is a little slower. But even in the latter case, we still want the number of deployments to go up, increasing the total number of people using our software.
Perhaps the most obvious side-effect of successful software with a strong user base is the features necessary to keep those users happy. The feature set grows along with the users of the system. This is often overlooked by projects, despite the obviousness of new features. We know they're coming, yet, little thought goes into how the endless stream of features going into our code impedes our ability to scale up our efforts.
This is especially tricky when the software is in its infancy. The organization developing the software will bend over backwards to reel in new users. And there's little consequence of doing so in the beginning because the side-effects are limited. There's not a lot of mature features, there's not a huge development team, and there's less chance of annoying existing users by breaking something that they've come to rely on. When these factors aren't there, it's easier for us to nimbly crank out the features and dazzle existing/prospective users. But how do we force ourselves to be mindful of these early design decisions? How do we make sure that we don't unnecessarily limit our ability to scale the software up, in terms of supporting more features?
It's not just the individual users who make a given feature complex. Instead, it's a group of users that all need the same feature in order to use our software effectively. And from there, we have to start thinking about personas, or roles, and which features are available for which roles. The need for this type of organizational structure isn't made apparent till much later on in the game; after we've made decisions that make it difficult to introduce role-based feature delivery. And depending on how our software is deployed, we may have to support a variety of unique use cases. For example, if we have several large organizations as our customers, each with their own deployments, they'll likely have their own unique constraints on how users are structured. This is challenging, and our architecture needs to support the disparate needs of many organizations, if we're going to scale.
It's actually easier to deal with not enough developers for the features we're trying to develop, than having too many developers. When there's a large burden of feature development, it's a good opportunity to step back and think—"what would we do differently if we had more developers?" This question usually gets skipped. We go hire more developers, and when they arrive, it's to everyone's surprise that there's no immediate improvement in feature throughput. This is why it's best to have an open development culture where there are no stupid questions, and where responsibilities are defined.
There's no one correct team structure or development methodology. The development team needs to apply itself to the issues faced by the software we're trying to deliver. The biggest hurdle is for sure the number, size, and complexity of features. So that's something we need to consider when forming our team initially, as well as when growing the team. This latter point is especially true because the team structure we used way back when the software was new isn't going to fit what we face when the features scale up.
Scaling influences the perspectives of our architecture. Our architecture, in turn, determines responses to scaling influencers. The process is iterative and never-ending throughout the lifetime of our software.
Scaling up in the traditional sense doesn't really work in a browser environment. When backend services are overwhelmed by demand, it's common to "throw more hardware" at the problem. Easier said than done of course, but it's a lot easier to scale up our data services these days, compared to 20 years ago. Today's software systems are designed with scalability in mind. It's helpful to our frontend application if the backend services are always available and always responsive, but that's just a small portion of the issues we face.
The expectation that browser-based web applications be lean and fast is an emergent phenomenon. Perhaps, that's due in part to the competition we face. There are a lot of bloated applications out there, and whether they're used in the browser or natively on the desktop, users know what bloat feels like, and generally run the other way:
At an architectural level, components are the main building blocks we work with. These may be very high-level components with several levels of abstraction. Or, they could be something exposed by a framework we're using, as many of these tools provide their own idea of "components". For our purposes in this book, components sit somewhere in the middle—not too abstract, and not too implementation-specific. The idea being that we need to be thoughtful of our application composition, without worrying too much about the specifics.
As we'll see, the design of our various components is closely-tied to the trade-offs we make in other perspectives. And that's a good thing, because it means that if we're paying attention to the scalable qualities we're after, we can go back and adjust the design of our components in order to meet those qualities.
Components don't sit in the browser on their own. Components communicate with one another all the time. There's a wide variety of communication techniques at our disposal here. Component communication could be as simple as method invocation, or as complex as an asynchronous publish-subscribe event system. The approach we take with our architecture depends on our more specific goals. The challenge with components is that we often don't know what the ideal communication mechanism will be, till after we've started implementing our application. We have to make sure that we can adjust the chosen communication path:
Seldom will we implement our own communication mechanism for our components. Not when so many tools exist, that solve at least part of the problem for us. Most likely, we'll end up with a concoction of an existing tool for communication and our own implementation specifics. What's important is that the component communication mechanism is its own perspective, which can be designed independently of the components themselves.
There's lots we can do here to offset the negative user experience of waiting for things to load. This includes utilizing web specifications that allow us to treat applications and the services they use as installable components in the web browser platform. Of course, these are all nascent ideas, but worth considering as they mature alongside our application.
The second part of the performance perspective of our architecture is concerned with responsiveness. That is, after everything has loaded, how long does it take for us to respond to user input? Although this is a separate problem from that of loading resources from the backend, they're still closely-related. Often, user actions trigger API requests, and the techniques we employ to handle these workflows impact user-perceived responsiveness.
Typically, these URIs will map to an API resource. When the user hits one of these URIs in our application, we'll translate the URI into another URI that's used to request backend data. The component we use to manage these application URIs is called a router, and there's lots of frameworks and libraries with a base implementation of a router. We'll likely use one of these.
The addressability perspective plays a major role in our architecture, because ensuring that the various aspects of our application have an addressable URI complicates our design. However, it can also make things easier if we're clever about it. We can have our components utilize the URIs in the same way a user utilizes links.
Rarely does software do what you need it to straight out of the box. Highly-configurable software systems are touted as being good software systems. Configuration in the frontend is a challenge because there's several dimensions of configuration, not to mention the issue of where we store these configuration options. Default values for configurable components are problematic too—where do they come from? For example, is there a default language setting that's set until the user changes it? As is often the case, different deployments of our frontend will require different default values for these settings:
Every configurable aspect of our software complicates its design. Not to mention the performance overhead and potential bugs. So, configurability is a large issue, and it's worth the time spent up-front discussing with various stakeholders what they value in terms of configurability. Depending on the nature of our deployment, users may value portability with their configuration. This means that their values need to be stored in the backend, under their account settings. Obviously decisions like these have backend design implications, and sometimes it's better to get away with approaches that don't require a modified backend service.
There's a lot to consider from the various perspectives of our architecture, if we're going to build something that scales. We'll never get everything that we need out of every perspective simultaneously. This is why we make architectural trade-offs—we trade one aspect of our design for another more desirable aspect.
Before we start making trade-offs, it's important to state explicitly what cannot be traded. What aspects of our design are so crucial to achieving scale that they must remain constant? For instance, a constant might be the number of entities rendered on a given page, or a maximum level of function call indirection. There shouldn't be a ton of these architectural constants, but they do exist. It's best if we keep them narrow in scope and limited in number. If we have too many strict design principles that cannot be violated or otherwise changed to fit our needs, we won't be able to easily adapt to changing influencers of scale.
Does it make sense to have constant design principles that never change, given the unpredictability of scaling influencers? It does, but only once they emerge and are obvious. So this may not be an up-front principle, though we'll often have at least one or two up-front principles to follow. The discovery of these principles may result from the early refactoring of code or the later success of our software. In any case, the constants we use going forward must be made explicit and be agreed upon by all those involved.
Performance bottlenecks need to be fixed, or avoided in the first place where possible. Some performance bottlenecks are obvious and have an observable impact on the user experience. These need to be fixed immediately, because it means our code isn't scaling for some reason, and might even point to a larger design issue.
Other performance issues are relatively small. These are generally noticed by developers running benchmarks against code, trying by all means necessary to improve the performance. This doesn't scale well, because these smaller performance bottlenecks that aren't observable by the end user are time-consuming to fix. If our application is of a reasonable size, with more than a few developers working on it, we're not going to be able to keep up with feature development if everyone's fixing minor performance problems.
These micro-optimizations introduce specialized solutions into our code, and they're not exactly easy reading for other developers. On the other hand, if we let these minor inefficiencies go, we will manage to keep our code cleaner and thus easier to work with. Where possible, trade off optimized performance for better code quality. This improves our ability to scale from a number of perspectives.
It's nice to have generic components where nearly every aspect is configurable. However, this approach to component design comes at a performance cost. It's not noticeable at first, when there are few components, but as our software scales in feature count, the number of components grows, and so does the number of configuration options. Depending on the size of each component (its complexity, number of configuration options, and so forth) the potential for performance degradation increases exponentially. Take a look at the following diagram:
We can keep our configuration options around as long as there're no performance issues affecting our users. Just keep in mind that we may have to remove certain options in an effort to remove performance bottlenecks. It's unlikely that configurability is going to be our main source of performance issues. It's also easy to get carried away as we scale and add features. We'll find, retrospectively, that we created configuration options at design time that we thought would be helpful, but turned out to be nothing but overhead. Trade off configurability for performance when there's no tangible benefit to having the configuration option.
A related problem to that of configurability is substitutability. Our user interface performs well, but as our user base grows and more features are added, we discover that certain components cannot be easily substituted with another. This can be a developmental problem, where we want to design a new component to replace something pre-existing. Or perhaps we need to substitute components at runtime.
Our ability to substitute components lies mostly with the component communication model. If the new component is able to send/receive messages/events the same as the existing component, then it's a fairly straightforward substitution. However, not all aspects of our software are substitutable. In the interest of performance, there may not even be a component to replace.
As we scale, we may need to re-factor larger components into smaller components that are replaceable. By doing so, we're introducing a new level of indirection, and a performance hit. Trade off minor performance penalties to gain substitutability that aids in other aspects of scaling our architecture.
Assigning addressable URIs to resources in our application certainly makes implementing features more difficult. Do we actually need URIs for every resource exposed by our application? Probably not. For the sake of consistency though, it would make sense to have URIs for almost every resource. If we don't have a router and URI generation scheme that's consistent and easy to follow, we're more likely to skip implementing URIs for certain resources.
It's almost always better to have the added burden of assigning URIs to every resource in our application than to skip out on URIs. Or worse still, not supporting addressable resources at all. URIs make our application behave like the rest of the Web; the training ground for all our users. For example, perhaps URI generation and routes are a constant for anything in our application—a trade-off that cannot happen. Trade off ease of development for addressability in almost every case. The ease of development problem with regard to URIs can be tackled in more depth as the software matures.
The ease with which features are developed in our software boils down to the development team and it's scaling influencers. For example, we could face pressure to hire entry-level developers for budgetary reasons. How well this approach scales depends on our code. When we're concerned with performance, we're likely to introduce all kinds of intimidating code that relatively inexperienced developers will have trouble swallowing. Obviously, this impedes the ease of developing new features, and if it's difficult, it takes longer. This obviously does not scale with respect to customer demand.
Developers don't always have to struggle with understanding the unorthodox approaches we've taken to tackle performance bottlenecks in specific areas of the code. We can certainly help the situation by writing quality code that's understandable. Maybe even documentation. But we won't get all of this for free; if we're to support the team as a whole as it scales, we need to pay the productivity penalty in the short term for having to coach and mentor.
When all else fails, we need to take a step back and look holistically at the featureset of our application. Can our architecture support them all? Is there a better alternative? Scrapping an architecture that we've sunk many hours into almost never makes sense—but it does happen. The majority of the time, however, we'll be asked to introduce a challenging set of features that violate one or more of our architectural constants.
When that happens, we're disrupting stable features that already exist, or we're introducing something of poor quality into the application. Neither case is good, and it's worth the time, the headache, and the cursing to work with the stakeholders to figure out what has to go.
If we've taken the time to figure out our architecture by making trade-offs, we should have a sound argument for why our software can't support hundreds of features.
That being said, we can certainly use a given framework of our liking as input to the design process. If we really like the tool, and our team has experience using it, we can let it influence our design decisions. Just as long as we understand that the framework does not automatically respond to scaling influencers—that part is up to us.
It's worth the time investigating the framework to use for our project because choosing the wrong framework is a costly mistake. The realization that we should have gone with something else usually comes after we've implemented lots of functionality. The end result is lots of re-writing, re-planning, re-training, and re-documenting. Not to mention the time lost on the first implementation. Choose your frameworks wisely, and be cautious about being framework-coupling.
Why use a mash-up of smaller libraries when there's a monolithic framework out there with everything that we need? Libraries are our tools, and if they fulfill a need in our architecture, by all means use them. Some developers shy away from low-level tools because of the dependency-chaos that ensues. In practice, this happens anyway, even if we're leveraging an all-encompassing framework.
At the end of the day, the distinction between frameworks and libraries doesn't really matter to us. Creating a third-party dependency nightmare doesn't scale well. Neither does sticking with one tool exclusively and maintaining a lot of code ourselves. It's about finding the right fit between depending heavily on other projects and reinventing the wheel ourselves.
Saying one framework scales better than another isn't justified. Writing a TODO application as a benchmark for how well the framework scales is hardly useful. We write TODO applications to get a feel for the framework, and how it compares to others. If we're unsure about which framework fits our style, a TODO application is a good start.
Our goal is to implement something that scales well in response to influencers. These are unique and unknown upfront. The best we can do is make predictions about what scaling influencers we'll likely be hit with in the future. Based on these likely influencers, and the nature of the application we're building, some frameworks are better candidates than others. Frameworks help us scale, but they don't scale for us.