(For more resources on Microsoft, see here.)
Understanding BizTalk operational architecture
We have already explored the core conceptual architecture of BizTalk Server 2010, but now we will delve more deeply into how this architecture fits into the real world of Windows Servers and applications.
At its core, BizTalk is a .NET application built on top of SQL Server. This already tells us that we have two definite dependencies: SQL Server and Windows Server. We also have a core dependency on Active Directory to provide a service account and user access and control; that said, in smaller environments, BizTalk can use local groups, but this does not scale well. The core set of servers involved in a BizTalk environment are shown in the following diagram:
These three servers are the core moving parts in any BizTalk environment. SQL Server hosts the message box and all the other databases. BizTalk provides most of the processing and Active Directory provides authentication. This book does not cover Active Directory as that is already expected to be running in your enterprise, but the other two will be explored in detail.
Administering BizTalk Server
Most administration and operation tasks for BizTalk Server are performed in the BizTalk Administration console; an MMC Snap-In designed to provide access to all the settings in a BizTalk group through a single interface. MMC provides a common user interface approach for Windows administration tasks and the use of an MMC Snap-In makes BizTalk very familiar to most administrators. Like most MMC Snap-Ins (IIS, Active Directory, and so on), there are three panes in the BizTalk Administration console Snap-In from left to right: navigation, information, and actions.
As you click on different nodes in the left navigation pane, different context comes up in the center and right panes. The right Actions pane changes further when different objects are selected in the center information pane. This context allows us to change specific settings more easily depending on where we have set our focus in the console.
The root node in the navigation pane displays a Console Root folder with the BizTalk Server Administration and Event Viewer (Local) nodes beneath it. By default, the first node will have the BizTalk group of the local machine listed within, but by right-clicking the BizTalk Server Administration node, we can connect to other BizTalk groups. This allows us to remotely administer multiple BizTalk groups from a single workstation. It also allows us to perform most administration tasks without logging into the BizTalk Servers directly. When connecting to a BizTalk group, we actually provide the connection information for the management database, which is the brain of BizTalk.
Within the BizTalk group, there are three primary areas of the console that we use to manage our solutions; each is represented by a node. They are introduced as follows and can be seen in the previous screenshot:
- Applications: This node houses all the applications deployed to a BizTalk Server group. It is from here that we can configure and control specific applications in BizTalk. An application in BizTalk is the logical grouping for a set of related artifacts that normally form a solution. Within each application, the artifacts are categorized in the nodes, as shown in the following screenshot:
These nodes largely correspond to concepts that we covered in the previous chapters. When our BizTalk application deploys locally from Visual Studio, all the assemblies should deploy to the same Application, and thus be part of the same "solution". This view will list all artifacts from any assembly deployed to this application. Policies are BRE rule sets; this is their formal name. Send Port Groups are simple grouping mechanisms for the Send Ports. Role Links tie parties to ports and orchestrations.
- Parties: These are mechanisms for working with trade partners and are particularly suited to solutions that require the same general processing for messages, but may need to send the results or intermediary requests to different end points. Parties are heavily utilized in B2B scenarios to create easily extensible solutions. A party can represent a trade partner or another system or division within the enterprise and are a key factor to EDI.
- Platform Settings: This is the place where the settings for a BizTalk group and all its subordinate objects reside. Hosts, host instances, servers, message boxes, and adapters are all configured here. As the name implies, this is where we work with settings that affect the core BizTalk platform.
Scalability in BizTalk Server
According to Wikipedia, "scalability is the ability of a system, network, or process to handle growing amounts of work in a graceful manner, or its ability to be enlarged to accommodate that growth". There are generally two types of scalabilities in the computer world: scale up, which means moving to a larger and more powerful server, and scaling out; which means adding more servers. For most software, scaling up is the simpler or even the only path. BizTalk is fundamentally designed to scale out with no changes needed to be made for the software running a solution. As we move further into the era of multicore processors, we are actually blurring this distinction and shifting more to the scale out model even when we choose to scale up. Only software and platforms that are made to be parallel can take full advantage of the multicore architectures now prominent in the industry.
Scaling SQL Server
There are two specific areas where BizTalk can scale out. The first and generally most important is the SQL Server area. SQL Server is frequently a bottleneck for BizTalk solutions. Often, this confuses administrators, even ones who know SQL Server, because they see low utilization on BizTalk Servers and don't see high utilization on SQL Servers. Most often, this is because SQL Server tends to be a disk bound application; meaning that the real bottlenecks tend to be the disk queues of SQL operations waiting to take place.
BizTalk Server has been carefully designed to fully exploit SQL Server in an extremely optimized manner, and subsequently exploit the BizTalk databases, specifically the message box, which should not be on a shared SQL instance used by other applications. In fact, the message box should have its own instance separated even from the other BizTalk databases.
While we're on the subject, now is probably a good time to raise a very important caveat about the message box. During configuration of BizTalk, the Max DOP setting on your SQL Server will be changed to one (1). This is because the message box is a highly tuned database that works very differently from most databases. The job of most databases is to hold data that will be returned as record sets. The Maximum Degree of Parallelism (Max DOP) setting, controls how SQL Server will try to run queries in parallel to each other to speed up their results. It is an instance-wide setting in SQL Server and defaults to zero (0), which allows SQL Server to use all available processors. For nearly all databases, the default Max DOP setting allows SQL Server to perform the query and return the data faster by breaking the query up amongst the processors on the server. This is sort of a divide and conquer approach if you will. This optimization in SQL Server will actually harm the performance of the message box database. The message box is structured in such a way that setting the Max DOP to any value other than one will cause the message box operations to slow down. This is because the operations performed on the message box are generally single record (and small record at that) operations. The overhead to parallelize them turns out to be more than beneficial from executing them in parallel. This will cause BizTalk to slow down if the SQL instance hosting the message box has the Max DOP set to any value besides one.
Having said all this, the other databases that make up BizTalk actually benefit from not setting the Max DOP to one. This is a great demonstration of why you may want to consider multiple SQL Servers or at least multiple instances for your BizTalk installation. Generally, SQL Server should be configured into two instances: one for the message box and one for all other databases. These can all be on the same physical server, but they should be separated from other databases and from each other. Newer versions of SQL Server do allow tighter resource control over databases, but this is still good advice for SQL 2008 R2. Within these databases, it is also a good idea to separate indexes from data storage to improve performance.
Adding more SQL Servers, or even just instances, to your BizTalk installation is a way to scale out the SQL Tier of your environment, but BizTalk also provides another way to scale out SQL Server by creating multiple message boxes. The idea is that one message box functions as the master message box, managing subscriptions and the other functions as runtime message boxes for delivering matched subscriptions (that is, starting orchestrations and send ports). This allows the master message box to focus only on subscription matching and thus it performs even better. It is suggested that if you create separate message boxes, so that you have at least three messages boxes in total, then one should master and the other two should be publishing, due to the extra overhead involved in using multiple message boxes. The intention is that these message boxes can each exist on different servers or at least on different SQL instances.
This is a very sophisticated technique and approach to addressing scalability, but like many tasks in BizTalk, this turns out to be surprisingly easy to accomplish. Simply right-click the Message Boxes node in the BizTalk Administration console (under Platform Settings), select New, and then select Message Box….
This will bring up a configuration dialog allowing you to specify the server and database name for this new message box. After you have created your new message boxes, you can go back to the master message box and disable new message publication, which will instruct the master message box to only perform routing and subscription matching. This entire operation can be performed while the platform is running.
If you ever need to remove a message box, you simply disable new message publication and let it continue running, then delete that message box.
Please note that you cannot delete the master message box without designating a new one.
Scaling BizTalk Server
After sorting out any SQL Server issues and scaling challenges, the next place to consider is the BizTalk tier. There are two ways to scale out the BizTalk tier: one is to add more hosts and host instances to the group, the other is to add more servers. Both turn out to be quite easy in BizTalk.
Adding more hosts and host instances
To add more hosts, you simply right-click the Hosts node in Platform Settings and select New | Host… from the context menu. The dialog that will appear is shown in the following screenshot:
In this dialog, you can specify the name and the Windows group that the host users will need to be a part of. You can also choose to mark a host as 32 bit in case you're working with components that do not support 64 bit runtime. Allowing different Windows groups for hosts, enables us to strictly control security permissions. If we have a location that receives (or sends) messages to a non-secure endpoint, we can isolate the execution of that port or location by using a less privileged account; this will help enforce security within our application. Once you create the host, you perform a similar operation to create a new host instance for that host. If you now left-click the Host instances node, you will notice that the Actions pane, on the far right, provides an alternative to right-clicking for a context menu.
Clicking New here is the same as right-clicking and selecting New | Host Instance from the context menu. Again, this is common in all MMC Snap-Ins. From this dialog we configure the settings for a specific host instance shown as follows:
These settings consist of Host name:, which would be the host we created before, and the server within the group on which to configure this host instance. We must also provide Logon: credentials for the Windows service that will be automatically created on this server for us. We can optionally decide to make sure this host instance is not capable of starting. This can be useful if we're setting up new hosts and instances on many servers, but are not yet ready to start them. We will see this presented more thoroughly later in this , in the section, Presenting the best practices for BizTalk configuration.
Adding more servers to the group
This step is a little more complicated, but only a little. All you have to do is install BizTalk on the new server and then run the BizTalk Server Configuration tool on the new server. When the wizard opens, click on Advanced Configuration and under Enterprise SSO, select Join and Existing SSO System, and under Group, select Join an existing BizTalk Group. Once this is done, the last task is to create host instances on the new server for all existing hosts; the same process we just did. As soon as we start the new host instances through the BizTalk Administration console, the new server will immediately begin processing the transactions. Adding new servers to the BizTalk group turns out to be very simple and enables us to quickly stand up with more capacity as needed.
As more servers are added to the group, they simply continue to pull work items off the queues independently. The more servers in the group, the more work it is able to perform. There is no practical limit to the number of servers that can be added to a BizTalk enterprise installation.
(For more resources on Microsoft, see here.)
Exploring High Availability in BizTalk
High Availability (HA) is the ability of an environment to deal with failures or outages without causing service or processing interruptions. For example, failures would be the loss of servers or services within the environment. BizTalk gives us a good degree of high availability inherent in the group concept, but the same is not true for SQL Server or some of the other services involved in our BizTalk installation. The following are some examples of how to make your BizTalk installation highly available. There is an excellent poster available from Microsoft that details scaling out BizTalk Server that is located at http://www.microsoft.com/download/en/details.aspx?id=15223.
High Availability in SQL Server
SQL Server can be made highly available with Windows clustering, now known as Failover Clustering. Failover clustering has been around in the Windows platform since NT 4 and has been improved with each release. This technology is meant to provide failover for services in Windows environments. The basic idea is that the cluster is a logical construct consisting of two or more nodes (nodes being Windows Servers); clients connect to this logical resource rather than a specific server. Only one node is active at any given time with the others being passive, but they are all capable of being active in the event of an active node failure. This switch over happens automatically and is the core feature of the Failover Cluster. Failover Cluster allows us to run services in the cluster and ensures that an instance of the service will be running on one of the nodes in the cluster. It is really a way to treat multiple servers as a single computing resource that has much higher uptime due to the ability to transfer the service to a passive node. This is exactly how SQL Server is clustered. Because SQL Server ultimately stores data on disk, this storage must be a shareable resource, normally on a SAN. The active node logs transactions and works the same way as SQL Server normally does. When the active node fails, one of the passive nodes becomes active and takes control of the storage resource and starts the SQL service. This failover process is demonstrated in the following screenshot:
After the failover, clients continue to use the same resource to connect to the new server because the resource is a logical resource that exists in the cluster. In BizTalk Server, this failover is automatic. Running host instances will lose connectivity to the message box database at the time of SQL failure and will automatically attempt to reconnect repeatedly. As the passive node becomes active, these connections will succeed, and BizTalk will continue processing as if nothing happened. More configurations are covered in the section, Examining Sample Installation Topologies, later in this article.
Clustering centers around making storage, networks, and the application's cluster resources. All this is done through an MMC Snap-In that is part of the Windows Server Manager and is a feature that can be installed in Windows Server. MSDN contains a great amount of information about configuring failover clustering and specifically clustering SQL Server located at http://msdn.microsoft.com/en-us/library/ms189134.aspx. To configure a cluster, this documentation will be the most up-to-date and best resource for you to use. Although it provides extremely advanced features, Failover Cluster is a fairly simple technology to work with and I strongly encourage you to experiment with it. It is critical for getting high availability out of Windows Server solutions.
High Availability with clustered BizTalk hosts
The BizTalk runtime can also be clustered much like SQL Server. There is limited applicability for when to do this because the group concept in BizTalk provides a large degree of high availability automatically. This is precisely why persistence and durability are so critical to BizTalk's scalability. Adapters use a transaction to deliver a message to the message box. Once it is there, it is marked for processing in a transaction and eventually marked as "processed" in another transaction. Every part of BizTalk functions in this transactional manner; including orchestration. If an orchestration were running on a server that simply went dark (power failure perhaps), the BizTalk runtime would detect that this orchestration was no longer running and would load it to another server from its last persistence point. This is a pretty sophisticated capability and, like most of BizTalk, is something we just get for free as part of the infrastructure.
There certainly are instances where clustering BizTalk hosts is a good idea. A good example is for adapters that are not safe for parallel operations, such as the FTP and MSMQ adapters when used for the receive operations. Because the FTP protocol does not provide a way to lock a file, there is no way to know that the file is being read by another host instance. If our environment had two BizTalk Servers, each running a host instance with an FTP receive, we could very well have the same file read into BizTalk twice. This would not be a desirable situation.
The solution to this is to allow only one host instance to run these adapters. This is easy to specify in the BizTalk Administration console. To do this, we simply create a new BizTalk host and then assign the adapter handlers that we want to run in this host. The handler is the binding between an adapter and a host. Adapters can have many handlers for different hosts and have different ones for send and receive operations. This allows us to have a fine-grained level of control over the partitioning of our solutions in our environment. Creating a host was covered previously and assigning the handlers are fairly easy. In the BizTalk Administration console, expand the Platform Settings node and click on Adapters. The list of configured adapters is displayed. If you click on a specific adapter, in our case the FTP adapter, you are shown which handlers are set up for both send and receive operations. If we double-click Receive, or click properties option in the actions pane, we can change the assigned handler; that is, which host this operation is bound to. In the following screenshot, we can see that the Receive handler has been assigned to a host named SingleInstanceHost:
We can also use the New operation to create new handlers, so that certain applications use one host for an adapter and others can use another host. In the previous scenario, we could easily create a poor man's cluster by marking the host instance as disabled on all servers except one. This would ensure that the host instance only ran on one server, unfortunately, it would not provide automatic failover. Failover would require an administrator enabling, and then starting one of the other host instances.
To accomplish real clustering of the host instance, we need to create a cluster resource like before. Clustering a host instance is similar to clustering SQL Server, though it does not require shared storage for data files. This is shown as follows:
Understanding disaster recovery
Disaster recovery is generally one of the least understood parts of BizTalk Server. This could be because the product does so much for us that we don't give serious thought into how it works, until there is a disaster. The vast majority of BizTalk installations are not set up properly for disaster recovery. This section will outline how disaster recovery works in BizTalk and how to make sure it is working correctly in your environment.
Unlike high availability or scalability, disaster recovery is what we turn to after a true disaster, like fire, flood, or earthquake. These are the sorts of disasters that wipe out entire data centers. These types of disasters can be extinction-level events for many enterprises. IT has a long history of planning to cope with these sorts of disasters.
BizTalk has several specific aspects that require more planning and discipline for disaster recovery than most applications. Unlike many applications, you cannot use the raw data and transaction log files in BizTalk to perform disaster recovery. In fact, you can't even use the normal SQL disaster recovery plans that most enterprises have already established. Generally, most enterprises create a backup job that simply backs up all the databases on a server. This will not work with BizTalk. This is because BizTalk involves many databases that interact with each other often through DTC. The use of multiple databases is why BizTalk can be so well distributed and can scale so well, but it requires the databases to stay in sync with each other. This is also why mirroring cannot be used in SQL Server, because mirroring cannot ensure transactional integrity between multiple databases. BizTalk is backed up through the concept of log shipping.
In SQL Server, a database is represented by two (or more) physical files; a Master Data File (MDF) and a Log Database File (LDF). The MDF stores all the data for a database and the LDF is where transactions are actually written before being committed. This allows SQL Server to defer some processing and to hold intermediate results (or on-going transactions) without changing the current state of the database until a transaction is complete. This is critical for providing rollback capability and also for making SQL Server able to handle read operations while simultaneously processing write or update operations. Unless the database is brought offline gracefully, which normally means processing the transactions into the MDF, the state of the database will be unreliable without the LDF. Log shipping involves moving differential snapshots of the transaction log to a new file that can be used to restore the database without taking it offline.
The BizTalk backup job
Backup for the databases in BizTalk is accomplished with transaction log marking and is performed by the only supported method for backing up BizTalk databases: an included SQL job designed specifically for this purpose. This job, Backup BizTalk Server, marks all the databases in a BizTalk group with the same log mark at the same time. It then copies the differential transaction logs as files to a new location. This multi-database synchronized marking and shipping is shown in the following figure:
The job runs by default every fifteen minutes; once per day, it creates a full backup of all the databases, the other times it creates only the transaction logs. Unfortunately, for many BizTalk environments, this job is not configured to run by default because it requires a location to move the files to. This location should be a network share or on a SAN. As soon as you set a location in the backup job configuration, you can and should enable this job.
Ultimately, this is also only half the process, the other half is configuring a destination for these transaction logs to be imported into. The destination SQL Server for a restore will use the backup files and the differential logs to recreate the state of the original databases on the new server. This does mean that we could lose some transactional history if we perform a true disaster recovery. The maximum amount of data lost is the time window that the BizTalk backup is scheduled to run in, which defaults to the previously mentioned 15 minutes. This value can be lowered or increased depending on your requirements.
Standing the new BizTalk environment up
The destination SQL Server has several jobs that run on it to help it track and process the backed up files. When time comes to stand this server up as the replacement, you disable the jobs that import log files and run a restore job (these jobs are created for you with scripts that ship with BizTalk). You then run other scripts on the individual BizTalk Servers in your new environment to configure them to point to this new database. If these are completely new servers, you can simply configure them to join an existing group as if you were adding new servers to a group. The scripts are really designed for transferring existing servers to point to the new databases. This is common if you're using backups to restore the BizTalk Servers.
All of this is well documented in the BizTalk documentation and on MSDN, but presented here for clarity. There is, however, one final option to provide disaster recovery and that is the stretch-cluster or geo-cluster. In this scenario, a node in the SQL Failover Cluster runs in a geographically remote location. This allows auto failover without log shipping, but it requires immense bandwidth and is currently not feasible for most organizations. Also, due to the laws of physics, which dictate how fast information can travel over a network, it is not appropriate for very long distances. It also requires SAN technology that supports mirroring, which at this point tends to be proprietary and expensive. If you have this bandwidth available, you could also have nodes of your BizTalk group configured in the redundant location and disaster failover would be almost completely automated. Finally, it is important to keep in mind that there is no job that cleans up the files after the log shipping; they need to be deleted periodically to keep the shared storage location from becoming full.
In this article we have learned how BizTalk runs on servers in our environment and the different roles and servers required for building scalable and reliable BizTalk installations. We learned about scalability and high availability, as well as disaster recovery. We also examined sample topologies and their benefits and requirements.
- Exchange Server 2010 Windows PowerShell: Working with Distribution Groups [Article]
- Integrating BizTalk Server and Microsoft Dynamics CRM [Article]
- Communicating from Dynamics CRM to BizTalk Server [Article]
- Getting Started with Microsoft SQL Server 2008 R2 [Article]