Creativity is the power to connect the seemingly unconnected.
Let's begin our journey by investigating what BizTalk Server actually is, why to use it, and how to craft a running application. This chapter will be a refresher on BizTalk Server for those of you who have some familiarity with the product.
In this chapter, you will learn:
How to articulate BizTalk Server, when to use it, and how it works
To outline the role of BizTalk schemas, maps, and orchestrations
BizTalk messaging configurations
So what exactly is BizTalk Server, and why should you care about it? In a nutshell, Microsoft BizTalk Server 2009 uses adapter technology to connect disparate entities and enable the integration of data, events, processes, and services. An entity may be an application, department, or even an altogether different organization that you need to be able to share information with. A software adapter is typically used when we need to establish communication between two components that do not natively collaborate. BizTalk Server adapters are built with a common framework which results in system integration done through configuration, not coding.
Traditionally, BizTalk Server has solved problems in three areas. First, BizTalk Server acts as an Enterprise Application Integration (EAI) server that connects applications that are natively incapable of talking to each other. The applications may have incompatible platforms, data structure formats, or security models. For example, when a new employee is hired, the employee data in the human resources application needs to be sent to the payroll application so that the new employee receives his/her paycheck on time. Nothing prevents you from writing the code necessary to connect these disparate applications with a point-to-point solution. However, using such a strategy often leads to an application landscape that looks like this:
Many organizations choose to insert a communication broker between these applications as shown in following figure.
Some of the benefits that you would realize from such an architectural choice include:
Loose coupling of applications where one does not have a physical dependency on the other
Durable infrastructure that can guarantee delivery, and queue messages during destination system downtime
Centralized management of system integration endpoints
Message flow control such as in-order delivery
Insight into cross-functional business processes through business activity monitoring
BizTalk Server solves a second problem by filling the role of business-to-business (B2B) broker that facilitates communication across different organizations. BizTalk supports B2B scenarios by offering Internet-friendly adapters, industry-standard EDI message schemas, and robust support for both channel- and message-based security.
The third broad area that BizTalk Server excels in is Business Process Automation (BPA). BPA is all about taking historically manual workflow procedures and turning them into executable processes. For example, consider the organization that typically receives a new order via email and the sales agent manually checks inventory levels prior to inserting the order into the Fulfillment System. If inventory is too low, then the sales agent has to initiate an order with their supplier and watch out for the response so that the Inventory System can be updated. What problems are inevitable in this scenario?
Poor scalability when the number of orders increase
Lack of visibility into the status of orders and supplier requests
Multiple instances of redundant data entry, ripe for mistakes
By deciding to automate this scenario, the company can reduce human error while streamlining communications between applications and organizations.
What's one thing all of these BizTalk Server cases have in common? They all depend on the real-time interchange and processing of discrete messages in an event-driven fashion. This partially explains why BizTalk Server is such a strong tool within a service-oriented architecture. We'll investigate many of BizTalk's service-oriented capabilities in later chapters, but it's important to note that the functionality that exists to support three top-level scenarios above (EAI, B2B, and BPM) nicely fits into a service-oriented mindset. Concepts such as schema-first design, loose coupling, and reusability are soaked into the fabric of BizTalk Server.
BizTalk Server should be targeted for solutions that exchange real-time messages as opposed to Extract Transform Load (ETL) products that excel at bulky, batch-oriented exchanges between data stores.
BizTalk Server 2009 is the 6th release of the product, the first release being BizTalk Server 2000. Back in those days, developers had access to four native adapters (file system, MSMQ, HTTP, and SMTP); development was done in a series of different tools, and the underlying engine had some fairly tight coupling between components. Since then, the entire product was rebuilt and reengineered for .NET and a myriad of new services and features have become part of the BizTalk Server suite. The application continues to evolve and take greater advantage of the features of the Microsoft product stack, while still being the most interoperable and platform-neutral offering that Microsoft has ever produced.
So how does BizTalk Server actually work? BizTalk Server at its core is an event-processing engine, based on a conventional publish-subscribe pattern. Wikipedia defines the publish-subscribe pattern as:
An asynchronous messaging paradigm where senders (publishers) of messages are not programmed to send their messages to specific receivers (subscribers). Rather, published messages are characterized into classes, without knowledge of what (if any) subscribers there may be. Subscribers express interest in one or more classes, and only receive messages that are of interest, without knowledge of what (if any) publishers there are.
This pattern enforces a natural loose coupling and provides more scalability than an engine that requires a tight connection between receivers and senders. In the first release of BizTalk Server, the product DID have tightly coupled messaging components, but thankfully the engine was completely redesigned for BizTalk Server 2004.
Once a message is received by a BizTalk adapter, it runs through any necessary pre-processing (such as decoding) in BizTalk pipelines, before being subjected to data transformation via BizTalk maps, and finally being published to a central database called the MessageBox. Then, parties which have a corresponding subscription for that message can consume it as they see fit. While introducing a bit of unavoidable latency, the MessageBox database makes up for that by providing us with durability, reliability, and scalability. For instance, if one of our subscriber systems is offline for maintenance, outbound messages are not lost, but rather the MessageBox makes sure to queue messages until the subscriber is ready to receive them. Worried about a large flood of inbound messages that steal processing threads away from other BizTalk activitiesâno problem! The MessageBox makes sure that each and every message finds its way to its targeted subscriber, even if it must wait until the flood of inbound messages subside.
There are really two ways to look at the way BizTalk is structured. The first is the traditional EAI view, which sees BizTalk receiving messages, and routing them to the next system for consumption. The flow is very linear and BizTalk is seen as a broker between two applications.
However, the other way to consider BizTalk, and the focus of this book, is as a service bus, with numerous input/output channels that process messages in a very dynamic way. That is, instead of visualizing the data flow as a straight path through BizTalk to a destination system, consider BizTalk exposing services as on-ramps to a variety of destinations. Messages published to BizTalk Server may fan out to dozens of subscribers, who have no interest in what the publishing application actually was. Instead of thinking about BizTalk as a simple connector of systems, think of BizTalk as a message bus which coordinates a symphony of events between endpoints.
This concept, first introduced to me by the incomparable Charles Young (http://geekswithblogs.net/cyoung/), is an exciting way to exploit BizTalk's engine in this modern world of service-orientation. In the diagram below, I've shown how the central BizTalk bus has receiver services hanging off of it, and has a multitude of distinct subscriber services that are activated by relevant messages reaching the bus.
If the on-ramp concept is a bit abstract to understand, consider a simple analogy. In designing the transportation for a city, it would be foolish of me to create distinct roads between each and every destination. The design and maintenance of such a project would be lunacy. I would be smart to design a shared highway with on and off ramps, which enable people to use a common route to get between the numerous locations around town. As new destinations in the city emerge, the entire highway (or road system) doesn't need to undergo changes, but rather, only a new entrance/exit point needs to be appended to the existing shared infrastructure.
What exactly is a message anyway? A message is data processed through BizTalk Server's messaging engine, whether that data is transported as an XML document, a delimited flat file, or a Microsoft Word document. The message content may contain a command (for example
InsertCustomer), a document (for example
Invoice), or an event (for example
VendorAdded). A message has a set of properties associated with it. First and foremost, a message may have a type associated with it which uniquely defines it within the messaging bus. The type is typically comprised of the XML namespace and the root node name (for example
http://CompanyA.Purchasing#PurchaseOrder). The message type is much like the class object in an object-oriented programming language; it uniquely identifies entities by their properties. The other critical attribute of a message in BizTalk Server is the property bag called the message context. The message context is a set of name/value properties that stays attached to the message as long as it remains within BizTalk Server. These context values include metadata about the transport used to publish the message, and attributes of the message itself. Properties in the message context that are visible to the BizTalk engine, and therefore available for routing decisions, are called promoted properties.
How does a message actually get into BizTalk Server? A receive location is configured for the actual endpoint that receives messages. The receive location uses a particular adapter, which knows how to absorb the inbound message. For instance, a receive location may be configured to use the FILE adapter which polls a particular directory for XML messages. The receive location stores the file path to monitor, while the adapter provides transport connectivity. Upon receipt of a message, the adapter stamps a set of values into the message context. For the FILE adapter, values such as
ReceivedFileName are added to that message's context property bag. Note that BizTalk has both application adapters such as SQL Server, Oracle, and SAP as well as transport-level adapters such as HTTP, MSMQ, and FILE. The key point is that the adapter configuration user experience is virtually identical regardless of the type of adapter chosen.
Receive locations have a particular receive pipeline associated with them. A pipeline is a sequential set of operations that are performed on the inbound message in preparation for being parsed and processed by BizTalk. For instance, I would need a pipeline in order to decrypt, unzip, or validate the XML structure of my inbound message. One of the most critical roles of the pipeline is to identify the type of the inbound message and put the type into the message context as a promoted property. As discussed earlier, a message type is the unique characterization of a message. Think of a receive pipeline as doing all the pre-processing steps necessary for putting the message in its most usable format.
A receive port contains one or more receive locations. Receive ports have XSLT maps associated with them that are applied to messages prior to publishing them to the MessageBox database. What value does a receive port offer me? It acts as a grouping of receive locations where capabilities such as mapping and data tracking can be applied to any of the receive locations associated with it. It may also act as a container that allows me to publish a single entity to BizTalk Server regardless of how it came in, or what it looked like upon receipt. Let's say that my receive port contains three receive locations, which all receive slightly different "invoice" messages from three different external vendors. At the receive port level, I have three maps that take each unrelated message and maps it to a single, common format, before publishing it to BizTalk.
By default, all messages pass through BizTalk Server as a stream of bytes, not as an XML message loaded into the server's memory. Therefore, when the message is published to the MessageBox, BizTalk Server has yet to look inside the message unless:
The receive port had an XSLT map corresponding to the inbound message type
An XML validation/disassemble/decoding pipeline component was applied to the message.
Note that custom pipeline components may also peek into the message content. If the message has promoted properties associated with it, then the disassembler pipeline component will extract the relevant data nodes from the message and insert them into the message context.
Now that we have a message cleaned up (by the pipeline) and in a final structure (via an XSLT map), it's published to the BizTalk Server MessageBox where message routing can begin. For our purposes, there are two subscribers that we care about. The first type of subscriber is a send port. A send port is conceptually the inverse of the receive location and is responsible for transporting messages out of the BizTalk bus.
It has not only an adapter reference, adapter configuration settings, and a pipeline (much like the receive location), but it also has the ability to apply XSLT maps to outbound messages. If a send port subscribes to a message, it first applies any XSLT maps to the message, then processes it through a send pipeline, and finally uses the adapter to transmit the message out of BizTalk.
The other subscriber for a published message is a BizTalk orchestration. An orchestration is an executable business process, which uses messages to complete operations in a workflow. We'll spend plenty of time working with orchestration subscribers throughout this book.
What do you need to set up a brand new BizTalk project? First, you will want to have a development environment with Windows Server 2008, IIS 7.0, SQL Server 2008, Visual Studio 2008, and BizTalk Server 2009, installed in that order.
Consider using a standard structure for all of your BizTalk Server solutions. This makes it easier to package and share source code, while also defining a consistent place to store solution artifacts in each project. To build the structure below, I put together a VBScript file, which is available on my blog at: http://seroter.wordpress.com/2007/03/29/script-for-automatically-creating-biztalk-solution-structure/.
Note that BizTalk Server 2009 solutions can (and should) be centrally persisted in standard source control applications such as Subversion or Microsoft Team Foundation Server.
You can tell if you have successfully installed BizTalk Server in your development environment if you are able to see BizTalk Projects in the Visual Studio.NET New Projects menu option.
When a new BizTalk Project is added to a Visual Studio.NET solution, you should immediately right-click the project and select the Properties option. In BizTalk Server 2009, we can now set properties in the familiar C# project properties pane, instead of the BizTalk-only properties window. The BizTalk project type has been redesigned so that BizTalk projects are now simply specialized C# project types.
The first value that you need to set is under the Signing section. You can either point to an existing strong name key, or now in BizTalk Server 2009, generate a new key on the fly. BizTalk Server projects are deployed to the Global Assembly Cache (GAC) and must be strong named prior to doing so. After setting the necessary key value, navigate to the BizTalk-specific Deployment section, and set the Application Name to something meaningful such as BizTalkSOA.
Once you have a project created, the strong name key set, and application name defined, you're ready to start adding development artifacts to your project.
Arguably the building block of any BizTalk Server solution (and general SOA solution) is the data contract, which describes the type of messages that flow through the BizTalk bus. A contract for a message in BizTalk Server is represented using an industry-standard XML Schema Definition (XSD). For a given contract, the XSD spells out the elements, their organizational structure, and their data types. An XSD also defines the expected ordering of nodes, whether or not the node is required, and how many times the node can appear at the particular location in the node tree. Following is an example XSD file:
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema"> <xs:element name="Person> <xs:complexType> <xs:sequence> <xs:element name="FirstName" type="xs:string"/> <xs:element name="LastName" type="xs:string"/> <xs:element name="Age" type="xs:int"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
Having a strict contract can reduce flexibility but it greatly increases predictability as the message consumer can confidently build an application, which depends on the message being formatted a specific way.
While producing completely valid XSD syntax, the BizTalk Schema Editor takes a higher-level approach to defining the schema itself. Specifically, instead of working purely with familiar XML concepts of
attributes, the BizTalk Schema Editor advances a simpler model based on
fields, which is meant to better represent the hierarchical nature of a schema. Do not let this fact mislead you to believe that the BizTalk Schema Editor is just some elementary tool designed to accommodate the drooling masses. In fact, the Editor enables us to graphically construct relatively complex message shapes through a fairly robust set of visual properties and XSD annotations.
There are a multiple ways to create schemas in the BizTalk Schema Editor. These include:
You can generate a schema from an existing XML file. The BizTalk Editor infers the node names and structure from the provided XML instance. In many integration projects, you start off knowing exactly what the transmission payload looks like. If you are fortunate enough to start your project with a sample XML file already in place, this schema generation mechanism is a big time-saver. However, there are caveats to this strategy. The BizTalk Editor can only build a schema structure based on the nodes that are present in the XML file. If optional nodes were omitted from the instance file, then they will be missing from the schema. Also, the schema will not mark "repeating" structures unless the XML file represents a particular node multiple times. Finally, the generated schema will not try to guess the data type of the node, and will default all nodes to a type of
string. Despite these considerations, this method is a fantastic way to establish a head start on schema construction.
XSD schemas may also be manufactured through the BizTalk adapters. For example, the BizTalk adapters for SQL Server and Oracle will generate XSD schemas based on the database table you are targeting. As we will see shortly, BizTalk Server also generates schemas for services that you wish to consume. Using adapters to harvest metadata and automatically generate schemas is a powerful way to make certain that your messages match the expected system format.
New schemas can actually be created by importing and including previously created schemas. If XSD complex types are defined in a schema (for example
Address), then new schemas can be built by mixing and matching existing types. Because these inherited types are merely referenced, not copied, changes to the original content types cascade down to the schemas that reuse them. If you are inclined to design a base set of standard types, then building schemas as compositions of existing types is a very useful way to go.
Finally, you have the option to roll up your sleeves, and build a new XSD schema from scratch. Now while you can switch to a text editor and literally type out a schema, the BizTalk Editor allows you to graphically build a schema tree from the beginning. Note that because of BizTalk Server's rigorous support for the XSD standard, you can even fashion your XML Schemas in alternate tools like Altova's XML Spy. We will handcraft many of our schemas in the BizTalk Editor for the schemas that we build together in this chapter and throughout the book.
If you're like me, you often sketch the schema layout first, and only later worry about concepts such as data types, repeating nodes, and entry restrictions. By default, each new node is assigned a
string data type and is assumed to only exist once in a single XML document. Using the BizTalk Server Schema Editor, you can associate a given node with a wide variety of alternate data types such as
base64Binary. One thing to remember is that while you may use a more forgiving schema for inbound data, you should be strict in what you send out to other systems. We want to make sure to only produce messages that have clean data and stand little chance of being outright rejected by the target system.
Changing the number of times a particular node can appear in an XML document is as simple as highlighting the target node and setting the Max Occurs property. It's also fairly straightforward to set limits on the data allowed within certain nodes. What if we want a ZipCode field to only accept a maximum of 10 characters? Or what if the data stored in an AddressType node should be constrained to only 3 allowable choices? By default, these options are not visible for a given node. To change that, you can select a node and set the Derived By equal to Restriction. A flurry of new properties becomes available such as Maximum Length or Enumeration.
A critical BizTalk schema concept to examine is the property schema. Earlier in this chapter, I mentioned the notion of promoted properties which expose a message's data content to the BizTalk messaging layer. This in turn allows for a message to be routed to subscribers who are specifically interested in data condition (for example
12345). Promoted properties are defined in a property schema, which is a special schema type within BizTalk Server. The property schema contains a flat list of elements (no records allowed) that represent the type of data we want the BizTalk messaging engine to know about. Once the property schema is created, we can associate specific fields in our message schema with the elements defined in the property schema. As we will see in practice later in this book, one key benefit of property schemas is that they can be used by more than one XSD schema. For instance, we could create a
ModifiedEmployee message that both map to a single
EmployeeID property field. In this manner, we can associate messages of different types which have common data attributes.
The BizTalk Schema Editor is a robust tool for building industry-standard XSD schemas. In a service-oriented architecture, the data contract is key, and understanding how to construct an XSD contract within BizTalk Server is an important skill.
Rarely does data emitted from one system match the structure and content expected by another system. Hence, some sort of capability is needed to translate data so that it can be digested by a variety of consumers. Extensible Stylesheet Language Transformations (XSLT) is the industry standard for reshaping XML documents and the BizTalk Mapper is the tool used by BizTalk developers to graphically build XSLTs.
We are often lucky enough to be able to make direct connections between nodes. For instance, even though the node names are different, it is very easy to drag a link between a source node named FName and a destination node named FirstName. However, frequently you are required to generate new data in a destination schema that requires reformatting or reshaping the source data. This is where BizTalk Mapper functoids come to the rescue. What in the world is a functoid? Well, it is a small component which executes data manipulation functions and calculations on source nodes in order to meet the needs of the destination schema. There are over 75 functoids available in the BizTalk Mapper, which span a variety of categories such as string manipulation, mathematical calculations, logical conditions, and cumulative computation.
If you don't see exactly what you're looking for, you can use the Scripting functoid which enables you to write your own XSL script or .NET code to be executed within the map.
It's important to understand that the BizTalk Mapper is for data normalization logic only, NOT business logic. If you need to make business decisions, a map is not the right place to store that logic. For example, you would not want to embed complex discount generation logic within a BizTalk map. That sort of business logic belongs in a more easily maintained repository than in a map file. As a simple rule, the map should only be responsible for shaping the output message, not for altering the meaning of the data in its fields. Maps are great for transformation instructions, but a lousy place to store mission-critical business algorithms.
Earlier in this crash-course on BizTalk Server, we discussed the BizTalk messaging architecture and its foundation in a publish and subscribe routing model. One of the most important parts of a messaging configuration is enabling the receipt of new messages. Without the ability to absorb messages, there's not much else to talk about. In BizTalk Server, messages are brought onboard through the combination of receive ports and receive locations.
Receive ports can be configured from within the BizTalk Server Administration Console. New receive ports support both "one-way" or "two-way" message exchange patterns. On the lefthand side of a receive port configuration, there are a series of vertically arranged tabs that display different sets of properties. Choosing the Receive Locations tab enables us to create the actual receive location which defines the URI that BizTalk will monitor for inbound messages. In the Transport section of a receive location's primary configuration pane, we can choose from the list of available BizTalk adapters. Once an adapter is chosen from the list, the Configure button next to the selected transport type becomes active. For a receive location exploiting the FILE adapter, "configuration" requires entering a valid file path into the Receive folder property.
The next step in configuring BizTalk messaging is to create a subscriber for the data that is published by this receiving interface. BizTalk send ports are an example of a subscriber in a messaging solution. Much like receive locations, send ports allow you to choose a BizTalk adapter and configure the transmission URI for the message. However, simply configuring a URI does not complete a send port configuration, as we must pinpoint what type of message this subscriber is interested in. On the left side of a send port configuration window, there is a vertical set of tabs. The Filters tab is where we can set up specific interest criteria for this send port. For example, we could define a subscription that listens for all messages of a particular type that reach the MessageBox.
A send port can be in three distinct states. By default, a send port is
unenlisted. This means that the port has not registered its particular subscription with BizTalk, and would not pull any messages from the MessageBox. A send port may also be
enlisted, which is associated with ports that have registered subscriptions but are not processing messages. In this case, the messages targeted for this port stay in a queue until the port is placed in the final state,
Started. A started port has its subscriptions active in the MessageBox and is heartily processing all the messages it cares about.
BizTalk Server includes a workflow platform, which allows us to graphically create executable, long-running, stateful processes. These workflows, called orchestrations, are designed in Visual Studio.NET and executed on the BizTalk Server. The Orchestration Designer in Visual Studio.NET includes a rich palette of shapes we can use to build robust workflows consisting of control flow, message manipulation, service consumption, and much more. The Orchestration Runtime is responsible for executing the orchestrations and managing their state data.
Orchestration is a purely optional part of a BizTalk solution. You can design a complete application that consists solely of message routing ports. In fact, many of the service-oriented patterns that we visit throughout this book will not require an orchestration. That said, there are a number of scenarios where injecting orchestrations into the solution makes sense. For instance, instead of subscribing directly to the "new employee" message, perhaps a payroll system will need additional data (such as bank information for a direct deposit) not currently available in the original employee message. We could decide to create a workflow, which first inserts the available information into the payroll system, and then sends a message to the new employee asking for additional data points. The workflow would then wait for and process the employee's response and conclude by updating the record in the payroll system with the new information. BizTalk orchestrations are a good fit for automating manual processes, or choreographing a series of disconnected services or processes to form a single workflow.
Orchestration "shapes" such as Decide, Transform, Send, Receive, and Loop are used to build our orchestration diagrams like the one below. This particular diagram below shows a message leaving the orchestration, and then another message returning later on in the flow. How does that message know which running orchestration instance to come back to? What if we have a thousand of these individual processes in flight at a single point in time? BizTalk Server has the concept of correlation which means that you can identify a unique set of attributes for a given message which will help it find its way to the appropriate running orchestration instance. A correlation attribute might be as simple as a unique invoice identifier, or a composite key made up of a person's name, order date, and zip code.
Orchestration is a powerful tool in your development arsenal and we will make frequent use of it throughout this book.
In this chapter, we looked at what BizTalk is, its core use cases, and how it works. In my experience, one of the biggest competitors to BizTalk Server is not another product, but custom-built solutions. Many organizations engage a "build versus buy" debate prior to committing to a commercial product. In this chapter, I highlighted just a few aspects of BizTalk that make it a compelling choice for usage. With BizTalk Server you get a well-designed scalable messaging engine with a durable persistence tier which guarantees that your mission-critical messages are not lost in transit. The engine also provides native support for message tracking, recoverability, and straightforward scalability. BizTalk provides you 20+ native application adapters that save weeks of custom development time and testing. We also got a glimpse of BizTalk's integrated workflow toolset that enables us to quickly build executable business processes that run in a load-balanced environment. These features alone often tip the scales in BizTalk Server's favor, not to mention the multitude of features that we have yet to discuss such as Enterprise Single Sign On the Business Rules Engine, Business Activity Monitoring, and more.
I hope that this chapter also planted some seeds in your mind with regards to thinking about BizTalk solutions in a service-oriented fashion. There are best practices for designing reusable, maintainable solutions that we will investigate throughout the rest of this book. In the next chapter, we'll explore one of the most critical technologies for building robust service interfaces in BizTalk Server: Windows Communication Foundation.