Business Service Management (BSM) is a key area in today's IT management arena. In the context of IT infrastructure management, there has been a major shift in the decision making process. The questions driving these decisions have moved from why do we need this to how can we achieve this. The answer to this question requires IT management to be viewed as a business enabler as opposed to a support function.
This chapter will highlight the importance of BSM in today's IT space. We will illustrate the challenges in managing today's data centers, with an emphasis on the industry standard guidelines for managing these complexities. We will also cover the concept of modeling IT infrastructure as systems and services. We will touch upon the details of sharing IT resources across different verticals and the related management issues. The chapter will also highlight how BSM can be one of the solutions to the various complexities that plague today's IT infrastructure landscape. In addition, this chapter will also highlight the Information Technology Infrastructure Library (ITIL v3) guidelines on BSM. The topics covered in this chapter are relevant to the BSM area and are not specific to Oracle Enterprise Manager (OEM).
IT infrastructure has transformed itself from being a necessary evil to that of a key business enabler, helping companies develop solutions to differentiate them from their competitors. IT infrastructure in modern day enterprises is the backbone that helps them stand straight with their head above the competition. To this effect, the data center landscape, which hosts this infrastructure, has evolved from a few servers in an obscure corner room of a building to that of thousands of servers in different buildings spread across various geographies. The technologies deployed in these data centers also have transformed from Mainframe and Unix systems, running e-mail and legacy applications to heterogeneous, distributed solutions involving database, middleware servers, Commercial off the Shelf (COTS), packaged, and custom applications. Further, these products and solutions interact among themselves to provide external facing business services and enable day-to-day internal business operations. The advent of Web 2.0 and cloud computing and niche features such as Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS) have further complicated the landscape.
The infrastructure consists of both external and internal applications serving various classes of users. These users access various applications through different access points and devices. Even though actual IT infrastructures are far more intricate depending on the business domain of the enterprise, the above minimalist view clearly demonstrates the complexities involved. To this view, if we add the collaborations among the various entities, the topology becomes almost unmanageable. The following is a very simplistic illustration of the physical topology of the infrastructure that supports the earlier functional view:
It can be seen how, IT impacts every aspect of the business operations—ranging from customer care to end user interactions to accounting to employee self service. Needless to say, the performance of the IT infrastructure is a key driver towards the success of the enterprise business.
This complexity in the IT landscape necessitates deployment of a highly sophisticated management solution across the enterprise. Such a solution must be able to manage all aspects of the IT infrastructure, starting from physical hosts and devices to packaged applications. While the solution should definitely cater to managing disparate components individually, it must also provide visibility into the complex business processes and usage of the underlying infrastructure. The former view is required as a tool for day-to-day IT operations by system administrators and support personnel who know the physical topology very well. The latter view provides the CXO-level senior management with invaluable insight into the effectiveness of the underlying infrastructure in driving business operations.
Many of the applications and business processes interact with each other and come together, to provide meaningful services to both external and internal users. Such interactions are achieved using diverse technologies and architectures such as SOA, web services, cloud computing, Web 2.0, and so on. These services must also cater to the availability and performance expectations of customers and internal users. These expectations are formally referred to as service-levels. The commitment on availability and performance of these services, commonly referred to as business services by the service provider, is defined formally using Service-Level Agreements or SLAs. Enterprise-wide management of these business services including their service levels requires technology-independent perspectives that provide the CXOs with the big picture. The above management concepts fall under the broad category of BSM.
Prior to discussing the various modeling options, it is important to understand the necessity of modeling the IT infrastructure. As discussed in the previous section, a typical data center consists of numerous heterogeneous hardware and software components. The hardware components present in a data center are as varied as network routers, switches, machines ranging from servers to desktops, Mainframes, storage devices, load balancers, and so on. The software components deployed on such hardware are significantly more diverse such as operating systems, databases, application servers, middleware, and so on. In an enterprise data center, both hardware and software will be sourced from multiple vendors. To further add another layer of complexity, it is very likely that multiple versions of the same software product, from the same vendor, could be deployed across the enterprise.
As an example, the data center of a large commercial bank could contain network switches and routers from Cisco, Mainframes from IBM, and industry standard servers from HP. This hardware will be utilized to run mission-critical CRM applications from Oracle running on Oracle middleware and Oracle Real Application Cluster (RAC) databases running a Solaris operating system. These applications would interact with Enterprise Resource Planning (ERP) systems from SAP. There will also be custom applications built in-house, running on Oracle WebLogic Application Server. In the previous topology, although the database used by both CRM and ERP systems could be supplied by Oracle, their versions could be different, that is, Oracle Database 10g and Oracle Database 11g.
In a large enterprise, the CTO staff will comprise various teams of administrators having focused responsibilities on managing different components within the data center. For instance, network engineers will be assigned network router operations whereas DBAs will be responsible for database maintenance. In addition, there will be a set of administrators who maintain the enterprise applications such as CRM, Siebel, and so on. Such administrators are responsible for regular operational tasks of different components in the data center. The DBAs will need to perform regular tasks such as re-indexing, performing backup and recovery, managing table spaces, and so on. The application administrators will be handling configuration of middleware, deployment of applications, provisioning users, and so on. In addition to the regular tasks, these administrators will also be responsible for the stability and health of their respective areas.
These operational teams will be complemented by a strategy team that will be responsible for IT budgeting and planning. These teams will be responsible for driving the efficiency of IT infrastructure and operations. As an example, the CTO strategy team might have a goal of increasing the IT hardware utilization by 10 percent for a fiscal year. Another goal may be to project the additional hardware requirements to support an upcoming business strategy. In order to achieve such goals, the team will require data such as usage, operational efficiency, capacity, and so on. The data requirements will be both current and historical.
The strategy and the operations team need to work together to meet the compliance requirements. These requirements touch areas such as security, configuration, and storage. While the strategy team is responsible for setting compliance standards and goals, the administrators are entrusted with the responsibility of ensuring that these compliance levels are adhered to. To illustrate this, let us consider the security requirements on a CRM On-Demand application. In order to meet a specific customer security requirement around passwords, the administrator will have to configure the applications accordingly.
It is clear from the above explanation that the different responsibilities require focused perspectives of the IT infrastructure. The focused perspectives must enable the administrators to view their components of interest. They must also include other components that are dependent on these as well as the areas on which a component is dependent on. Since the different components in a data center do not operate in isolation and interact with one another, it is imperative that the IT staff get a holistic view of the enterprise IT topology.
To simplify the previous explanation with an example, let's consider the perspective required by the DBA. The DBA will require a database-centric view, which shows all the databases in the enterprise. This perspective must allow the DBA to also figure out the host on which a specific database instance runs. It is equally important to understand the applications that use a specific database instance. These perspectives allow the administrators to view the dependencies between components. Let's consider a DBA of an Oracle database running on a Solaris operating system and servicing a travel portal. Due to security requirements, the Oracle database needs to be patched. As a prerequisite to this, the DBA needs to figure out the underlying operating system details so as to ensure that all the mandatory operating patches have been applied to the host. Moreover, the DBA needs to work with the administrators of all travel applications using this database instance to schedule a maintenance window when this patch can be applied. In the absence of the above holistic view, the DBA will not be able to project the business impact of this IT maintenance.
The following image provides a perspective of a component-centric view of the database used in the travel portal and primarily caters to the database administrators.
The previous image is an illustration of the database-centric view of the travel portal. This view is centered on the database and shows both the physical infrastructure used by the database and the travel portal application that depends on the database.
A different perspective is required by the strategy team. The strategy team will require a view that maps a specific business function to the IT infrastructure. This perspective will detail out the various components in the data centered that collaborate with each other to provide a certain business function. This view will also highlight the relationship among the different components.
Continuing with the same travel portal example in the previous section, the strategy team responsible for the portal will need a view of all the components such as hosts, databases, middleware, and applications required by the travel portal. This view will enable them to identify the IT usage in providing the business functions to project the capacity requirements so as to meet the business goals. In the above scenario, this translates to the strategy team being able to project the additional hardware requirement correctly in order to meet a 20 percent surge in user traffic forecasted by the business teams.
Such a view provides the necessary visibility to the strategy team in determining the infrastructure utilized to provide the business service. This mapping between the business functions and the underlying IT infrastructure comes in handy, not only in identifying the components providing a specific business function, but also by enabling to project the impact of a component on the business functions.
In addition to these two perspectives, business strategy demands yet another paradigm to view the IT infrastructure. The data center provides numerous business services through its IT infrastructure. While the two views discussed in the previous sections provide insight into the components that are part of a business service, they clearly lack the ability to depict the business service itself. However, the above views are the first key steps towards representing the actual business service. It is important to visualize each of these business services as an entity by themselves. Such a business service-centric perspective will provide vital information at a service level.
Such a business-centric view is a key enabler in representing the services for both the service provider as well as the service consumer. The service-level assurance will vary depending on the category of consumers. These business services might be provided for external users such as partners, sales channels, or end customers. For example, the travel portal will be used by end users to book their regular travel. It will also be utilized by airline and hotel partners. The consumers of the above business services can also be internal. For example, the sales teams in the travel portal business would like to use the portal for booking tickets for their own travel. The service consumers may also be categorized based on geographical location. For instance, the travel portal will have dedicated data centers for specific user locations such as U.S., Europe, and Asia Pacific. During U.S. holidays, the U.S. data center for the travel portal must be geared to meet additional customer traffic.
Needless to say, the service provider must monitor the services as well as their respective service levels for each category of users. In the absence of a business-centric view, it will be cumbersome for the IT staff to translate the business priorities to the required IT configurations.
This outlook allows the service provider to gather key data, such as the general health of the business service that is provided, as well as quantitative and qualitative descriptions of the service levels. The general health of the service is measured as availability of the business service. The quantitative measure of a service is described using usage metrics while performance metrics indicate the quality of the service. This perspective also enables the IT staff in determining if their service-level assurances with each category of consumers are met.
Each of these different perspectives helps in visualizing different aspects of the same IT infrastructure. Such perspectives are therefore termed as models. The individual components within the data center are modeled as targets or manageable entities. The holistic view of the infrastructure that combines the functional interactions between various targets is defined as a system model. The perspective that facilitates the service provider in getting the business view of the infrastructure is termed as a service model.
Each of the components within a data center exhibit certain attributes and would require certain management tasks. A target is a manageable entity within an enterprise data center. Examples of targets in the travel portal example are:
Hosts on which the database and middleware are installed such as
db1.us.travel.com, db.travel.co.sg, db.travel.co.uk, and so on
Database instances such as
orcl1, orcl2, orcl3, orcl4, and so on
Middleware server instances such as
fmw1_us_wlsWebLogic managed server,
fmw2_us_wls, and so on
Application instance of the travel portal such as
trvl-portal-us, trvl-portal-eu, and so on
The following image provides a pictorial representation of all the targets described within the travel portal:
These targets belong to various types such as databases, hosts, WebLogic Servers, portal applications, and so on. Moreover, these targets are deployed across different geographies in different data centers. The attributes exhibited by each target instance can be classified into various categories, which help the IT administrators have insights into different aspects of the component. Some of these categories that indicate the key aspects are:
Availability: It describes the general status of the target and its ability to respond to requests. This aspect is usually represented as status indicator. For example, the availability of
fmw1_us_wls, a WebLogic managed server, it will indicate if the server is currently running or not.
Metrics: These are the indicators that provide quantitative measurements of different traits of the target. For example, the performance metrics of the host instance
db1.us.travel.comtarget include CPU utilization, free disk space, and so on.
Configuration metrics: These describe the various configurable parameters for the target under consideration. For example, the configuration metrics for the database instance orcl1 target include log buffer size, pool size, cache, and so on.
The state and behavior of the targets can be modified by performing different target operations. These operations include tasks that directly affect the availability, performance, or configuration of the target instance. These operations also include routine maintenance tasks to be performed on the target instance. Examples of some of these tasks include:
Process control: Such as start, stop, restart, and so on. As an example, in case of a middleware domain, this corresponds to restarting Managed Server target instance—
Configuration management: These include modifying the instance-specific properties that affect its behavior. As an example, for the database instance
orcl1target, this corresponds to increasing the
Sort Area Sizeparameter.
Scheduling maintenance: This is one of the routine tasks before embarking on any changes to the target configuration. As an example, if a security patch is to be applied on the database instance
orcl2target, a maintenance window is scheduled during the upcoming weekend when the traffic is expected to be relatively less.
Backup and recovery: This is a specific maintenance task that is periodically done to preserve the current data and configuration. As an example, for a host target
db1.us.oracle.com, this corresponds to a regular backup of the user's home directory.
Compliance management: This is yet another task undertaken periodically to ensure that the target under consideration does not violate any of the policies set at the enterprise level. As an example, for the host target
db.travel.co.sg, this corresponds to a daily check of the username and password to ensure that they meet the standards set by the enterprise security team.
As seen in the previous section, the travel portal has different kinds of targets, that is, hosts, databases, application servers, and so on. Each of these targets is known as a target type. The targets belonging to the same type exhibit similar management attributes and behaviors. Hence, modeling the targets automatically classifies them into buckets of various target types. Each target type is different from the other and requires specific management tasks and skill sets. Even standard operations such as process control, backup, and recovery and so on, need to be performed in a manner specific to the type. In the absence of a classification based on type, it will be an overwhelming challenge to manage the disparate targets. For instance, backup operations of a database target are drastically different from that of an application server. With the classification of targets based on types, it is far easier to perform backup operations across all database types. This also enables the administrator tasks that are very specific to a particular type. For example, the database instances require periodic re-indexing, which is not required for application server targets.
The following image illustrates the classification of different targets within the travel portal by target type. It can be seen that the various WebLogic Servers are classified under the same target type. A similar categorization is shown for the database targets as well.
With the introduction of targets models, the components within a data center can be visualized as a manageable entity. The target type model further enhances this by enabling the IT administrator to collate the behavior and operations of similar target instances.
Systems and groups are paradigms that help in visualizing the holistic perspective of the enterprise IT infrastructure using composition of multiple targets. Groups model homogeneous targets together, that is, belonging to the same target type, whereas systems model heterogeneous targets. These are two similar perspectives that help the IT staff in mapping business functions to IT infrastructure. These two models supplement each other in combining the management tasks.
The targets in an enterprise can be aligned together based on the target type. For instance, the data storage for a specific business function could be provided by multiple database instances, for reasons of failover or load balancing. As a result, it makes sense to model and subsequently manage these database instances together. Such a model of logically related homogeneous targets is known as a target group. For example, in the travel portal example, the database targets
orcl2 cater to the U.S. customer base.
The following image depicts that the database instances
orcl2 can be combined into a single target group—US Portal Oracle Database Group.
The primary advantage of combining the various target instances into a target group is the ability to manage multiple targets as one. Even though there are two database instances in the above travel portal in the U.S. region, they can be logically managed as a single target group. This facilitates applying common management tasks on all the members comprising the target group. For example, in the travel portal, the backup of all member targets such as
orcl2 within the US-DB-Group can be performed together. The same backup database job can be run for all instances within the same group.
Moreover, the group helps in monitoring the member targets as one entity. As there are two database instances clustered in the U.S. data center to facilitate load balancing and failover, the availability of the database as a whole to the middleware and travel applications can be viewed if they are grouped together. This allows the administrator to ensure that both the database instances do not become unavailable simultaneously.
Grouping related targets together also helps in comparing the target configuration together. Also, policy enforcements are simplified by modelling the related target instances together as a group. As an illustration, at the enterprise level, it might be mandated that all the Oracle database instances deployed for the travel portal must be of a certain version and patch-set level. The database administrator in the U.S. data center can easily compare the current version and patch-set deployments of the databases within the target group and can therefore enforce the above policy.
The targets within an enterprise can also be related by various other categories. These categories can be based on parameters such as lines of business, functions, geographies, and so on. When multiple targets interact with each other to provide a business solution, it is natural from a management point of view to combine the various components into a logical entity. This paradigm of modeling disparate but related targets as one entity is known as system modeling.
The targets within a particular geographical location can be related together into a single entity for easiness in management operations. For instance, combining all the targets within the Singapore data center helps in visualizing all the components, providing business functions to the APAC users. From a world-wide view, such a geographical perspective aids in getting a snapshot of the components. This snapshot can be used to drive operational efficiency such as utilization across all business functions provided within the geography. Such an aggregation of all the components within a specific location also facilitates monitoring the various business functions that are provided within the geography.
The following image illustrates the aggregation of different but logically related targets within the same geographical location in the travel portal. Different targets such as Portal app, Weblogic Server, Oracle Database server, and the related hosts in Singapore are combined together into a single system target—APAC TRAVEL SYSTEM.
Another criterion for relating different targets within an enterprise into a system can be functional support. For example, within a travel portal, it makes sense to combine the logically related targets such as the middleware targets, related database targets, and associated hosts, which provide credit card validation and payment functions into a single system.
One of the significant benefits of modeling a system is the ability to view the difference in configuration between two time intervals. As described in the target modeling section, each target has configuration parameters. These parameters may get changed as part of the configuration management operation or as part of maintenance operations. By comparing the changes in the configuration of the system as a whole between two different intervals of time, any misbehavior in performance of the topology can be easily nailed down. For instance, after a recent patch-set deployment in one of the related application servers, if the credit validation and payment functions show poor performance, the diagnosis is aided by a consolidated view of all the configuration changes in the recent past within the system.
Modeling components that interact to provide a related business function into a system have added benefits such as scheduling the same maintenance windows for all the related targets. For example, all the targets that interact to provide the credit card validation function within the travel portal. They can be restarted together after a critical patch-set deployment.
Once a system has been modelled, it becomes easier for individual target type administrators to determine the potential impact of a specific operation on a target instance. For example, a database administrator who manages the database instances of the credit card validation function might contemplate a restart. By viewing the associated system, it becomes fairly easy to identify other targets such as application server, application deployments, and so on, which could be impacted due to this operation. Hence, a view based on business function allows individual stakeholders to determine the business impact of IT operations.
Both the systems and groups model aim at visualizing the related components of the infrastructure stack as one entity. By doing so, there are some inherent advantages. They are as follows:
Associating related targets into a single entity allows the administrators to view the availability of all the targets together
Combining related targets helps in rolling up the key policy violations across the topology to be looked at
Aggregation of related targets also helps in visualizing deviations from expected thresholds of metrics collectively
However, the systems model is significantly different from the group model described above. While a group target comprises of homogeneous targets, the systems model comprises of logically related heterogeneous target instances. A group target enables similar operations across targets of the same type and is primarily intended to model clusters providing failover and load balancing functions. A system target is intended to be a single point of reference for a particular line of business or geography even while managing multiple targets belonging to different types.
IT infrastructure comprises of multiple targets that interact with each other to provide numerous business functions. These business functions are used by both internal and external consumers. It is apparent that no IT management solution is complete without a functional view of the business services provided by the infrastructure. Such a functional paradigm of the IT topology with a business-centric focus is known as the service model.
Continuing with the travel portal example, the portal provides a wide gamut of business services to different consumers such as flight reservation, hotel reservation, and so on. In addition the portal also consumes services from other service providers to enhance their business functions. For instance, the travel portal may rely on a third-party payment gateway to facilitate all payment-related operations. With the service modeling, each of these business services is represented as a different entity having its own respective business value.
The following image indicates the various business services provided by the Travel Portal Application. The travel portal provides a suite of business functions such as flight search, hotel search, car rentals, reservations, and so on. In addition, the travel portal also consumes the payment service from the service partner illustrated as follows:
Service modeling allows the administrators to manage the infrastructure viewed through a business service dimension. This is different from the traditional management philosophy that relies on a bottoms-up approach in managing and maintaining individual components. Defining a service model and applying that in operations management helps the administrators map the business priorities in their day-to-day management tasks. This is an optimization over administrators working with individual components in silos, clueless about their impact on overall business strategy. This is a significant leap in bridging the gap between IT and business management.
Service modeling is a top-down approach in managing the IT infrastructure. At the end of the day, the business service offered by a data center is the very reason for its existence. This essence is extremely significant to IT infrastructure in moving up the value chain in the larger organizational goals. This is a paradigm shift in focus from managing individual components to managing the business service itself as an entity. While this brings in its own set of challenges in modeling and monitoring the underlying infrastructure, it drives all decision-making processes from a service consumer perspective. The service model provides that vital missing link between IT admin and the end user.
For example, in the absence of a business service-centric mindset, a database administrator travel portal would have applied a database patch-set and perform restart operation without knowing the business impact. There might be qualitative assurances such as SLAs in place with partners based on the up time. With the business service model in place, the impact of any operation on the service level is assessed prior to the execution of any IT operation. The administrator now can determine the current service level of the business function and then check for possible violations in assurances before embarking on significant operations. This requires working with the business teams to determine the right maintenance window before any critical configuration change.
These days where distributed technology such as cloud computing, grid computing, managed and hosted services, and on demand services are prevalent, there needs to be an assurance on the quality of the service that is offered. This assurance is represented as Service-Level Agreements or SLAs. The SLAs between the service consumer and the provider may be arrived based on various parameters. The key parameters determining the service levels include:
Availability: Assurance on the up time of the business service. As an example, the travel portal may be required to be available 99 percent of all business hours.
Performance: Assurance on the quality of performance of the business service. For example, payment transactions would be completed within three seconds.
Usage: Assurance on the scalability and robustness of the service. For instance, the travel portal may support up to 100,000 requests per second.
Support: Assurance on maintenance and support whenever there is a service disruption. For example, any P1 support ticket will be closed in eight hours.
Service modeling also helps in tracking the service levels of dependent service providers. As discussed above, the travel portal is dependent on the third-party payment gateway for the payment-related functional flows. If there is any drop in service quality levels from the agreed levels, this might impact the business process flows of the travel portal. This drop ultimately impacts the end users of the travel portal, which could potentially result in loss of business revenue. With the service model, both service providers and consumers can track potential service-level disruptions.
The various business service offerings within a business enterprise comprise of multiple technologies and hardware. The service model enables the management of the business functions and their service levels without getting overwhelmed by the variety and scale of the underlying technology. For example, the travel portal provides the ability to perform search on both domestic and international flights by interacting with various partner airline web services and exposes this flight search as a web service. This business function is made possible through complex interactions between hundreds of components spread across multiple data centers and locations. These different components leverage on different technologies of hardware and software. The travel portal has an agreement with its business partners to provide the
flight search function within a three-second period. In the absence of the service model, it becomes extremely cumbersome to define and monitor the performance of this business function. The sheer magnitude of the technologies involved as well as the scale and range of the IT components involved would require specialized skill sets and domain knowledge, to manage such a distributed infrastructure. By visualizing the flight search using the service model, the IT management of the flight search is simplified and tracking of the service level becomes straightforward. The service model hence provides an abstraction to get past the complexity of the IT implementation and focus on the key business goals.
The system and group models, as discussed before, provide the mapping between business goals and the IT infrastructure within the enterprise. Both the system and group models aid in getting a unified view of the various technologies within the topology into a single entity. In a modern day enterprise, it is common to share the same IT infrastructure supporting multiple business functions for effective utilization of resources. For example, in the travel portal, the same application server and database may host both hotel search and flight search applications. The system or group targets associate the application servers and databases into a single entity; the multiple business functions offered by such an entity will still be required to be distinguished. For instance, the service-level assurance for a hotel search may be different from the service levels offered by the flight search. So the hotel search and flight search business services would need to be modeled as two different entities having different service-levels. The service models help in visualizing and monitoring different business services offered within the same IT infrastructure.
The business services provided by an enterprise IT infrastructure can be visualized in two different paradigms. Each of these services can be envisaged as an end result of the associated IT components in the data center that interact with each other. The business services can also be represented based on an external view of the business services. These service models are complementary in nature and provide different perspectives of the same business service.
The different business service offerings from an enterprise IT infrastructure is an end result of interactions between various component targets. Hence, the business service itself can be visualized as a direct outcome of the system components. This system-based approach in modeling the business service is an effective approach in viewing the various service attributes such as availability, performance, and service-levels as a combination of the system components. Such a modeling approach is very helpful to a service provider in determining and identifying the infrastructure components that can potentially impact the service levels.
System-based service modeling involves at first relating all the associated component targets into a system, and then rolling up the performance traits from the component levels within a system to a service attribute. As described above, the service levels of a business function can be expressed in terms of different traits such as availability, performance, and so on. The availability of a business service can then be monitored using the health or status of various components within the system. For example, in the travel portal, the business offerings can be modeled as services based on a system comprising the portal application, the application server on which it is deployed, and the associated database servers and hosts. The availability of the business services can then be ensured by making sure that all the relevant components within the system have high availability.
The following image illustrates the modeling concept of a service, based on a system target. In the travel portal example, the business function of flight search is modeled as a business service provided by a system of different targets such as Weblogic Server, Portal app, Oracle Database, and so on.
The performance of the service is a direct outcome of the performance of the underlying IT infrastructure. Hence the quality of the business service can also be modeled by defining the attributes of the underlying service targets. For instance, the average response time of the flight service (which is one of the offerings of the travel portal) can be modeled as an aggregation of the average response time within the application server as well as the average time spent in the execution of database queries. Therefore, a service-level assurance of eight seconds turnaround time in any invocation of a flight search can be interpreted as a threshold of three seconds for database query execution and five second within the application server. Such a system-based modeling of the flight search business service enables the IT staff to define the performance goals of each system component based on the overall organization business priorities.
System-based service modeling is also useful in determining the impact of a specific component in meeting the SLAs. For instance, if there is a critical patch-set that needs to be applied to a database helps in the execution of the flight search business service, the maintenance window can be scheduled based on the current service-level of the flight search service and the business hours specified in the SLAs. In the absence of a system-based service model, this process requires manual intervention and requires multiple layers of communication between the business and IT teams. Such a model automates the whole process and ensures smooth delivery of the service.
Service models based on system components can also be highly useful in determining the required capacity additions to meet the service-level assurances. By mapping the performance of the service to that of the underlying infrastructure components, the service model equips the IT with the required data points so as to meet a business requirement. For instance, if there is an expected surge in the user traffic in the travel portal in the U.S. area, the IT management can easily determine the current components that interact to provide the flight search area and provide the performance levels of each of the components. This can be utilized by the IT management to determine if more application servers are required for providing load balancing and thereby ensuring that the infrastructure can be scaled up to meet the business demands.
Another dimension of modeling the business services is through actually invoking the services provided by the enterprise periodically. By measuring the performance of these regular interactions, the perceived service levels can be determined. Such an artificial invocation of business services periodically with the specific intention to check various parameters such as availability, performance, and so on is known as synthetic transaction or service test. The synthetic transactions provide a direct means of measuring the real end user experience for the various business services. This way of modelling the business services through the execution of synthetic transition provides a consumer perspective of the service availability and performance.
The different business services provided by the enterprise have different categories of consumers. These consumers may be located in different parts of the globe. By executing synthetic transactions from locations having key customers, the service levels of critical customers can be monitored separately. For instance, for the travel portal in the APAC region, by defining and executing these synthetic transactions from Tokyo and Beijing, the perceived service-levels by customers in these two key locations can be ascertained. Such a location-wise perspective can be a unique differentiator in the service-level offerings from that of the competition.
By having the transactions executed from various locations, the network latency of the services can also be determined. This can be a key input in determining the capacity additions during the planning phase as well as tuning for high value customers. For instance, if a slow response is experienced by the Beijing customers, there can be more servers added in the APAC region that can be dedicated to the user requests, originating from Beijing.
The synthetic transactions can be executed from within or outside of the enterprise IT infrastructure. The execution of such service tests from different locations within the enterprise infrastructure provides insights into availability and performance of different aspects of the business services. This information can be extremely handy in determining and avoiding single points of failures within a business service. By running the synthetic transactions from multiple locations external to the IT infrastructure, the perceived service behavior in different locations can be compared with each other. Any deviations in service-level performance in specific locations can also be ascertained. This can also be useful in proactively managing and monitoring the IT infrastructure. In the travel portal example, with the execution of synthetic transactions in Beijing to simulate the end customer behavior, the IT staff can proactively diagnose and fix the poor performance of the business service, even before the end customers report this as a service ticket. This can not only reduce the turnaround time in fixing the service disruptions, but also ensures that the SLAs with a specific customer are strictly adhered to.
The following image illustrates the concept of synthetic transactions being used to model a business service. The Flight Search business function in the travel portal is monitored externally using synthetic transactions that are executed from multiple geographical locations such as Tokyo and Beijing.
The same business functions may be exposed through different interfaces by using different technologies. For example, in the same travel portal, the Flight Search business function is exposed both as a web page in the portal for end users and as a web service, for use by partner applications. By having two synthetic transactions— a web transaction simulating the end user search and a web service transaction that mimics the invocation from a partner application—the behavior of different flavors of the same business service can be managed more effectively.
The synthetic transactions are "dummy" transactions that simulate the actual interactions between the end users and the enterprise IT infrastructure that provides the business services. As these synthetic transactions interact with the actual infrastructure in the data centers as part of the production systems, the synthetic transactions will have to be distinguished from normal user transactions. This can be achieved by specifying special parameters as part of these service tests. For instance, the travel portal can have specific user accounts configured to be used by the synthetic transactions so that these transactions do not result in any real checkout or ticketing.
The modeling techniques that are covered in the previous sections are derived from industry standard guidelines. Information Technology Infrastructure Library or ITIL is one of the prominent standards that provides a set of guidelines around Enterprise IT Management. It looks at all aspects of the IT management ecosystem including business services management, IT operations management, and its development. ITIL has evolved from v1 to v3, which is the current form. Before the evolution of the ITIL guidelines enterprises, the world didn't have an accepted set of frameworks, tools, and policies for IT management. This forced each enterprise to develop its own set of frameworks for managing the complexities in the data center. However, this made the landscape very complex for private external management vendors to develop standard products that could potentially take advantage of common guidelines for IT management. Essentially this drove the cost of managing IT upwards. This ensured that enterprises and governmental agencies came together to define a common set of guidelines for suggested frameworks around IT management. The results of this initial attempt of consolidation of IT management principles and guidelines were released as ITIL v1.
As can be imagined, the initial versions of the guidelines were an attempt to consolidate the best practices of IT management from different governmental and private enterprises. This attempt at consolidation was not very successful, and the guidelines grew very large and ran into many volumes. ITIL v2 was the next attempt at consolidation of the guideline sets from v1. However, this time the focus was more towards managing the IT infrastructure through a services-based model. This focus on business services modelling and management ensured a wider acceptance and understanding of the guidelines. This acceptance and understanding led to a further consolidation of the guidelines with a renewed focus on business services lifecycle management, and the guidelines were released in 2007 as v3 which is the most recent form.
As seen from the evolution of ITIL guidelines above, post the initial consolidation of IT management principles, the focus has shifted from a process-based management of the infrastructure to a services-based model and its management. This focus has remained largely intact, and has also gained acceptance in almost all the large enterprises as the guidelines for IT infrastructure modeling, management, and its overall governance.
The latest specifications, that is, ITILv3 (http://www.itil-officialsite.com/) focus on service lifecycle management. The lifecycle of services management is described in the following five steps:
1. Service strategy: This step is the most important step and acts as the basis of the entire lifecycle. This deals with the requirements for a business service and the policies that govern its implementation and delivery.
2. Service design: This step follows the strategy and deals with the design of the business service. This defines the guidelines for selection of technology, design architectures, and all other design aspects that enable the final service to deliver on the agreed set of requirements.
3. Service transition: This step follows the design and implementation of the services and focuses on the actual delivery of the service to the end customers. This step primarily deals with service evaluation as against the defined requirements and management aspects such as configuration management, release management, and change management.
4. Service operation: This step follows the transition and deals with the continuous operations of the service. This focuses primarily on event, incident, and problem management as well as deals with access management.
5. Continuous service improvement: This is the final step in the lifecycle of the business service and as the name suggests, deals with processes that can help improve the current form of the service. This step deals with managing and reporting the service-levels. These reports are then analyzed to identify potential areas of improvement. These recommendations are fed back to the operations teams to optimize infrastructure usage to ultimately achieve high service-levels. As suggested by the name, this is a continuous process and attempts to maximize the efficiency of service delivery to the end customers.
The lifecycle described above is a step process where each step is a logical continuation of the preceding steps. The final steps of the lifecycle then feed back into the first step. This feedback is critical in gathering the requirements for the next version of the service and also for its subsequent design and implementation.
This section has attempted to provide an overview of the evolution of the ITIL guidelines with a focus on its current form (v3). As one can imagine these guidelines are very comprehensive in nature, with each step requiring a very detailed description and discussion. However, as seen above, the focus is clearly on managing the IT infrastructure as a tool that delivers the business service. This focus is now widely accepted and adopted as the strategy of choice for IT management by almost all CTOs.
In this chapter, we covered the challenges at different levels of IT operation in managing the intricacies involved in today's data centers. The evolution of the data centers from monolithic servers to large scale distributed deployments has brought about significant complexity in the IT backbone. Modeling this complex topology and the inherent interactions with a business service-based focus is one of the precursors to effectively manage the IT infrastructure. Reduction in complexity of management will be enabled by a hierarchy of models namely target, system, and services.
Modeling components into a target enables the administrator to perform regular monitoring and maintenance tasks easily. Creation of group targets, based on related and homogeneous targets, as well as modeling related components into a system, brings in the required business outlook in viewing the IT infrastructure. Modeling the business functions as services brings in the end user dimension while managing the data centers. Service modeling also helps in determining the service-levels based on availability and performance as perceived by the service consumers. This end user-based focus towards IT management is a must for continuous improvement in IT operations. Such an improvement is a key to transforming IT, to scale up to meet the business challenges in any organization. To wrap up, this chapter also gave an introduction to the industry standards such as ITIL in business service modeling and governance. To put it in a nutshell, we saw that BSM requires the right mix of modeling and mindset.
The next chapter will introduce Oracle Enterprise Manager (OEM) 11g. OEM is a product offering from Oracle that provides solutions to the typical infrastructure management problems as described in this chapter. This chapter will introduce the key terminologies and concepts used within OEM. Understanding them will be essential for effective deployment of OEM for managing business services.