Common performance issues

(For more resources related to this topic, see here.)

Threading performance issues

Threading performance issues are the issues related to concurrency, as follows:

  • Lack of threading or excessive threading
  • Threads blocking up to starvation (usually from competing on shared resources)
  • Deadlock until the complete application hangs (threads waiting for each other)

Memory performance issues

Memory performance issues are the issues that are related to application memory management, as follows:

  • Memory leakage: This issue is an explicit leakage or implicit leakage as seen in improper hashing
  • Improper caching: This issue is due to over caching, inadequate size of the object, or missing essential caching
  • Insufficient memory allocation: This issue is due to missing JVM memory tuning

Algorithmic performance issues

Implementing the application logic requires two important parameters that are related to each other; correctness and optimization. If the logic is not optimized, we have algorithmic issues, as follows:

  • Costive algorithmic logic
  • Unnecessary logic

Work as designed performance issues

The work as designed performance issue is a group of issues related to the application design. The application behaves exactly as designed but if the design has issues, it will lead to performance issues. Some examples of performance issues are as follows:

  • Using synchronous when asynchronous should be used
  • Neglecting remoteness, that is, using remote calls as if they are local calls
  • Improper loading technique, that is, eager versus lazy loading techniques
  • Selection of the size of the object
  • Excessive serialization layers
  • Web services granularity
  • Too much synchronization
  • Non-scalable architecture, especially in the integration layer or middleware
  • Saturated hardware on a shared infrastructure

Interfacing performance issues

Whenever the application is dealing with resources, we may face the following interfacing issues that could impact our application performance:

  • Using an old driver/library
  • Missing frequent database housekeeping
  • Database issues, such as, missing database indexes
  • Low performing JMS or integration service bus
  • Logging issues (excessive logging or not following the best practices while logging)
  • Network component issues, that is, load balancer, proxy, firewall, and so on

Miscellaneous performance issues

Miscellaneous performance issues include different performance issues, as follows:

  • Inconsistent performance of application components, for example, having slow components can cause the whole application to slow down
  • Introduced performance issues to delay the processing speed
  • Improper configuration tuning of different components, for example, JVM, application server, and so on
  • Application-specific performance issues, such as excessive validations, apply many business rules, and so on

Fake performance issues

Fake performance issues could be a temporary issue or not even an issue. The famous examples are as follows:

  • Networking temporary issues
  • Scheduled running jobs (detected from the associated pattern)
  • Software automatic updates (it must be disabled in production)
  • Non-reproducible issues

In the following sections, we will go through some of the listed issues.

Threading performance issues

Multithreading has the advantage of maximizing the hardware utilization. In particular, it maximizes the processing power by executing multiple tasks concurrently. But it has different side effects, especially if not used wisely inside the application.

For example, in order to distribute tasks among different concurrent threads, there should be no or minimal data dependency, so each thread can complete its task without waiting for other threads to finish. Also, they shouldn't compete over different shared resources or they will be blocked, waiting for each other. We will discuss some of the common threading issues in the next section.

Blocking threads

A common issue where threads are blocked is waiting to obtain the monitor(s) of certain shared resources (objects), that is, holding by other threads. If most of the application server threads are consumed in a certain blocked status, the application becomes gradually unresponsive to user requests.

In the Weblogic application server, if a thread keeps executing for more than a configurable period of time (not idle), it gets promoted to the Stuck thread. The more the threads are in the stuck status, the more the server status becomes critical. Configuring the stuck thread parameters is part of the Weblogic performance tuning.

Performance symptoms

The following symptoms are the performance symptoms that usually appear in cases of thread blocking:

  • Slow application response (increased single request latency and pending user requests)
  • Application server logs might show some stuck threads.
  • The server's healthy status becomes critical on monitoring tools (application server console or different monitoring tools)
  • Frequent application server restarts either manually or automatically
  • Thread dump shows a lot of threads in the blocked status waiting for different resources
  • Application profiling shows a lot of thread blocking

An example of thread blocking

To understand the effect of thread blocking on application execution, open the HighCPU project and measure the time it takes for execution by adding the following additional lines:

long start= new Date().getTime(); .. .. long duration= new Date().getTime()-start; System.err.println("total time = "+duration);

Now, try to execute the code with a different number of the thread pool size. We can try using the thread pool size as 50 and 5, and compare the results. In our results, the execution of the application with 5 threads is much faster than 50 threads!

Let's now compare the NetBeans profiling results of both the executions to understand the reason behind this unexpected difference.

The following screenshot shows the profiling of 50 threads; we can see a lot of blocking for the monitor in the column and the percentage of Monitor to the left waiting around at 75 percent:

To get the preceding profiling screen, click on the Profile menu inside NetBeans, and then click on Profile Project (HighCPU). From the pop-up options, select Monitor and check all the available options, and then click on Run.

The following screenshot shows the profiling of 5 threads, where there is almost no blocking, that is, less threads compete on these resources:

Try to remove the System.out statement from inside the run() method, re-execute the tests, and compare the results.

Another factor that also affects the selection of the pool size, especially when the thread execution takes long time, is the context switching overhead. This overhead requires the selection of the optimal pool size, usually related to the number of available processors for our application.

Context switching is the CPU switching from one process (or thread) to another, which requires restoration of the execution data (different CPU registers and program counters). The context switching includes suspension of the current executing process, storing its current data, picking up the next process for execution according to its priority, and restoring its data.

Although it's supported on the hardware level and is faster, most operating systems do this on the level of software context switching to improve the performance. The main reason behind this is the ability of the software context switching to selectively choose the required registers to save.

Thread deadlock

When many threads hold the monitor for objects that they need, this will result in a deadlock unless the implementation uses the new explicit Lock interface. In the example, we had a deadlock caused by two different threads waiting to obtain the monitor that the other thread held.

The thread profiling will show these threads in a continuous blocking status, waiting for the monitors. All threads that go into the deadlock status become out of service for the user's requests, as shown in the following screenshot:

Usually, this happens if the order of obtaining the locks is not planned. For example, if we need to have a quick and easy fix for a multidirectional thread deadlock, we can always lock the smallest or the largest bank account first, regardless of the transfer direction. This will prevent any deadlock from happening in our simple two-threaded mode. But if we have more threads, we need to have a much more mature way to handle this by using the Lock interface or some other technique.

Memory performance issues

In spite of all this great effort put into the allocated and free memory in an optimized way, we still see memory issues in Java Enterprise applications mainly due to the way people are dealing with memory in these applications.

We will discuss mainly three types of memory issues: memory leakage, memory allocation, and application data caching.

Memory leakage

Memory leakage is a common performance issue where the garbage collector is not at fault; it is mainly the design/coding issues where the object is no longer required but it remains referenced in the heap, so the garbage collector can't reclaim its space. If this is repeated with different objects over a long period (according to object size and involved scenarios), it may lead to an out of memory error.

The most common example of memory leakage is adding objects to the static collections (or an instance collection of long living objects, such as a servlet) and forgetting to clean collections totally or partially.

Performance symptoms

The following symptoms are some of the expected performance symptoms during a memory leakage in our application:

  • The application uses heap memory increased by time
  • The response slows down gradually due to memory congestion
  • OutOfMemoryError occurs frequently in the logs and sometimes an application server restart is required
  • Aggressive execution of garbage collection activities
  • Heap dump shows a lot of objects retained (from the leakage types)
  • A sudden increase of memory paging as reported by the operating system monitoring tools

An example of memory leakage

We have a sample application ExampleTwo; this is a product catalog where users can select products and add them to the basket. The application is written in spaghetti code, so it has a lot of issues, including bad design, improper object scopes, bad caching, and memory leakage. The following screenshot shows the product catalog browser page:

One of the bad issues is the usage of the servlet instance (or static members), as it causes a lot of issues in multiple threads and has a common location for unnoticed memory leakages.

We have added the following instance variable as a leakage location:

private final HashMap<String, HashMap> cachingAllUsersCollection = new HashMap();

We will add some collections to the preceding code to cause memory leakage. We also used the caching in the session scope, which causes implicit leakage. The session scope leakage is difficult to diagnose, as it follows the session life cycle. Once the session is destroyed, the leakage stops, so we can say it is less severe but more difficult to catch.

Adding global elements, such as a catalog or stock levels, to the session scope has no meaning. The session scope should only be restricted to the user-specific data. Also, forgetting to remove data that is not required from a session makes the memory utilization worse. Refer to the following code:

@Stateful public class CacheSessionBean

Instead of using a singleton class here or stateless bean with a static member, we used the Stateful bean, so it is instantiated per user session. We used JPA beans in the application layers instead of using View Objects. We also used loops over collections instead of querying or retrieving the required object directly, and so on.

It would be good to troubleshoot this application with different profiling aspects to fix all these issues. All these factors are enough to describe such a project as spaghetti.

We can use our knowledge in Apache JMeter to develop simple testing scenarios. As shown in the following screenshot, the scenario consists of catalog navigations and details of adding some products to the basket:

Executing the test plan using many concurrent users over many iterations will show the bad behavior of our application, where the used memory is increased by time. There is no justification as the catalog is the same for all users and there's no specific user data, except for the IDs of the selected products. Actually, it needs to be saved inside the user session, which won't take any remarkable memory space.

In our example, we intend to save a lot of objects in the session, implement a wrong session level, cache, and implement meaningless servlet level caching. All this will contribute to memory leakage. This gradual increase in the memory consumption is what we need to spot in our environment as early as possible (as we can see in the following screenshot, the memory consumption in our application is approaching 200 MB!):

Improper data caching

Caching is one of the critical components in the enterprise application architecture. It increases the application performance by decreasing the time required to query the object again from its data store, but it also complicates the application design and causes a lot of other secondary issues.

The main concerns in the cache implementation are caching refresh rate, caching invalidation policy, data inconsistency in a distributed environment, locking issues while waiting to obtain the cached object's lock, and so on.

Improper caching issue types

The improper caching issue can take a lot of different variants. We will pick some of them and discuss them in the following sections.

No caching (disabled caching)

Disabled caching will definitely cause a big load over the interfacing resources (for example, database) by hitting it in with almost every interaction. This should be avoided while designing an enterprise application; otherwise; the application won't be usable.

Fortunately, this has less impact than using wrong caching implementation!

Most of the application components such as database, JPA, and application servers already have an out-of-the-box caching support.

Too small caching size

Too small caching size is a common performance issue, where the cache size is initially determined but doesn't get reviewed with the increase of the application data. The cache sizing is affected by many factors such as the memory size. If it allows more caching and the type of the data, lookup data should be cached entirely when possible, while transactional data shouldn't be cached unless required under a very strict locking mechanism.

Also, the cache replacement policy and invalidation play an important role and should be tailored according to the application's needs, for example, least frequently used, least recently used, most frequently used, and so on.

As a general rule, the bigger the cache size, the higher the cache hit rate and the lower the cache miss ratio. Also, the proper replacement policy contributes here; if we are working—as in our example—on an online product catalog, we may use the least recently used policy so all the old products will be removed, which makes sense as the users usually look for the new products.

Monitoring of the caching utilization periodically is an essential proactive measure to catch any deviations early and adjust the cache size according to the monitoring results. For example, if the cache saturation is more than 90 percent and the missed cache ratio is high, a cache resizing is required.

Missed cache hits are very costive as they hit the cache first and then the resource itself (for example, database) to get the required object, and then add this loaded object into the cache again by releasing another object (if the cache is 100 percent), according to the used cache replacement policy.

Too big caching size

Too big caching size might cause memory issues. If there is no control over the cache size and it keeps growing, and if it is a Java cache, the garbage collector will consume a lot of time trying to garbage collect that huge memory, aiming to free some memory. This will increase the garbage collection pause time and decrease the cache throughput.

If the cache throughput is decreased, the latency to get objects from the cache will increase causing the cache retrieval cost to be high to the level it might be slower than hitting the actual resources (for example, database).

Using the wrong caching policy

Each application's cache implementation should be tailored according to the application's needs and data types (transactional versus lookup data). If the selection of the caching policy is wrong, the cache will affect the application performance rather than improving it.

Performance symptoms

According to the cache issue type and different cache configurations, we will see the following symptoms:

  • Decreased cache hit rate (and increased cache missed ratio)
  • Increased cache loading because of the improper size
  • Increased cache latency with a huge caching size
  • Spiky pattern in the performance testing response time, in case the cache size is not correct, causes continuous invalidation and reloading of the cached objects

An example of improper caching techniques

In our example, ExampleTwo, we have demonstrated many caching issues, such as no policy defined, global cache is wrong, local cache is improper, and no cache invalidation is implemented. So, we can have stale objects inside the cache.

Cache invalidation is the process of refreshing or updating the existing object inside the cache or simply removing it from the cache. So in the next load, it reflects its recent values. This is to keep the cached objects always updated.

Cache hit rate is the rate or ratio in which cache hits match (finds) the required cached object. It is the main measure for cache effectiveness together with the retrieval cost.

Cache miss rate is the rate or ratio at which the cache hits the required object that is not found in the cache.

Last access time is the timestamp of the last access (successful hit) to the cached objects.

Caching replacement policies or algorithms are algorithms implemented by a cache to replace the existing cached objects with other new objects when there are no rooms available for any additional objects. This follows missed cache hits for these objects. Some examples of these policies are as follows:

  • First-in-first-out (FIFO): In this policy, the cached object is aged and the oldest object is removed in favor of the new added ones.
  • Least frequently used (LFU): In this policy, the cache picks the less frequently used object to free the memory, which means the cache will record statistics against each cached object.
  • Least recently used (LRU): In this policy, the cache replaces the least recently accessed or used items; this means the cache will keep information like the last access time of all cached objects.
  • Most recently used (MRU): This policy is the opposite of the previous one; it removes the most recently used items. This policy fits the application where items are no longer needed after the access, such as used exam vouchers.
  • Aging policy: Every object in the cache will have an age limit, and once it exceeds this limit, it will be removed from the cache in the simple type. In the advanced type, it will also consider the invalidation of the cache according to predefined configuration rules, for example, every three hours, and so on.

It is important for us to understand that caching is not our magic bullet and it has a lot of related issues and drawbacks. Sometimes, it causes overhead if not correctly tailored according to real application needs.

Work as designed performance issues

In work as designed performance issues, the application behaves exactly as designed but the design itself has issues that lead to bad performance. Let's go through some examples.

Synchronous where asynchronous is required

The design assumes that some parts of the application can be achieved in sequence without considering the expected time spent in some elements of this flow (or retry logic in certain services). This will lead to a bad performance of the application in these transactions, while in fact it is working as designed.

For example, if the transaction needs to send an e-mail to the customer and the e-mail server is not responding, the request will end up with a timeout after a configurable period, and if there is a retry logic applied, the user will wait until a response is sent back from the application.

Performance symptoms

The identification of the issues usually results from application analysis (for example, code inspection, profiling analysis, and so on), but we can expect the general symptoms when we use synchronous code where we should use the asynchronous one, as follows:

  • Slow response time intermittent or consistent in certain application areas
  • Customer browser inconsistently gives timeout messages such as 408 - Http Request timeout error

An example of improper synchronous code

A common example of improper synchronous code is how online orders are submitted. As a part of submission of the order, a lot of internal systems communication is usually required. We shouldn't let the customer wait all this time, but instead, we need to show the confirmation page after executing only the essential steps. All other steps that can be executed in the background should be communicated to the customer by the asynchronous communication, that is, Ajax calls or e-mails.

Neglecting remoteness

In the distributed development with different system components, sometimes the developers are not aware that they are actually making remote calls when an abstract layer is used or when they interact with the integration library. So whenever they need to get some information from this interface, they use it without any concerns.

This is a typical development issue. What we mean here by neglecting remoteness is that the designer does not respect the remoteness of the system that calls during the design of the application. These remote calls will consume time and could also involve some serialization operations; both will impact the transaction performance.

Performance symptoms

The following symptoms usually appear when we neglect the cost of the remote calls:

  • Consistent delay of the application transactions
  • Performance tuning cycles do not cause any actual improvements
  • Code analysis suggests the response time is almost spent in certain remote calls
  • Mismatch between the number of transactions and the number of remote service calls (for example, remote calls double the number of transactions)

An example of using remote calls as local calls

The application displays a lot of the vendor's products. We need to check the product availability and the latest prices prior to displaying it, adding it to the basket, or executing the final checkout.

The operation seems to be simple but it is a remote call. The response time is dependent on the remote system and a sequence of these operations will cause an impact on application response time.

Excessive serialization performance impact

The system design follows the best practices of isolating different application layers and organizing them into loosely coupled layers. One of these layers consists of fine-tuned web services that are orchestrated into larger wrapper web services. This will cause each request to these orchestrated services to go through many serialization layers, from the wrapper web services to the subsequent fine service calls.

Performance symptoms

Excessive serialization can lead to the following performance symptoms:

  • Consistent slow performance of the application, in particular, under load
  • Low throughput of the application
  • Taking thread dump under load will show a lot of threads in serialization logic

An example of excessive serialization

The following two issues represent good examples of the excessive serialization issue.

Object size impact

Determining the object size is essential as it can impact the performance if not optimally designed. For example, small-sized objects can affect the performance if it is in the interfacing layer, and it would require a lot of calls to assemble all the required objects even if only a few attributes are actually needed from each object.

Large objects can also produce useless overhead over network transmission. The same effect takes place from using complex nested object structures. The following diagram represents the impact of selecting the object size during designing, in particular, the interfacing specifications:

If these calls involve serialization in any form of data, it would produce additional overhead, for example, XML, JSON, and so on.

Another aspect for the object size impact is memory consumption, so if we need to save the User object in the session and if this object holds a lot of unnecessary data, it will be a waste of memory to do that. Instead, we need to add to the session the minimal required information to efficiently utilize the application memory.

Web services granularity impact

Similar to the object size performance concern, web services should be designed to fulfill the requirement in the least number of service calls rather than having multiple calls that produce overhead and decrease the performance of the application.

Let's assume that we have a web service that returns the weather forecast for a given city. We can select one of the following design options:

  • If the application wants a week's data, it should call the weather forecast per day option seven times with a different day
  • If the application wants a month's data, it should call the weather forecast per week option four to five times to get the complete weather detail of that month, and if it needs just a day detail, it should neglect the extra data
  • If the application wants a week or a day's data, it should call the weather forecast per month option and filter out the extra data
  • Weather forecast for the required period provides the caller the ability to send the start date and number of days the forecast is required for with a maximum of 30 days of data that will be returned

These options are designed for a decision that should be taken according to application needs, but we can see the fourth option is equivalent to implementing the other three call types. So, it would be better to have a single call of the fourth type rather than having all these call types. At the same time, all the other three call types wouldn't serve all possible scenarios (for example, retrieving the weather forecast for 10 days) without having to make multiple calls or returning extra data that produces performance overhead to retrieve such data from the database and serialize them back to the caller.

In some design decisions to reuse the code, a new web service is created that provides wider user experience by orchestrating many calls to the old web service calls. This will produce performance degradation, especially if these calls are remote calls.

Selected performance issues

In this section, we will pick some performance issues and discuss them in more detail.

Unnecessary application logic

Here, the application developers usually lack a good understanding of the used framework capabilities so they use extra unnecessary logic that either produces extra database hits, memory consumption, or even processing power consumption. Such unwanted code can only be detected if it causes extra hits to resources or external calls, and the best way to detect this is to perform manual cod e inspection or profiling on the application.

If we open our project, ExampleTwo, we will find a lot of good examples of extra unnecessary logic, such as loading the whole collection to search for an instance inside it, where we can retrieve it directly. Refer to the following code:

ProductStock[] stocks = catalogSessionBean.loadAllStocks(); for (ProductStock stock : stocks) { if (stock.getProductId().getId()==id) { if(catalogSessionBean.updateStock(stock,-1)){ basketBean.addToBasket(id); . . . } . . . }

Also, we can see that the operation of decreasing the stock once added to the basket is not correct. This should only operate—if required—on the memory level and not actually update the database unless the order is finally submitted. In this case, we won't need the session listener and all this spaghetti unwanted code! Refer to the following code:

for (BasketElement basketElement : basketBean.getBasketElements()) { ProductStock currentStock = null; for (ProductStock stock : stocks) { if (stock.getProductId().getId() == basketElement.getProductId()) { currentStock = stock; break; } } catalogSessionBean.updateStock(currentStock, basketElement.getCount()); }

A third example in this project is the bad product catalog filtering logic. It should construct the query according to the parameters rather than having all the possible combinations in such a bad logic that it misses certain scenarios. If the code is not documented, no one will actually be able to catch these issues easily. Refer to the following code:

if(criteria.getProductCategory()==0 && criteria.getPrice()>0 && criteria.getSearchKeyword()==null) { query = em.createNamedQuery("Product.findByPrice"); query.setParameter("price", criteria.getPrice()); }else if(criteria.getProductCategory()>0 && criteria.getPrice()==0 && criteria.getSearchKeyword()==null) { query = em.createNamedQuery("Product.findByCategoryId"); query.setParameter("categoryId", criteria.getProductCategory()); }else if( ... ... // rest of the bad logic code

Similar code shouldn't pass through either an automatic or manual code review and we shouldn't allow such code in our enterprise application code. In the production environment, changing such bad logic code is not recommended because of the potential impact unless the impact assessment is clear. Otherwise, it could lead to potential application malfunction, so it is better to address such coding issues early in the development phase by both automated and manual code reviews.

A lot of alternatives are available with better coding quality to solve the previous coding issue. These include using the standard JPA QueryBuilder or the dynamic construction of the required query.

Excessive application logging

Logging is very useful in the troubleshooting applications, especially in the production environment, where understanding old user actions or transactions is usually impossible without having meaningful logging.

Logging must strictly follow the best practices guidelines or it will impact our application performance. If we look back at our HighCPU example, we can see that all the threads are blocked, waiting to obtain the lock over the logging System.out object.

Special attention should be paid when logging the XML structures. It will degrade the application performance severely, so it shouldn't be added without the if(debugEnabled) condition unless it is not in a common scenario.

It is important to ensure that the logging configurations are correctly deployed in the production environment, as sometimes the performance issue is simply the incorrect enablement of the application debug level.

Database performance issues

A database is one of the biggest concerns in the enterprise application performance where sometimes the data grows to a huge size and gradually affects the application performance. Different database issues can affect the application performance as follows:

  • Using an old database version
  • Using an old JDBC driver library
  • Missing table indexes in frequently used tables (SQL tuning principals)
  • ORM layer tuning (such as JPA caching and index preallocation)
  • Missing batch or bulk database operations in massive database manipulations
  • Missing regular database housekeeping

Our main concern here is regular database housekeeping, which is essentially required and should be planned while designing the application, not when the performance issues show up! This housekeeping includes the following examples:

  • Having frequent backups
  • Partitioning the large tables
  • Cleaning the different table spaces, for example, temporary and undo table space
  • Archiving the old records

When the database size exceeds the handling limit, it will impact the application performance and it will be very difficult to manage all tuning techniques without having a service outage. Database size issue is unique for the production environment and can't be replicated in other environments due to the big difference between the production environment and other environments.

Having database performance issues require the assurance of performing the following steps:

  • Checking the database performance report periodically such as the Oracle AWR report
  • Tuning the database to suit the application
  • Getting low performing queries and identifying issues through the database-specific analysis tools, for example, the SQL execution plan
  • Monitoring database server performance and fixing any issues
  • Using the latest database drivers
  • Performing any necessary housekeeping activities
  • Adding a caching layer if it does not already exist
  • Changing the application persistent methodology such as using batch or bulk loading techniques for large data insertion instead of separate updates

Bulk database manipulations should be used. Using the database-supporting bulk operations in cases where there are a lot of database operations converts the fine-tuning operations into course-gained operations and can improve the application performance.

Going back to our example application and profile, it uses JProfiler while executing the performance testing (using Apache JMeter). We can see the following database Hot Spots:

The interesting find in the preceding screenshot is that it informs us that we have a coding issue. We have concluded this because of the number of calls of different queries as it appears in the Events column. If we are hitting the database by thousands of calls for just six product catalogs, we clearly have bad coding issues that need to be fixed first before we continue working on any performance improvement or database tuning.

It is worth mentioning that one of the database-related issues is the data loading policy in either eager or lazy loading techniques. In each application, we need to consider the selection of the proper way to handle each transactional scenario according to the size of data, type of data, frequency of data changes, and whether data caching is used or not.

Missing proactive tuning

In any enterprise system architecture, different application components usually have the performance tuning suggestions that fit each application type. Such a tuning should be done in advance (that is, proactively) and not after facing performance issues.

We can definitely further tune these components in cases of performance issues according to application performance results. But as a baseline, we should be doing the required tuning in advance. An example of such tuning is the operating system tuning, application server tuning, JVM tuning, database tuning, and so on.

Client-side performance issues

Performance issues on the client side means either a JavaScript coding issue or slow loading of different resources. Rarely do we need to consider cascading style sheets (CSS).

With the advance of web technologies, a lot of developers now rely on the Ajax technology to do a lot of things in the application interface such as loading and submitting contents. Some frameworks come with components that support the Ajax calls, such as JSF and ADF.

The initial set of questions that we need to answer when we face client-side performance issue is as follows:

  • Is the issue related to the browser type?
  • Is it related to the page size or resources size?
  • What happens if we disable JavaScript in the browser?
  • Did this issue occur in the performance testing tool?

Performance testing tools such as Apache JMeter won't execute the JavaScript code, so don't rely on these tools to catch JavaScript issues. Instead, we will use this as an advantage because having performance issues in browsers but not in JMeter suggests that we are facing the most probable JavaScript issue.

A good thing is that all browsers these days have integrated tools that can be used to troubleshoot rendering content. Usually, they are named Developer tools. Also, they have additional useful third-party plugins that can be used for the same purpose.

We can also use the external tools (they also have plugins for different browsers), such as DynaTrace or Fiddler.

Chrome developer tools

If you execute the ExampleTwo application in the Chrome browser, press F12 or select Developer tools from the Tools menu, and then reload the website again.

This example uses Chrome Version 30.0.x m.

Download the latest Chrome browser from

Network analysis

We can see that a lot of useful information is available and it is organized in different tabs. If we select the Network tab, we can see the loading time of different resources, as shown in the following screenshot. This facilitates identifying if a certain resource takes time to load or if it has a large size:

If we move the cursor over any of these Timeline figures, we will see more details about the DNS lookup and latency, or we can click on any row to open it for more details. In the following screenshot, we can see one request detail organized in different tabs, including the Headers, Preview, Response, Cookies, and Timing details:

JavaScript profiling

Open the Chrome browser and navigate to any website, for example, the Packt Publishing website,

Now, switch to the Profiles tab, click on the Start button, and refresh the page by pressing F5. Once the page is completely loaded, click on the Stop button.

Now, we have just profiled the JavaScript code. The profiling data will be under the name Profile 1 and we will see the JavaScript code performance with every method CPU time. A link is available for the corresponding script source code if this is our application JavaScript method, as shown in the following screenshot:

A lot of display features are available, for example, if we click on the toggle button, %, it will show the actual time spent rather than the relative percentage time.

These useful tools will help identify the heaviest JavaScript methods that need some tuning.

Speed Tracer

Speed Tracer is a Chrome plugin tool that helps to identify and fix potential performance issues. It shows different visualized metrics with recommendation hints.

As it instruments the low-level points in the browser, it can help in identifying and locating different issues related to different phases, such as the JavaScript parsing and execution, CSS, DOM event handling, resource loading, and XMLHttpRequest callbacks. Refer to the following screenshot:

Here is an example of tool recommendation hints. There are all Info level hints meaning there are no major issues, as shown in the following screenshot:

Speed Tracer can be downloaded from the following URL:

Internet Explorer developer tools

The developer tools are available in Internet Explorer. If we press F12, it will open similar developer tools. Select the Network tab, click on the Start capturing button, and then open the same page, that is,

We can see nearly the same tabs that exist in the Chrome developer tools but to the left, and we can start profiling and reloading the page to obtain the JavaScript profiling snapshot. The following screenshot is taken from Internet Explorer Version 11.0.x:

Firefox developer tools

In Firefox, we can select Toggle Tools from the Web Developer menu option to open the tools. The tools have the same features that exist in IE and Chrome. The following screenshot shows the developer tools in Firefox (the screenshot is taken from Firefox Version 24.0, which is a portable version):

Navigating time specifications

W3C is sponsoring a new specification for an interface for web applications to get different timing information; the specifications define a new interface PerformanceTiming, which can be used to get different timings. A sample usage for this interface, once implemented in different browsers (as it is not supported yet), will look like the following code:

<head> <script type="text/javascript"> function onLoad() { var now = new Date().getTime(); var loadTime = now-performance.timing.navigationStart; alert("Page loading time: " + loadTime); } </script> </head> <body onload="onLoad()">

For more information about these specifications, check the specification documentation at the following URL:


In this article, we learned briefly how to determine performance issue locations and covered some of the common enterprise applications performance issues.

We tried to classify these common issues into different groups, and then we discussed some samples from each group category in more detail. The important point here is to be able to use these issues as models or templates of typical application performance issues, so we can frame any performance issue in these templates.

Finally, we discussed how to diagnose client-side performance issues using existing browser embedded tools.

Resources for Article:

Further resources on this subject:


You've been reading an excerpt of:

Java EE 7 Performance Tuning and Optimization

Explore Title