Troubleshooting WebSphere Security-related Issues


IBM WebSphere Application Server v7.0 Security

IBM WebSphere Application Server v7.0 Security

Secure your IBM WebSphere applications with Java EE and JAAS security standards using this book and eBook

        Read more about this book      

(For more resources on IBM, see here.)

Troubleshooting general security configuration exceptions

The selected cases in this subsection concerns the situations when various aspects of configuring security are carried out and, as a result, error conditions occur.

Identifying problems with the Deployment Manager—node agent communication blues

Several of the problems that may take place due to either wrong or incomplete security configuration are found in the communication of the administrative layers of the WebSphere environment, i.e., between the deployment manager and the node agent(s).

A couple of the most common situations are shown below, along with recommendations as to how to correct the condition.

Receiving the message HMGR0149E: node agent rejected

The message HMGR0149E is the result of the Deployment Manager rejecting a request to connect from the node agent. This type of error and the display of this message normally takes place when security changes in the Deployment Manager were not synchronized with the node in question. An example of log file clip where this message is found can be seen in the following screenshot:

Troubleshooting WebSphere Security-related Issues

One way to fix this problem is by using the command. The syntax for this command is: dmgr_host [dmgr_port] [-conntype <type>]
[-stopservers] [-restart] [-quiet] [-nowait]
[-logfile <filename>] [-replacelog] [-trace]
[-username <username>] [-password <password>]
[-localusername <localusername>]
[-localpassword <localpassword>]
[-profileName <profile>] [-help]

Furthermore, a very simple procedure to correct this problem is given next:

  1. Stop the affected node agent(s).
  2. Execute, on the node agent OS host, the command.
  3. Monitor the SystemOut.log file for both dmgr and nodeagent processes.
  4. Start the node agent.

For additional information on messages from the high availability manager, refer to the WAS ND7 Information Center link: v7r0/topic/ ws.hamanager.nls.HAManagerMessages.html

Receiving the message ADMS0005E: node agent unable to synchronize

This message, ADMS0005E, is the result of the node agent attempting to synchronize configuration with the Deployment Manager. It is likely caused when changes in security-related configuration occurred and the node agent were not available. The following screenshot shows an example of this type of error.

Troubleshooting WebSphere Security-related Issues

One way to solve the issue is to shut down the node agent, and then, manually execute the command from the node OS host using a user ID and password that has administrative privileges on the Deployment Manager. For syntax or usage information about this command, kindly refer to the previous example.

In case this action does not solve the problem, follow the next procedure:

  1. Stop the node agent(s)
  2. Using the ISC, disable global security
  3. Restart the Deployment Manager
  4. Start the node agent(s)
  5. Perform a full synchronization using the ISC
  6. Using the ISC, enable global security
  7. Synchronize changes with all nodes
  8. Stop the node agent(s)
  9. Restart the Deployment Manager to activate global security
  10. Start the node agent(s)

For additional information on messages about the administrative synchronization, refer to the WAS ND7 Information Center link: v7r0/topic/

Troubleshooting runtime security exceptions

To close the section on troubleshooting, this subsection presents several cases of error or exception conditions that occur due to security configuration of various WAS ND7 environment components. Such components can be all within WAS or some components could be external, for example, the IHS/WebSphere Plug-in.

Troubleshooting HTTPS communication between WebSphere Plug-in and Application Server

When setting up the HTTPS communication between the WebSphere Plug-in and the WebSphere Application Server there may be instances in which exceptions and errors may occur during the configuration phase. Some of the most common are listed next.

Receiving the message SSL0227E: SSL handshake fails

The message SSL0227E is a common one when the main IHS process is attempting to retrieve the SSL certificate indicated by the property SSLServerCert located in the httpd.conf file. What this message is stating is that the intended SSL certificate cannot be found by its label from the key ring indicated by the directive KeyFile in the same configuration file.

An example of this type of message is shown in the following screenshot. In order to correct this error, there are two possibilities that can be explored.

Troubleshooting WebSphere Security-related Issues

On the one hand, one needs to insure that the directive KeyFile is pointing to the correct key ring file. That is, that the key ring file actually stores the target SSL certificate to be used with this IHS server.

On the other hand, there may be a typographic error in the value of the property SSLServerCert. In other words, the label that is mapped to the target SSL certificate was misspelled in the httpd.conf file.

In both cases, the command gsk7capicmd can be used to list the content of the key ring file. The syntax for listing the contents of a key ring file is:

<IHS_ROOT_Directory>/bin/gsk7capicmd -cert -list all -db <Path_To_
kdb_File> -pw <kdb_File_Password>

For additional information on messages about handshaking issues, refer to the IHS v7 Information Center link: topic/ troubhandmsg.html

Receiving ws_config_parser errors while loading the plugin configuration file

If the configParserParse message of the ws_config_parser component is observed in the errors log file of the IBM HTTP Server; the following screenshot is an example of a possible output that may be found in the error logs. There may be a couple of reasons for this type of message to appear in the logs.

Troubleshooting WebSphere Security-related Issues

One reason for this type of message is that it occurs at the time in which the IHS process is being brought down. The WebSphere Plug-in module is in its cycle to reparse the plugin-cfg.xml file while the IHS process is shutting down, therefore the ws_config_parser component does not have enough resources to perform the parsing of the configuration file and throws this message, possibly multiple times in a row. In order to ensure that this is the correct interpretation of the message, it is necessary to find an indicator, such as a 'shutting down' type of message like the one shown in the next screenshot:

Troubleshooting WebSphere Security-related Issues

The other reason why this message may appear in the logs is very likely that the process owner of the IHS process does not have the correct privileges to read the plugin-cfg.xml file. In this case, ensure that the definition for the property User in the httpd.conf file has enough privileges to read the plug-in configuration file defined for the property WebSpherePluginConfig of the httpd.conf file.

For additional information on messages about WebSphere Plug-in issues, refer to the article Error message definitions for WebSphere Application Server's webserver plugin component.

Receiving the message GSK_ERROR_BAD_CERT: No suitable certificate found

The message GSK_ERROR_BAD_CERT appears in log files when the WebSphere Plug-in is attempting to establish an SSL connection with the back-end WebSphere Application Server and it does not have a way to validate the SSL certificate sent by the WebSphere Application Server. An example of this type of message is shown in the next screenshot:

Troubleshooting WebSphere Security-related Issues

One way to solve this problem is by adding to the IHS key ring file the signer certificate from the WebSphere Application Server. When doing this, care must be taken to correctly select the WebSphere trust store. In other words, the correct scope for your target Application Server needs to be identified so that the appropriate trust store can be accessed.

For instance, if it was desired to obtain the root certificate (aka, signer certificate) used by the Chap7AppServer Application Server, one needs to identify the scope for that application server. Therefore, one should start with the following breadcrumb in the ISC (Deployment Manager console): Security | SSL certificate and key management | Manage endpoint security configurations. The following screenshot illustrates a portion of the resulting page:

Troubleshooting WebSphere Security-related Issues

Once the appropriate scope is identified, continue by completing the breadcrumb: Security | SSL certificate and key management | Manage endpoint security configurations | Chap7AppServer | Key stores and certificates | NodeDefaultTrustStore | Signer certificates. The following screenshot shows a portion of a resulting page.

Troubleshooting WebSphere Security-related Issues

You are now in position to extract the Application Server signer SSL certificate. Once this certificate is extracted, it needs to be imported into the IHS key ring file as a root certificate.

        Read more about this book      

(For more resources on IBM, see here.)

Receiving the message GSK_KEYFILE_IO_ERROR: No access to key file

The message GSK_KEYFILE_IO_ERROR is normally found in log files when the IHS process cannot access the IHS key ring file indicated by the directive KeyFile. One common reason is file permissions. It is very likely that the IHS process owner does not have the appropriate access file permissions to the key ring file or one of the parent directories. An example of this type of message is shown in the following screenshot:

Troubleshooting WebSphere Security-related Issues

In order to find out the file permission of the target key ring file, the following shell commands could be used:

CurFileOrDir=$(grep ^KeyFile <IHS_Root_directory>/conf/httpd.conf |
awk '{print $2}')
while [[CurFileOrDir " != "" ]]; do
/bin/ls $LS_OPTIONS -ld $CurFileOrDir;

The first line extracts the value of the KeyFile directive. The next line is a while loop that makes sure that the variable defined in line one is not empty. If it is empty, the loop ends. The first line inside the loop does a full listing of the current file or directory. This type of listing will show the file permissions. The second line in the loop trims off the last portion of the full path, shrinking the path by one, and assigns the resulting value to the initial variable used in the while loop.

For additional information on messages about GSKit issues, refer to the following appendix link: 0845-00/en_US/HTML/am39_error_ref08.htm

Receiving the message WSVR0009E / ORBX0390E: JVM does not start due to org.omg.CORBA. INTERNAL error

The message WSVR0009E is a generic error message that indicates that a start-up error has occurred and the cause is unknown. However, one needs to look for additional information on the same line that can shed a light as to what the possible cause of the problem may be. An example of this type of error is shown in the following screenshot. As additional information, a node agent generated the log file from where this snippet was extracted.

Troubleshooting WebSphere Security-related Issues

In the first place, it is observed that following the WSVR0009E message, an indicator is provided, that is, the class org.omg.CORBA.INTERNAL returned an error of the type CREATE_LISTENER_FAILED_4. In addition, before the WSVR0009E message an ORBX0390E message is included in the log file. The additional information this first message provides is that the node agent was unable to bind to one of its ports since that port was already in use by another process.

Therefore, we know that another process, unknown to us up to this point, is using one of the ports required by the node agent. We have the option of either assigning a different port to the node agent or find out what process is currently using the port. Depending on the process, one may wish to disable it or if it is a required process at that point we may have to revisit our port assignment strategy and correct the problem by selecting a different port or a different port range for the node agent.

So, how then can we go about identifying the active process using up the TCP port in question? There is a Unix command that can be used to find out what process is using the conflicting port. It is the lsof command (list open files). Without any arguments, the command will list all the open files in the Unix system. (Remember that in Unix any resource—file, Unix pipe, TCP ports, and so on—are taken as files.) There are a few variants for this command, so depending on your flavor of Unix, you may need to tune the syntax given next. In order to find out the process ID of the process currently using the TCP port in question, use the following call for the lsof command:

lsof -P -i:<PORT>

The capital -P indicates to the command not to convert the port numbers into names (that is, names derived from the /etc/services file). The lowercase I flag tells lsof to use its value as an internet address pattern. In our case, we only wish to indicate the port portion of it, which is signified by the preceding colon. A sample output is shown in the following screenshot.

Troubleshooting WebSphere Security-related Issues

The screenshot shows, in this example, that a java process with the PID of 7581 is using the TCP port 9402. With this information, using the commands ps and grep, one can find out what exact java process we are dealing with (or any other process for that matter).

In versions of WAS ND earlier than the node agent restart action would launch two node agents and this type of error would appear in the logs. If you currently are experiencing this error, make sure that your WAS ND version is at least

For additional information on messages in the series WSVR00, refer to the following information center link: v7r0/topic/ ws.runtime.runtime.html

Concluding WebSphere security-related tips

We conclude this article by listing several tips that may come in handy as you continue your work with WAS ND7, specifically in the area of security. The tips included next are in no particular order.

Using wildcards in virtual hosts: never do it!

Many times, when creating virtual hosts, some administrators do not bother to finetune the default configuration for the hostname, focusing only on the port. When one clicks on the New button to create a new alias, one sees a page that contains the portion shown in the following screenshot:

Troubleshooting WebSphere Security-related Issues

Never use the alias pair: host name "*"; port "80".

Ensuring best practice: set tracing from wide to specific search pattern

When turning on tracing in order to troubleshoot any type of problem in general and security in particular, one needs to keep in mind that WebSphere logging and tracing parses the trace string from left to right. This means that the trace string element at the left can be overridden by another element at its right. It is, therefore, best to place the more generic trace strings at the beginning and the most specific trace strings at the end.

For instance, if your WAS ND7 environment was using a custom J2C principal mapping module (cf. ect?version=compass&product=was-nd-dist&topic=rsec_pluginj2c) and you wanted to trace its activity to solve an issue, a possible helpful trace string would be similar to the one shown in the following screenshot.

Troubleshooting WebSphere Security-related Issues

Using a TAI such as SiteMinder: remove existing interceptors

WAS ND7 includes two trust association interceptors by default. If you are planning to use a different type such as the CA/Netegrity SiteMinder TAI it is highly recommended that you remove those configurations from the interceptors list. The following screenshot shows the default interceptors included.

Troubleshooting WebSphere Security-related Issues

One of the many reasons for removing the interceptors shown in the screenshot above is that when TAI is enabled and a different interceptor is configured and enabled, the SystemOut.log file would include exception messages thrown by these interceptor classes.

Once you have removed these class name definitions you can proceed to create a new definition for SiteMinder. An example of class name that could be used (for version 6.x) is com.netegrity.siteminder.websphere.auth.SmTrustAssociationInterceptor.


In this article we took a look at the follwing:

  • Troubleshooting general security configuration exceptions and security runtime exceptions
  • Selecting simple to apply but powerful security related tips

Further resources on this subject:

You've been reading an excerpt of:

IBM WebSphere Application Server v7.0 Security

Explore Title