Troubleshooting

Exclusive offer: get 50% off this eBook here
OpenStack Cloud Computing Cookbook - Second Edition

OpenStack Cloud Computing Cookbook - Second Edition — Save 50%

119 recipes to successfully set up and manage your OpenStack cloud environments with complete coverage of Nova, Swift, Keystone, Glance, and Horizon with this book and ebook

$29.99    $15.00
by Cody Bunch Kevin Jackson | October 2013 | Cookbooks Linux Servers Open Source

In this Article by Cody Bunch and Kevin Jackson author of the book OpenStack Could Computing Cookbook Second Edition explain how OpenStack like all software can have bugs that we are not able to solve ourselves.

(For more resources related to this topic, see here.)

OpenStack is a complex suite of software that can make tracking down issues and faults quite daunting to beginners and experienced system administrators alike. While there is no single approach to troubleshooting systems, understanding where OpenStack logs vital information or what tools are available to help track down bugs will help resolve issues we may encounter. However, OpenStack like all software will have bugs that we are not able to solve ourselves. In that case, we will show you how gathering the required information so that the OpenStack community can identify bugs and suggest fixes is important in ensuring those bugs or issues are dealt with quickly and efficiently.

Understanding logging

Logging is important in all computer systems, but the more complex the system, the more you rely on logging to be able to spot problems and cut down on troubleshooting time. Understanding logging in OpenStack is important to ensure your environment is healthy and you are able to submit relevant log entries back to the community to help fix bugs.

Getting ready

Log in as the root user onto the appropriate servers where the OpenStack services are installed. This makes troubleshooting easier as root privileges are required to view all the logs.

How to do it...

OpenStack produces a large number of logs that help troubleshoot our OpenStack installations. The following details outline where these services write their logs:

OpenStack Compute services logs

Logs for the OpenStack Compute services are written to /var/log/nova/, which is owned by the nova user, by default. To read these, log in as the root user (or use sudo privileges when accessing the files). The following is a list of services and their corresponding logs. Note that not all logs exist on all servers. For example, nova-compute.log exists on your compute hosts only:

  • nova-compute: /var/log/nova/nova-compute.log

    Log entries regarding the spinning up and running of the instances

  • nova-network: /var/log/nova/nova-network.log

    Log entries regarding network state, assignment, routing, and security groups

  • nova-manage: /var/log/nova/nova-manage.log

    Log entries produced when running the nova-manage command

  • nova-conductor: /var/log/nova/nova-conductor.log

    Log entries regarding services making requests for database information

  • nova-scheduler: /var/log/nova/nova-scheduler.log

    Log entries pertaining to the scheduler, its assignment of tasks to nodes, and messages from the queue

  • nova-api: /var/log/nova/nova-api.log

    Log entries regarding user interaction with OpenStack as well as messages regarding interaction with other components of OpenStack

  • nova-cert: /var/log/nova/nova-cert.log

    Entries regarding the nova-cert process

  • nova-console: /var/log/nova/nova-console.log

    Details about the nova-console VNC service

  • nova-consoleauth: /var/log/nova/nova-consoleauth.log

    Authentication details related to the nova-console service

  • nova-dhcpbridge: /var/log/nova/nova-dhcpbridge.log

    Network information regarding the dhcpbridge service

OpenStack Dashboard logs

OpenStack Dashboard (Horizon) is a web application that runs through Apache by default, so any errors and access details will be in the Apache logs. These can be found in /var/log/apache2/*.log, which will help you understand who is accessing the service as well as the report on any errors seen with the service.

OpenStack Storage logs

OpenStack Object Storage (Swift) writes logs to syslog by default. On an Ubuntu system, these can be viewed in /var/log/syslog. On other systems, these might be available at /var/log/messages.

The OpenStack Block Storage service, Cinder, will produce logs in /var/log/cinder by default. The following list is a breakdown of the log files:

  • cinder-api: /var/log/cinder/cinder-api.log

    Details about the cinder-api service

  • cinder-scheduler: /var/log/cinder-scheduler.log

    Details related to the operation of the Cinder scheduling service

  • cinder-volume: /var/log/cinder/cinder-volume.log

    Log entries related to the Cinder volume service

OpenStack Identity logs

The OpenStack Identity service, Keystone, writes its logging information to /var/log/keystone/keystone.log. Depending on how you have Keystone configured, the information in this log file can be very sparse to extremely verbose including complete plaintext requests.

OpenStack Image Service logs

The OpenStack Image Service Glance stores its logs in /var/log/glance/*.log with a separate log file for each service. The following is a list of the default log files:

  • api: /var/log/glance/api.log

    Entries related to the glance API

  • registry: /var/log/glance/registry.log

    Log entries related to the Glance registry service. Things like metadata updates and access will be stored here depending on your logging configuration.

OpenStack Network Service logs

OpenStack Networking Service, formerly Quantum, now Neutron, stores its log files in /var/log/quantum/*.log with a separate log file for each service. The following is a list of the corresponding logs:

  • dhcp-agent: /var/log/quantum/dhcp-agent.log

    Log entries pertaining to the dhcp-agent

  • l3-agent: /var/log/quantum/l3-agent.log

    Log entries related to the l3 agent and its functionality

  • metadata-agent: /var/log/quantum/metadata-agent.log

    This file contains log entries related to requests Quantum has proxied to the Nova metadata service.

  • openvswitch-agent: /var/log/quantum/openvswitch-agent.log

    Entries related the the operation of Open vSwitch. When implementing OpenStack Networking, if you use a different plugin, its log file will be named accordingly.

  • server: /var/log/quantum/server.log

    Details and entries related to the quantum API service

  • OpenVSwitch Server: /var/log/openvswitch/ovs-vswitchd.log

    Details and entries related to the OpenVSwitch Switch Daemon

Changing log levels

By default each OpenStack service has a sane level of logging, which is determined by the level set as Warning. That is, it will log enough information to provide you the status of the running system as well as some basic troubleshooting information. However, there will be times that you need to adjust the logging verbosity either up or down to help diagnose an issue or reduce logging noise.

As each service can be configured similarly, we will show you how to make these changes on the OpenStack Compute service.

Log-level settings in OpenStack Compute services

To do this, log into the box where the OpenStack Compute service is running and execute the following commands:

sudo vim /etc/nova/logging.conf

Change the following log levels to either DEBUG, INFO or WARNING in any of the services listed:

Log-level settings in other OpenStack services

Other services such as Glance and Keystone currently have their log-level settings within their main configuration files such as /etc/glance/glance-api.conf. Adjust the log levels by altering the following lines to achieve INFO or DEBUG levels:

Restart the relevant service to pick up the log-level change.

How it works...

Logging is an important activity in any software, and OpenStack is no different. It allows an administrator to track down problematic activity that can be used in conjunction with the community to help provide a solution. Understanding where the services log and managing those logs to allow someone to identify problems quickly and easily are important.

Checking OpenStack services

OpenStack provides tools to check on its services. In this section, we'll show you how to check the operational status of these services. We will also use common system commands to check whether our environment is running as expected.

Getting ready

To check our OpenStack Compute host, we must log into that server, so do this now before following the given steps.

How to do it...

To check that OpenStack Compute is running the required services, we invoke the nova-manage tool and ask it various questions about the environment, as follows:

Checking OpenStack Compute Services

To check our OpenStack Compute services, issue the following command:

sudo nova-manage service list

You will see an output similar to the following. The :-) indicates that everything is fine.

nova-manage service list

The fields are defined as follows:

  • Binary: This is the name of the service that we're checking the status of.
  • Host: This is name of the server or host where this service is running.
  • Zone: This refers to the OpenStack Zone that is running that service. A zone can run different services. The default zone is called nova.
  • Status: This states whether or not an administrator has enabled or disabled that service.
  • State: This refers to whether that running service is working or not.
  • Updated_At: This indicates when that service was last checked.

If OpenStack Compute has a problem, you will see XXX in place of :-). The following command shows the same:

nova-compute compute.book nova enabled XXX 2013-06-18 16:47:35

If you do see XXX, the answer to the problem will be in the logs at /var/log/nova/.

If you get intermittent XXX and :-) for a service, first check whether the clocks are in sync.

OpenStack Image Service (Glance)

The OpenStack Image Service, Glance, while critical to the ability of OpenStack to provision new instances, does not contain its own tool to check the status of the service. Instead, we rely on some built-in Linux tools. OpenStack Image Service (Glance) doesn't have a tool to check its running services, so we can use some system commands instead, as follows:

ps -ef | grep glance netstat -ant | grep 9292.*LISTEN

These should return process information for Glance to show it's running, and 9292 is the default port that should be open in the LISTEN mode on your server, which is ready for use. The output of these commands will be similar to the following:

ps -ef | grep glance

This produces output like the following:

To check if the correct port is in use, issue the following command:

netstat -ant | grep 9292 tcp 0 0 0.0.0.0:9292 0.0.0.0:* LISTEN

Other services that you should check

Should Glance be having issues while the above services are in working order, you will want to check the following services as well:

  • rabbitmq: For rabbitmq, run the following command:

    sudo rabbitmqctl status

    For example, output from rabbitmqctl (when everything is running OK) should look similar to the following screenshot:

    If rabbitmq isn't working as expected, you will see output similar to the following indicating that the rabbitmq service or node is down:

  • ntp: For ntp (Network Time Protocol, for keeping nodes in time-sync), run the following command:

    ntpq -p

    ntp is required for multi-host OpenStack environments but it may not be installed by default. Install the ntp package with sudo apt-get install -y ntp)

    This should return output regarding contacting NTP servers, for example:

  • MySQL Database Server: For MySQL Database Server, run the following commands:

    PASSWORD=openstack mysqladmin -uroot –p$PASSWORD status

    This will return some statistics about MySQL, if it is running, as shown in the following screenshot:

Checking OpenStack Dashboard (Horizon)

Like the Glance Service, the OpenStack Dashboard service, Horizon, does not come with a built-in tool to check its health.

Horizon, despite not having a built-in utility to check service health, does rely on the Apache web server to serve pages. To check the status of the service then, we check the health of the web service. To check the health of the Apache web service, log into the server running Horizon and execute the following command:

ps -ef | grep apache

This command produces output like the following screenshot:

To check that Apache is running on the expected port, TCP Port 80, issue the following command:

netstat -ano | grep :80

This command should show the following output:

tcp 0 0 0.0.0.0:80 0.0.0.0:* LISTEN off (0.00/0/0)

To test access to the web server from the command line issue the following command:

telnet localhost 80

This command should show the following output:

Trying 127.0.0.1... Connected to localhost. Escape character is '^]'.

Checking OpenStack Identity (Keystone)

Keystone comes with a client side implementation called the python-keystone client. We use this tool to check the status of our Keystone services.

To check that Keystone is running the required services, we invoke the keystone command:

# keystone user-list

This produces output like the following screenshot:

Additionally, you can use the following commands to check the status of Keystone. The following command checks the status of the service:

# ps -ef | grep keystone

This should show output similar to the following:

keystone 5441 1 0 Jun20 ? 00:00:04 /usr/bin/python /usr/bin/keystone-all

Next you can check that the service is listening on the network. The following command can be used:

netstat -anlp | grep 5000

This command should show output like the following:

tcp 0 0 0.0.0.0:5000 0.0.0.0: LISTEN 54421/python

Checking OpenStack Networking (Neutron)

When running the OpenStack Networking service, Neutron, there are a number of services that should be running on various nodes. These are depicted in the following diagram:

On the Controller node, check the Quantum Server API service is running on TCP Port 9696 as follows:

sudo netstat -anlp | grep 9696

The command brings back output like the following:

tcp 0 0 0.0.0.0:9696 0.0.0.0:* LISTEN 22350/python

On the Compute nodes, check the following services are running using the ps command:

  • ovsdb-server
  • ovs-switchd
  • quantum-openvswitch-agent

For example, run the following command:

ps -ef | grep ovsdb-server

On the Network node, check the following services are running:

  • ovsdb-server
  • ovs-switchd
  • quantum-openvswitch-agent
  • quantum-dhcp-agent
  • quantum-l3-agent
  • quantum-metadata-agent

To check our Neutron agents are running correctly, issue the following command from the Controller host when you have the correct OpenStack credentials sourced into your environment:

quantum agent-list

This will bring back output like the following screenshot when everything is running correctly:

Checking OpenStack Block Storage (Cinder)

To check the status of the OpenStack Block Storage service, Cinder, you can use the following commands:

  • Use the following command to check if Cinder is running:

    ps -ef | grep cinder

    This command produces output like the following screenshot:

  • Use the following command to check if iSCSI target is listening:

    netstat -anp | grep 3260

    This command produces output like the following:

    tcp 0 0 0.0.0.0:3260 0.0.0.0:* LISTEN 10236/tgtd

  • Use the following command to check that the Cinder API is listening on the network:

    netstat -an | grep 8776

    This command produces output like the following:

    tcp 0 0.0.0.0:8776 0.0.0.0:* LISTEN

  • To validate the operation of the Cinder service, if all of the above is functional, you can try to list the volumes Cinder knows about using the following:

    cinder list

    This produces output like the following:

Checking OpenStack Object Storage (Swift)

The OpenStack Object Storage service, Swift, has a few built-in utilities that allow us to check its health. To do so, log into your Swift node and run the following commands:

  • Use the following command for checking the Swift Service
    • Using Swift Stat:

      swift stat

      This produces output like the following:

    • Using PS:

      There will be a service for each configured container, account, object-store.

      ps -ef | grep swift

      This should produce output like the following screenshot:

    • Use the following command for checking the Swift API:

      ps -ef | grep swift-proxy

      This should produce the following screenshot:

    • Use the following command for checking if Swift is listening on the network:

      netstat -anlp | grep 8080

      This should produce output like the following:

      tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN 9818/python

How it works...

We have used some basic commands that communicate with OpenStack services to show they're running. This elementary level of checking helps with troubleshooting our OpenStack environment.

Troubleshooting OpenStack Compute services

OpenStack Compute services are complex, and being able to diagnose faults is an essential part of ensuring the smooth running of the services. Fortunately, OpenStack Compute provides some tools to help with this process, along with tools provided by Ubuntu to help identify issues.

How to do it...

Troubleshooting OpenStack Compute services can be a complex issue, but working through problems methodically and logically will help you reach a satisfactory outcome. Carry out the following suggested steps when encountering the different problems presented.

Steps for when you cannot ping or SSH to an instance

  1. When launching instances, we specify a security group. If none is specified, a security group named default is used. These mandatory security groups ensure security is enabled by default in our cloud environment, and as such, we must explicitly state that we require the ability to ping our instances and SSH to them. For such a basic activity, it is common to add these abilities to the default security group.
  2. Network issues may prevent us from accessing our cloud instances. First, check that the compute instances are able to forward packets from the public interface to the bridged interface. Use the following command for the same:

    sysctl -A | grep ip_forward

  3. net.ipv4.ip_forward should be set to 1. If it isn't, check that /etc/sysctl.conf has the following option uncommented. Use the following command for it:

    net.ipv4.ip_forward=1

  4. Then, run to following command to pick up the change:

    sudo sysctl -p

  5. Other network issues could be routing issues. Check that we can communicate with the OpenStack Compute nodes from our client and that any routing to get to these instances has the correct entries.
  6. We may have a conflict with IPv6, if IPv6 isn't required. If this is the case, try adding --use_ipv6=false to your /etc/nova/nova.conf file, and restart the nova-compute and nova-network services. We may also need to disable IPv6 in the operating system, which can be achieved using something like the following line in /etc/modprobe.d/ipv6.conf:

    install ipv6 /bin/true

  7. If using OpenStack Neutron, check the status of the neutron services on the host and the correct IP namespace is being used (see Troubleshooting OpenStack Networking).
  8. Reboot your host.

Methods for viewing the Instance Console log

  1. When using the command line, issue the following commands:

    nova list

    nova console-log INSTANCE_ID

    For example:

    nova console-log ee0cb5ca-281f-43e9-bb40-42ffddcb09cd

  2. When using Horizon, carry out the following steps:
    1. Navigate to the list of instance and select an instance.
    2. You will be taken to an Overview screen. Along the top of the Overview screen is a Log tab. This is the console log for the instance.

  3. When viewing the logs directly on a nova-compute host, look for the following file:

    The console logs are owned by root, so only an administrator can do this. They are placed at: var/lib/nova/instances/<instance_id>/console.log.

Instance fails to download meta information

If an instance fails to communicate to download the extra information that can be supplied to the instance meta-data, we can end up in a situation where the instance is up but you're unable to log in, as the SSH key information is injected using this method.

Viewing the console log will show output like in the following screenshot:

If you are not using Neutron, ensure the following:

  1. nova-api is running on the Controller host (in a multi_host environment, ensure there's a nova-api-metadata and a nova-network package installed and running on the Compute host).
  2. Perform the following iptables check on the Compute node:

    sudo iptables -L -n -t nat

    We should see a line in the output like in the following screenshot:

  3. If not, restart your nova-network services and check again.
  4. Sometimes there are multiple copies of dnsmasq running, which can cause this issue. Ensure that there is only one instance of dnsmasq running:

    ps -ef | grep dnsmasq

    This will bring back two process entries, the parent dnsmasq process and a spawned child (verify by the PIDs). If there are any other instances of dnsmasq running, kill the dnsmasq processes. When killed, restart nova-network, which will spawn dnsmasq again without any conflicting processes.

If you are using Neutron:

The first place to look is in the /var/log/quantum/metadata_agent.log on the Network host. Here you may see Python stack traces that could indicate a service isn't running correctly. A connection refused message may appear here suggesting the metadata agent running on the Network host is unable to talk to the Metadata service on the Controller host via the Metadata Proxy service (also running on the Network host).

The metadata service runs on port 8775 on our Controller host, so checking that is running involves checking the port is open and it's running the metadata service. To do this on the Controller host, run the following:

sudo netstat -antp | grep 8775

This will bring back the following output if everything is OK:

tcp 0 0 0.0.0.0:8775 0.0.0.0:* LISTEN

If nothing is returned, check that the nova-api service is running and if not, start it.

Instance launches; stuck at Building or Pending

Sometimes, a little patience is needed before assuming the instance has not booted, because the image is copied across the network to a node that has not seen the image before. At other times though, if the instance has been stuck in booting or a similar state for longer than normal, it indicates a problem. The first place to look will be for errors in the logs. A quick way of doing this is from the controller server and by issuing the following command:

sudo nova-manage logs errors

A common error that is usually present is usually related to AMQP being unreachable. Generally, these errors can be ignored unless, that is, you check the time stamp and these errors are currently appearing. You tend to see a number of these messages related to when the services first started up so look at the timestamp before reaching conclusions.

This command brings back any log line with the ERROR as log level, but you will need to view the logs in more detail to get a clearer picture.

A key log file, when troubleshooting instances that are not booting properly, will be available on the controller host at /var/log/nova/nova-scheduler.log. This file tends to produce the reason why an instance is stuck in Building state. Another file to view further information will be on the compute host at /var/log/nova/nova-compute.log. Look here at the time you launch the instance. In a busy environment, you will want to tail the log file and parse for the instance ID.

Check /var/log/nova/nova-network.log (for Nova Network) and /var/log/quantum/*.log (for Neutron) for any reason why instances aren't being assigned IP addresses. It could be issues around DHCP preventing address allocation or quotas being reached.

Error codes such as 401, 403, 500

The majority of the OpenStack services are web services, meaning the responses from the services are well defined.

40X: This refers to a service that is up but responding to an event that is produced by some user error. For example, a 401 is an authentication failure, so check the credentials used when accessing the service.

500: These errors mean a connecting service is unavailable or has caused an error that has caused the service to interpret a response to cause a failure. Common problems here are services that have not started properly, so check for running services.

If all avenues have been exhausted when troubleshooting your environment, reach out to the community, using the mailing list or IRC, where there is a raft of people willing to offer their time and assistance. See the Getting help from the community recipe at the end of this article for more information.

Listing all instances across all hosts

From the OpenStack controller node, you can execute the following command to get a list of the running instances in the environment:

sudo nova-manage vm list

To view all instances across all tenants, as a user with an admin role execute the following command:

nova list --all-tenants

These commands are useful in identifying any failed instances and the host on which it is running. You can then investigate further.

How it works...

Troubleshooting OpenStack Compute problems can be quite complex, but looking in the right places can help solve some of the more common problems. Unfortunately, like troubleshooting any computer system, there isn't a single command that can help identify all the problems that you may encounter, but OpenStack provides some tools to help you identify some problems. Having an understanding of managing servers and networks will help troubleshoot a distributed cloud environment such as OpenStack.

There's more than one place where you can go to identify the issues, as they can stem from the environment to the instances themselves. Methodically working your way through the problems though will help lead you to a resolution.

OpenStack Cloud Computing Cookbook - Second Edition 119 recipes to successfully set up and manage your OpenStack cloud environments with complete coverage of Nova, Swift, Keystone, Glance, and Horizon with this book and ebook
Published: October 2013
eBook Price: $29.99
Book Price: $49.99
See more
Select your format and quantity:

Troubleshooting OpenStack Object Storage services

OpenStack Storage service (Swift) is built for highly available storage, but there will be times when something will go wrong, from authentication issues to failing hardware.

How to do it...

Carry out the following steps when encountering the problems presented.

Authentication issues

Authentication issues in Swift occur when a user or a system has been configured with the wrong credentials. A Swift system that has been supported by OpenStack Authentication service (Keystone) will require performing authentication steps against Keystone manually as well as viewing logs during the transactions. Check the Keystone logs for evidence of user authentication issues for Swift.

The user will see the following message with authentication issues:

If Swift is working correctly but Keystone isn't, skip to the Troubleshooting OpenStack Authentication recipe.

Swift can add complexity to authentication issues when ACLs have been applied to containers. For example, a user might not have been placed in an appropriate group that is allowed to perform that function on that container. To view a container's ACL, issue the following command on a client that has the Swift tool installed:

swift -V 2.0 -A http://keystone_server:5000/v2.0 -U tenant:user
-K password stat container

The Read ACL: and Write ACL: information will show which roles are allowed to perform those actions.

To check a user's roles, run the following set of commands on the Keystone server:

# Administrator Credentialsexport
OS_USERNAME=adminexport OS_PASSWORD=openstack
export OS_AUTH_URL=http://172.16.0.200:5000/v2.0 export OS_TENANT_NAME=cookbook # Get User ID keystone user-list # Get Tenant ID keystone tenant-list # Use the user-id and tenant-id to get the roles for # that user in that
tenant
keystone -I admin -K openstack -N http://172.16.0.200:5000/v2.0/
-T cookbook role-list --user user-id --tenant tenant-id

Now compare with the ACL roles assigned to the container.

Handling drive failure

When a drive fails in an OpenStack Storage environment, you must first ensure the drive is unmounted so Swift isn't attempting to write data to it. Replace the drive and rebalance the rings.

Handling server failure and reboots

The OpenStack Storage service is very resilient. If a server is out of action for a couple of hours, Swift can happily work around this server being missing from the ring. Any longer than a couple of hours though, and the server will need removing from the ring.

How it works...

The OpenStack Storage service, Swift, is a robust object storage environment, and as such, handles a relatively large number of failures within this environment. Troubleshooting Swift involves running client tests, viewing logs, and in the event of failure, identifying what the best course of action is.

Troubleshooting OpenStack Dashboard

The OpenStack dashboard, Horizon, provides the web UI that your end users will use to consume your OpenStack environment, so keeping it running is critical. There are a few instances however, where Horizon may decide to go awry.

How to do it…

When the Horizon goes awry you can check the following.

Unable to log into the OpenStack Dashboard

If you find you are unable to log into Horizon, check you have a valid user/password. To do this, log into a node that has the python-keystone client and attempt to authenticate with the same user:

export OS_TENANT_NAME=cookbook export OS_USERNAME=admin export OS_PASSWORD=openstack export OS_AUTH_URL=http://172.16.0.200:5000/v2.0/ keystone user-list

Next, if you are able to log in, but are presented with a Something went wrong screen, validate all services listed in Keystone are accessible to the server running horizon. To do this, log into the horizon server, and if you do not have the python-keystone client, install it:

sudo apt-get install -y python-keystoneclient export OS_TENANT_NAME=cookbook export OS_USERNAME=admin export OS_PASSWORD=openstack export OS_AUTH_URL=http://172.16.0.200:5000/v2.0/ for i in 'keystone endpoint-list | grep http | awk {'print $6'} |
cut -d / -f 3,3 | cut -d : -f 1'; do ping -c 1 $i; done

Additionally, you can edit the settings file for Horizon to enable more detailed logging and further troubleshooting by changing the following LOGGING lines section in /etc/openstack-dashboard/local_settings.py.

LOGGING = { 'version': 1, # When set to True this will disable all logging except # for loggers specified in this configuration dictionary. Note # that if nothing is specified here and disable_existing_loggers # is True, django.db.backends will still log unless it is # disabled explicitly. 'disable_existing_loggers': False, 'handlers': { 'null': { 'level': 'DEBUG', 'class': 'django.utils.log.NullHandler', }, 'console': { # Set the level to "DEBUG" for verbose output logging. 'level': 'INFO', 'class': 'logging.StreamHandler', }, }, 'loggers': { # Logging from django.db.backends is VERY verbose, send to null # by default. 'django.db.backends': { 'handlers': ['null'], 'propagate': False, }, 'requests': { 'handlers': ['null'], 'propagate': False, }, 'horizon': { 'handlers': ['console'], 'propagate': False, }, 'openstack_dashboard': { 'handlers': ['console'], 'propagate': False, }, 'novaclient': { 'handlers': ['console'], 'propagate': False, }, 'keystoneclient': { 'handlers': ['console'], 'propagate': False, }, 'glanceclient': { 'handlers': ['console'], 'propagate': False, }, 'nose.plugins.manager': { 'handlers': ['console'], 'propagate': False, } } }

How it works…

With Horizon being dependent on the good health of your OpenStack environment, most horizon issues will be solved as you troubleshoot other services. That said, with the guidance in this section you will be able to find which service is causing horizon angst and allow your users back into the system.

Troubleshooting OpenStack Authentication

The OpenStack Authentication service (Keystone) is a complex service, as it has to deal with underpinning the authentication and authorization for the complete cloud environment. Common problems include misconfigured endpoints, incorrect parameters being stored, and general user authentication issues, which involve resetting passwords or providing further details to the end user.

Getting ready

Administrator access is required to troubleshoot Keystone, so we first configure our environment, so that we can simply execute the relevant Keystone commands.

# Administrator Credentialsexport OS_USERNAME=adminexport OS_PASSWORD=
openstack
export OS_AUTH_URL=http://172.16.0.200:5000/v2.0 export OS_TENANT_NAME=cookbook

How to do it...

Carry out the following steps when encountering the problems presented.

Misconfigured endpoints

Keystone is the central service that directs authenticated users to the correct service, so it's vital that the users be sent to the correct location. Symptoms include HTTP 500 error messages in various logs regarding the services that are being accessed and clients timing out trying to connect to network services that don't exist. To verify your endpoints in each region, perform the following command:

keystone endpoint-list

We can drill down into specific service types with the following command. For example, to show adminURL for the compute service type in all regions:

keystone endpoint-get --service compute --endpoint_type adminURL

An alternative to listing the endpoints in this format is to list the catalog, which outputs the details in a more human-readable way:

keystone catalog

This provides a convenient way of seeing the endpoints configured.

Authentication issues

From time to time, users will have trouble authenticating against Keystone due to forgotten or expired details or unexpected failure within the authentication system. Being able to identify such issues will allow you to restore service or allow the user to continue using the environment.

The first place to look will be the relevant logs. This includes the /var/log/nova logs, the /var/log/glance logs (if related to images), as well as the /var/log/keystone logs.

Troubleshooting accounts might include missing accounts, so view the users on the system using the following command:

keystone user-list

After displaying the user list to ensure an account exists for the user, we can get further information on a particular user by issuing, for example, the following command, after retrieving the user ID of a particular user:

keystone user-get 68ba544e500c40668435aa6201e557e4

This will display output similar to the following screenshot:

This allows us to verify that the user has a valid account in a particular tenant.

If a user's password needs resetting, we can execute the following command after getting the user ID, to set a user's password to (for example) openstack:

keystone user-password-update \ --pass openstack \ 68ba544e500c40668435aa6201e557e4

If it turns out a user has been set to disabled, we can simply re-enable the account with the following command:

keystone user-update --enabled true 68ba544e500c40668435aa6201e557e4

There could be times when the account is working but problems exist on the client side. Before looking at Keystone for the issue, ensure your environment is set up correctly for the user account you are working with, in other words, set the following environment variables (example using a user called kevinj):

export OS_USERNAME=kevinjexport OS_PASSWORD=openstack export OS_AUTH_URL=http://172.16.0.200:5000/v2.0 export OS_TENANT_NAME=cookbook

How it works...

User authentication issues can be client-side or server-side, and when some basic troubleshooting has occurred on the client, we can use Keystone commands to find out why someone's user journey has been interrupted. With this, we are able to view and update user details, set passwords, set them into the appropriate tenants, and disable or enable them, as required.

OpenStack Cloud Computing Cookbook - Second Edition 119 recipes to successfully set up and manage your OpenStack cloud environments with complete coverage of Nova, Swift, Keystone, Glance, and Horizon with this book and ebook
Published: October 2013
eBook Price: $29.99
Book Price: $49.99
See more
Select your format and quantity:

Troubleshooting OpenStack Networking

OpenStack Networking is now a complex service with the introduction of Neutron, as it now gives users the ability to define and create their own networking within their cloud environment. Common problems for an OpenStack administrator include misconfigured Neutron installations, routing problems and switch plugin problems. Problems for users include misunderstanding the capabilities of Neutron or limitations imposed by administrators.

Getting ready

We'll be troubleshooting Neutron installations so administrator access is required to troubleshoot this service. Ensure you're logged in as root on our controller, compute, and network hosts and configure our environment to enable us to run various commands.

To log into our hosts that were created using Vagrant issue the following in separate shells:

vagrant ssh controller vagrant ssh compute vagrant ssh network

In our Controller and Network host sessions, as root, issue the following:

# Administrator Credentialsexport OS_USERNAME=adminexport OS_PASSWORD=
openstack
export OS_AUTH_URL=http://172.16.0.200:5000/v2.0 export OS_TENANT_NAME=cookbook

How to do it...

Carry out the following steps when encountering the problems presented.

Cloud-init reporting Connection Refused when accessing Metadata

In an instance's console log (when you issue nova console-log INSTANCE_ID) you may see lines such as:

There are a number of possibilities for this, but the result will be the same and we will be unable to log into our cloud instance because the instance was unable to have its SSH key injected into it.

Check that you have configured our physical interfaces on our network and compute nodes for use with OVS. As part of the installation and configuration, ensure that you have run the following command:

ovs-vsctl add-port br-eth1 eth1

Where eth1 is our physical interface and br-eth1 is the bridge created on this interface.

Check that your instance can route to the 169.254.169.254 metadata host from the gateway of the instance, and if not create a route to this network. When subnets are created and a gateway is specified, it is assumed that this gateway address can route to the 169.254.169.254 address. If it can't, you will see errors described in the sections we saw. To create a 169.254.169.254 route on the instance itself, create the subnet with the following options:

quantum subnet-create demoNet1 \ 10.1.0.0/24 \ --name snet1 \ --no-gateway \ --host_routes type=dict list=true \ destination=0.0.0.0/0,nexthop=10.1.0.1 \ --allocation-pool start=10.1.0.2,end=10.1.0.254

By specifying --no-gateway, Neutron will inject the 169.254.169.254 route into the instance so it shows up in the instance routing table, but to provide a default gateway we specify a destination of 0.0.0.0/0 and if appropriate the next hop in the route to allow that instance access elsewhere.

Submitting Bug reports

OpenStack is a hugely successful open source, public and private cloud framework. It has gained this momentum by the individuals and organizations downloading and contributing to it. By using the software in a vast array of environments and scenarios, and running the software on a myriad of hardware, you will invariably encounter bugs. In an open source project, the best thing we can now do is tell the developers about it so they can develop or suggest a solution for us.

How to do it...

The OpenStack project is available through LaunchPad. LaunchPad is an open source suite of tools that helps people and teams to work together on software projects and is accessible at http://launchpad.net/, so the first step is to create an account.

Creating an account on LaunchPad

Steps for creating and account on LaunchPad are as follows:

  1. Creating an account on LaunchPad is easy. First, head over to https://login.launchpad.net/+new_account (or navigate from the home page to the Login/Register link).
  2. Fill in your name, e-mail address, and password details, as shown in the following screenshot:

  3. We will then be sent an e-mail with a link to complete the registration. Click on this to be taken to a confirmation page.
  4. We will then be taken to an account page, but no further details need to be entered here.

Submitting bug reports through LaunchPad

Now that we have an account on LaunchPad, we can submit bug reports. The following links take us directly to the bug report sections of those projects:

On submitting a short summary, a search is made to see if a similar bug exists. If it does, click on the bug and then ensure you click on the This bug affects X people. Does this bug affect you? link. If multiple people report that they are affected by a bug, its status changes from reported by a single person to confirmed, helping the Bug Triage team with their work. Please ensure you add any relevant additional information on the bug, in support of the issues you are facing.

If the bug doesn't exist, we will be presented with a form that has a one-liner Summary field and a free-form textbox in which to put in the required information.

On submitting bugs, try to follow these rules:

  • Include the OS platform, architecture, and software package versions
  • Give step-by-step details on how to recreate the bug
  • Enter what you expected to happen
  • Enter what actually happened instead
  • Be precise—developers like precision

Useful commands to help complete a bug report

The following is a list of useful commands that will help you in the completion of the bug report:

  • OS System Version: lsb_release -r
  • Architecture: uname -i
  • Package version:

    dpkg -l | grep name_of_package dpkg -s name_of_package | grep Version

Pasting logs

Sometimes, there will be a need to submit logging information to support your bug report. This information can be quite lengthy, so rather than including the text from such logs, within the bug report, it is encouraged to use a text paste service, which will provide you with a unique URL that you can use to reference the information within your bug report. For this purpose, you can use the service at http://paste.openstack.org/.

Ensure you sanitize any data that you paste in public. This includes removing any sensitive data such as IPs, usernames, and passwords.

Once a bug is submitted, an e-mail will be sent to the e-mail address used to register with LaunchPad, and any subsequent updates in relation to the bug will be sent to this e-mail address, allowing us to track its progress all the way through to a fix being released.

How it works...

OpenStack is developed by a relatively small number of people, compared to the number of people the community that end up downloading and using the software. This means the software gets used in scenarios that developers can't feasibly test or just didn't see as possible at the time. The net result is that bugs often come out during this time. Being able to report these bugs is vital, and this is why open source software development is so hugely successful in creating proven and reliable software.

OpenStack's development lives on LaunchPad, so all bug tracking and reporting is done using this service. This provides a central tool for the global community and allows end users to communicate with the relevant projects to submit bugs.

Submitting bugs is a vital element to an open source project. It allows you to shape the future of the project as well as be part of the ecosystem that is built around it.

It is important to give as much information as possible to the developers when submitting bugs. Be precise and ensure that the steps to recreate the bug are easy to follow and provide an explanation of the environment you are working in, to allow the bug to be recreated. If it can't be recreated, it can't be fixed.

See also

You can find out more information about the OpenStack community at: http://www.openstack.org/community/.

Getting help from the community

OpenStack would not be where it is today without the ever-growing community of businesses, sponsors, and individuals. As with many large OSS projects, support is fantastic, meaning round-the-clock attention to requests for help, which can sometimes exceed the best efforts of paid-for support.

How to do it...

There are a number of ways to reach out for support from the excellent OpenStack community. They are explained in the following sections.

IRC support

Internet Relay Chat has been the mainstay of the Internet since the beginning, and collaboration from developers and users can be found on the Freenode IRC network.

OpenStack has a channel (or a room) on the Freenode IRC network called #openstack.

There are two ways of accessing IRC, either through the web interface or by using an IRC client:

  • IRC access using a web browser
    1. Accessing the #openstack channel, using a web browser, can be achieved at http://webchat.freenode.net/.
    2. Enter #openstack as the channel.
    3. Choose a username for yourself.
    4. Complete the CAPTCHA and you will be placed into the #openstack channel.
  • IRC access using an IRC client
    1. Download a suitable IRC client for your operating system (for example, Xchat).
    2. When loading up your client, choose a username (and enter a password if you have registered your username) and connect to the Freenode network (irc.freenode.net).
    3. When connected, type the following command to join #openstack:

      /j #openstack

    4. We will now be in the #openstack channel.

Mailing list

Subscribing to the mailing list allows you to submit and respond to queries where an instant response might not be required and is useful if you need your question to reach more members than the relatively smaller number that is on IRC.

To subscribe to the mailing list, head over to https://launchpad.net/openstack, where you will see an option to subscribe to the mailing list.

You will need to create a LaunchPad ID and be a member of the OpenStack project (see the Submitting Bug reports recipe on submitting bugs on how to do this).

Pasting logs

When asking for help, it usually involves copying logs from your environment and sharing them with the community. To help facilitate this, a web service has been created that allows you to paste the log entries that can be referred to in an IRC chat or in an e-mail without having to paste them directly. This can be found at http://paste.openstack.org/. When you create a new paste, you are given a unique URL that you can then refer to for the information instead.

Ensure you sanitize any data that you paste in public. This includes removing any sensitive data such as IPs, usernames, and passwords.

How it works...

The OpenStack community is what makes OpenStack what it is. It is made up of developers, users, testers, companies, and individuals with a vested interest in ensuring OpenStack's success. There are a number of useful places to ask for help when it comes to community support. This includes IRC and the mailing list.

You are encouraged to post and respond to requests in IRC and on the mailing list, as there are likely to be many people wanting the same questions answered. There will also be the development and project teams wanting to understand what is causing issues so they can help address them.

See also

You can find out more information about the OpenStack community, at http://www.openstack.org/community/.

Summary

OpenStack like all software has bugs that can't be solved ourselves.So, here we saw how to gather the required information so that the OpenStack community can identify bugs and suggest fixes that are important in ensuring those bugs or issues are dealt with quickly and efficiently.

Resources for Article:


Further resources on this subject:


About the Author :


Kevin Jackson

Kevin Jackson is married with three children. He is an experienced IT professional working with small businesses to online enterprises. He has extensive experience of various flavors of Linux and Unix. He works from home in Southport, UK, specializing in OpenStack for Rackspace covering the International market for the Big Cloud Solutions team. He can be found on twitter @itarchitectkev. He also authored the first edition of OpenStack Cloud Computing Cookbook, Packt Publishing.

Books From Packt


OpenStack Cloud Computing Cookbook
OpenStack Cloud Computing Cookbook

OpenNebula 3 Cloud Computing
OpenNebula 3 Cloud Computing

Apache CloudStack Cloud Computing
Apache CloudStack Cloud Computing

Getting Started with Oracle Public Cloud
Getting Started with Oracle Public Cloud

IBM Websphere Portal 8: Web Experience Factory and the Cloud
IBM Websphere Portal 8: Web Experience Factory and the Cloud

Oracle Enterprise Manager Cloud Control 12c: Managing Data Center Chaos
Oracle Enterprise Manager Cloud Control 12c: Managing Data Center Chaos

VMware vCloud Security
VMware vCloud Security

Getting Started with ownCloud
Getting Started with ownCloud


No votes yet

Post new comment

CAPTCHA
This question is for testing whether you are a human visitor and to prevent automated spam submissions.
u
K
R
3
6
n
Enter the code without spaces and pay attention to upper/lower case.
Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software