Creating the first Circos diagram

Exclusive offer: get 50% off this eBook here
Circos Data Visualization How-to [Instant]

Circos Data Visualization How-to [Instant] — Save 50%

Create dynamic data visualizations in the social, physical, and computer sciences with the Circos data visualization program with this book and ebook.

$12.99    $6.50
by Tom Schenk Jr. | April 2013 | Open Source

In this article, by Tom Schenk Jr., the author of Circos Data Visualization How-to, we will create a very basic Circos diagram containing links (ribbons) showing the relationship between hair and eye color. Throughout this task, we will become acquainted with Circos' genome-based terminology. As Circos' roots are in biology, the program does not read the typical tables most users are accustomed to.

(For more resources related to this topic, see here.)

Getting ready

Let's start with the simple task of graphing a relationship between a student's eye and hair color. We can expect some results: brown eyes are more common for students with brown or black hair, and blue eyes are more common amongst blondes. Circos is able to show these relationships with more clarity than a traditional table. We will be using the hair and eye color data available in the book's supplemental materials (HairEyeColor.csv). The data contains the information about hair and eye color of University of Delaware students.

Create a folder C:\Users\user_name\Circos Book\HairEyeColor, and place the data file into the location. Here, user_name denotes the user name that is used to log in to your computer.

The original data is in a size that can be typically stored in a data set. Each line represents a student and their respective hair (black, brown, blonde, or red) and eye (blue, brown, green, or hazel) color. The following table shows the first 10 lines of data:

Hair

Eye

Brown

Red

Blonde

Brown

Blonde

Brown

Black

Brown

Brown

Brown

Brown

Blue

Hazel

Blue

Blue

Brown

Brown

Hazel

 

Before we start creating the specific diagram, let's prepare the data into a table. If you wish, you can use Microsoft Excel's PivotTable or Data Pilots of OpenOffice to transform it into a table as follows:

 

Blue

Brown

Green

Hazel

Black

Blonde

Brown

Red

20

94

84

17

68

7

119

26

5

15

29

14

15

11

54

14

In order to use the data for Circos, we need a simpler format. Open a text file and create a table only separated by spaces. We will also change the row and column titles to make it clearer, as follows:

X Blue_Eyes Brown_Eyes Green_Eyes Hazel_Eyes Black_Hair 20 68 5 15 Blonde_Hair 94 7 15 11 Brown_Hair 84 119 29 54 Red_Hair 17 26 14 14

The X is simply a place holder. Save this file as HairEyeColorTable.txt as we are ready to use Circos.

You can skip the process of making the raw tables. We will be using the HairEyeColorTable.txt file to create the Circos diagram.

How to do it…

  1. Open the Command Prompt and change the directory to the location of the tableviewer tools in the Circos\Circos Tools\tools\tableviewer\bin, as follows:

    cd C:\Program Files (x86)\Circos\Circos Tools\tools\tableviewer\bin

  2. Parse the text table (HairEyeColorTable.txt). This will create a new file, HairEyeColorTable-parsed.txt, which will be refined into a Circos diagram as follows:

    perl parse-table -file "C:\Users\user_name\Circos Book\ HairEyeColor\HairEyeColorTable.txt" > "C:\Users\user_name\Circos Book\HairEyeColor\HairEyeColorTable-parsed.txt"

  3. The parse command consists of a few parts. First, Perl's parse-table instructs Perl to execute the parse program on the HairEyeColorTable.txt file. Second, the > symbol instructs Windows to write the output into another text file called HairEyeColorTable-parsed.txt.

    Linux Users

    Linux users can use a simpler, shorter syntax. Steps 2 and 3 can be completed with this command:

    cat "~/Documents/Circos Book/HairEyeColor/ HairEyeColorTable.txt" | bin/parse-table | bin/ make-conf -dir "~/Documents/user_name/Circos Book/ HairEyeColor/HairEyeColorTable-parsed.txt

    Create the configuration files from the parsed table using the following command:

    type "C:\Users\user_name\Circos Book\HairEyeColor\ HairEyeColorTable-parsed.txt" | perl make-conf -dir "C:\Users\ user_name\Circos Book\HairEyeColor\"

    This will create 11 new configuration files. These files contain the data and style information which is needed to create the final diagram.

    This command consists of two parts. We are instructing Windows to pass the text in the HairEyeColorTable-parsed.txt file to the make-conf command. The | (pipe) character separates what we want passed along and the actual command. After the pipe, we are instructing Perl to execute the make-conf command and store the output into a new directory.

  4. We need to create a final file, which compiles all the information. This file will also tell Circos how the diagram should appear, such as size, labels, image style, and where the diagram will be saved. We will save the diagram as HairEyeColor.conf.

    • The make-conf command gave us the color.conf file, which associates colors with the final diagram. In addition, the Circos installation provides us with some other basic colors and fonts. The first several lines of code are:

      <colors> <<include colors.conf>> <<include C:\Program Files (x86)\Circos\etc\colors.conf>> </colors> <fonts> <<include C:\Program Files (x86)\Circos\etc\fonts.conf>> </fonts>

    • The next segment is the ideogram. These are the parameters that set the details of the image. This first set of lines specifies the spacing, color, and size of the chromosomes:

      <ideogram> <spacing> default=0.01r break=200u </spacing> thickness = 100p stroke_thickness = 2 stroke_color = black fill = yes fill_color = black radius = 0.7r show_label = yes label_font = condensedbold label_radius = dim(ideogram,radius) + 0.05r label_size = 48 band_stroke_thickness = 2 show_bands = yes fill_bands = yes </ideogram>

    • Next, we will define the image, including where it is stored (this location is mentioned in the following code snippet as dir), the file name, whether we want an SVG or PNG file, size, background color, and any rotation:

      dir = C:\Users\user_name\Circos Book\HairEyeColor\ file = HairEyeColor svg = yes png = yes 24bit = yes radius = 800p background = white angle_offset = +90

    • Lastly, we will input the data and define how the links (ribbons) should look:

      chromosomes_units = 1 karyotype = karyotype.txt <links> z = 0 radius = 1r – 150p bezier_radius = 0.2r <link cell_> ribbon = yes flat = yes show = yes color = black thickness = 2 file = cells.txt </link> show_bands = yes <<include C:\Program Files (x86)\Circos\etc\housekeeping.conf>>

      Save this file as HairEyeColor.conf with the other configuration files. Have a look at the next diagram which explains all this procedure:

      The make-conf command outputs a few very important files. First, karyotype.txt defines each ideogram band's name, width, and color. Meanwhile, cells.txt is the segdup file containing the actual data. It is very different from our original table, but it dictates the width of each ribbon. Circos links the karyotype and segdup files to create the image. The other configuration files are mostly to set the aesthetics, placement, and size of the diagram.

  5. Return to the Command Prompt and execute the following command:

    cd C:\Users\user_name\Circos Book\HairEyeColor perl "C:\Program Files (x86)\Circos\bin\circos" –conf HairEyeColor.conf

Several lines of text will scroll across the screen. At the conclusion, HairEyeColor.png and HairEyeColor.svg will appear in the folder as shown in the next diagram:

Circos Data Visualization How-to [Instant] Create dynamic data visualizations in the social, physical, and computer sciences with the Circos data visualization program with this book and ebook.
Published: November 2012
eBook Price: $12.99
See more
Select your format and quantity:

There's more…

Now we can work toward improving the quality of the image. Later, we will increase the complexity. This section will add two tweaks. First, we will change the colors so the hair and eye color will correspond to image colors—a natural way to display such data. Secondly, we will include some transparency so we can see the overlapping ribbons even better.

  1. We can change the color of the ribbons by adjusting the colors.conf file generated by the make-conf command. Open the file and change the colors to:

    colorgreen_eyes = 46,139,87 colorblack_hair = 0,0,0 colorblue_eyes = 0,191,255 colorbrown_hair = 205,133,63 colorbrown_eyes = 178,34,34 colorhazel_eyes = 208,195,131 colorred_hair = 255,0,0 colorblonde_hair = 242,218,145

  2. Let's also add some transparency. Transparency values range from 0 (opaque) to 1 (transparent). Modify the existing colors to:

    colorgreen_eyes = 46,139,87,.2 colorblack_hair = 0,0,0,.2 colorblue_eyes = 0,191,255,.2 colorbrown_hair = 205,133,63,.2 colorbrown_eyes = 178,34,34,.2 colorhazel_eyes = 208,195,131,.2 colorred_hair = 255,0,0,.2 colorblonde_hair = 242,218,145,.2

  3. Save the file as colors-new.conf. Meanwhile, return to HairEyeColor.conf and change <<include colors.conf>> to <<include colors-new.conf>>.

  4. Regenerate the image by using the following command:

    perl "C:/Program Files (x86)/Circos/bin/circos" –conf HairEyeColor.conf

This will generate the following diagram:

Links without ribbons

Perhaps we will find it more pertinent to show whether there is a relationship, as opposed to the quantity of a relationship. We can easily change from ribbons—whose width corresponds to the data—to simple links.

In the HairEyeColor.conf file, edit the ribbon = yes line to ribbon = no. Regenerate the image; the result will now be:

Editing the image for a final product

We may want to edit the image for a final product. For instance, Circos does not support spaces in labels, leaving an underscore for denoting a space in a diagram. This may be unacceptable for the final product. You may want to explore Scalar Vector Graphic (SVG) output. SVG is a particular format, which allows you to change image sizes with no distortion.

You can open an SVG in programs such as Adobe Illustrator (http://www.adobe.com/products/illustrator.html) or Inkscape (http://inkscape.org/) to modify the design or create a poster. SVG allows you to even select and change specific parts of the diagram.

Summery

In this article, we created a very basic Circos diagram containing links (ribbons) showing the relationship between hair and eye color. Throughout this task, we became acquainted with Circos' genome-based terminology.

Resources for Article :


Further resources on this subject:


Circos Data Visualization How-to [Instant] Create dynamic data visualizations in the social, physical, and computer sciences with the Circos data visualization program with this book and ebook.
Published: November 2012
eBook Price: $12.99
See more
Select your format and quantity:

About the Author :


Tom Schenk Jr.

Tom Schenk Jr. is the Director of Analytics for the city of Chicago. He also maintains the Data Nouveau website at www.datanouveau.net. Tom has written numerous scholarly articles on data visualization, education, and economic research. He has emphasized the use of data visualization techniques in governmental reports. Previously, he was an Educational Consultant for the Iowa Department of Education and Senior Analyst at Department of Medical Social Sciences at Northwestern University.

Books From Packt


Pentaho Data Integration 4 Cookbook
Pentaho Data Integration 4 Cookbook

Matplotlib for Python Developers
Matplotlib for Python Developers

QlikView 11 for Developers
QlikView 11 for Developers

MATLAB Graphics and Data Visualization Cookbook
MATLAB Graphics and Data Visualization Cookbook

HTML5 Graphing and Data Visualization Cookbook
HTML5 Graphing and Data Visualization Cookbook

FusionCharts Beginner’s Guide: The Official Guide for FusionCharts Suite
FusionCharts Beginner’s Guide: The Official Guide for FusionCharts Suite

Data Visualization: a successful design process
Data Visualization: a successful design process

 Infinispan Data Grid Platform
Infinispan Data Grid Platform


Code Download and Errata
Packt Anytime, Anywhere
Register Books
Print Upgrades
eBook Downloads
Video Support
Contact Us
Awards Voting Nominations Previous Winners
Judges Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software
Resources
Open Source CMS Hall Of Fame CMS Most Promising Open Source Project Open Source E-Commerce Applications Open Source JavaScript Library Open Source Graphics Software