Reader small image

You're reading from  Building a BeagleBone Black Super Cluster

Product typeBook
Published inNov 2014
Publisher
ISBN-139781783989447
Edition1st Edition
Right arrow
Author (1)
Andreas J Reichel
Andreas J Reichel
author image
Andreas J Reichel

Andreas Josef Reichel was born in 1982 in Munich, Bavaria, to Josef and Ursula. He went to an elementary school from 1989 to 1993 and continued with lower secondary education for 4 years and started with middle school in 1996. In 1999, he finished school as the best graduate of the year. From 2000 to 2001, he went to Fachoberschule and got his subject-linked university entrance qualification, with which he began to study Physical Technology at the University of Applied Sciences in Munich. After two semesters, he got his preliminary diploma and began with general studies of Physics at the Ludwig Maximilian University of Munich in 2003. In 2011, he completed Dipl.-Phys. (Univ.) in experimental physics with the THz characterization of thin semiconductor films in photonics and optoelectronics. Now, he is working on his dissertation to Dr. rer. nat. on plasma etching processes for semiconductors at the Walter Schottky Institute of the Technische Universität München in Garching. In his spare time, he has been learning programming languages such as BASIC, Pascal, C/C++, x86 and x64 Assembler, as well as HTML, PHP, JavaScript, and the database system MySQL and has been programming since he was 13 years old. Since 1995, he has been an active hobby musician in different accordion ensembles and orchestras. He also loves to learn about languages and drawing, and he began practicing Chinese martial arts in 2012. He invests most of his free time in hobby electronic projects and family genealogical research. He was the co-author of Charge carrier relaxation and effective masses in silicon probed by terahertz spectroscopy, S. G. Engelbrecht, A. J. Reichel, and R. Kersting, Journal of Applied Physics.
Read more about Andreas J Reichel

Right arrow

Software programming


The most important part of a good computer is good software. Without good software, specifically optimized for its hardware, the full computational power cannot be utilized. In this book, I will show you how to build a supercomputer cluster that gains its high-speed computational power from distributing certain tasks to other its via networking. For this purpose, special software is required and has to be compiled from the source code. How this works and what nodes are will be explained in Chapter 2, Building a Beowulf Cluster, and Chapter 3, Operating System Setup and Configuration.

The open source philosophy

Although there are a lot of already existing helpful software packages, it is very important to understand that Linux is an open source operating system written for an open source community. Usually, Windows users are frustrated when they gain first contact with open source software, because they are used to having already working and easy-to-install software. A huge disadvantage is that these software packets are compiled for a standard platform and might not be optimized to a specific computer that they are installed on. Another problem is that if software components of the operating system are updated but older versions are required by the user software, instabilities might arise or completely different interfaces might disrupt the software functionality completely.

Software modularity and dependencies

Linux is a highly modular operating system. The whole system is built on the philosophy of open software, which means that every part of the operating system can be compiled from available open source code. This source code is then compiled by standard programming languages such as C, C++, FORTRAN, Assembler, and others in order to build binary code specifically optimized for certain hardware. The technique by which software is built does not differ much from Windows or Linux operating systems. For the beginner, it might be hard to produce a working compiled program starting from source code because usually, there are a lot of software dependencies such as missing software libraries or other programs that code is based upon. In this case, it might be hard to find all the required libraries, especially when newer versions that have changed in interfaces such as function definitions are available. On the other hand, a dependency can soon lead to several others so that the search for all the required libraries grows exponentially and takes a lot of time.

Also, on certain hardware, some well-established compiling parameters do not work and have to be modified or bug fixes have to be found. This can make the simple task of "just compiling software" an unsolvable problem for beginners. Thanks to the rising community of hobby programmers and Linux enthusiasts, there are a lot of forums online that can be searched for such problems. Often, solutions are present, and if not, there can be hints that point us in the right direction, at least.

The following sections will explain the basics of creating software on Linux operating systems with standard programming environments. It is written for hobby enthusiasts who might or might not have already tried and programmed their own software. Although existing knowledge is very helpful, it is not required in order to understand the following explanations.

The source code and programming languages

Each computer program consists of binary code, which means a sequence of two states usually described as zero and one. A specific state is called a bit. Four of these bits make up a so-called nibble and eight make up a byte. Several bytes can be described as a word, a double word, or a quad word. The following table gives you a small summary of the most important data sizes:

Amount of bits

Amount of bytes

Special name

4

1/2

nibble

8

1

byte

16

2

word

32

4

double word

64

8

quad word

A central processor does nothing else except interpreting bit sequences as commands. These commands are called instructions and tell the CPU what to do. This is the lowest level of programming, which is the so-called machine language. Machine language is, except to certain freaky people, not human-readable. For example, it is not obvious that the binary code 1011 0100 0100 1100 1100 1101 0010 0001 is the end of an MS-DOS program.

Low-level programming

Low-level programming means the direct programming of machine language. Of course, this has to happen in a human-readable way. One possibility to simplify 1011 0100 0100 1100 is given by using another number system, such as the hexadecimal system, resulting in 0xB4 0x4C. This is better, but it's still not readable by humans. The final simplification is the invention of so-called mnemonics. For example, on Intel x86-platforms, 0xB4 0x4C would mean mov ah, 0x4C in this mnemonics language. Now, one can understand that this code sets the CPU register named ah to the value of 0x4C. This language as well as the software that translates this back into bits is called Assembler.

Assembler has its advantages and disadvantages. One big advantage is that the resulting software does exactly what you programmed. This means that there is no optimization that modifies your code and you can program very effectively in size and speed. One big disadvantage, however, is the problem that each CPU has its own instruction set. This means that our Sitara CPU will not understand the preceding example, because it has no ah register. For any real problems we want to solve using computers, we are primarily interested in the nature of the problem and not the nature of the CPU used in the computer. To make programming independent of the used computer platform, there exist so-called high-level programming languages.

High-level programming

High-level programming languages consist of keywords, syntax, and grammar, as with every spoken language. The keywords define the vocabulary that can be used, whereas syntax and grammar define the exact utilization and order of these keywords. To understand this, we should have a look at how a simple loop will look in a low-level language compared to a high-level language.

The following example shows us that a simple loop already needs four different instructions in Assembler, while it can be realized by a relatively simple for keyword in the high-level language C++:

Low level (Assembler)

High level (C++)

mov cx, 0
@mark1:
    other code..
inc cx
cmp cx, 345
jne @mark1

for (int j = 0; j < 345; j++) {
    other code...
}

A low-level language compared to a high-level language

While C++ is a general-purpose high-level programming language, there are also languages that are more specifically optimized. One example is FORTRAN, which is mainly used for mathematical problems due to its ability to define matrices and other mathematical structures very easily.

The compiler toolchain

Once the required code has been written and is ready to be translated from its human-readable form into machine language, there is a certain sequence of tools that have to be used. This toolset is called the compiler toolchain:

  1. Firstly, the code is treated by the compiler itself. The compiler translates the high-level language to a low-level language, mostly Assembler.

  2. It is then translated to object files. Usually, these two processes are performed by only one compiler internally.

  3. The object files then have the binary format and can be executed theoretically. However, the OS must know a few things in order to execute programs correctly. It must be told where in the main memory the program has to be loaded, how much memory it uses, which libraries it needs, and so on. To fulfill these requirements, we need to link the program.

  4. The so-called linker combines one or more object files and adds information that's specific to the OS in use.

  5. The final program then consists of a special executable format generated by the linker and incorporating the object files produced by the compiler. In Linux, this executable format is called Executable and Linking Format (ELF).

The whole process is depicted in the following diagram:

Software compilation and simple toolchain

Another important feature of the linker is its capability to embed the required libraries into the executable file. This is called static linking. The program can then be used on other computers that do not have that specific library installed. The opposite of static linking is dynamic linking. In this case, only a stub of the library is linked into the program that tells the OS which library to provide. Dynamically-linked programs are smaller in size but always need their libraries.

In all examples, the main focus will be on C++. Some of the modules are only available as FORTRAN code; however, once compiled, their functions can also be accessed from C++ programs.

Previous PageNext Page
You have been reading a chapter from
Building a BeagleBone Black Super Cluster
Published in: Nov 2014Publisher: ISBN-13: 9781783989447
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Andreas J Reichel

Andreas Josef Reichel was born in 1982 in Munich, Bavaria, to Josef and Ursula. He went to an elementary school from 1989 to 1993 and continued with lower secondary education for 4 years and started with middle school in 1996. In 1999, he finished school as the best graduate of the year. From 2000 to 2001, he went to Fachoberschule and got his subject-linked university entrance qualification, with which he began to study Physical Technology at the University of Applied Sciences in Munich. After two semesters, he got his preliminary diploma and began with general studies of Physics at the Ludwig Maximilian University of Munich in 2003. In 2011, he completed Dipl.-Phys. (Univ.) in experimental physics with the THz characterization of thin semiconductor films in photonics and optoelectronics. Now, he is working on his dissertation to Dr. rer. nat. on plasma etching processes for semiconductors at the Walter Schottky Institute of the Technische Universität München in Garching. In his spare time, he has been learning programming languages such as BASIC, Pascal, C/C++, x86 and x64 Assembler, as well as HTML, PHP, JavaScript, and the database system MySQL and has been programming since he was 13 years old. Since 1995, he has been an active hobby musician in different accordion ensembles and orchestras. He also loves to learn about languages and drawing, and he began practicing Chinese martial arts in 2012. He invests most of his free time in hobby electronic projects and family genealogical research. He was the co-author of Charge carrier relaxation and effective masses in silicon probed by terahertz spectroscopy, S. G. Engelbrecht, A. J. Reichel, and R. Kersting, Journal of Applied Physics.
Read more about Andreas J Reichel