Fundamentals of Computer Science
The world of computer science is a broad and complex one. Not only is it constantly changing and evolving, but the components we consider part of computer science are also adapting and adjusting. The computational thinking process allows us to tackle any problem presented with purpose and focus. No matter what the problem is, we can break it down, find patterns that will help us find solutions, generalize our solutions, and design algorithms that can help us provide solutions to that problem.
Throughout this book, we will be looking at the computational thinking process carefully, tackling problems in multiple areas and using the Python programming language and its associated libraries and packages to create algorithms that help us solve these problems. Before we look at various problems, however, we will explore some of the important computer science concepts that will help us navigate the rest of this book.
In this chapter, we will explore the following topics:
- Introduction to computer science
- Theoretical computer science
- System software
- Computing
- Data types and structures
Technical requirements
Here is the source code that will be used in this chapter: https://github.com/PacktPublishing/Applied-Computational-Thinking-with-Python-Second-Edition/tree/main/Chapter01.
Introduction to computer science
When looking for a definition of computer science, you will encounter multiple variations, but they all state that computer science encompasses all aspects of computers and computing concepts, including hardware and software. In computer science, hardware design is learned in courses offered in engineering or computer engineering, for the most part. The software side of computer science includes operating systems (OSs) and applications, among other programming areas. For this book, we will be concentrating on the software side of computer science.
In this chapter, we’ll look at some of the basic definitions, theories, and systems that are important as we delve deeper into the computational thinking world. Once we have identified key areas and defined the necessary concepts, we will be ready to move on to the applications and real-world challenges we face in an ever-changing tech world while also exploring the elements of computational thinking and the Python programming capabilities that can help us tackle these challenges.
The wide range of topics available in computer science can be both daunting and exciting and it is ever-evolving. Some of these topics include game design, OSs, applications for mobile or desktop devices, programming robots, and much more. Constant and consistent breakthroughs in computers and computing provide new and exciting opportunities, much of which is unknown to us. Having a basic understanding of the systems behind computer science can help us interact with technology and tackle problems more efficiently. Let’s start by learning about how computers store information using the binary system.
Learning about computers and the binary system
All computers store information as binary data. The binary system reads all information as a switch, which can be on or off – that is, 1 or 0. The binary system is a base-2 system. You’ll need a basic understanding of binary numbers and binary systems to progress in computer science.
The binary system translates all data so that it can be stored as strings using only two numbers: 0 and 1. Data is stored in computers using bits. A bit (which stands for binary digit) is the smallest unit of data you can find in a computer – that is, either a 0 or a 1.
When counting in the binary system, the first two numbers are 0 (or 00) and 1 (or 01), much like in the base-10 number system we use in everyday life. If we were to continue counting in binary, our next number would be 10. Let’s compare the first three numbers in the base-10 system and the binary system before we learn how to convert from one into the other:
Figure 1.1 – Base-10 and binary comparison
The next number in the base-10 system would be 3. In the binary system, the next number would be 11, which is read as one one. The first 10 numbers in the base-10 and binary systems are as follows:
Base-10 |
Binary |
0 |
00 |
1 |
01 |
2 |
10 |
3 |
11 |
4 |
100 |
5 |
101 |
6 |
110 |
7 |
111 |
8 |
1000 |
9 |
1001 |
10 |
1010 |
Figure 1.2 – Base-10 and binary comparison (continued)
As mentioned previously, the binary system is a base-2 system. This means that each digit of the base-10 system is paired with a power of 2, so we use those powers to convert between numbers. Understanding how to convert from base-2 into base-10 and vice versa can help us have a better understanding of the relationship between numbers in the different systems.
Converting from binary into base-10
We will start with an example of converting from a binary number into a base-10 number. Take the number 101101. To convert the number, each digit must be multiplied by the corresponding base-2 power. The binary number consists of 6 digits, so the powers of 2 we will use will be 5, 4, 3, 2, 1, and 0. This means the number is converted as follows:
1 × 2 5 + 0 × 2 4 + 1 × 2 3 + 1 × 2 2 + 0 × 2 1 + 1 × 2 0
= 32 + 0 + 8 + 4 + 0 + 1 = 45
The binary number 101101 is equivalent to 45 in the base-10 system. In everyday life, we write the numbers in base-10, so we understand the number 45 as it’s written. However, our computers convert this information into binary to be able to process it, so the number becomes the binary number 101101 so that it can easily be read by the computer.
Converting from base-10 into binary
Again, let’s start with an example to demonstrate the process of converting from a base-10 number into a binary number. Take the number 591. To convert the base-10 number into binary, we have to divide the number by 2 iteratively. If the result has no remainder, we insert a 0 (if it is the first number) or insert a 0 to the left of the existing numbers.
If the result has a remainder of 1, we insert a 1 (if it is the first number) or insert a 1 to the left of the existing numbers.
When we divide 591 by 2, the result is 295 with a remainder of 1. That means our right-most number, which is our first number, is 1.
Now, divide 295 by 2. The result is 147 with a remainder of 1. So, we insert a 1 to the left of the 1. Our number is now 11.
Now, divide 147 by 2. The result is 73 with a remainder of 1. Our result is now 111. Now, we’ll carry out further divisions:
- 73 ÷ 2 = 36 with a remainder of 1. Our number is now 1111.
- 36 ÷ 2 = 18 with no remainder. Our number is now 01111.
- 18 ÷ 2 = 9 with no remainder. Our number is now 001111.
- 9 ÷ 2 = 4 with a remainder of 1. Our number is now 1001111.
- 4 ÷ 2 = 2 with no remainder. Our number is now 01001111.
- 2 ÷ 2 = 1 with no remainder. Our number is now 001001111.
- 1 ÷ 2 = 0 with a remainder of 1. Our number is now 1001001111.
The number 591 in base-10 is equivalent to the number 1001001111 in the binary system.
Another way to convert this number is to use a table for the divisions:
Starting Base-10 |
Divided by 2 |
Remainder |
591 |
295 |
1 |
295 |
147 |
1 |
147 |
73 |
1 |
73 |
36 |
1 |
36 |
18 |
0 |
18 |
9 |
0 |
9 |
4 |
1 |
4 |
2 |
0 |
2 |
1 |
0 |
1 |
0 |
1 |
Table 1.1 – Converting base-10 number 591 into binary
Using this table, take the numbers from the right-most column and write them starting with the last row from bottom to top. The result is 1001001111.
Learning how to convert numbers is only a small part of converting data into binary, but it is an important one. All information, including letters and symbols, must be converted into binary to be read by a computer. American Standard Code for Information Exchange (ASCII) is a protocol that has been adopted universally to convert information. That said, some of the protocol is obsolete, so other protocols use ASCII as a base to expand their capabilities. UTF-16 is a widely used 16-bit character set that is based on Unicode, an extension of ASCII.
As discussed, in this section, we learned that information must be encoded or converted for a computer to read it. Multiple systems and protocols exist, but for now, we will move on to computer science theory. However, revisiting binary, ASCII, and Unicode as you work through problems can be helpful.
Understanding theoretical computer science
While you don’t need to be a master mathematician to love computer science, these two subjects are intrinsically tied. Computer science, particularly programming, uses algebraic algorithms. We will explore algorithms in depth later on, but again, the important point here is that they are mathematical. The logical processes stem from the philosophical nature and history of mathematics. Now, if mathematical topics are not to your liking, don’t despair. The logical processes needed to become a programmer and developer can be used without you having to learn higher mathematics. Knowing higher mathematics just simplifies some concepts for those who have that background.
Theoretical computer science includes multiple theories and topics. Some of these topics and theories are listed as follows, but keep in mind that other topics are also included in theoretical computer science that may not be discussed in this book. A short description and explanation of each of the theories or terms listed here have been included for you to review:
- Algorithms
- Coding theory
- Computational biology
- Data structures
- Cryptography
- Information theory
- Machine learning
- Automata theory
- Formal language theory
- Symbolic computation
- Computational geometry
- Computational number theory
We will look at the aforementioned theories in the following sections.
Algorithms
An algorithm is a set of instructions that a computer can read. Algorithms provide rules or instructions in a way in which a computer can logically process the information provided as input and create an output. In most books, you are introduced to the algorithm and programming by creating a Hello World! program. I won’t make this book the exception.
In Python, the code would require that we print the message to the screen. Because the Python language is easy to learn and read, many, if not most, of the code strives to be logical. So, to print a message to the screen, we can use the print()
command. Here is the code we’d use:
print("Hello world!")
Similarly, we could use the following code:
print('Hello world!")
Python reads both "
and '
as the same thing when it comes to strings.
The result of the preceding code looks like this when we run the algorithm:
Figure 1.3 – The Hello World! Python program
Note
Don’t worry – we’ll discuss the Python programming language later in Chapter 2, Elements of Computational Thinking, and in more depth in Part 2, Applying Python and Computational Thinking, starting with Chapter 9, Introduction to Python, as well.
While lengthy, discussing algorithms is critically important to this book and your progression with Python. Consequently, we will be covering this in-depth exploration of algorithms in Chapter 2, Elements of Computational Thinking, and Chapter 3, Understanding Algorithms and Algorithmic Thinking, since algorithms are a key element of the computational thinking process.
Important note
Chapter 2, Elements of Computational Thinking, will focus on the computational thinking process itself, which has four elements: decomposition, pattern recognition, pattern generalization and abstraction, and algorithm design. As you can see, the last element is algorithm design, so we will need to get more acquainted with what an algorithm is and how we can create one so that you can then implement and design algorithms when solving problems with Python. Chapter 3, Understanding Algorithms and Algorithmic Thinking, will focus on a deeper understanding of algorithms and introduce you to the design process.
We’ll look at coding theory next.
Coding theory
Coding theory is also sometimes known as algebraic coding theory. When working with code and coding theory, three major areas are studied: data compression, error correction, and cryptography. We will cover these in more detail in the following sections.
Data compression
The importance of data compression cannot be understated. Data compression allows us to store the maximum amount of information possible while taking up the least amount of space. In other words, data compression is the process of using the fewest number of bits to store the data.
Important note
Remember that a bit is the smallest unit of data you can find in a computer – that is, a 0 or a 1. A group of 8 bits is called a byte. We use bytes as a unit of measurement for the size of the memory of a computer or storage device, such as a memory card or external drive, and more.
As our technology and storage capacities have grown and improved, our ability to store additional data has as well. Historically, computers had kilobytes or megabytes of storage when they were first introduced into households, but at the time of writing, they now have gigabytes and terabytes worth of storage. The conversions for each of these storage units are shown here:
Figure 1.4 – Byte conversions
If you look for information online, you may find that some sources state that there are 1,024 gigabytes in a terabyte. That is a binary conversion. In the decimal system or base-10 system, there are 1,000 gigabytes per terabyte. To understand conversion better, it is important to understand the prefixes that apply to the base-10 system and the prefixes that apply to the binary system:
Base-10 Prefixes |
Value |
Binary Prefixes |
Value |
kilo |
1,000 |
kibi |
1,024 |
mega |
1,0002 |
mebi |
1,0242 |
giga |
1,0003 |
gibi |
1,0243 |
tera |
1,0004 |
tebi |
1,0244 |
peta |
1,0005 |
pebi |
1,0245 |
exa |
1,0006 |
exbi |
1,0246 |
zetta |
1,0007 |
zebi |
1,0247 |
yotta |
1,0008 |
yobi |
1,0248 |
Table 1.2 – Base-10 and binary prefixes with values
As mentioned, the goal is always to use the least amount of bits for the largest amount of data possible. Therefore, we compress, or reduce, the size of data to use less storage.
So, why is data compression so important? Let’s go back in time to 2000. Here, a laptop computer on sale for about $1,000 had about 64 MB of Random Access Memory (RAM) and 6 GB of hard drive memory. A photograph on our digital phones takes anywhere from 2 to 5 megabytes of memory when we use its actual size. That means our computers couldn’t store many (and in some cases, any) of the modern pictures we take now. Data compression advances allow us to store more memory, create better games and applications, and much more as we can have better graphics and additional information or code without having to worry as much about the amount of memory they use.
Error correction
In computer science, errors are a fact of life. We make mistakes in our processes, our algorithms, our designs, and everything in between. Error correction, also known as error handling, is the process a computer goes through to automatically correct an error or multiple errors, which happens when digital data is transmitted incorrectly.
An Error Correction Code (ECC) can help us analyze data transmissions. ECC locates and corrects transmission errors. In computers, ECC is built into a storage space that can identify common internal data corruption problems. For example, ECC can help read broken codes, such as a missing piece of a Quick Response (QR) code. An example of ECC is hamming codes. A hamming code is a binary linear code that can detect up to two-bit errors. This means that up to two bits of data can be lost or corrupted during transmission, and the receiver will know that an error occurred, or be able to reconstruct the original data with no errors.
Important note
Hamming codes are named after Richard Wesley Hamming, who discovered them in 1950. Hamming was a mathematician who worked with coding related to telecommunications and computer engineering.
Another type of ECC is a parity bit. A parity bit checks the status of data and determines whether any data has been lost or overwritten. Error correction is important for all software that’s developed because any updates, changes, or upgrades can lead to the entire program or parts of the program or software being corrupted.
Cryptography
Cryptography is used in computer science to hide code. In cryptography, information or data is written so that it can’t be read by anyone other than the intended recipient of the message. In simple terms, cryptography takes readable text or information and converts it into unreadable text or information.
When we think about cryptography now, we tend to think of encryption of data. Coders encrypt data by converting it into code that cannot be seen by unauthorized users. However, cryptography has been around for centuries – that is, it pre-dates computers. Historically, the first uses of cryptography were found around 1900 BC in a tomb in Egypt. Atypical or unusual hieroglyphs were mixed with common hieroglyphs at various parts of the tomb.
The reason for these unusual hieroglyphs is unknown, but the messages were hidden from others with their use. Later on, cryptography would be used to communicate in secret by governments and spies, in times of war and peace. Nowadays, cryptography is used to encrypt data since our information exists in digital format, so protecting sensitive information, such as banking, demographic, or personal data, is important.
We will be exploring the various topics surrounding coding theory through some of the problems presented throughout this book.
Computational biology
Computational biology is the area of theoretical computer science that focuses on the study of biological data and bioinformatics. Bioinformatics is a science that allows us to collect biological data and analyze it. An example of bioinformatics is collecting and analyzing genetic codes. In the study of biology, large quantities of data is explored and recorded.
Studies can be wide-ranging in topics and interdisciplinary. For example, a genetic study may include data from an entire state, an entire race, or an entire country. Some areas within computational biology include molecules, cells, tissues, and organisms. Computational biology allows us to study the composition of these things, from the most basic level to the larger organism. Bioinformatics and computational biology provide a structure for experimental studies in these areas, create predictions and comparisons, and provide us with a way to develop and test theories.
Computational thinking and coding allow us to process that data and analyze it. In this book, the problems presented will allow us to explore ways in which we can use Python in conjunction with computational thinking to find solutions to complex problems, including those in computational biology.
Data structures
In coding theory, we use data structures to collect and organize data. The goal is to prepare the data so that we can perform operations efficiently and effectively. Data structures can be primitive or abstract. Software has built-in data structures, which are primitive, or we can define them using our programming language. A primitive data structure is predefined. Some primitive data structures include integers, characters (chars), and Boolean structures. Examples of abstract or user-defined data structures include arrays and two-dimensional arrays, stacks, trees and binary trees, linked lists, queues, and more.
User-defined data structures have different characteristics. For example, they can be linear or non-linear, homogeneous or non-homogeneous, and static or dynamic. If we need to arrange data in a linear sequence, we can use an array, which is a linear data structure. If our data is not linear, we can use non-linear data structures, such as graphs. When we have data that is of a similar type, we use homogeneous data structures.
Keep in mind that an array, for example, is both a linear and homogeneous data structure. Non-homogeneous or heterogeneous data structures have dissimilar data. An example of a non-homogeneous data structure a user can create is a class. The difference between a static and a dynamic data structure is that the size of a static structure is fixed, while a dynamic structure is flexible in size. To build a better understanding of data structures, we will explore them through problem-solving by using various computational thinking elements. We will revisit data structures very briefly at the end of this chapter since they relate to data types, which we will discuss shortly.
Information theory
Information theory is defined as a mathematical study that allows us to code information so that it can be transmitted through computer circuits or telecommunications channels. The information is transmitted through sequences that may contain symbols, impulses, and even radio signals.
In information theory, computer scientists study the quantification of information, data storage, and information communication. Information can be either analog or digital in information theory. Analog data refers to information represented by an analog signal. In turn, an analog signal is a continuous wave that changes over a given time. A digital signal displays data as binary – that is, as a discrete wave. We represent analog waves as sine waves and digital waves as square waves. The following graph shows a sine curve as a function of value over time:
Figure 1.5 – Analog signal
An analog signal is described by the key elements of a sine wave: amplitude, period, frequency, and phase shift:
- The amplitude is the height of the curve from its center. A sine curve repeats infinitely.
- The period refers to the length of one cycle of the sine curve – that is, the length of the curve before it starts to repeat.
- The frequency and the period of the sine curve have an inverse relationship:
frequency = 1 _ period
Concerning the inverse relationship, we can also say the following:
period = 1 _ frequency
- The phase shift of a sine curve is how much the curve shifts from 0. This is shown in the following graph:
Figure 1.6 – Phase shift examples
In contrast, digital signal graphs look like bar graphs or histograms. They only have two data points, 0 or 1, so they look like boxy hills and valleys:
Figure 1.7 – Digital signal
Digital signals have finite sets of discrete data. A dataset is discrete in that it contains individual and distinct data points. For analog signals, the data is continuous and infinite. When working with computer science, both types of signals are important and useful. We will explore digital signals in some of the problems throughout the book, specifically in the problems presented in Chapter 17, Applied Computational Thinking Problems.
Automata theory
Automata theory is one of the most fascinating topics in theoretical computer science. It refers to the study of machines and how calculations can be completed reliably and efficiently. Automata theory involves the physical aspects of simple machines, as well as logical processing. So, what exactly are automata used for and how does it work?
Automata are devices that use predetermined conditions to respond to outside input. When you look at your thermostat, you’re working with an automata. You set the temperature you want and the thermostat reacts to an outside source to gather information and adjust the temperatures accordingly.
Another example of automata is surgical robots. These robots can improve the outcomes of surgeries for patients and are being improved upon constantly. Since the goal of automata theory is to make machines that are reliable and efficient, it is a critical piece in developing artificial intelligence and smart robotic machines such as surgical robots.
Formal language theory
Formal language theory is often tied to automata theory in computer science. Formal language theory involves studying the syntax, grammar, vocabulary, and everything else involving a formal language. In computer science, formal language refers to the logical processing and syntax of computer programming languages. Concerning automata, the machines process the formal language to perform the tasks or code provided for them.
Symbolic computation
Symbolic computation is a branch of computational mathematics that deals with computer algebra. The terms symbolic computation and computer algebra are sometimes used interchangeably. Some programming software and languages focus on the symbolic computations of mathematics formulas. Programs that use symbolic computation perform operations such as polynomial factorization, simplifying algebraic functions or expressions, finding the greatest common divisor of polynomials, and more.
In this book, we will use computer algebra and symbolic computation when solving some real-world problems. Python allows us to not only perform the mathematical computations that may be required for problems but also explore graphical representations or models that result from those computations. As we explore solutions to real-world problems, we will need to use various libraries or extensions of the Python programming language. More on that will be provided in Part 2, Applying Python and Computational Thinking, of this book, where we will explore the Python programming language in greater detail.
Computational geometry
Like symbolic computation, computational geometry lives in the branch of computer science that deals with computational mathematics. The algorithms we study in computational geometry are those that can be expressed with geometry. The data is analyzed via geometric figures, geometric analysis, data structures that follow geometric patterns, and more. The input and output of problems that require computational geometry are geometric.
When thinking of geometry, we often revert to the figures we mostly associate with that branch of mathematics, such as polygons, triangles, and circles. That said, when we look at computational geometry, some of the algorithms are those that can be expressed by points, lines, other geometric figures, or those that follow a geometric pattern. Triangulation falls under this branch of computer science.
Data triangulation is important for applications such as optical 3D measuring systems. We triangulate GPS signals to locate a phone, for example, which is used in law enforcement.
There are many uses of triangulation in modern times, some of which we’ll explore through real and relevant problems throughout this book.
Computational number theory
Number theory is a branch of mathematics that studies integers and their properties. So, computational number theory involves studying algorithms that are used to solve problems in number theory. Part of the study of number theory is primality testing.
Algorithms that are created to determine whether input or output is prime are used for many purposes. One of the most critically important uses and applications of primality testing and number theory is for encryption purposes. As our lives have moved to saving everything electronically, our most personal information, such as banking information, family information, and even social security numbers, lives in some code or algorithm. It is important to encrypt such information so that others cannot use or access it. Computational number theory and cryptography are intrinsically tied, as you will explore later.
Some of the theories presented are meant to help you understand how intertwined computer science theories and their applications are, as well as their relevance to what we do each day.
In this section, we learned about theoretical computer science. We also learned about its various theories. Throughout this book, we will be using computational thinking (discussed further in Chapter 2, Elements of Computational Thinking) to help us tackle problems, from the most basic applications to some complex analyses, by defining and designing adequate algorithms that use these theories. Theoretical computer science is used to study a system’s software, which we will explore next.
Learning about a system’s software
System’s software is used to perform multiple functions and communicate between the OS of a computer, peripherals such as a keyboard and mouse, and firmware, which is permanently saved to a device and is needed for its operation, among other functions. These are part of the two main types of software: system software and application software.
System software allows a computer to communicate between hardware and applications. Think of a smartphone. In its most basic form, a phone is composed of hardware, which includes a battery, cameras, memory, screen, and all the physical components and peripherals. The OS allows those components to be used by applications.
Take the camera application of a phone. The system software lets the application communicate with the phone to use the camera to take a picture, edit it, save it, and share it. A computer’s OS also allows the hardware to communicate with programs. A design program will use the mouse or other peripherals that can be used to draw, create, use a touch screen if available, and more.
If we do not know our system’s software, we cannot create applications that can communicate effectively with our hardware, creating errors that can range from critical, or rendering a peripheral useless, to minor, where some components may work, say taking a picture, but others may not, such as saving or sharing the picture. The system software is created in such a way that it provides us with the easiest, most efficient way to communicate between the hardware and applications. To do this, systems use an OS. Let’s take a look at what those systems are and what they do.
Operating systems
The OS performs multiple tasks. As you may recall, error handling is part of an OS that checks for the most common possible errors to fix them without creating a larger problem or rendering an application worthless. Error handling is one of the OS’s most important tasks. In addition, the OS is responsible for the security of your computer or device. If you have a smartphone, you know that many updates to the OS are done to fix a security problem or prevent a security breach. The OS is responsible for only allowing an authorized user to interact with the content that is stored in the device.
In addition to security and error handling, an OS is responsible for allocating memory for files and organizing them. When we save and delete a file or program, the memory that had been used is freed. However, something might be saved immediately before and immediately after. The OS allocates and reallocates memory to maintain the best performance possible by the device. Memory management not only refers to user-saved files but also to the RAM.
The file management of a device is also run by the OS. The OS allocates the information as a filesystem, breaking the information into directories that can easily be accessed by the user and the device. The filesystem is responsible for keeping track of where files are, both from the OS and the user, the settings for access to the device, which are evolving constantly, and how to access the files and understand the statuses of those files. Access to devices has changed in recent years.
While computers typically use a username and password, many devices can now be accessed through a fingerprint, a numerical or alpha-numerical passcode, facial recognition, images, paths, and more. As any of these topics evolve, the OS evolves as well and needs to be updated or recreated. The OS is also responsible for allowing communication between the applications and the device.
Application software
Application software refers to software applications that perform a particular task. Think of the applications, or apps, that you can access from a mobile device. There are hundreds of types of applications, such as static games that live on a device, games that allow you to play others remotely, news applications, eBook readers, fitness training apps, alarms, clocks, music, and so much more! Applications always perform some form of task, be it for personal use, business use, or educational use.
Application software has multiple functions. You may find suites for productivity, such as Microsoft (Office) and Google products. When we need to research on the internet, we use applications called browsers, which allow us to access information and index it so that we can access it. These browsers include Google Chrome, Safari, Firefox, Edge, Opera, and others. Browsers are used by both mobile devices and computers. Keep in mind that the purpose of an app is to perform a specific task for the end user.
Important note
As an aside, applications have grown exponentially since computers became household tools and phones started being used for other things rather than just for calling others. Early computers were used for just that: computing, or calculating mathematical analyses and tasks. That’s one of the reasons it is so important to have an understanding of the development and history of computer science. Since we cannot completely predict future uses of computer science and system software, the more we know about them, the more we will be able to create and adapt when technological advances happen.
In this section, we learned about system software. We also learned about OS software and application software. For this book, some applications will be more important as we sort through some of the problems presented, such as databases, productivity software, enterprise resource planning, and educational software.
In the next section, we’ll start to explore more about computing and how computers have an architecture that allows software and hardware to interact.
Understanding computing
In computer science, computing refers to the activities that computers perform to communicate, manage, and process information. Computing is usually divided into four main areas: algorithms, architecture, programming languages, and theory.
Since we discussed theory and algorithms in previous sections, we will now focus on defining architecture and programming languages.
Architecture
Computer architecture refers to the set of instructions that interact with computer systems. In more basic terms, the architecture includes the instructions that allow software and hardware to interact. Computer architecture has three main subcategories:
- Instruction Set Architecture (ISA)
- Microarchitecture
- System Design
Instruction Set Architecture (ISA)
The ISA is the boundary that exists between hardware and software. It is classified in multiple ways, but two common ones are complex instruction set computer (CISC) and reduced instruction set computer (RISC):
- CISC is a computer that has explicit instructions for many tasks, such as simple mathematical operations and loading something from memory. CISC includes everything that is not included in RISC.
- RISC is a computer with an architecture that has reduced cycles per instruction (CIP).
CISC tries to do more things with a fewer number of instructions, while RISC only uses simple instructions. CISC is multi-step, while RISC is single-step, performing one task at a time. The CISC process includes instructions, microcode conversion, microinstructions, and execution. In contrast, RISC includes instructions and execution.
In CISC, microcode conversion refers to interpreting language at a lower level. It considers the hardware resources to create microinstructions. Microinstructions are single instructions in microcode. After the microcode creates the microinstructions, the microinstructions can be executed. The following diagram shows the process for both RISC and CISC:
Figure 1.8 – Difference between RISC and CISC
Both RISC and CISC are necessary for computer programmers. There are advantages and disadvantages to having a single-step process (RISC) versus a multi-step process (CISC). RISC reduces the cycles per instruction, doing one thing at a time. CISC reduces the instructions in a program but at the cost of cycles per instruction. Depending on what our needs are, we can choose the best path to take.
Programming languages
Programming languages are the way we write instructions for computers and other devices. Different languages are used depending on what is needed, ease of use, and much more. Examples of programming languages include the following:
- Ruby and Python: Ruby is a programming language mostly used for web applications. Ruby is stable and easy to use; however, many developers choose to use Python over Ruby because Python is faster in many cases and has a larger ecosystem. Although Ruby has not been as popular and had some performance issues, the language is very much alive in 2023 and continues to grow. Python, on the other hand, is widely used for multiple purposes, such as web applications, user interface applications, and websites, among others. It is also one of the languages that is being adopted as schools around the world begin to require programming courses for graduation from secondary schools. We will explore Python in greater depth later in this book.
- C: The C language is a critically important part of computer science as C was the first language to be used and is still the most widely used language. C has been around since 1972 when Dennis Ritchie invented it, but it has been used by others since 1978 when it was first published. While other languages have grown in popularity since, C is still used in 2023. Some of its uses include OSs, hardware drivers, and applications, among others. C is a base-level language, which means it requires almost no abstraction.
- C++: C++ was developed by Bjarne Stroustrup as an extension of C in 1985. The goal of the language was to add object-oriented capabilities. The language is still widely used both in conjunction with the C language in OSs and for other software. C++ is an intermediate-level programming language.
- C#: (C sharp (C#) is a high-level programming language. Much like C++, it has object-oriented capabilities and is an extension of the C programming language. One of the main differences between C++ and C# is that C++ uses machine code while C# uses bytecode. Machine code can be executed directly by a computer. The user’s code is compiled into bytecode, which is a low-level code that needs to be interpreted.
- Swift: The Swift programming language was developed by Apple Inc. in 2014. As programming languages go, Swift is one of the newest. Apple released it as an open source programming language with version 2.2, which was released in 2015. The language is considered to be a general-purpose and compiled programming language and version 5.7 was released in September 2022. This language is important in the development of apps in the iOS ecosystem for Apple products.
- Scratch: Scratch was developed as a visual programming, block-coding language in 2002 by MIT Media Lab. As a block programming language, it is used extensively in schools to teach students of all ages how to code. Scratch is now adapted for multiple uses, including some robotic applications, such as Vex Code, incorporating machine learning and artificial intelligence, and much more. It is compatible with popular classroom peripherals such as Makey Makey, which is a circuit that interacts with a computer and can be fully controlled with a Scratch program. While it is popular for educational purposes, the power of the programming language cannot be understated and the language itself and its functionalities continue to grow.
- JavaScript: JavaScript is a scripting language that is used only within browsers. It is used to create websites and web applications. Java, on the other hand, is a general-purpose programming language. JavaScript helps us make websites animated or add interactive functionalities to them.
- Java: Java is compiled into bytecode and is widely used to develop Android devices and applications. It is an object-oriented programming language that allows programs to run in any environment.
- PHP: PHP is otherwise known as Hypertext Preprocessor. Much like Java, it is a general-purpose programming language. It is widely available and used in website design and applications. PHP is considered to be easy to learn, yet has many advanced features. It can also be used to write desktop applications.
- SQL: Structured query language (SQL) is a programming language that’s used to interact with data. SQL is domain-specific. It has been around for almost as long as C, making its first appearance in 1974. The main importance of SQL is that it can interact with databases, whereas other languages are not able to do so.
In computational thinking, we use many different programming languages, depending on what our goals are, what information we have or need, and what our application or software requirements are. Choosing a language is dependent on not just our knowledge of the language, but the possible functionalities of the language.
We will work more extensively with Python in this book because of its open source nature, ease of use, and the large number of applications it can be used for. However, Python is not the only option. Knowing about other languages is important, especially for developers.
With that, we’ve learned about computing and a few of its areas, namely, architecture and programming languages. We also learned about the ISA and its types, as well as various programming languages. In the next section, we’ll look at data types and structures.
Learning about data types and structures
In computer science, data types and structures are two distinct things:
- A data type is a basic classification. Some data types include integers, floats, and strings.
- Data structures use multiple types of data types. They can organize the information in memory and determine how we access the information.
We’ll look at these in more detail in the following sections.
Data types
As mentioned previously, data types are basic classifications. They are variables that are used throughout a program and can only exist with one classification. There are different classes of data types. We will focus on primitive and abstract data types for now, but we will revisit this topic as we consider problems and design solutions.
Primitive data types include byte, short, int, long, float, double, Boolean, and char:
- A byte can store numbers from -128 to 127. While these numbers can be stored as integers or ints, a byte uses less storage, so if we know the number is between those values, we can use a byte data type instead.
- A short is a number between -32,768 and 32,767.
- An integer, int, is used to store numbers between -2,147,483,648 and 2,147,483,647.
- Long is used to store numbers from -9,223,372,036,854,775,808 and 9,223,372,036,854,775,807.
- A float allows us to save a decimal number.
- Decimal numbers can also be saved as double, which has more precision than a float.
- Boolean values are data types that are either
True
orFalse
. So, a variable can be saved so that when its value is printed, the result will be saved as true or false. - Char is used to save a variable as a single character.
Data structures
As mentioned in the Coding theory section earlier, data structures are used to collect and organize data efficiently and effectively. Data structures can be primitive, such as the built-in data structures in software, or abstract. Primitive data structures can also be defined using programming languages, but they are predefined. Some primitive data structures include the data types listed in the previous section, such as chars and Boolean structures.
Abstract data types (ADTs) include information that can help structure and design data types. Abstract data structures include arrays and two-dimensional arrays, stacks, trees and binary trees, linked lists, queues, and more, as mentioned in the Coding theory section earlier in this chapter. Lists can contain multiple instances of the same data values. These lists are countable, so we can find how many elements are in the list, reorder them, remove items, add items, and so on. Lists are widely used as linked lists, arrays, or dynamic arrays:
- A linked list means that each data element in the list is connected, or points, to the next one, regardless of where they are stored within the memory.
- An array is ordered. The elements are read so that they make sense. Think of an array as reading this sentence. You don’t read the sentence as array and think reading as this of sentence. We read the sentence in order, from left to right, not in a jumbled order.
- Dynamic arrays can be resized, which is important when choosing a data type.
A stack ADT is a collection of elements and has two operations – push and pop. A push is used to add an element to the collection, while a pop removes the most recent element.
A queue ADT is a linear data structure. As with a stack, we can add or remove elements. However, in a queue ADT, the point of deletion and the point of insertion are done at two different ends.
As mentioned previously, data structures are concrete implementations of data types. How we add or remove elements from a collection, for example, is the data structure.
This can all be slightly confusing, but we will learn more about them throughout this book. For now, understanding the definitions and simple examples is enough.
Summary
In this chapter, we learned about some of the fundamentals of computer science. We looked at how to convert from binary into base-10 and vice versa. We also explored topics and theories in theoretical computer science. We learned about computing and data types and structures. These sections allowed us to understand the computational thinking process and how to tackle the different types of problems that will be presented in this book, starting in Chapter 2, Elements of Computational Thinking.
As we delve deeper into the computational thinking world and process, we will need to revisit some of the content of this chapter as we look at problems, search for the best way to solve them, and make decisions about how to write the algorithms.
Problems may have an infinite number of ways to be solved using algorithms. Understanding how processes work and which data structures are most suitable for our problems is imperative in creating the best solutions. Identifying the data types that are needed for these algorithms and how computers read data will only help us with writing the most effective and efficient algorithms.
In the next chapter, we will learn about the computational thinking process and how to break down problems to design our algorithmic solutions.