All applications process data. Data comes in, data is processed, and then data goes out.
Data usually comes into our program from files, databases, or user input, and it can be put temporarily into variables that will be stored in the memory of the running program. When the program ends, the data in memory is lost. Data is usually output to files and databases, or to the screen or a printer. When using variables, you should think about, firstly, how much space the variable takes in the memory, and, secondly, how fast it can be processed.
We control this by picking an appropriate type. You can think of simple common types such as int
and double
as being different-sized storage boxes, where a smaller box would take less memory but may not be as fast at being processed; for example, adding 16-bit numbers might not be processed as quickly as adding 64-bit numbers on a 64-bit operating system. Some of these boxes may be stacked close by, and some may be thrown into a big heap further away.
Naming things and assigning values
There are naming conventions for things, and it is good practice to follow them, as shown in the following table:
Naming convention
Examples
Used for
Camel case
cost
, orderDetail
, dateOfBirth
Local variables, private fields.
Title case aka Pascal case
String
, Int32
, Cost
, DateOfBirth
, Run
Types, non-private fields, and other members like methods.
Some C# programmers like to prefix the names of private fields with an underscore, for example, _dateOfBirth
instead of dateOfBirth
. The naming of private members of all kinds is not formally defined because they will not be visible outside the class, so both are valid. My preference is without an underscore.
Good Practice : Following a consistent set of naming conventions will enable your code to be easily understood by other developers (and yourself in the future!).
The following code block shows an example of declaring a named local variable and assigning a value to it with the =
symbol. You should note that you can output the name of a variable using a keyword introduced in C# 6.0, nameof
:
double heightInMetres = 1.88 ;
Console.WriteLine($"The variable { nameof (heightInMetres)} has the value
{heightInMetres} ." );
Warning! The message in double quotes in the preceding code wraps onto a second line because the width of a printed page is too narrow. When entering a statement like this in your code editor, type it all in a single line.
Literal values
When you assign to a variable, you often, but not always, assign a literal value. But what is a literal value? A literal is a notation that represents a fixed value. Data types have different notations for their literal values, and over the next few sections, you will see examples of using literal notation to assign values to variables.
Storing text
For text, a single letter, such as an A
, is stored as a char
type.
Good Practice : Actually, it can be more complicated than that. Egyptian Hieroglyph A002 (U+13001) needs two System.Char
values (known as surrogate pairs) to represent it: \uD80C
and \uDC01
. Do not always assume one char
equals one letter or you could introduce hard-to-notice bugs into your code.
A char
is assigned using single quotes around the literal value, or assigning the return value of a function call, as shown in the following code:
char letter = 'A' ;
char digit = '1' ;
char symbol = '$' ;
char userChoice = GetSomeKeystroke();
For text, multiple letters, such as Bob
, are stored as a string
type and are assigned using double quotes around the literal value, or by assigning the return value of a function call or constructor, as shown in the following code:
string firstName = " Bob" ;
string lastName = "Smith" ;
string phoneNumber = "(215) 555-4256" ;
string horizontalLine = new ('-' , count: 74 );
string address = GetAddressFromDatabase(id: 563 );
string grinningEmoji = char .ConvertFromUtf32(0x1F600 );
To output emoji at the command line on Windows, you must use Windows Terminal because Command Prompt does not support emoji, and set the output encoding to use UTF-8, as shown in the following code:
Console.OutputEncoding = System.Text.Encoding.UTF8;
string grinningEmoji = char .ConvertFromUtf32(0x1F600 );
Console.WriteLine(grinningEmoji);
Verbatim strings
When storing text in a string
variable, you can include escape sequences, which represent special characters like tabs and new lines using a backslash, as shown in the following code:
string fullNameWithTabSeparator = "Bob\tSmith" ;
But what if you are storing the path to a file on Windows, and one of the folder names starts with a T
, as shown in the following code?
string filePath = "C:\televisions\sony\bravia.txt" ;
The compiler will convert the \t
into a tab character and you will get errors!
You must prefix with the @
symbol to use a verbatim literal string
, as shown in the following code:
string filePath = @"C:\televisions\sony\bravia.txt" ;
Raw string literals
Introduced in C# 11, raw string literals are convenient for entering any arbitrary text without needing to escape the contents. They make it easy to define literals containing other languages like XML, HTML, or JSON.
Raw string literals start and end with three or more double-quote characters, as shown in the following code:
string xml = """
< person age = " 50" >
< first_name > Mark</ first_name >
</ person >
""";
Why three or more double-quote characters? That is for scenarios where the content itself needs to have three double-quote characters; you can then use four double-quote characters to indicate the beginning and end of the contents. Where the content needs to have four double-quote characters, you can then use five double-quote characters to indicate the beginning and end of the contents. And so on.
In the previous code, the XML is indented by 13 spaces. The compiler looks at the indentation of the last three or more double-quote characters, and then automatically removes that level of indentation from all the content inside the raw string literal. The results of the previous code would therefore not be indented as in the defining code, but instead be aligned with the left margin, as shown in the following markup:
< person age = "50" >
< first_name > Mark</ first_name >
</ person >
Raw interpolated string literals
You can mix interpolated strings that use curly braces {
}
with raw string literals. You specify the number of braces that indicate a replaced expression by adding that number of dollar signs to the start of the literal. Any fewer braces than that are treated as raw content.
For example, if we want to define some JSON, single braces will be treated as normal braces, but the two dollar symbols tell the compiler that any two curly braces indicate a replaced expression value, as shown in the following code:
var person = new { FirstName = "Alice" , Age = 56 };
string json = $$"""
{
"first_name": "{{person.FirstName}}",
"age": {{person.Age}},
"calculation", "{{{ 1 + 2 }}}"
}
""" ;
Console.WriteLine(json);
The previous code would generate the following JSON document:
{
"first_name" : "Alice" ,
"age" : 56 ,
"calculation" , "{3}"
}
The number of dollars tells the compiler how many curly braces are needed for something to become recognized as an interpolated expression.
Summarizing options for storing text
To summarize:
Literal string : Characters enclosed in double-quote characters. They can use escape characters like \t
for tab. To represent a backslash, use two: \\
.
Raw string literal : Characters enclosed in three or more double-quote characters.
Verbatim string : A literal string prefixed with @
to disable escape characters so that a backslash is a backslash. It also allows the string
value to span multiple lines because the whitespace characters are treated as themselves instead of instructions to the compiler.
Interpolated string : A literal string prefixed with $
to enable embedded formatted variables. You will learn more about this later in this chapter.
Storing numbers
Numbers are data that we want to perform an arithmetic calculation on, for example, multiplying. A telephone number is not a number. To decide whether a variable should be stored as a number or not, ask yourself whether you need to perform arithmetic operations on the number or whether the number includes non-digit characters such as parentheses or hyphens to format the number, such as (414) 555-1234. In this case, the “number” is a sequence of characters, so it should be stored as a string
.
Numbers can be natural numbers, such as 42, used for counting (also called whole numbers); they can also include negative numbers, such as -42 (called integers); or they can be real numbers, such as 3.9 (with a fractional part), which are called single- or double-precision floating-point numbers in computing.
Let’s explore numbers:
Use your preferred code editor to add a new Console App /console
project named Numbers
to the Chapter02
workspace/solution.
If you are using Visual Studio Code, then select Numbers
as the active OmniSharp project. When you see the pop-up warning message saying that required assets are missing, click Yes to add them.
If you are using Visual Studio 2022, then set the startup project to the current selection.
In Program.cs
, delete the existing code and then type statements to declare some number variables using various data types, as shown in the following code:
uint naturalNumber = 23 ;
int integerNumber = -23 ;
float realNumber = 2.3F ;
double anotherRealNumber = 2.3 ;
Storing whole numbers
You might know that computers store everything as bits. The value of a bit is either 0
or 1
. This is called a binary number system . Humans u se a decimal number system .
The decimal number system, also known as Base 10, has 10 as its base , meaning there are 10 digits, from 0 to 9. Although it is the number base most used by human civilizations, other number base systems are popular in science, engineering, and computing. The binary number system, also known as Base 2, has two as its base, meaning there are two digits, 0 and 1.
The following table shows how computers store the decimal number 10. Take note of the bits with the value 1 in the 8 and 2 columns; 8 + 2 = 10:
128
64
32
16
8
4
2
1
0
0
0
0
1
0
1
0
So, 10
in decimal is 00001010
in binary.
Improving legibility by using digit separators
Two of the improvements seen in C# 7.0 and later are the use of the underscore character _
as a digit separator, and support for binary literals.
You can insert underscores anywhere into the digits of a number literal, including decimal, binary, or hexadecimal notation, to improve legibility.
For example, you could write the value for 1 million in decimal notation, that is, Base 10, as 1_000_000
.
You can even use the 2/3 grouping common in India: 10_00_000
.
Using binary or hexadecimal notation
To use binary notation, that is, Base 2, using only 1s and 0s, start the number literal with 0b
. To use hexadecimal notation, that is, Base 16, using 0 to 9 and A to F, start the number literal with 0x
.
Exploring whole numbers
Let’s enter some code to see some examples:
In Program.cs
, type statements to declare some number variables using underscore separators, as shown in the following code:
int decimalNotation = 2 _000_000;
int binaryNotation = 0b _0001_1110_1000_0100_1000_0000;
int hexadecimalNotation = 0 x_001E_8480;
Console.WriteLine($" {decimalNotation == binaryNotation} " );
Console.WriteLine($" {decimalNotation == hexadecimalNotation} " );
Run the code and note the result is that all three numbers are the same, as shown in the following output:
True
True
Computers can always exactly represent integers using the int
type or one of its sibling types, such as long
and short
.
Storing real numbers
Computers cannot always represent real, aka decimal or non-integer, numbers precisely. The float
and double
types store real numbers using single- and double-precision floating points.
Most programming languages implement the IEEE Standard for Floating-Point Arithmetic. IEEE 754 is a technical standard for floating-point arithmetic established in 1985 by the Institute of Electrical and Electronics Engineers (IEEE ).
The following table shows a simplification of how a computer represents the number 12.75
in binary notation. Note the bits with the value 1
in the 8, 4, ½, and ¼ columns.
8 + 4 + ½ + ¼ = 12¾ = 12.75.
128
64
32
16
8
4
2
1
.
½
¼
1/8
1/16
0
0
0
0
1
1
0
0
.
1
1
0
0
So, 12.75
in decimal is 00001100.1100
in binary. As you can see, the number 12.75
can be exactly represented using bits. However, some numbers can’t, which is something that we’ll be exploring shortly.
Writing code to explore number sizes
C# has an operator named sizeof()
that returns the number of bytes that a type uses in memory. Some types have members named MinValue
and MaxValue
, which return the minimum and maximum values that can be stored in a variable of that type. We are now going to use these features to create a console app to explore number types:
In Program.cs
, type statements to show the size of three number data types, as shown in the following code:
Console.WriteLine($"int uses { sizeof ( int )} bytes and can store numbers in the range { int .MinValue:N0} to { int .MaxValue:N0} ." );
Console.WriteLine($"double uses { sizeof ( double )} bytes and can store numbers in the range { double .MinValue:N0} to { double .MaxValue:N0} ." );
Console.WriteLine($"decimal uses { sizeof ( decimal )} bytes and can store numbers in the range { decimal .MinValue:N0} to { decimal .MaxValue:N0} ." );
Warning! The width of the printed pages in this book makes the string
values (in double quotes) wrap over multiple lines. You must type them on a single line, or you will get compile errors.
Run the code and view the output, as shown in Figure 2.4 :
Figure 2.4: Size and range information for common number data types
An int
variable uses four bytes of memory and can store positive or negative numbers up to about 2 billion. A double
variable uses 8 bytes of memory and can store much bigger values! A decimal
variable uses 16 bytes of memory and can store big numbers, but not as big as a double
type.
But you may be asking yourself, why might a double
variable be able to store bigger numbers than a decimal
variable, yet it’s only using half the space in memory? Well, let’s now find out!
Comparing double and decimal types
You will now write some code to compare double
and decimal
values. Although it isn’t hard to follow, don’t worry about understanding the syntax right now:
Type statements to declare two double
variables, add them together, and compare them to the expected result. Then, write the result to the console, as shown in the following code:
Console.WriteLine("Using doubles:" );
double a = 0.1 ;
double b = 0.2 ;
if (a + b == 0.3 )
{
Console.WriteLine($" {a} + {b} equals { 0.3 } " );
}
else
{
Console.WriteLine($" {a} + {b} does NOT equal { 0.3 } " );
}
Run the code and view the result, as shown in the following output:
Using doubles:
0.1 + 0.2 does NOT equal 0.3
In locales that use a comma for the decimal separator the result will look slightly different, as shown in the following output:
0,1 + 0,2 does NOT equal 0,3
The double
type is not guaranteed to be accurate because some numbers like 0.1
literally cannot be represented as floating-point values.
As a rule of thumb, you should only use double
when accuracy, especially when comparing the equality of two numbers, is not important. An example of this might be when you’re measuring a person’s height; you will only compare values using greater than or less than, but never equals.
The problem with the preceding code is illustrated by how the computer stores the number 0.1
, or multiples of it. To represent 0.1
in binary, the computer stores 1 in the 1/16 column, 1 in the 1/32 column, 1 in the 1/256 column, 1 in the 1/512 column, and so on.
The number 0.1
in decimal is 0.00011001100110011
… in binary, repeating forever:
4
2
1
.
½
¼
1/8
1/16
1/32
1/64
1/128
1/256
1/512
1/1024
1/2048
0
0
0
.
0
0
0
1
1
0
0
1
1
0
0
Good Practice : Never compare double
values using ==
. During the First Gulf War, an American Patriot missile battery used double
values in its calculations. The inaccuracy caused it to fail to track and intercept an incoming Iraqi Scud missile, and 28 soldiers were killed. You can read about this at https://www.ima.umn.edu/~arnold/disasters/patriot.html .
Copy and paste the statements that you wrote before (which used the double
variables).
Modify the statements to use decimal
and rename the variables to c
and d
, as shown in the following code:
Console.WriteLine("Using decimals:" );
decimal c = 0.1 M;
decimal d = 0.2 M;
if (c + d == 0.3 M)
{
Console.WriteLine($" {c} + {d} equals { 0.3 M} " );
}
else
{
Console.WriteLine($" {c} + {d} does NOT equal { 0.3 M} " );
}
Run the code and view the result, as shown in the following output:
Using decimals:
0.1 + 0.2 equals 0.3
The decimal
type is accurate because it stores the number as a large integer and shifts the decimal point. For example, 0.1
is stored as 1
, with a note to shift the decimal point one place to the left. 12.75
is stored as 1275
, with a note to shift the decimal point two places to the left.
Good Practice : Use int
for whole numbers. Use double
for real numbers that will not be compared for equality to other values; it is okay to compare double
values being less than or greater than, and so on. Use decimal
for money, CAD drawings, general engineering, and wherever the accuracy of a real number is important.
The float
and double
types have some useful special values: NaN
represents not-a-number (for example, the result of dividing by zero), Epsilon
represents the smallest positive number that can be stored in a float
or double
, and PositiveInfinity
and NegativeInfinity
represent infinitely large positive and negative values. They also have methods for checking for these special values like IsInfinity
and IsNaN
.
Storing Booleans
Booleans can only contain one of the two literal values true
or false
, as shown in the following code:
bool happy = true ;
bool sad = false ;
They are most used to branch and loop. You don’t need to fully understand them yet, as they are covered more in Chapter 3 , Controlling Flow, Converting Types, and Handling Exceptions .
Storing any type of object
There is a special type named object
that can store any type of data, but its flexibility comes at the cost of messier code and possibly poor performance. Because of those two reasons, you should avoid it whenever possible. The following steps show you how to use object types if you need to use them:
Use your preferred code editor to add a new Console App /console
project named Variables
to the Chapter02
workspace/solution.
If you are using Visual Studio Code, then select Variables
as the active OmniSharp project. When you see the pop-up warning message saying that required assets are missing, click Yes to add them.
In Program.cs
, delete the existing statements and then type statements to declare and use some variables using the object
type, as shown in the following code:
object height = 1.88 ;
object name = " Amir" ;
Console.WriteLine($" {name} is {height} metres tall." );
int length1 = name.Length;
int length2 = ((string )name).Length;
Console.WriteLine($" {name} has {length2} characters." );
Run the code and note that the fourth statement cannot compile because the data type of the name
variable is not known by the compiler, as shown in Figure 2.5 :
Figure 2.5: The object type does not have a Length property
Add double slashes to the beginning of the statement that cannot compile to comment out the statement, making it inactive.
Run the code again and note that the compiler can access the length of a string
if the programmer explicitly tells the compiler that the object
variable contains a string
by prefixing with a cast expression like (string)
, as shown in the following output:
Amir is 1.88 metres tall.
Amir has 4 characters.
The object
type has been available since the first version of C#, but C# 2.0 and later have a better alternative called generics , which we will cover in Chapter 6 , Implementing Interfaces and Inheriting Classes . This will provide us with the flexibility we want, but without the performance overhead.
Storing dynamic types
There is another special type named dynamic
that can also store any type of data, but even more than object
, its flexibility comes at the cost of performance. The dynamic
keyword was introduced in C# 4.0. However, unlike object
, the value stored in the variable can have its members invoked without an explicit cast. Let’s make use of a dynamic
type:
Add statements to declare a dynamic
variable. Assign a string
literal value, and then an integer value, and then an array of integer values, as shown in the following code:
dynamic something = "Ahmed" ;
Add a statement to output the length of the dynamic
variable, as shown in the following code:
Console.WriteLine($"Length is {something.Length} " );
Run the code and note it works because a string
value does have a Length
property, as shown in the following output:
Length is 5
Uncomment the statement that assigns an int
value of 12 to the something
variable.
Run the code and note the runtime error because int
does not have a Length
property, as shown in the following output:
Unhandled exception. Microsoft.CSharp.RuntimeBinder.RuntimeBinderException: 'int' does not contain a definition for 'Length'
Uncomment the statement that assigns the array of three integers 3, 5, and 7 to the something
variable.
Run the code and note the output because an array of three int
values does have a Length
property, as shown in the following output:
Length is 3
One limitation of dynamic
is that code editors cannot show IntelliSense to help you write the code. This is because the compiler cannot check what the type is during build time. Instead, the CLR checks for the member at runtime and throws an exception if it is missing.
Exceptions are a way to indicate that something has gone wrong at runtime. You will learn more about them and how to handle them in Chapter 3 , Controlling Flow, Converting Types, and Handling Exceptions .
Declaring local variables
Local variables are declared inside methods, and they only exist during the execution of that method. Once the method returns, the memory allocated to any local variables is released.
Strictly speaking, value types are released while reference types must wait for a garbage collection. You will learn about the difference between value types and reference types in Chapter 6 , Implementing Interfaces and Inheriting Classes .
Specifying the type of a local variable
Let’s explore local variables declared with specific types and using type inference:
Depending on your code editor and color scheme, it will show green squiggles under each of the variable names and lighten their text color to warn you that the variable is assigned but its value is never used.
Inferring the type of a local variable
You can use the var
keyword to declare local variables with C# 3 and later. The compiler will infer the type from the value that you assign after the assignment operator, =
.
A literal number without a decimal point is inferred as an int
variable, that is, unless you add a suffix, as described in the following list:
L
: Compiler infers long
UL
: Compiler infers ulong
M
: Compiler infers decimal
D
: Compiler infers double
F
: Compiler infers float
A literal number with a decimal point is inferred as double
unless you add the M
suffix, in which case the compiler infers a decimal
variable, or the F
suffix, in which case it infers a float
variable.
Double quotes indicate a string
variable, single quotes indicate a char
variable, and the true
and false
values infer a bool
type:
Modify the previous statements to use var
, as shown in the following code:
var population = 67 _000_000;
var weight = 1.88 ;
var price = 4.99 M;
var fruit = "Apples" ;
var letter = 'Z' ;
var happy = true ;
Hover your mouse over each of the var
keywords and note that your code editor shows a tooltip with information about the type that has been inferred.
At the top of Program.cs
, import the namespace for working with XML to enable us to declare some variables using types in that namespace, as shown in the following code:
using System.Xml;
Good Practice : If you are using Polyglot Notebooks, then add using
statements in a separate code cell above the code cell where you write the main code. Then, click Execute Cell to ensure the namespaces are imported. They will then be available in subsequent code cells.
Under the previous statements, add statements to create some new objects, as shown in the following code:
var xml1 = new XmlDocument();
XmlDocument xml2 = new XmlDocument();
var file1 = File.CreateText("something1.txt" );
StreamWriter file2 = File.CreateText("something2.txt" );
Good Practice : Although using var
is convenient, some developers avoid using it to make it easier for a code reader to understand the types in use. Personally, I use it only when the type is obvious. For example, in the preceding code statements, the first statement is just as clear as the second in stating what the types of the xml
variables are, but it is shorter. However, the third statement isn’t clear in showing the type of the file
variable, so the fourth is better because it shows that the type is StreamWriter
. If in doubt, spell it out!
Using target-typed new to instantiate objects
With C# 9, Microsoft introduced another syntax for instantiating objects known as target-typed new . When instantiating an object, you can specify the type first and then use new
without repeating the type, as shown in the following code:
XmlDocument xml3 = new ();
If you have a type with a field or property that needs to be set, then the type can be inferred, as shown in the following code:
Person kim = new ();
kim.BirthDate = new (1967 , 12 , 26 );
class Person
{
public DateTime BirthDate;
}
This way of instantiating objects is especially useful with arrays and collections because they have multiple objects often of the same type, as shown in the following code:
List<Person> people = new ()
{
new () { FirstName = "Alice" },
new () { FirstName = "Bob" },
new () { FirstName = "Charlie" }
};
You will learn about arrays in Chapter 3 , Controlling Flow, Converting Types, and Handling Exceptions , and collections in Chapter 8 , Working with Common .NET Types .
Good Practice : Use target-typed new to instantiate objects unless you must use a pre-version 9 C# compiler. I have used target-typed new throughout the remainder of this book. Please let me know if you spot any cases that I missed!
Getting and setting the default values for types
Most of the primitive types except string
are value types , which means that they must have a value. You can determine the default value of a type by using the default()
operator and passing the type as a parameter. You can assign the default value of a type by using the default
keyword.
The string
type is a reference type . This means that string
variables contain the memory address of a value, not the value itself. A reference type variable can have a null
value, which is a literal that indicates that the variable does not reference anything (yet). null
is the default for all reference types.
You’ll learn more about value types and reference types in Chapter 6 , Implementing Interfaces and Inheriting Classes .
Let’s explore default values:
Add statements to show the default values of an int
, bool
, DateTime
, and string
, as shown in the following code:
Console.WriteLine($"default(int) = { default ( int )} " );
Console.WriteLine($"default(bool) = { default ( bool )} " );
Console.WriteLine($"default(DateTime) = { default (DateTime)} " );
Console.WriteLine($"default(string) = { default ( string )} " );
Run the code and view the result. Note that your output for the date and time might be formatted differently if you are not running it in the UK and that null
values output as an empty string
, as shown in the following output:
default(int) = 0
default(bool) = False
default(DateTime) = 01/01/0001 00:00:00
default(string) =
Add statements to declare a number, assign a value, and then reset it to its default value, as shown in the following code:
int number = 13 ;
Console.WriteLine($"number has been set to: {number} " );
number = default ;
Console.WriteLine($"number has been reset to its default: {number} " );
Run the code and view the result, as shown in the following output:
number has been set to: 13
number has been reset to its default: 0