In this book, we introduce the power of the command line using the bash shell. Bash is the most widely accepted shell, and is found on everything from toasters to high-performance computers. We start with the basics and quickly move to some more advanced skills throughout the book.
You're reading from Hands-On Data Science with the Command Line
Who this book is for
Hands-On Data Science with the Command Line provides useful tips and tricks on how to use the command line for everyday data problems. This book is aimed for the reader that has little to no command-line experience but has worked in the field of computer science and/or has experience with modern data science problems.
You'll learn how to set up the command line on multiple platforms and configure it to your liking, learn how to find help with commands, and learn how to create reusable scripts. You will also learn how to obtain an actual dataset, perform some analytics, and learn how to visualize the data. Towards the end of the book, we touch on some of the advanced features of the command line and where to go from there.
In addition, all of the code examples are available to download in Packt's GitHub account. Any updates to this book will be made available to you by the Packt platform.
What this book covers
Chapter 1, Data Science at the Command line and Setting It up, covers how to install and configure the command line on multiple platforms of your choosing.
Chapter 2, Essential Commands, is a hands-on demo on using the basics of the command line and where to find help if needed.
Chapter 3, Shell Workflows, and Data Acquisition and Massaging, really gets into performing some basic data science exercises with a live dataset and customizing your command-line environment as you see fit.
Chapter 4, Reusable Bash and Developing Reusable Code in Bash, builds on the previous chapters and gets more advanced with creating reusable scripts and visualizations.
Chapter 5, Loops, Functions, and String Processing, is an advanced hands-on exercise on iterating over data using loops and exploring with regular expressions.
Chapter 6, SQL, Math, and Wrapping it up, is an advanced hands-on exercise to use what you've learned over the last chapters, and we introduce databases, streaming, and working with APIs.
To get the most out of this book
For this book, all you require is the Bash shell and a operating system that can run the command line or the latest version of Docker. You will also need an Internet connection (preferably cable or higher) and strong typing skills.
Download the example code files
You can download the example code files for this book from your account at www.packt.com. If you purchased this book elsewhere, you can visit www.packt.com/support and register to have the files emailed directly to you.
You can download the code files by following these steps:
- Log in or register at www.packt.com.
- Select the SUPPORT tab.
- Click on Code Downloads & Errata.
- Enter the name of the book in the Search box and follow the onscreen instructions.
Once the file is downloaded, please make sure that you unzip or extract the folder using the latest version of:
- WinRAR/7-Zip for Windows
- Zipeg/iZip/UnRarX for Mac
- 7-Zip/PeaZip for Linux
The code bundle for the book is also hosted on GitHub at https://github.com/PacktPublishing/Hands-On-Data-Science-with-Command-Line. In case there's an update to the code, it will be updated on the existing GitHub repository.
We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!
Conventions used
There are a number of text conventions used throughout this book.
CodeInText: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: "Mount the downloaded WebStorm-10*.dmg disk image file as another disk in your system."
A block of code is set as follows:
<<EOF cat >greetlib.sh
greet_yourself () {
echo Hello, \${1:-\$USER}!
}
EOF
When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:
<key>Ctrl+b</key> “
<key>Ctrl+b</key> <key></key>
<key>Ctrl+b</key> “
Any command-line input or output is written as follows:
sudo apt install -y screen tmux
Bold: Indicates a new term, an important word, or words that you see onscreen. For example, words in menus or dialog boxes appear in the text like this. Here is an example: "Select System info from the Administration panel."
Get in touch
Feedback from our readers is always welcome.
General feedback: If you have questions about any aspect of this book, mention the book title in the subject of your message and email us at customercare@packtpub.com.
Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packt.com/submit-errata, selecting your book, clicking on the Errata Submission Form link, and entering the details.
Piracy: If you come across any illegal copies of our works in any form on the Internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.
If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.
Reviews
Please leave a review. Once you have read and used this book, why not leave a review on the site that you purchased it from? Potential readers can then see and use your unbiased opinion to make purchase decisions, we at Packt can understand what you think about our products, and our authors can see your feedback on their book. Thank you!
For more information about Packt, please visit packt.com.