Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Streamlit for Data Science - Second Edition

You're reading from  Streamlit for Data Science - Second Edition

Product type Book
Published in Sep 2023
Publisher Packt
ISBN-13 9781803248226
Pages 300 pages
Edition 2nd Edition
Languages
Author (1):
Tyler Richards Tyler Richards
Profile icon Tyler Richards

Table of Contents (15) Chapters

Preface An Introduction to Streamlit Uploading, Downloading, and Manipulating Data Data Visualization Machine Learning and AI with Streamlit Deploying Streamlit with Streamlit Community Cloud Beautifying Streamlit Apps Exploring Streamlit Components Deploying Streamlit Apps with Hugging Face and Heroku Connecting to Databases Improving Job Applications with Streamlit The Data Project – Prototyping Projects in Streamlit Streamlit Power Users Other Books You May Enjoy
Index

The Data Project – Prototyping Projects in Streamlit

In the previous chapter, we discussed how to create Streamlit applications that are specific to job applications. Another fun application of Streamlit is to try out new and interesting data science ideas and create interactive apps for others. Some examples of this include applying a new machine learning model to an existing dataset, carrying out an analysis of some data uploaded by users, or creating an interactive analysis on a private dataset. There are numerous reasons for making a project like this, such as personal education or community contribution.

In terms of personal education, often, the best way to learn about a new topic is to observe how it actually works by applying it to the world around you or a dataset that you know closely. For instance, if you try to learn how Principal Component Analysis works, you can always learn about it in a textbook or watch someone else apply it to a dataset. However, I have...

Technical requirements

In this section, we will utilize the website Goodreads.com, which is a popular website owned by Amazon that is used to track everything about a user’s reading habits, from when they started and finished books to what they would like to read next. It is recommended that you first head over to https://www.goodreads.com/, sign up for an account, and explore a little (perhaps you can even add your own book lists!).

Data science ideation

Often, coming up with a new idea for a data science project is the most daunting part. You might have numerous doubts. What if I start a project that no one likes? What if my data actually doesn’t work out well? What if I can’t think of anything? The good news is that if you create projects that you actually do care about and would use, then the worst-case scenario is that you have an audience of one! And if you send me (tylerjrichards@gmail.com) your project, I promise to read it. So that makes it an audience of two at the very least.

Some examples I have either created or observed in the wild include the following:

Collecting and cleaning data

There are two ways in which to get data from Goodreads: through its Application Programming Interface (API), which allows developers to programmatically access data about books, and through its manual exporting function. Sadly, Goodreads is deprecating its API in the near future and, as of December 2020, does not give access to new developers.

The original Goodreads app uses the API, but our version will rely on the manual exporting function that the Goodreads website has instead. To get your data, head over to https://www.goodreads.com/review/import and download your own data. If you do not have a Goodreads account, feel free to use my personal data for this, which can be found at https://github.com/tylerjrichards/goodreads_book_demo. I have saved my Goodreads data in a file, called goodreads_history.csv, in a new folder, called streamlit_goodreads_book. To make your own folder with the appropriate setup, run the following in your terminal:

mkdir...

Making an MVP

Looking at our data, we can start by asking a basic question: what are the most interesting questions I can answer with this data? After looking at the data and thinking about what information I would want from my Goodreads reading history, here are a few questions that I have thought of:

  • How many books do I read each year?
  • How long does it take for me to finish a book that I have started?
  • How long are the books that I have read?
  • How old are the books that I have read?
  • How do I rate books compared to other Goodreads users?

We can take these questions, figure out how to modify our data to visualize them well, and then make the first attempt at creating our product by printing out all of the graphs.

How many books do I read each year?

For the first question about books read per year, we have the Date Read column with the data presented in the format of yyyy/mm/dd. The following code block will do the following:

    ...

Iterative improvement

So far, we have been almost purely in production mode with this app. Iterative improvement is all about editing the work we have already done and organizing it in a way that makes the app more usable and, frankly, nicer to look at. There are a few improvements that we can shoot for here:

  • Beautification via animation
  • Organization using columns and width
  • Narrative building through text and additional statistics

Let’s start by using animations to make our apps a bit prettier!

Beautification via animation

In Chapter 7, Exploring Streamlit Components, we explored the use of various Streamlit Components; one of these was a component called streamlit-lottie, which gives us the ability to add animation to our Streamlit applications. We can improve our current app by adding an animation to the top of our current Streamlit app using the following code. If you want to learn more about Streamlit Components, please head back...

Hosting and promotion

Our final step is to host this app on Streamlit Community Cloud. To do this, we need to perform the following steps:

  1. Create a GitHub repository for this work.
  2. Add a requirements.txt file.
  3. Use one-click deployment on Streamlit Community Cloud to deploy the app.

We have already covered this extensively in Chapter 5, Deploying Streamlit with Streamlit Community Cloud, so give it a shot now without instructions.

Summary

What a fun chapter! We have learned so much here – from how to come up with data science projects of our own, and how to create initial MVPs, to the iterative improvement of our apps. We did this all through the lens of our Goodreads dataset, and we took this app from just an idea to a fully functioning app hosted on Streamlit Community Cloud. I look forward to seeing all the different types of Streamlit apps that you create. Please create something fun and send it to me on Twitter at @tylerjrichards. In the next chapter, we will focus on interviews with Streamlit power users and creators to learn tips and tricks, why they use Streamlit so extensively, and also where they think the library will go from here. See you there!

Learn more on Discord

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/sl

lock icon The rest of the chapter is locked
You have been reading a chapter from
Streamlit for Data Science - Second Edition
Published in: Sep 2023 Publisher: Packt ISBN-13: 9781803248226
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}