Reader small image

You're reading from  Cracking the Data Science Interview

Product typeBook
Published inFeb 2024
PublisherPackt
ISBN-139781805120506
Edition1st Edition
Concepts
Right arrow
Authors (2):
Leondra R. Gonzalez
Leondra R. Gonzalez
author image
Leondra R. Gonzalez

Leondra R. Gonzalez is a data scientist at Microsoft and Chief Data Officer for tech startup CulTRUE, with 10 years of experience in tech, entertainment, and advertising. During her academic career, she has completed educational opportunities with Google, Amazon, NBC, and AT&T.
Read more about Leondra R. Gonzalez

Aaren Stubberfield
Aaren Stubberfield
author image
Aaren Stubberfield

Aaren Stubberfield is a senior data scientist for Microsoft's digital advertising business and the author of three popular courses on Datacamp. He graduated with an MS in Predictive Analytics and has over 10 years of experience in various data science and analytical roles focused on finding insights for business-related questions.
Read more about Aaren Stubberfield

View More author details
Right arrow

Learning the basics of data storage

As stated earlier, the data storage step in the model pipeline process tends to be a function of machine learning/data engineers. However, it is beneficial for a data scientist to have a basic understanding of this step.

Data storage is simply about housing the data that we gather from different sources. There are a variety of approaches to this, depending on the data’s requirements (e.g., the structure, schema, size, ingestion type, privacy, etc.).

The following are some examples of data storage options within MLOps:

  • Binary Large Object (BLOB) storage: BLOB storage is a type of data storage that is designed to store and manage large binary data, such as images, videos, documents, and other types of files. BLOBs can be of varying sizes, from small to very large, and they are typically unstructured data, meaning they lack a specific schema or organization. In modern data architectures, the cloud services offered by Azure Blob...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Cracking the Data Science Interview
Published in: Feb 2024Publisher: PacktISBN-13: 9781805120506

Authors (2)

author image
Leondra R. Gonzalez

Leondra R. Gonzalez is a data scientist at Microsoft and Chief Data Officer for tech startup CulTRUE, with 10 years of experience in tech, entertainment, and advertising. During her academic career, she has completed educational opportunities with Google, Amazon, NBC, and AT&T.
Read more about Leondra R. Gonzalez

author image
Aaren Stubberfield

Aaren Stubberfield is a senior data scientist for Microsoft's digital advertising business and the author of three popular courses on Datacamp. He graduated with an MS in Predictive Analytics and has over 10 years of experience in various data science and analytical roles focused on finding insights for business-related questions.
Read more about Aaren Stubberfield