Reader small image

You're reading from  The Machine Learning Solutions Architect Handbook - Second Edition

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781805122500
Edition2nd Edition
Right arrow
Author (1)
David Ping
David Ping
author image
David Ping

David Ping is an accomplished author and industry expert with over 28 years of experience in the field of data science and technology. He currently serves as the leader of a team of highly skilled data scientists and AI/ML solutions architects at AWS. In this role, he assists organizations worldwide in designing and implementing impactful AI/ML solutions to drive business success. David's extensive expertise spans a range of technical domains, including data science, ML solution and platform design, data management, AI risk, and AI governance. Prior to joining AWS, David held positions in renowned organizations such as JPMorgan, Credit Suisse, and Intel Corporation, where he contributed to the advancements of science and technology through engineering and leadership roles. With his wealth of experience and diverse skill set, David brings a unique perspective and invaluable insights to the field of AI/ML.
Read more about David Ping

Right arrow

Data ingestion

The data ingestion component plays a crucial role in acquiring data from diverse sources, including structured, semi-structured, and unstructured formats, such as databases, knowledge graph, social media, file storage, and IoT devices. Its primary responsibility is to store this data persistently in various storage solutions like object data storage (e.g., Amazon S3), data warehouses, or other data stores. Effective data ingestion patterns should incorporate both real-time streaming and batch ingestion mechanisms to cater to different types of data sources and ensure timely and efficient data acquisition.Various data ingestion technologies and tools cater to different ingestion patterns. For streaming data ingestion, popular choices include Apache Kafka, Apache Spark Streaming, and Amazon Kinesis/Kinesis Firehose. These tools enable real-time data ingestion and processing. On the other hand, for batch-oriented data ingestion, tools like Secure File Transfer Protocol (SFTP...

lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
The Machine Learning Solutions Architect Handbook - Second Edition
Published in: Apr 2024Publisher: PacktISBN-13: 9781805122500

Author (1)

author image
David Ping

David Ping is an accomplished author and industry expert with over 28 years of experience in the field of data science and technology. He currently serves as the leader of a team of highly skilled data scientists and AI/ML solutions architects at AWS. In this role, he assists organizations worldwide in designing and implementing impactful AI/ML solutions to drive business success. David's extensive expertise spans a range of technical domains, including data science, ML solution and platform design, data management, AI risk, and AI governance. Prior to joining AWS, David held positions in renowned organizations such as JPMorgan, Credit Suisse, and Intel Corporation, where he contributed to the advancements of science and technology through engineering and leadership roles. With his wealth of experience and diverse skill set, David brings a unique perspective and invaluable insights to the field of AI/ML.
Read more about David Ping