Reader small image

You're reading from  Data Science for Web3

Product typeBook
Published inDec 2023
PublisherPackt
ISBN-139781837637546
Edition1st Edition
Concepts
Right arrow
Author (1)
Gabriela Castillo Areco
Gabriela Castillo Areco
author image
Gabriela Castillo Areco

Gabriela Castillo Areco holds an M.Sc. in big data science from the TECNUM School of Engineering, University of Navarra. With extensive experience in both the business and data facets of blockchain technology, Gabriela has undertaken roles as a data scientist, machine learning analyst, and blockchain consultant in both large corporations and small ventures. She served as a professor of new crypto businesses at Torcuato di Tella University and is currently a member of the BizOps data team at IOV Labs.
Read more about Gabriela Castillo Areco

Right arrow

Preface

The advent of Web3 has ushered in a trove of data, characterized by its distinctive properties, giving rise to novel concepts and breathing new life into established ones. Within the expansive Web3 ecosystem, data assumes diverse forms, stored across multiple platforms and formats, ranging from on-chain transactional data to independent news aggregators, oracles, and social networks. In essence, Web3 is a continuous generator of data directly or indirectly related to its ecosystem.

Web3, with its inherent characteristics, has unlocked value on various fronts. Decentralization has demonstrated that businesses without central authorities are possible. Trustless interactions have driven coordination between entities, facilitating smooth exchanges of goods and services over the blockchain, even among strangers and without intermediaries. This has resulted in value transfers reaching remote corners of the world at minimal costs, fostering direct connections between artists and collectors and facilitating crowdfunding by directly supporting product developers, among a myriad of other applications.

However, one aspect of Web3 that remains relatively unexplored, yet holds immense value to unlock, is transparency. Transparency fosters reliance, a cornerstone for mass adoption. A significant milestone for the industry will be achieved when ordinary individuals seamlessly engage with it, grounded in trust due to accessible and verifiable information. To fully realize the potential of transparency, many Web3 data scientists and analysts, equipped with the requisite skills, conceptual knowledge, tools, and a profound understanding of the data and business landscape, are needed. This is what this book aims to do – empower you to evolve into Web3 data specialists capable of understanding and extracting value from data.

The book is structured into three parts. The first section covers foundational concepts necessary to execute data analysis tasks. You will gain insights into on-chain data, learn to access and extract insights, explore sources of relevant off-chain data, and navigate potential obstacles. Additionally, two domains that generate vast amounts of data, namely NFTs and DeFi, are examined in depth, each presenting its own set of business rules and technical concepts.

The second part of the book shifts focus to machine learning use cases utilizing Web3 data. We have curated practical cases that data scientists, whether freelancers or employed professionals, may encounter in their work.

The Appendix addresses the question, what should we do with the knowledge acquired? It provides guidance on navigating the decentralized work landscape, understanding industry expectations for prospective data employees, and identifying the soft and hard skills necessary for success. In order to offer a glimpse into the future of the industry, we have engaged with Web3 data leaders who share their experiences, perspectives, and visions. The intent of this part is to shorten the time required to find jobs or other ways to contribute in the industry.

The benefits of decentralization, trustless interactions, and transparency in trade cannot be ignored, and that is why the industry continues to grow year by year, unlocking new use cases and creating new jobs. The purpose of this book is to contribute to the understanding of the data that Web3 generates so that you can be prepared to shape the next era of the internet.

Who this book is for

The format of the book and the list of topics covered make it suitable for data professionals interested in the Web3 ecosystem. The explanations have been simplified, catering to professionals with no data science background but eager to leverage data tools for in-depth analysis of blockchain data. You are encouraged to engage with the shared repository and experiment with the provided solutions, fostering a hands-on learning experience. Although not mandatory, a basic understanding of statistics, machine learning, SQL, and Python would be advantageous.

What this book covers

Chapter 1, Where Data and Web3 Meet, introduces the fundamental concepts of Web3 and data science tools.

Chapter 2, Working with On-Chain Data, explores the structure of on-chain data.

Chapter 3, Working with Off-Chain Data, delves into relevant off-chain data for the industry and guidance on where to locate it.

Chapter 4, Exploring the Digital Uniqueness of NFTs – Games, Art, and Identity, examines NFT businesses and how to calculate pertinent metrics.

Chapter 5, Exploring Analytics on DeFi, introduces DeFi businesses and how to calculate essential metrics.

Chapter 6, Preparing and Exploring Our Data, showcases preprocessing steps that are useful when dealing with Web3 data.

Chapter 7, A Primer on Machine Learning and Deep Learning, delves into the core concepts necessary for advancing through the machine learning cases explored in Part 2.

Chapter 8, Sentiment Analysis – NLP and Crypto News, explores the application of natural language processing (NLP) in crypto sentiment analysis.

Chapter 9, Generative Art for NFTs, examines examples of art generation to support NFT initiatives.

Chapter 10, A Primer on Security and Fraud Detection, explores an application for fraud detection.

Chapter 11, Price Prediction with Time Series, delves into an application for predicting prices with time series.

Chapter 12, Marketing Discovery with Graphs, examines an application to identify influencers and communities with on-chain data.

Chapter 13, Building Experience with Crypto Data – BUIDL, covers various options for job searching or continuing studies in the Web3 domain.

Chapter 14, Interviews with Web3 Data Leaders, concludes the book by delving into the perspectives of Web3 data leaders regarding the industry and its future.

To get the most out of this book

A Jupyter or a Google Colab notebook is sufficient to cover all the examples. In some cases, to access data, we will need to sign up for an account and obtain API keys.

Software/hardware covered in the book

Operating system requirements

Python 3.7+

Windows, macOS, or Linux

Google Colaboratory or Jupyter notebook

If you are using the digital version of this book, we advise you to type the code yourself or access the code from the book’s GitHub repository (a link is available in the next section). Doing so will help you avoid any potential errors related to the copying and pasting of code.

Disclaimer

All opinions expressed in this book are just opinions and should not be considered an inducement to invest or follow a particular strategy. They are intended for informational purposes only and should not be relied upon for making investment decisions. Please consult with a qualified financial advisor before making any investment decisions.

Download the example code files

You can download the example code files for this book from GitHub at https://github.com/PacktPublishing/Data-Science-for-Web3. If there’s an update to the code, it will be updated in the GitHub repository.

We also have other code bundles from our rich catalog of books and videos available at https://github.com/PacktPublishing/. Check them out!

Conventions used

There are a number of text conventions used throughout this book.

Code in text: Indicates code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles. Here is an example: “The following information was taken from the CSV file, filtered by the Betweenness centrality column.”

A block of code is set as follows:

{'domain': {'id': '131',
'name': 'Unified Twitter Taxonomy',
'description': 'A taxonomy of user interests. '},
'entity': {'id': '913142676819648512',
'name': 'Cryptocurrencies',
'description': 'Cryptocurrency'}},

When we wish to draw your attention to a particular part of a code block, the relevant lines or items are set in bold:

'annotations': 'annotations': [{'start': 10,
'end': 18,
'probability': 0.8568,
'type': 'Organization',
'normalized_text': 'Blackrock'},

Any command-line input or output is written as follows:

decompose = seasonal_decompose(df, model= 'additive').plot(observed=True, seasonal=True, trend=True, resid=True, weights=False)

Bold: Indicates a new term, an important word, or words that you see on screen. For instance, words in menus or dialog boxes appear in bold. Here is an example: “Once we have filled in the mandatory requirements on the API page, we can press the blue Execute button, which will return the URL we can use.”

Tips or important notes

Appear like this.

Get in touch

Feedback from our readers is always welcome.

General feedback: If you have questions about any aspect of this book, email us at customercare@packtpub.com and mention the book title in the subject of your message.

Errata: Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you have found a mistake in this book, we would be grateful if you would report this to us. Please visit www.packtpub.com/support/errata and fill in the form.

Piracy: If you come across any illegal copies of our works in any form on the internet, we would be grateful if you would provide us with the location address or website name. Please contact us at copyright@packt.com with a link to the material.

If you are interested in becoming an author: If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, please visit authors.packtpub.com.

Share Your Thoughts

Once you’ve read Data Science for Web3, we’d love to hear your thoughts! Please click here to go straight to the Amazon review page for this book and share your feedback.

Your review is important to us and the tech community and will help us make sure we’re delivering excellent quality content.

Download a free PDF copy of this book

Thanks for purchasing this book!

Do you like to read on the go but are unable to carry your print books everywhere? Is your e-book purchase not compatible with the device of your choice?

Don’t worry, now with every Packt book you get a DRM-free PDF version of that book at no cost.

Read anywhere, on any device. Search, copy, and paste code from your favorite technical books directly into your application.

The perks don’t stop there! You can get exclusive access to discounts, newsletters, and great free content in your inbox daily

Follow these simple steps to get the benefits:

  1. Scan the following QR code or visit the link:

https://packt.link/free-ebook/9781837637546

  1. Submit your proof of purchase.
  2. That’s it! We’ll send your free PDF and other benefits to your email directly.
lock icon
The rest of the chapter is locked
You have been reading a chapter from
Data Science for Web3
Published in: Dec 2023Publisher: PacktISBN-13: 9781837637546
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
undefined
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime

Author (1)

author image
Gabriela Castillo Areco

Gabriela Castillo Areco holds an M.Sc. in big data science from the TECNUM School of Engineering, University of Navarra. With extensive experience in both the business and data facets of blockchain technology, Gabriela has undertaken roles as a data scientist, machine learning analyst, and blockchain consultant in both large corporations and small ventures. She served as a professor of new crypto businesses at Torcuato di Tella University and is currently a member of the BizOps data team at IOV Labs.
Read more about Gabriela Castillo Areco