Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Python for Finance Cookbook - Second Edition
Python for Finance Cookbook - Second Edition

Python for Finance Cookbook: Over 80 powerful recipes for effective financial data analysis, Second Edition

By Eryk Lewinson
S$53.99 S$36.99
Book Dec 2022 740 pages 2nd Edition
eBook
S$53.99 S$36.99
Print
S$67.99
Subscription
Free Trial
eBook
S$53.99 S$36.99
Print
S$67.99
Subscription
Free Trial

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Dec 30, 2022
Length 740 pages
Edition : 2nd Edition
Language : English
ISBN-13 : 9781803243191
Category :
Table of content icon View table of contents Preview book icon Preview Book

Python for Finance Cookbook - Second Edition

Acquiring Financial Data

The first chapter of this book is dedicated to a very important (some may say the most important) part of any data science/quantitative finance project—gathering data. In line with the famous adage “garbage in, garbage out,” we should strive to obtain data of the highest possible quality and then correctly preprocess it for later use with statistical and machine learning algorithms. The reason for this is simple—the results of our analyses are highly dependent on the input data and no sophisticated model will be able to compensate for that. That is also why in our analyses, we should be able to use our (or someone else’s) understanding of the economic/financial domain to motivate certain data for, for example, modeling stock returns.

One of the most frequently reported issues among the readers of the first edition of this book was getting high-quality data. That is why in this chapter we spend more time exploring different sources of financial data. While quite a few of these vendors offer similar information (prices, fundamentals, and so on), they also offer additional, unique data that can be downloaded via their APIs. An example could be company-related news articles or pre-computed technical indicators. That is why we will download different types of data depending on the recipe. However, be sure to inspect the documentation of the library/API, as most likely its vendor also provides standard data such as prices.

Additional examples are also covered in the Jupyter notebooks, which you can find in the accompanying GitHub repository.

The data sources in this chapter were selected intentionally not only to showcase how easy it can be to gather high-quality data using Python libraries but also to show that the gathered data comes in many shapes and sizes.

Sometimes we will get a nicely formatted pandas DataFrame, while other times it might be in JSON format or even bytes that need to be processed and then loaded as a CSV. Hopefully, these recipes will sufficiently prepare you to work with any kind of data you might encounter online.

Something to bear in mind while reading this chapter is that data differs among sources. This means that the prices we downloaded from two vendors will most likely differ, as those vendors also get their data from different sources and might use other methods to adjust the prices for corporate actions. The best practice is to find a source you trust the most concerning a particular type of data (based on, for example, opinion on the internet) and then use it to download the data you need. One additional thing to keep in mind is that when building algorithmic trading strategies, the data we use for modeling should align with the live data feed used for executing the trades.

This chapter does not cover one important type of data—alternative data. This could be any type of data that can be used to generate some insights into predicting asset prices. Alternative data can include satellite images (for example, tracking shipping routes, or the development of a certain area), sensor data, web traffic data, customer reviews, etc. While there are many vendors specializing in alternative data (for example, Quandl/Nasdaq Data Link), you can also get some by accessing publicly available information via web scraping. As an example, you could scrape customer reviews from Amazon or Yelp. However, those are often bigger projects and are unfortunately outside of the scope of this book. Also, you need to make sure that web scraping a particular website is not against its terms and conditions!

Using the vendors mentioned in this chapter, you can get quite a lot of information for free. But most of those providers also offer paid tiers. Remember to do thorough research on what the data suppliers actually provide and what your needs are before signing up for any of the services.

In this chapter, we cover the following recipes:

  • Getting data from Yahoo Finance
  • Getting data from Nasdaq Data Link
  • Getting data from Intrinio
  • Getting data from Alpha Vantage
  • Getting data from CoinGecko

Getting data from Yahoo Finance

One of the most popular sources of free financial data is Yahoo Finance. It contains not only historical and current stock prices in different frequencies (daily, weekly, and monthly), but also calculated metrics, such as the beta (a measure of the volatility of an individual asset in comparison to the volatility of the entire market), fundamentals, earnings information/calendars, and many more.

For a long period of time, the go-to tool for downloading data from Yahoo Finance was the pandas-datareader library. The goal of the library was to extract data from a variety of sources and store it in the form of a pandas DataFrame. However, after some changes to the Yahoo Finance API, this functionality was deprecated. It is definitely good to be familiar with this library, as it facilitates downloading data from sources such as FRED (Federal Reserve Economic Data), the Fama/French Data Library, or the World Bank. Those might come in handy for different kinds of analyses and some of them are presented in the following chapters.

As of now, the easiest and fastest way of downloading historical stock prices is to use the yfinance library (formerly known as fix_yahoo_finance).

For the sake of this recipe, we are interested in downloading Apple’s stock prices from the years 2011 to 2021.

How to do it…

Execute the following steps to download data from Yahoo Finance:

  1. Import the libraries:
    import pandas as pd
    import yfinance as yf
    
  2. Download the data:
    df = yf.download("AAPL",
                     start="2011-01-01",
                     end="2021-12-31",
                     progress=False)
    
  3. Inspect the downloaded data:
    print(f"Downloaded {len(df)} rows of data.")
    df
    

    Running the code generates the following preview of the DataFrame:

Figure 1.1: Preview of the DataFrame with downloaded stock prices

The result of the request is a pandas DataFrame (2,769 rows) containing daily Open, High, Low, and Close (OHLC) prices, as well as the adjusted close price and volume.

Yahoo Finance automatically adjusts the close price for stock splits, that is, when a company divides the existing shares of its stock into multiple new shares, most frequently to boost the stock’s liquidity. The adjusted close price takes into account not only splits but also dividends.

How it works…

The download function is very intuitive. In the most basic case, we just need to provide the ticker (symbol), and it will try to download all available data since 1950.

In the preceding example, we downloaded daily data from a specific range (2011 to 2021).

Some additional features of the download function are:

  • We can download information for multiple tickers at once by providing a list of tickers (["AAPL", "MSFT"]) or multiple tickers as a string ("AAPL MSFT").
  • We can set auto_adjust=True to download only the adjusted prices.
  • We can additionally download dividends and stock splits by setting actions='inline'. Those actions can also be used to manually adjust the prices or for other analyses.
  • Specifying progress=False disables the progress bar.
  • The interval argument can be used to download data in different frequencies. We could also download intraday data as long as the requested period is shorter than 60 days.

There’s more…

yfinance also offers an alternative way of downloading the data—via the Ticker class. First, we need to instantiate the object of the class:

aapl_data = yf.Ticker("AAPL")

To download the historical price data, we can use the history method:

aapl_data.history()

By default, the method downloads the last month of data. We can use the same arguments as in the download function to specify the range and frequency.

The main benefit of using the Ticker class is that we can download much more information than just the prices. Some of the available methods include:

  • info—outputs a JSON object containing detailed information about the stock and its company, for example, the company’s full name, a short business summary, which exchange it is listed on, as well as a selection of financial metrics such as the beta coefficient
  • actions—outputs corporate actions such as dividends and splits
  • major_holders—presents the names of the major holders
  • institutional_holders—shows the institutional holders
  • calendar—shows the incoming events, such as the quarterly earnings
  • earnings/quarterly_earnings—shows the earnings information from the last few years/quarters
  • financials/quarterly_financials—contains financial information such as income before tax, net income, gross profit, EBIT, and much more

Please see the corresponding Jupyter notebook for more examples and outputs of those methods.

See also

For a complete list of downloadable data, please refer to the GitHub repo of yfinance (https://github.com/ranaroussi/yfinance).

You can check out some alternative libraries for downloading data from Yahoo Finance:

  • yahoofinancials—similarly to yfinance, this library offers the possibility of downloading a wide range of data from Yahoo Finance. The biggest difference is that all the downloaded data is returned as JSON.
  • yahoo_earnings_calendar—a small library dedicated to downloading the earnings calendar.

Getting data from Nasdaq Data Link

Alternative data can be anything that is considered non-market data, for example, weather data for agricultural commodities, satellite images that track oil shipments, or even customer feedback that reflects a company’s service performance. The idea behind using alternative data is to get an “informational edge” that can then be used for generating alpha. In short, alpha is a measure of performance describing an investment strategy’s, trader’s, or portfolio manager’s ability to beat the market.

Quandl was the leading provider of alternative data products for investment professionals (including quant funds and investment banks). Recently, it was acquired by Nasdaq and is now part of the Nasdaq Data Link service. The goal of the new platform is to provide a unified source of trusted data and analytics. It offers an easy way to download data, also via a dedicated Python library.

A good starting place for financial data would be the WIKI Prices database, which contains stock prices, dividends, and splits for 3,000 US publicly traded companies. The drawback of this database is that as of April 2018, it is no longer supported (meaning there is no recent data). However, for purposes of getting historical data or learning how to access the databases, it is more than enough.

We use the same example that we used in the previous recipe—we download Apple’s stock prices for the years 2011 to 2021.

Getting ready

Before downloading the data, we need to create an account at Nasdaq Data Link (https://data.nasdaq.com/) and then authenticate our email address (otherwise, an exception is likely to occur while downloading the data). We can find our personal API key in our profile (https://data.nasdaq.com/account/profile).

How to do it…

Execute the following steps to download data from Nasdaq Data Link:

  1. Import the libraries:
    import pandas as pd
    import nasdaqdatalink
    
  2. Authenticate using your personal API key:
    nasdaqdatalink.ApiConfig.api_key = "YOUR_KEY_HERE"
    

    You need to replace YOUR_KEY_HERE with your own API key.

  1. Download the data:
    df = nasdaqdatalink.get(dataset="WIKI/AAPL",
                            start_date="2011-01-01", 
                            end_date="2021-12-31")
    
  2. Inspect the downloaded data:
    print(f"Downloaded {len(df)} rows of data.")
    df.head()
    

    Running the code generates the following preview of the DataFrame:

Figure 1.2: Preview of the downloaded price information

The result of the request is a DataFrame (1,818 rows) containing the daily OHLC prices, the adjusted prices, dividends, and potential stock splits. As we mentioned in the introduction, the data is limited and is only available until April 2018—the last observation actually comes from March, 27 2018.

How it works…

The first step after importing the required libraries was authentication using the API key. When providing the dataset argument, we used the following structure: DATASET/TICKER.

We should keep the API keys secure and private, that is, not share them in public repositories, or anywhere else. One way to make sure that the key stays private is to create an environment variable (how to do it depends on your operating system) and then load it in Python. To do so, we can use the os module. To load the NASDAQ_KEY variable, we could use the following code: os.environ.get("NASDAQ_KEY").

Some additional details on the get function are:

  • We can specify multiple datasets at once using a list such as ["WIKI/AAPL", "WIKI/MSFT"].
  • The collapse argument can be used to define the frequency (available options are daily, weekly, monthly, quarterly, or annually).
  • The transform argument can be used to carry out some basic calculations on the data prior to downloading. For example, we could calculate row-on-row change (diff), row-on-row percentage change (rdiff), or cumulative sum (cumul) or scale the series to start at 100 (normalize). Naturally, we can easily do the very same operation using pandas.

There’s more...

Nasdaq Data Link distinguishes two types of API calls for downloading data. The get function we used before is classified as a time-series API call. We can also use the tables API call with the get_table function.

  1. Download the data for multiple tickers using the get_table function:
    COLUMNS = ["ticker", "date", "adj_close"]
    df = nasdaqdatalink.get_table("WIKI/PRICES", 
                                  ticker=["AAPL", "MSFT", "INTC"], 
                                  qopts={"columns": COLUMNS}, 
                                  date={"gte": "2011-01-01", 
                                        "lte": "2021-12-31"}, 
                                  paginate=True)
    df.head()
    
  2. Running the code generates the following preview of the DataFrame:

    Figure 1.3: Preview of the downloaded price data

    This function call is a bit more complex than the one we did with the get function. We first specified the table we want to use. Then, we provided a list of tickers. As the next step, we specified which columns of the table we were interested in. We also provided the range of dates, where gte stands for greater than or equal to, while lte is less than or equal to. Lastly, we also indicated we wanted to use pagination. The tables API is limited to 10,000 rows per call. However, by using paginate=True in the function call we extend the limit to 1,000,000 rows.

  1. Pivot the data from long format to wide:
    df = df.set_index("date")
    df_wide = df.pivot(columns="ticker")
    df_wide.head()
    

    Running the code generates the following preview of the DataFrame:

Figure 1.4: Preview of the pivoted DataFrame

The output of the get_tables function is in the long format. However, to make our analyses easier, we might be interested in the wide format. To reshape the data, we first set the date column as an index and then used the pivot method of a pd.DataFrame.

Please bear in mind that this is not the only way to do so, and pandas contains at least a few helpful methods/functions that can be used for reshaping the data from long to wide and vice versa.

See also

Getting data from Intrinio

Another interesting source of financial data is Intrinio, which offers access to its free (with limits) database. The following list presents just a few of the interesting data points that we can download using Intrinio:

  • Intraday historical data
  • Real-time stock/option prices
  • Financial statement data and fundamentals
  • Company news
  • Earnings-related information
  • IPOs
  • Economic data such as the Gross Domestic Product (GDP), unemployment rate, federal funds rate, etc.
  • 30+ technical indicators

Most of the data is free of charge, with some limits on the frequency of calling the APIs. Only the real-time price data of US stocks and ETFs requires a different kind of subscription.

In this recipe, we follow the preceding example of downloading Apple’s stock prices for the years 2011 to 2021. That is because the data returned by the API is not simply a pandas DataFrame and requires some interesting preprocessing.

Getting ready

Before downloading the data, we need to register at https://intrinio.com to obtain the API key.

Please see the following link (https://docs.intrinio.com/developer-sandbox) to understand what information is included in the sandbox API key (the free one).

How to do it…

Execute the following steps to download data from Intrinio:

  1. Import the libraries:
    import intrinio_sdk as intrinio
    import pandas as pd
    
  2. Authenticate using your personal API key, and select the API:
    intrinio.ApiClient().set_api_key("YOUR_KEY_HERE")
    security_api = intrinio.SecurityApi()
    

    You need to replace YOUR_KEY_HERE with your own API key.

  1. Request the data:
    r = security_api.get_security_stock_prices(
        identifier="AAPL", 
        start_date="2011-01-01",
        end_date="2021-12-31", 
        frequency="daily",
        page_size=10000
    )
    
  2. Convert the results into a DataFrame:
    df = (
        pd.DataFrame(r.stock_prices_dict)
        .sort_values("date")
        .set_index("date")
    )
    
  3. Inspect the data:
    print(f"Downloaded {df.shape[0]} rows of data.")
    df.head()
    

    The output looks as follows:

Figure 1.5: Preview of the downloaded price information

The resulting DataFrame contains the OHLC prices and volume, as well as their adjusted counterparts. However, that is not all, and we had to cut out some additional columns to make the table fit the page. The DataFrame also contains information, such as split ratio, dividend, change in value, percentage change, and the 52-week rolling high and low values.

How it works…

The first step after importing the required libraries was to authenticate using the API key. Then, we selected the API we wanted to use for the recipe—in the case of stock prices, it was the SecurityApi.

To download the data, we used the get_security_stock_prices method of the SecurityApi class. The parameters we can specify are as follows:

  • identifier—stock ticker or another acceptable identifier
  • start_date/end_date—these are self-explanatory
  • frequency—which data frequency is of interest to us (available choices: daily, weekly, monthly, quarterly, or yearly)
  • page_size—defines the number of observations to return on one page; we set it to a high number to collect all the data we need in one request with no need for the next_page token

The API returns a JSON-like object. We accessed the dictionary form of the response, which we then transformed into a DataFrame. We also set the date as an index using the set_index method of a pandas DataFrame.

There’s more...

In this section, we show some more interesting features of Intrinio.

Not all information is included in the free tier. For a more thorough overview of what data we can download for free, please refer to the following documentation page: https://docs.intrinio.com/developer-sandbox.

Get Coca-Cola’s real-time stock price

You can use the previously defined security_api to get the real-time stock prices:

security_api.get_security_realtime_price("KO")

The output of the snippet is the following JSON:

{'ask_price': 57.57,
 'ask_size': 114.0,
 'bid_price': 57.0,
 'bid_size': 1.0,
 'close_price': None,
 'exchange_volume': 349353.0,
 'high_price': 57.55,
 'last_price': 57.09,
 'last_size': None,
 'last_time': datetime.datetime(2021, 7, 30, 21, 45, 38, tzinfo=tzutc()),
 'low_price': 48.13,
 'market_volume': None,
 'open_price': 56.91,
 'security': {'composite_figi': 'BBG000BMX289',
              'exchange_ticker': 'KO:UN',
              'figi': 'BBG000BMX4N8',
              'id': 'sec_X7m9Zy',
              'ticker': 'KO'},
 'source': 'bats_delayed',
 'updated_on': datetime.datetime(2021, 7, 30, 22, 0, 40, 758000, tzinfo=tzutc())}

Download news articles related to Coca-Cola

One of the potential ways to generate trading signals is to aggregate the market’s sentiment on the given company. We could do it, for example, by analyzing news articles or tweets. If the sentiment is positive, we can go long, and vice versa. Below, we show how to download news articles about Coca-Cola:

r = intrinio.CompanyApi().get_company_news(
    identifier="KO", 
    page_size=100
)
 
df = pd.DataFrame(r.news_dict)
df.head()

This code returns the following DataFrame:

Figure 1.6: Preview of the news about the Coca-Cola company

Search for companies connected to the search phrase

Running the following snippet returns a list of companies that Intrinio’s Thea AI recognized based on the provided query string:

r = intrinio.CompanyApi().recognize_company("Intel")
df = pd.DataFrame(r.companies_dict)
df

As we can see, there are quite a few companies that also contain the phrase “intel” in their names, other than the obvious search result.

Figure 1.7: Preview of the companies connected to the phrase “intel”

Get Coca-Cola’s intraday stock prices

We can also retrieve intraday prices using the following snippet:

response = (
    security_api.get_security_intraday_prices(identifier="KO", 
                                              start_date="2021-01-02",
                                              end_date="2021-01-05",
                                              page_size=1000)
)
df = pd.DataFrame(response.intraday_prices_dict)
df

Which returns the following DataFrame containing intraday price data.

Figure 1.8: Preview of the downloaded intraday prices

Get Coca-Cola’s latest earnings record

Another interesting usage of the security_api is to recover the latest earnings records. We can do this using the following snippet:

r = security_api.get_security_latest_earnings_record(identifier="KO")
print(r)

The output of the API call contains quite a lot of useful information. For example, we can see what time of day the earnings call happened. This information could potentially be used for implementing trading strategies that act when the market opens.

Figure 1.9: Coca-Cola’s latest earnings record

See also

Getting data from Alpha Vantage

Alpha Vantage is another popular data vendor providing high-quality financial data. Using their API, we can download the following:

  • Stock prices, including intraday and real-time (paid access)
  • Fundamentals: earnings, income statement, cash flow, earnings calendar, IPO calendar
  • Forex and cryptocurrency exchange rates
  • Economic indicators such as real GDP, Federal Funds Rate, Consumer Price Index, and consumer sentiment
  • 50+ technical indicators

In this recipe, we show how to download a selection of crypto-related data. We start with historical daily Bitcoin prices, and then show how to query the real-time crypto exchange rate.

Getting ready

Before downloading the data, we need to register at https://www.alphavantage.co/support/#api-key to obtain the API key. Access to the API and all the endpoints is free of charge (excluding the real-time stock prices) within some bounds (5 API requests per minute; 500 API requests per day).

How to do it…

Execute the following steps to download data from Alpha Vantage:

  1. Import the libraries:
    from alpha_vantage.cryptocurrencies import CryptoCurrencies
    
  2. Authenticate using your personal API key and select the API:
    ALPHA_VANTAGE_API_KEY = "YOUR_KEY_HERE"
    crypto_api = CryptoCurrencies(key=ALPHA_VANTAGE_API_KEY,
                                  output_format= "pandas")
    
  3. Download the daily prices of Bitcoin, expressed in EUR:
    data, meta_data = crypto_api.get_digital_currency_daily(
        symbol="BTC", 
        market="EUR"
    )
    

    The meta_data object contains some useful information about the details of the query. You can see it below:

    {'1. Information': 'Daily Prices and Volumes for Digital Currency',
     '2. Digital Currency Code': 'BTC',
     '3. Digital Currency Name': 'Bitcoin',
     '4. Market Code': 'EUR',
     '5. Market Name': 'Euro',
     '6. Last Refreshed': '2022-08-25 00:00:00',
     '7. Time Zone': 'UTC'}
    

    The data DataFrame contains all the requested information. We obtained 1,000 daily OHLC prices, the volume, and the market capitalization. What is also noteworthy is that all the OHLC prices are provided in two currencies: EUR (as we requested) and USD (the default one).

    Figure 1.10: Preview of the downloaded prices, volume, and market cap

  1. Download the real-time exchange rate:
    crypto_api.get_digital_currency_exchange_rate(
        from_currency="BTC", 
        to_currency="USD"
    )[0].transpose()
    

    Running the command returns the following DataFrame with the current exchange rate:

Figure 1.11: BTC-USD exchange rate

How it works…

After importing the alpha_vantage library, we had to authenticate using the personal API key. We did so while instantiating an object of the CryptoCurrencies class. At the same time, we specified that we would like to obtain output in the form of a pandas DataFrame. The other possibilities are JSON and CSV.

In Step 3, we downloaded the daily BTC prices using the get_digital_currency_daily method. Additionally, we specified that we wanted to get the prices in EUR. By default, the method will return the requested EUR prices, as well as their USD equivalents.

Lastly, we downloaded the real-time BTC/USD exchange rate using the get_digital_currency_exchange_rate method.

There’s more...

So far, we have used the alpha_vantage library as a middleman to download information from Alpha Vantage. However, the functionalities of the data vendor evolve faster than the third-party library and it might be interesting to learn an alternative way of accessing their API.

  1. Import the libraries:
    import requests
    import pandas as pd
    from io import BytesIO
    
  2. Download Bitcoin’s intraday data:
    AV_API_URL = "https://www.alphavantage.co/query"
    parameters = {
        "function": "CRYPTO_INTRADAY",
        "symbol": "ETH",
        "market": "USD",
        "interval": "30min",
        "outputsize": "full",
        "apikey": ALPHA_VANTAGE_API_KEY
    }
    r = requests.get(AV_API_URL, params=parameters)
    data = r.json()
    df = (
        pd.DataFrame(data["Time Series Crypto (30min)"])
        .transpose()
    )
    df
    

    Running the snippet above returns the following preview of the downloaded DataFrame:

    Figure 1.12: Preview of the DataFrame containing Bitcoin’s intraday prices

    We first defined the base URL used for requesting information via Alpha Vantage’s API. Then, we defined a dictionary containing the additional parameters of the request, including the personal API key. In our function call, we specified that we want to download intraday ETH prices expressed in USD and sampled every 30 minutes. We also indicated we want a full output (by specifying the outputsize parameter). The other option is compact output, which downloads the 100 most recent observations.

    Having prepared the request’s parameters, we used the get function from the requests library. We provide the base URL and the parameters dictionary as arguments. After obtaining the response to the request, we can access it in JSON format using the json method. Lastly, we convert the element of interest into a pandas DataFrame.

    Alpha Vantage’s documentation shows a slightly different approach to downloading this data, that is, by creating a long URL with all the parameters specified there. Naturally, that is also a possibility, however, the option presented above is a bit neater. To see the very same request URL as presented by the documentation, you can run r.request.url.

  1. Download the upcoming earnings announcements within the next three months:
    AV_API_URL = "https://www.alphavantage.co/query"
    parameters = {
        "function": "EARNINGS_CALENDAR",
        "horizon": "3month",
        "apikey": ALPHA_VANTAGE_API_KEY
    }
    r = requests.get(AV_API_URL, params=parameters)
    pd.read_csv(BytesIO(r.content))
    

    Running the snippet returns the following output:

Figure 1.13: Preview of a DataFrame containing the downloaded earnings information

While getting the response to our API request is very similar to the previous example, handling the output is much different.

The output of r.content is a bytes object containing the output of the query as text. To mimic a normal file in-memory, we can use the BytesIO class from the io module. Then, we can normally load that mimicked file using the pd.read_csv function.

In the accompanying notebook, we present a few more functionalities of Alpha Vantage, such as getting the quarterly earnings data, downloading the calendar of the upcoming IPOs, and using alpha_vantage's TimeSeries module to download stock price data.

See also

Getting data from CoinGecko

The last data source we will cover is dedicated purely to cryptocurrencies. CoinGecko is a popular data vendor and crypto-tracking website, on which you can find real-time exchange rates, historical data, information about exchanges, upcoming events, trading volumes, and much more.

We can list a few of the advantages of CoinGecko:

  • Completely free, and no need to register for an API key
  • Aside from prices, it also provides updates and news about crypto
  • It covers many coins, not only the most popular ones

In this recipe, we download Bitcoin’s OHLC from the last 14 days.

How to do it…

Execute the following steps to download data from CoinGecko:

  1. Import the libraries:
    from pycoingecko import CoinGeckoAPI
    from datetime import datetime
    import pandas as pd
    
  2. Instantiate the CoinGecko API:
    cg = CoinGeckoAPI()
    
  3. Get Bitcoin’s OHLC prices from the last 14 days:
    ohlc = cg.get_coin_ohlc_by_id(
        id="bitcoin", vs_currency="usd", days="14"
    )
    ohlc_df = pd.DataFrame(ohlc)
    ohlc_df.columns = ["date", "open", "high", "low", "close"]
    ohlc_df["date"] = pd.to_datetime(ohlc_df["date"], unit="ms")
    ohlc_df
    

    Running the snippet above returns the following DataFrame:

Figure 1.14: Preview of the DataFrame containing the requested Bitcoin prices

In the preceding table, we can see that we have obtained the requested 14 days of data, sampled every 4 hours.

How it works…

After importing the libraries, we instantiated the CoinGeckoAPI object. Then, using its get_coin_ohlc_by_id method we downloaded the last 14 days’ worth of BTC/USD exchange rates. It is worth mentioning there are some limitations of the API:

  • We can only download data for a predefined number of days. We can select one of the following options: 1/7/14/30/90/180/365/max.
  • The OHLC candles are sampled with a varying frequency depending on the requested horizon. They are sampled every 30 minutes for requests of 1 or 2 days. Between 3 and 30 days they are sampled every 4 hours. Above 30 days, they are sampled every 4 days.

The output of the get_coin_ohlc_by_id is a list of lists, which we can convert into a pandas DataFrame. We had to manually create the column names, as they were not provided by the API.

There’s more...

We have seen that getting the OHLC prices can be a bit more difficult using the CoinGecko API as compared to the other vendors. However, CoinGecko has additional interesting information we can download using its API. In this section, we show a few possibilities.

Get the top 7 trending coins

We can use CoinGecko to acquire the top 7 trending coins—the ranking is based on the number of searches on CoinGecko within the last 24 hours. While downloading this information, we also get the coins’ symbols, their market capitalization ranking, and the latest price in BTC:

trending_coins = cg.get_search_trending()
(
    pd.DataFrame([coin["item"] for coin in trending_coins["coins"]])
    .drop(columns=["thumb", "small", "large"])
)

Using the snippet above, we obtain the following DataFrame:

Figure 1.15: Preview of the DataFrame containing the 7 trending coins and some information about them

Get Bitcoin’s current price in USD

We can also extract current crypto prices in various currencies:

cg.get_price(ids="bitcoin", vs_currencies="usd")

Running the snippet above returns Bitcoin’s real-time price:

{'bitcoin': {'usd': 47312}}

In the accompanying notebook, we present a few more functionalities of pycoingecko, such as getting the crypto price in different currencies than USD, downloading the entire list of coins supported on CoinGecko (over 9,000 coins), getting each coin’s detailed market data (market capitalization, 24h volume, the all-time high, and so on), and loading the list of the most popular exchanges.

See also

You can find the documentation of the pycoingecko library here: https://github.com/man-c/pycoingecko.

Summary

In this chapter, we have covered a few of the most popular sources of financial data. However, this is just the tip of the iceberg. Below, you can find a list of other interesting data sources that might suit your needs even better.

Additional data sources are:

  • IEX Cloud (https://iexcloud.io/)—a platform providing a vast trove of different financial data. A notable feature that is unique to the platform is a daily and minutely sentiment score based on the activity on Stocktwits—an online community for investors and traders. However, that API is only available in the paid plan. You can access the IEX Cloud data using pyex, the official Python library.
  • Tiingo (https://www.tiingo.com/) and the tiingo library.
  • CryptoCompare (https://www.cryptocompare.com/)—the platform offers a wide range of crypto-related data via their API. What stands out about this data vendor is that they provide order book data.
  • Twelve Data (https://twelvedata.com/).
  • polygon.io (https://polygon.io/)—a trusted data vendor for real-time and historical data (stocks, forex, and crypto). Trusted by companies such as Google, Robinhood, and Revolut.
  • Shrimpy (https://www.shrimpy.io/) and shrimpy-python—the official Python library for the Shrimpy Developer API.

In the next chapter, we will learn how to preprocess the downloaded data for further analysis.

Join us on Discord!

To join the Discord community for this book – where you can share feedback, ask questions to the author, and learn about new releases – follow the QR code below:

https://packt.link/ips2H

Left arrow icon Right arrow icon
Download code icon Download Code

Key benefits

  • Explore unique recipes for financial data processing and analysis with Python
  • Apply classical and machine learning approaches to financial time series analysis
  • Calculate various technical analysis indicators and backtest trading strategies

Description

Python is one of the most popular programming languages in the financial industry, with a huge collection of accompanying libraries. In this new edition of the Python for Finance Cookbook, you will explore classical quantitative finance approaches to data modeling, such as GARCH, CAPM, factor models, as well as modern machine learning and deep learning solutions. You will use popular Python libraries that, in a few lines of code, provide the means to quickly process, analyze, and draw conclusions from financial data. In this new edition, more emphasis was put on exploratory data analysis to help you visualize and better understand financial data. While doing so, you will also learn how to use Streamlit to create elegant, interactive web applications to present the results of technical analyses. Using the recipes in this book, you will become proficient in financial data analysis, be it for personal or professional projects. You will also understand which potential issues to expect with such analyses and, more importantly, how to overcome them.

What you will learn

  • Preprocess, analyze, and visualize financial data
  • Explore time series modeling with statistical (exponential smoothing, ARIMA) and machine learning models
  • Uncover advanced time series forecasting algorithms such as Meta’s Prophet
  • Use Monte Carlo simulations for derivatives valuation and risk assessment
  • Explore volatility modeling using univariate and multivariate GARCH models
  • Investigate various approaches to asset allocation
  • Learn how to approach ML-projects using an example of default prediction
  • Explore modern deep learning models such as Google’s TabNet, Amazon’s DeepAR and NeuralProphet

What do you get with eBook?

Product feature icon Instant access to your Digital eBook purchase
Product feature icon Download this book in EPUB and PDF formats
Product feature icon Access this title in our online reader with advanced features
Product feature icon DRM FREE - Read whenever, wherever and however you want
Buy Now

Product Details


Publication date : Dec 30, 2022
Length 740 pages
Edition : 2nd Edition
Language : English
ISBN-13 : 9781803243191
Category :

Table of Contents

18 Chapters
Preface Chevron down icon Chevron up icon
Acquiring Financial Data Chevron down icon Chevron up icon
Data Preprocessing Chevron down icon Chevron up icon
Visualizing Financial Time Series Chevron down icon Chevron up icon
Exploring Financial Time Series Data Chevron down icon Chevron up icon
Technical Analysis and Building Interactive Dashboards Chevron down icon Chevron up icon
Time Series Analysis and Forecasting Chevron down icon Chevron up icon
Machine Learning-Based Approaches to Time Series Forecasting Chevron down icon Chevron up icon
Multi-Factor Models Chevron down icon Chevron up icon
Modeling Volatility with GARCH Class Models Chevron down icon Chevron up icon
Monte Carlo Simulations in Finance Chevron down icon Chevron up icon
Asset Allocation Chevron down icon Chevron up icon
Backtesting Trading Strategies Chevron down icon Chevron up icon
Applied Machine Learning: Identifying Credit Default Chevron down icon Chevron up icon
Advanced Concepts for Machine Learning Projects Chevron down icon Chevron up icon
Deep Learning in Finance Chevron down icon Chevron up icon
Other Books You May Enjoy Chevron down icon Chevron up icon
Index Chevron down icon Chevron up icon

Customer reviews

Filter icon Filter
Top Reviews
Rating distribution
Empty star icon Empty star icon Empty star icon Empty star icon Empty star icon 0
(0 Ratings)
5 star 0%
4 star 0%
3 star 0%
2 star 0%
1 star 0%

Filter reviews by


No reviews found
Get free access to Packt library with over 7500+ books and video courses for 7 days!
Start Free Trial

FAQs

How do I buy and download an eBook? Chevron down icon Chevron up icon

Where there is an eBook version of a title available, you can buy it from the book details for that title. Add either the standalone eBook or the eBook and print book bundle to your shopping cart. Your eBook will show in your cart as a product on its own. After completing checkout and payment in the normal way, you will receive your receipt on the screen containing a link to a personalised PDF download file. This link will remain active for 30 days. You can download backup copies of the file by logging in to your account at any time.

If you already have Adobe reader installed, then clicking on the link will download and open the PDF file directly. If you don't, then save the PDF file on your machine and download the Reader to view it.

Please Note: Packt eBooks are non-returnable and non-refundable.

Packt eBook and Licensing When you buy an eBook from Packt Publishing, completing your purchase means you accept the terms of our licence agreement. Please read the full text of the agreement. In it we have tried to balance the need for the ebook to be usable for you the reader with our needs to protect the rights of us as Publishers and of our authors. In summary, the agreement says:

  • You may make copies of your eBook for your own use onto any machine
  • You may not pass copies of the eBook on to anyone else
How can I make a purchase on your website? Chevron down icon Chevron up icon

If you want to purchase a video course, eBook or Bundle (Print+eBook) please follow below steps:

  1. Register on our website using your email address and the password.
  2. Search for the title by name or ISBN using the search option.
  3. Select the title you want to purchase.
  4. Choose the format you wish to purchase the title in; if you order the Print Book, you get a free eBook copy of the same title. 
  5. Proceed with the checkout process (payment to be made using Credit Card, Debit Cart, or PayPal)
Where can I access support around an eBook? Chevron down icon Chevron up icon
  • If you experience a problem with using or installing Adobe Reader, the contact Adobe directly.
  • To view the errata for the book, see www.packtpub.com/support and view the pages for the title you have.
  • To view your account details or to download a new copy of the book go to www.packtpub.com/account
  • To contact us directly if a problem is not resolved, use www.packtpub.com/contact-us
What eBook formats do Packt support? Chevron down icon Chevron up icon

Our eBooks are currently available in a variety of formats such as PDF and ePubs. In the future, this may well change with trends and development in technology, but please note that our PDFs are not Adobe eBook Reader format, which has greater restrictions on security.

You will need to use Adobe Reader v9 or later in order to read Packt's PDF eBooks.

What are the benefits of eBooks? Chevron down icon Chevron up icon
  • You can get the information you need immediately
  • You can easily take them with you on a laptop
  • You can download them an unlimited number of times
  • You can print them out
  • They are copy-paste enabled
  • They are searchable
  • There is no password protection
  • They are lower price than print
  • They save resources and space
What is an eBook? Chevron down icon Chevron up icon

Packt eBooks are a complete electronic version of the print edition, available in PDF and ePub formats. Every piece of content down to the page numbering is the same. Because we save the costs of printing and shipping the book to you, we are able to offer eBooks at a lower cost than print editions.

When you have purchased an eBook, simply login to your account and click on the link in Your Download Area. We recommend you saving the file to your hard drive before opening it.

For optimal viewing of our eBooks, we recommend you download and install the free Adobe Reader version 9.