Search icon
Arrow left icon
All Products
Best Sellers
New Releases
Books
Videos
Audiobooks
Learning Hub
Newsletters
Free Learning
Arrow right icon
Python for Finance Cookbook - Second Edition

You're reading from  Python for Finance Cookbook - Second Edition

Product type Book
Published in Dec 2022
Publisher Packt
ISBN-13 9781803243191
Pages 740 pages
Edition 2nd Edition
Languages
Author (1):
Eryk Lewinson Eryk Lewinson
Profile icon Eryk Lewinson

Table of Contents (18) Chapters

Preface Acquiring Financial Data Data Preprocessing Visualizing Financial Time Series Exploring Financial Time Series Data Technical Analysis and Building Interactive Dashboards Time Series Analysis and Forecasting Machine Learning-Based Approaches to Time Series Forecasting Multi-Factor Models Modeling Volatility with GARCH Class Models Monte Carlo Simulations in Finance Asset Allocation Backtesting Trading Strategies Applied Machine Learning: Identifying Credit Default Advanced Concepts for Machine Learning Projects Deep Learning in Finance Other Books You May Enjoy
Index

Getting data from CoinGecko

The last data source we cover is dedicated purely to cryptocurrencies. CoinGecko is a popular data vendor and crypto tracking website, on which you can find real-time exchange rates, historical data, information about exchanges, upcoming events, trading volumes, and much more.

We can list a few of the advantages of CoinGecko:

  • completely free, no need to register for an API key
  • aside from prices, they also provide updates and news about crypto
  • it covers many coins, not only the most popular ones

In this recipe, we download Bitcoin's OHLC from the last 14 days.

How to do it…

Execute the following steps to download data from CoinGecko.

  1. Import the libraries:
from pycoingecko import CoinGeckoAPI
from datetime import datetime
  1. Instantiate the CoinGecko API:
cg = CoinGeckoAPI()
  1. Get Bitcoin's OHLC prices from the last 14 days:
ohlc = cg.get_coin_ohlc_by_id(id="bitcoin", vs_currency="usd", days="14")
ohlc_df...

Summary

In this chapter, we have covered a few of the most popular sources of financial data. However, this is just the tip of the iceberg. Below, you can find a list of other interesting data sources that might suit your needs even better.

Additional data sources:

  • IEX Cloud (https://iexcloud.io/) - a platform providing a vast trove of different financial data. A notable feature that is unique to the platform is a daily and minutely sentiment score based on the activity on Stocktwits - an online community for investors and traders. However, that API is only available in the paid plan. You can access the IEX Cloud data using pyex, the official Python library.
  • Tiingo (https://www.tiingo.com/) and the tiingo library.
  • CryptoCompare (https://www.cryptocompare.com/) - the platform offers a wide range of crypto-related data via their API. What stands out about this data vendor is that they provide order book data.
  • twelvedata (https://twelvedata.com/)
  • polygon.io (https://polygon.io/) - a trusted...

Adjusting the returns for inflation

When doing different kinds of analyses, especially long-term ones, we might want to consider inflation. Inflation is the general rise of the price level of an economy over time. Or to phrase it differently, the reduction of the purchasing power of money. That is why we might want to decouple the inflation from the increase of the stock prices caused by, for example, the companies’ growth or development.

We can naturally adjust the prices of stocks directly, but in this recipe, we will focus on adjusting the returns and calculating the real returns. We can do so using the following formula:

where Rrt is the real return, Rt is the time t simple return, and stands for the inflation rate.

For this example, we use Apple’s stock prices from the years 2010 to 2020 (downloaded as in the previous recipe).

How to do it…

Execute the following steps to adjust the returns for inflation:

  1. Import libraries...

Changing the frequency of time series data

When working with time series, and especially financial ones, we often need to change the frequency (periodicity) of the data. For example, we receive daily OHLC prices, but our algorithm works with weekly data. Or we have daily alternative data, and we want to match it with our live feed of intraday data.

The general rule of thumb for changing frequency can be broken down into the following:

  • Multiply/divide the log returns by the number of time periods.
  • Multiply/divide the volatility by the square root of the number of time periods.

For any process with independent increments (for example, the geometric Brownian motion), the variance of the logarithmic returns is proportional to time. For example, the variance of rt3 - rt1 is going to be the sum of the following two variances: rt2−rt1 and rt3−rt2, assuming t1t2t3. In such a case, when we also assume that the parameters of...

Different ways of imputing missing data

While working with any time series, it can happen that some data is missing, due to many possible reasons (someone forgot to input the data, a random issue with the database, and so on). One of the available solutions would be to discard observations with missing values. However, imagine a scenario in which we are analyzing multiple time series at once, and only one of the series is missing a value due to some random mistake. Do we still want to remove all the other potentially valuable pieces of information because of this single missing value? Probably not. And there are many other potential scenarios in which we would rather treat the missing values somehow, rather than discarding those observations.

Two of the simplest approaches to imputing missing time series data are:

  • Backward filling—fill the missing value with the next known value
  • Forward filling—fill the missing value with the previous known value...

Converting currencies

Another quite common preprocessing step you might encounter while working on financial tasks is converting currencies. Imagine you have a portfolio of multiple assets, priced in different currencies and you would like to arrive at a total portfolio’s worth. The simplest example might be American and European stocks.

In this recipe, we show how to easily convert stock prices from USD to EUR. However, the very same steps can be used to convert any pair of currencies.

How to do it…

Execute the following steps to convert stock prices from USD to EUR:

  1. Import the libraries:
    import pandas as pd
    import yfinance as yf
    from forex_python.converter import CurrencyRates
    
  2. Download Apple’s OHLC prices from January 2020:
    df = yf.download("AAPL",
                     start="2020-01-01",
                     end="2020-01-31",
                     progress=False)
    df = df.drop(columns=["...

Different ways of aggregating trade data

Before diving into building a machine learning model or designing a trading strategy, we not only need reliable data, but we also need to aggregate it into a format that is convenient for further analysis and appropriate for the models we choose. The term bars refers to a data representation that contains basic information about the price movements of any financial asset. We have already seen one form of bars in Chapter 1, Acquiring Financial Data, in which we explored how to download financial data from a variety of sources.

There, we downloaded OHLCV data sampled by some time period, be it a month, day, or intraday frequencies. This is the most common way of aggregating financial time series data and is known as the time bars.

There are some drawbacks of sampling financial time series by time:

  • Time bars disguise the actual rate of activity in the market—they tend to oversample low activity periods (for example,...

Summary

In this chapter, we have learned how to preprocess financial time series data. We started by showing how to calculate returns and potentially adjust them for inflation. Then, we covered a few of the popular methods for imputing missing values. Lastly, we explained the different approaches to aggregating trade data and why choosing the correct one matters.

We should always pay significant attention to this step, as we not only want to enhance our model’s performance but also to ensure the validity of any analysis. In the next chapter, we will continue working with the preprocessed data and learn how to create time series visualization.

lock icon The rest of the chapter is locked
You have been reading a chapter from
Python for Finance Cookbook - Second Edition
Published in: Dec 2022 Publisher: Packt ISBN-13: 9781803243191
Register for a free Packt account to unlock a world of extra content!
A free Packt account unlocks extra newsletters, articles, discounted offers, and much more. Start advancing your knowledge today.
Unlock this book and the full library FREE for 7 days
Get unlimited access to 7000+ expert-authored eBooks and videos courses covering every tech area you can think of
Renews at $15.99/month. Cancel anytime}