Reader small image

You're reading from  Extending Excel with Python and R

Product typeBook
Published inApr 2024
PublisherPackt
ISBN-139781804610695
Edition1st Edition
Right arrow
Authors (2):
Steven Sanderson
Steven Sanderson
author image
Steven Sanderson

Steven Sanderson, MPH, is an applications manager for the patient accounts department at Stony Brook Medicine. He received his bachelor's degree in economics and his master's in public health from Stony Brook University. He has worked in healthcare in some capacity for just shy of 20 years. He is the author and maintainer of the healthyverse set of R packages. He likes to read material related to social and labor economics and has recently turned his efforts back to his guitar with the hope that his kids will follow suit as a hobby they can enjoy together.
Read more about Steven Sanderson

David Kun
David Kun
author image
David Kun

David Kun is a mathematician and actuary who has always worked in the gray zone between quantitative teams and ICT, aiming to build a bridge. He is a co-founder and director of Functional Analytics and the creator of the ownR Infinity platform. As a data scientist, he also uses ownR for his daily work. His projects include time series analysis for demand forecasting, computer vision for design automation, and visualization.
Read more about David Kun

View More author details
Right arrow

Performing linear regression in Python using Excel data

Linear regression in Python can be carried out with the help of libraries such as pandas, scikit-learn, statsmodels, and matplotlib. The following is a step-by-step code example:

  1. First, import the necessary libraries:
    # Import necessary libraries
    import pandas as pd
    import numpy as np
    import matplotlib.pyplot as plt
    from sklearn.model_selection import train_test_split
    import statsmodels.api as sm
    from statsmodels.graphics.regressionplots import plot_regress_exog
    from statsmodels.graphics.gofplots import qqplot
  2. Then, we create an Excel file with test data. Of course, in a real-life scenario, you would not need the mock data – you would skip this step and load the data from Excel (see the next step) after loading the necessary libraries:
    # Step 0: Generate sample data and save as Excel file
    np.random.seed(0)
    n_samples = 100
    X = np.random.rand(n_samples, 2)  # Two features
    y = 2 * X[:, 0] + 3 * X[:, 1] ...
lock icon
The rest of the page is locked
Previous PageNext Page
You have been reading a chapter from
Extending Excel with Python and R
Published in: Apr 2024Publisher: PacktISBN-13: 9781804610695

Authors (2)

author image
Steven Sanderson

Steven Sanderson, MPH, is an applications manager for the patient accounts department at Stony Brook Medicine. He received his bachelor's degree in economics and his master's in public health from Stony Brook University. He has worked in healthcare in some capacity for just shy of 20 years. He is the author and maintainer of the healthyverse set of R packages. He likes to read material related to social and labor economics and has recently turned his efforts back to his guitar with the hope that his kids will follow suit as a hobby they can enjoy together.
Read more about Steven Sanderson

author image
David Kun

David Kun is a mathematician and actuary who has always worked in the gray zone between quantitative teams and ICT, aiming to build a bridge. He is a co-founder and director of Functional Analytics and the creator of the ownR Infinity platform. As a data scientist, he also uses ownR for his daily work. His projects include time series analysis for demand forecasting, computer vision for design automation, and visualization.
Read more about David Kun