Calculating descriptive statistics with SQL
Calculating descriptive statistics with SQL is an important topic in data analysis, as it allows us to summarize and understand the main characteristics of a dataset. Here are some examples of how to calculate various measures of central tendency and variability in SQL.
Mean
The mean, also known as the average, is a measure of central tendency that represents the sum of all values in a dataset divided by the number of observations. In SQL, we can calculate the mean using the AVG()
function. For example, to calculate the mean of the salary
column in the employees
table, we can use the following query:
SELECT AVG(salary) AS mean_salaryFROM employees;
Case scenario
An interesting real-world scenario for calculating the mean in SQL for descriptive statistics is to analyze the average time spent by visitors on a website. Assume we have a table named pageviews
containing records of page views with columns such as visitor_id
, page_url...