Solution
Solution
dealerships.csv
file and click on OK.Solution
sqlda
database. Examine the schema for the customers
table from the schema dropdown. Notice the names of the columns, the same as we did in Exercise 6, Querying Salespeople, for the salespeople
table.SELECT email FROM customers WHERE state='FL' ORDER BY email
The following is the output of the preceding code:
SELECT first_name, last_name, email FROM customers WHERE city='New York City...
Solution
sqlda
database.SELECT c.*, p.*, COALESCE(s.dealership_id, -1), CASE WHEN p.base_msrp - s.sales_amount >500 THEN 1 ELSE 0 END AS high_savings FROM sales s INNER JOIN customers c ON c.customer_id=s.customer_id INNER JOIN products p ON p.product_id=s.product_id LEFT JOIN dealerships d ON s.dealership_id = d.dealership_id;
Thus, have the data to build a new model that will help the data science team to predict which customers are the best prospects for remarketing from the output generated.
Solution
sqlda
database.COUNT
function:SELECT COUNT(*) FROM sales;
You should get 37,711 sales.
SUM
aggregate function here:SELECT c.state, SUM(sales_amount) as total_sales_amount FROM sales s INNER JOIN customers c ON c.customer_id=s.customer_id GROUP BY 1 ORDER BY 1;
You will get the following output:
GROUP BY
clause and set LIMIT
as 5
:SELECT s.dealership_id, COUNT(*) FROM sales s WHERE channel='dealership' GROUP BY 1 ORDER BY 2 DESC LIMIT 5
You should get the following output:
Solution
sqlda
database.SUM
function:SELECT sales_transaction_date::DATE, SUM(sales_amount) as total_sales_amount FROM sales WHERE sales_transaction_date>='2018-01-01' AND sales_transaction_date<'2019-01-01' GROUP BY 1 ORDER BY 1;
The following is the output of the preceding code:
WITH daily_deals as ( SELECT sales_transaction_date::DATE, COUNT(*) as total_deals FROM sales GROUP BY 1 ), moving_average_calculation_30 AS ( SELECT sales_transaction_date, total_deals, AVG(total_deals) OVER (ORDER BY sales_transaction_date ROWS BETWEEN 30 PRECEDING and CURRENT ROW) AS deals_moving_average...
Solution
jupyter notebook
(if you do not have a notebook server running already). In the browser window that pops up, create a new Python 3 notebook. In the first cell, you can type in the standard import
statements and the connection information (replacing your_X
with the appropriate parameter for your database connection):from sqlalchemy import create_engine import pandas as pd % matplotlib inline cnxn_string = ("postgresql+psycopg2://{username}:{pswd}" ...
Solution
customer_sales
table:CREATE MATERIALIZED VIEW customer_search AS ( SELECT customer_json -> 'customer_id' AS customer_id, customer_json, to_tsvector('english', customer_json) AS search_vector FROM customer_sales );
CREATE INDEX customer_search_gin_idx ON customer_search USING GIN(search_vector);
SELECT customer_id, customer_json FROM customer_search WHERE search_vector @@ plainto_tsquery('english', 'Danny Bat');
This results in eight matching rows:
Solution:
sqlda
database:C:\> psql sqlda
EXPLAIN
command to return the query plan for selecting all available records within the customers
table:sqlda=# EXPLAIN SELECT * FROM customers;
This query will produce the following output from the planner:
The setup cost is 0, the total query cost is 1536, the number of rows is 50000, and the width of each row is 140. The cost is actually in cost units, the number of rows is in rows, and the width is in bytes.
15
:sqlda=# EXPLAIN SELECT * FROM customers LIMIT 15;
This query will produce the following output from the planner:
Two steps are involved in the query, and the limiting step costs 0.46 units...
Solution
sqlda
database:$ psql sqlda
OVER
and ORDER BY
statements. Insert the results into a new table called bat_sales_growth
:sqlda=# SELECT *, sum(count) OVER (ORDER BY sales_transaction_date) INTO bat_sales_growth FROM bat_sales_daily;
The following table shows the daily cumulative sum of sales:
lag
function of the sum
column and insert all the columns of bat_sales_daily
and the new lag
column into a new table, bat_sales_daily_delay
. This lag
column indicates what the sales were like 1 week before the given record:sqlda=# SELECT *, lag(sum, 7) OVER (ORDER BY sales_transaction_date) INTO bat_sales_daily_delay FROM bat_sales_growth;
bat_sales_growth
:sqlda=# SELECT * FROM bat_sales_daily_delay LIMIT 15;
The following is the output of...