Title | SQL Appendix Exercise 1 |
---|---|
Author | Gianna DiGiovanni |
Course | Data Warehouses |
Institution | University of Southern California |
Pages | 10 |
File Size | 802.1 KB |
File Type | |
Total Downloads | 25 |
Total Views | 147 |
SQL Assignment...
SQL with Teradata
SQL APPENDIX: EXERCISE 1 ANALYZING DATA WITH SQL NITIN KALÉ, UNIVERSITY OF SOUTHERN CALIFORNIA NANCY JONES, SAN DIEGO STATE UNIVERSITY
OBJECTIVE To use SQL to query a Teradata Database.
ACTIVITIES
Connect to a Teradata database Use SELECT, FROM, WHERE and other clauses to derive information from one or multiple tables. Use JOINs
SOFTWARE PREREQUISITES
Remote desktop connection
UCC PRODUCTS REQUIRED
None
SYSTEM ACCESS REQUIRED
Access to Sam’s Club data stored in a Teradata database. Request access here http://walton.uark.edu/enterprise/teradata-university-network.php
Nitin Kalé & Nancy Jones © 2016
1
SQL with Teradata
SAM'S CLUB DATABASE
Sam's Club, a division of Wal-Mart Stores, Inc., is a warehouse club that specializes in selling to small businesses. A membership-based store, Sam's Club offers goods and services for consumers and business owners as well as affordable luxury merchandise. Sam's Club keeps prices low by selling merchandise in bulk and at very low profit margins. The Sam's Club Database contains retail sales information gathered from sales at Sam's Club stores. The process used to gather this information begins with a Sam's Club member gathering all of the items they intend to purchase during the current visit to Sam's Club. The member then proceeds to a register to check out. A Sam's Club associate scans the member's Sam's Club card, at which point a visit number (visit_nbr) is generated and stored in the store_visits table. The associate proceeds by scanning each item with a barcode reader. When all of the items have been scanned, summary information about each individual type of product (i.e. 6 packages of sop) purchased during that visit is recorded in the item_scan table. When payment is tendered for items purchased on that visit, summary information for the total order (transaction time & date, amount spent, number of unique items purchased, etc) is recorded in the store_visits table. Other tables are used to store information about stores, products, and members. REAL DATA AND DATA INTEGRITY
The retail sales information in the UA_SAMSCLUB database was provided to the Walton College of Business by Wal-Mart Stores, Inc. The database consists of 6 tables with more than 55 million rows populated and ready for use. This is a gifted dataset that is based on real operational data. Like many real databases, integrity problems may be noted. This can provide a unique opportunity not only to expose students to real data but also to illustrate the effects of data integrity problems. SAM’S CLUB DATABASE SCHEMA
Nitin Kalé & Nancy Jones © 2016
2
SQL with Teradata
THE METADATA FOR SAM'S CLUB DATABASE
Attribute
Description
Nitin Kalé & Nancy Jones © 2016
Values
3
SQL with Teradata
ACTIVITY_CD
Activity Code
Y, N
BRAND_NAME
Name of the brand associated with the item
Null, name of brand
BUS_CR_TYP_STAT_CD
Business Credit Type Status Code
0-10
CARD_HOLDER_NBR
Card holder within an account
1-99
CATEGORY_NBR
Number assigned to a category of items
Null, 0-99
CMPLMNTRY_CARD_CN T
Number of extra cards given to an account
0-4
COLOR_DESC
Color description of an item
White, Almond, etc
CREATE_DATE
Date the item was created
Date
EFFECTIVE_DATE
Date the item began to be sold
Date 0-4
ELITE_STAT_CODE EXPIRATION_DATE
Expiration date of an item
Date
FINELINE
Combination of category_nbr & sub_category_nbr
4 digit number
ISSUING_CLUB_NBR
The club that the member originally joined
1-150
ITEM_NBR
The number assigned to every different item for sale
Unique number (PK)
ITEM_QUANTITY
The quantity of a unique item that is scanned
JOIN_DATE
Date the member joined the club
Date
LAST_RENEWAL_DATE
Last date that the member renewed their membership
Date
MEMBER_CODE
1,A,D,E,G,V,W,X,Y
MEMBER_STATUS_CD
A,D,E,T
MEMBER_TYPE
1,A,E,G,V,W,X
MEMBERSHIP_NBR
The number assigned to the member upon joining the club
MFG_NBR
Number representing a manufacturer
OBSOLETE_DATE
The date an item is no longer sold
Nitin Kalé & Nancy Jones © 2016
4
Date
SQL with Teradata
OPERATOR_NBR PRIMARY_DESC
The description of an item
Teal X-Large etc Null, 015-3001
QUALIFY_ORG_CODE REFUND_CODE
Code to indicate a return transaction
0 = Not Return, 1= Return
REGISTER_NBR
The register identification number where the transaction took place
1-85
RENEWAL_DATE
Date a membership should be renewed
Date
SALES_TAX_AMT
Tax charged for total visit
SECONDARY_CARD_CNT
Number of cards other than primary card assigned to the membership
SECONDARY_DESC
Additional description of an item
Sweatshirt, gift set etc
SIC
Standard Industry Classification code
783700, 443700 etc
SIZE_DESC
Text description of the size of the item, including clothing and non-clothing items
15CUFT, LARGE, etc
STATUS_CHG_DATE
The date an item last changed its status code
Date
STATUS_CODE
Whether an item is active or deactive
A = Active, D = Deactive
STORE_NAME
The name of the store
STORE_NBR
Store identification number
SUB_CATEGORY_NBR
The number assigned to a sub_category of items
TAX_COLLECT_CODE
Purchase taxable or not
TENDER_AMT
The amount tendered for the purchase
TENDER_TYPE
Type of payment used
Nitin Kalé & Nancy Jones © 2016
1-150
0,1
0 - Cash 1 - Check 2 - Gift Card 3 - Discover 4 - Direct Credit 5 - Business Credit
5
SQL with Teradata
6 - Personal Credit TOT_SCAN_CNT
Total number of scanned items per transaction
TOT_UNIQUE_ITM_CNT
The number of unique items purchased per transaction
TOT_UNIT_COST
The cost of the item (scrubbed)
TOTAL_SCAN_AMOUNT
The total number of items scanned per visit number
TOTAL_VISIT_AMT
The total value of the entire transaction
TRANSACTION_DATE
Date of the transaction
TRANSACTION_TIME
The time of day that the transaction started
UNIT_COST_AMOUNT
Cost/Unit (scrubbed)
UNIT_RETAIL_AMOUNT
Purchase Price/Unit (scrubbed)
VENDOR_NBR
The number of the vendor that supplies the item
VISIT_NBR
Every time a member goes to the register and has their membership card scanned, this number is then created
VNPK_CUBIC_FT
How many cubic feet does a vendor pack take up
VNPK_QTY
The quantity of items in a vendor pack
ZIP_CODE
The zip-code of the store
ZIP_CODE
The zip-code of the member
0-84
9 digit #
INSTRUCTIONS
1. Follow the instructions in the document TUN_Connection.pdf to connect to University of Arkansas
Nitin Kalé & Nancy Jones © 2016
6
SQL with Teradata
2.
After connecting, the desktop is your starting point to access the Teradata SQL Assistant.
3.
To access Teradata SQL Assistant, double-click on Teradata SQL Assistant…
4.
The application will open. Click on the “Connect Icon”.
5.
Select “Machine Data Source” tab. Select “Walton College Teradata” Data Source Name.
6.
Provide your Username and Password.
Nitin Kalé & Nancy Jones © 2016
7
SQL with Teradata
7.
Click OK.
8.
You are now ready to start the process of running queries in the Teradata SQL Assistant.
9.
From here you can add databases you want to use. You have access to the following databases:
UA_DILLARDS UA_HALLUX UA_SAMSCLUB UA_SAMSCLUB_big UA_SAMSCLUB_small UA_SAMSCLUB_star 10.
Right click the Database Properties panel, add database. Enter UA_SAMSCLUB. Click OK
11.
If you need help on using Teradata SQL Assistant, please click on the Help menu.
12.
Expand the database name to see the tables, attributes etc.
13.
You can now start writing queries to query the database
14.
E.g. Right click the Store_Information table and choose Generate SQL – SELECT
15.
The program automatically generates the SQL to select all fields and all rows.
Nitin Kalé & Nancy Jones © 2016
8
SQL with Teradata
16.
Click Execute
17.
You see the result of the query in the Answerset Panel.
Now write SQL queries to answer the following questions. For your answers, paste your SQL statement, from Teradata, and query results (you can use ctrl-A the result table, then ctrl-C, ctrl-V in a Word document) Sample – Question: Write a query that lists all attributes for all stores. Answer: SELECT
Store_Nbr, Store_Name, Align_Sub_Division_Nbr, Region_Nbr, District_Nbr, Open_Date, Store_Type, Street_Addr, City, State, ZIP_Code, Phone_Nbr, Manager_Name, Open_Sunday_Flag, Geographic_Zone, Apparel_Zone, SizeClass, SalesClass, Store_Code, Delivery_Type, Mdse_Major_Zone, Mdse_Sub_Zone
FROM
UA_SAMSCLUB.store_information
1. Count the number of stores. 2.
Count the number of items.
3.
Count the number of customers.
4. Find the average unit retail amount for electronic items purchased by Advantage members at Store 20 in Jan, 2000. Hint: Advantage Members are indicated by member_type 'V'. Category 5 is Electronics. For dates use the format ‘20000220’ for Feb 20, 2000
Nitin Kalé & Nancy Jones © 2016
9
SQL with Teradata
5. Find the markup rate for items purchased in Jan, 2000 from any store in Wisconsin ‘WI’ for the item vendor is Vendor of 11110. 6. What is the total quantity of items sold in ‘WI’ between Jan.5,2000 and Jan.7, 2000? 7. How many different members visited stores in ‘WI’ between Jan.3, 2000 and Jan.7, 2000? 8. Count the number of different customers (members) who bought an item with a ‘CLEAR’ color in Texas and spent more than $60. 9. Count the number of different stores that sold ‘HAMBURGER BUNS’ on Jan 10, 2000 to members with a member status code of ‘A’. 10.
The Sam’s Club management wants –
a.
a count of how many unique members who shopped in Jan, 2000?
b.
a count of how many unique members who shopped in Store 4 on Jan 3, 2000?
11. Management needs to know the dates (if any) on which anyone purchased ‘2 ALARM CHILI’ in WI. 12. Sam’s Club management needs the purchasing member number, item description, and price for the most expensive item(s) purchased in Wisconsin ‘WI’ on January 5, 2000. 13. Management also needs a count of the number of times that any product whose description contains ‘BODYWASH’ was purchased (visits) in any store in Florida ‘FL’ in Jan, 2000. 14.
List the customer who have not bought anything
15.
List the products that have never been sold
Nitin Kalé & Nancy Jones © 2016
10...