Intro to SQL Chapter 2 PDF

Title Intro to SQL Chapter 2
Course Introduction to Scripting
Institution Southern New Hampshire University
Pages 29
File Size 1.3 MB
File Type PDF
Total Downloads 44
Total Views 147

Summary

Chapter 2 Information...


Description

2.1 The relational model Database models A database model is a conceptual framework for database software. Leading database models of the 1960s, such as the hierarchic and network models, optimized processing speed at the expense of programmer productivity. As computer speed increased and programs became more complex, programmer productivity became paramount. A simpler approach was needed. Edgar Codd created the relational model in the 1970s in response to the complexity of existing databases. The model was first implemented as System R at IBM in the late 1970s then released as DB2 in the early 1980s. Oracle and other relational databases were released around the same time. The simplicity of relational databases was apparent to computer professionals. However, early product versions were slow compared to established databases. As relational processing speed caught up in the 1980s, relational products rapidly dominated the database market. The advent of the internet in the 1990s generated massive volumes of online data, called big data, often with poorly structured or missing information. In response, Google introduced the MapReduce database model optimized for big data analysis. The relational model is not optimized for big data. As a result, new non-relational databases have come to market since 2000. However, relational continues to dominate applications that require an accurate record of every transaction, such as banking, airline reservation systems, and student records.

Table 2.1.1: Example database models.

Model

Hierarchical

Inventor

IBM

Initial products 1960s

Objectives Speed and storage

Example databases IMS

Network

Relational

MapReduce

PARTICIPATION ACTIVITY

Charles Bachman

Edgar Codd

Google

1960s

Speed and storage

IDMS

1980s

Productivity and simplicity

MySQL Oracle SQL Server DB2

2000s

Big data analytics

Hadoop MapR Cloudera



2.1.1: Database models.

1) Which database is relational? DB2 IDMS Hadoop 2) The relational model was originally developed for which types of applications? Big data storage and analysis Transactional applications like banking and airline reservations Desktop applications with small databases 3) What was the initial impediment to commercial adoption of relational databases in the early 1980s? Reliability Processing speed



Cost

Relational data structure The relational model is a database model based on mathematical principles, with three parts: 1. A data structure that prescribes how data is organized. 2. Operations that manipulate data structures. 3. Rules that govern valid relational data. The relational data structure and operations are based on set theory. A set is a collection of values, or elements, with no inherent order. Sets are denoted with braces. Ex: {apple, banana, lemon} is the set containing three kinds of fruit. Since sets have no order, {apple, banana, lemon} is the same set as {lemon, banana, apple}. The relational data structure is based on three mathematical concepts: • A domain is a named set of possible database values, such as integers, dictionary words, or logical values TRUE and FALSE. • A tuple is a finite sequence of values, each drawn from a fixed domain. Ex: (3, apple, TRUE) is a tuple drawn from domains (Integers, DictionaryWords, LogicalValues). • A relation is a named set of tuples, all drawn from the same sequence of domains. Ex: The relation below is named Grocery and contains three tuples. Since a relation is a set, the relation tuples have no inherent order. PARTICIPATION ACTIVITY

2.1.2: Relations have no inherent order.

Animation content: Slide 1. A Relation named Grocery appears with three tuples. The first tuple contains the values 3 apple and TRUE. The second tuple contains the values 8 orange and FALSE. The third tuple contains the values 0 lemon and FALSE. Slide 2. A second Relation also named Grocery appears to the right of the original Relation with thee three tuples. The first tuple contains the values 0 lemon and

FALSE. The second tuple contains the values 8 orange and FALSE. The third tuple contains the values 3 apple and TRUE. The first row of the first Grocery Relation and the third row of the second Grocery Relation are highlighted red. The second row of the first Grocery Relation and the second row of the second Grocery Relation are highlighted orange. The third row of the first Grocery Relation and the first row of the second Grocery Relation are highlighted yellow. An equals sign appears between both Relations.

Animation captions: 1. A relation is a set of tuples. The Grocery relation has 3 tuples. 2. Since set elements have no inherent order, the two relations are the same. In the relational model, each tuple position is called an attribute and given a unique name. Ex: In the Grocery relation, the first, second, and third positions might be named Quantity, FruitType, and OrganicCertification. Domain, tuple, relation, and attribute are mathematical terms. Data type, row, table, and column are equivalent terms, commonly used in database processing. This material uses the common database terms.

Table 2.1.2: Equivalent relational terms. Mathematical Database

Files

Domain

Data type

Data type

Tuple

Row

Record

Relation

Table

File

Attribute

Column

Field

Terminology: tuples.

Relational tuples differ from mathematical tuples in one respect. In mathematics, the order of tuple values is significant. Ex: The tuples (7, 2) and (2, 7) are different. In the relational model, the order of tuple values is not significant, since each value is associated with a different attribute name.

PARTICIPATION ACTIVITY

2.1.3: Relational data structure.

1) Which terms are commonly used in relational database processing? Tuple, relation, attribute Row, table, column Record, file, field 2) Are these relations the same? { (8, mango, FALSE), (-11, watermelon, FALSE) } { (-11, watermelon, FALSE), (8, mango, FALSE) } Yes No 3) In the relational data structure, which components are named? Domain, tuple, relation Domain, relation Domain, relation, attribute

Relational operations The relational model stipulates a set of operations on tables, collectively called relational algebra. Like the relational data structure, relational operations are based on set theory. Relational operations include: • • • • • •

Union — combines two tables into a larger table. Difference — removes all rows of one table from another table. Projection — eliminates one or more columns of a table. Selection — selects a subset of rows of a table. Join — combines two tables into one. Product — lists all possible combinations of rows of two tables.

Relational operations have a direct implementation in SQL statements. PARTICIPATION ACTIVITY

2.1.4: Selection and projection operations.

Animation content: Slide 1. Three lines of code appear. The third line of code is boxed and contains the text WHERE Salary is greater than 50000 semicolon. Slide 2. The third line of code is unboxed and the first row of text is boxed and contains the text SELECT Name.

Animation captions: 1. Selection appears in the WHERE clause, which identifies specific rows. 2. Projection appears in the SELECT clause, which identifies specific columns. The result of relational operations is always a table. Similarly, the result of a SQL statement is always a table. Although result tables are not stored in the database, result tables have the same data structure as stored tables. Understanding the mathematics of these operations is interesting but not necessary. Understanding the corresponding SQL statements is more practical and useful for database programming. This material focuses on SQL rather than the mathematical operations. PARTICIPATION ACTIVITY

2.1.5: Relational operations.

1) What is the result of a relational operation? A row A column A table 2) Name three operations of the relational algebra. Selection, projection, and union Square root, exponentiation, and logarithm Integration and differentiation

Relational rules Relational rules, also known as integrity rules, are logical constraints that ensure data is valid and conforms to business policy. Universal rules are relational rules that govern data in every relational database. The relational model stipulates a number of universal rules, such as: • Unique primary key — all tables should have a column with no repeated values, called the primary key and used to identify individual rows. • Unique column names — different columns of the same table must have different names. • No duplicate rows — no two rows of the same table may be have identical values in all columns. Business rules are relational rules specific to a particular database and application. Example business rules include: • Unique column values — in a particular column, values may not be repeated. • No missing values — in a particular column, all rows must have known values. • Delete cascade — when a row is deleted, automatically delete all related rows.

In relational databases, universal and business rules can be violated when data is entered or updated. To ensure data conforms to rules at all times, relational databases provide tools to specify and automatically enforce rules. PARTICIPATION ACTIVITY

2.1.6: Delete cascade business rule.

Animation content: Slide 1. Two tables appear named Employee and Task. Table Employee has three columns named ID Name and Salary. Table Task has two columns named Employee ID and Task Name. Row two of columns ID Name and Salary of Table Employee is highlighted and contains the values 5384 Sam Snead and 30500 respectively. Rows two and three of columns Employee ID and Task Name of Table Task are highlighted. Row two contains the values 5384 and Write annual report respectively. Row three contains the values 5384 and Submit Timesheet respectively. Slide 2. The rows are unhighlighted and then the same rows have a line struck through them.

Animation captions: 1. Sam Snead has two tasks. 2. Database user specifies the delete cascade rule: When an employee is deleted, the employee's tasks are also deleted.

PARTICIPATION ACTIVITY

2.1.7: Relational rules.

1) Unique primary key is an example of a universal rule. True False 2) Delete cascade is an example of a universal rule. True False

3) Data in a relational database can violate relational rules. True False 4) Integrity rules and relational rules are synonymous. True False

2.2 Tables, columns, and rows Tables, columns, and rows All data in a relational database is structured in tables: • A table is a collection of data organized as columns and rows. • A column is a set of values of the same type. Each column has a name, different from other column names in the table. • A row is a set of values, one for each column. • A cell is a single column of a single row. In relational databases, each cell contains exactly one value. A table must have at least one column and any number of rows. A table without rows is called an empty table. PARTICIPATION ACTIVITY

2.2.1: Tables, columns, and rows.

Animation content: Slide 1. A table named Employee appears with three columns named ID Name and Salary. ID is labeled Column 1, Name is labeled Column 2 and Salary is labeled Column 3. The table has four rows with the

first row labeled Column names, the second row labeled Row 1, the third row labeled Row 2, and the fourth row labeled Row 3. Rows 1 2 and 3 are all boxed. Slide 2. Row 1 of column ID is highlighted and contains the value 2538. Slide 3. Row 1 of column Name is highlighted and contains the value Lisa Ellison. Slide 4. Row 1 of column Salary is highlighted and contains the value 45000.

Animation captions: 1. The Employee table has 3 columns with the names ID, Name, and Salary. The table has 3 rows. 2. Each value appears in single cell. In the first row of the table, "2538" is the ID column's value. 3. In the first row of the table, "Lisa Ellison" is the Name column's value. 4. In the first row of the table, "45000" is the Salary column's value. In addition to a name, each column has a data type, which defines the format of the values stored in each row. The column name and data type are specified in SQL when the table is created. In the Employee table, the data types of the ID, Name, and Salary columns are integer, string, and decimal, respectively. Usually, a database administrator defines the column names and data types, which can change, but rarely do after definition. Rows, on the other hand, are added, deleted, and changed frequently by database users. PARTICIPATION ACTIVITY

2.2.2: Tables, columns, and rows.

1) Which choice is a column name? 6381 Sam Snead Salary 2) Which choice is a cell value? ID Elsa Brinks 30500

3) What are the components of a column? Name only Data type only Name and data type 4) Must a table have at least one row? Yes No 5) How often do column names and data types change? Never Occasionally Often 6) How does a database administrator specify column names and data types? In a special file managed by the database administrator With the Python language With the SQL language

Data types Relational databases support many different data types. Most data types fall into one of these categories: • Integer data types represent positive and negative integers. Several integer data types exist, varying by the number of bytes allocated for each value. Common integer data types include INT, implemented as 4 bytes of storage, and SMALLINT, implemented as 2 bytes.

• Decimal data types represent numbers with fractional values. Decimal data types vary by number of digits after the decimal point and maximum size. Common decimal data types include FLOAT and DECIMAL. • Character data types represent textual characters. Common character data types include CHAR, a fixed string of characters, and VARCHAR, a string of variable length up to a specified maximum size. • Date and time data types represent date, time, or both, and sometimes specify time zone. Some data and time data types represent an interval rather than a point in time. Common date and time data types include DATE, TIME, DATETIME, and TIMESTAMP. • Binary data types store data exactly as the data appears in memory or computer files, bit for bit. The database manages binary data as a series of zeros and ones. Common binary data types include BLOB, BINARY, VARBINARY, and IMAGE. • Spatial data types store geometric information, such as lines, polygons, and map coordinates. Examples include POLYGON, POINT, and GEOMETRY. Spatial data types are relatively new and consequently vary greatly across database systems. Databases support additional, miscellaneous data types for specialized uses. Examples include MONEY for currency values, XML for data in XML format, BOOLEAN for true-false values, BIT for zeros and ones, and ENUM for a small, fixed set of alternative values.

Table 2.2.1: Example data types and values. Category

Data type

Value

Integer

INT

-9281344

Decimal

FLOAT

3.1415

Character

CHAR

Chicago

Date and time

DATETIME

12/25/2020 10:35:00

Binary

BLOB

1001011101...

Spatial

POINT

(2.5, 33.44)

Miscellaneous

MONEY

99.95 US Dollars

PARTICIPATION ACTIVITY

2.2.3: Data types.

Match the value with the data type.

Nadia

16

09/12/1986

3.14

00011011001

CHAR INT DATE BLOB FLOAT Reset

PARTICIPATION ACTIVITY

2.2.4: Data types.

1) The CHAR data type represents a variable string of characters. True False 2) A binary data type stores data as an exact copy of computer memory. True False 3)

Some date and time data types include time zone. True False

Data type implementations Data types are implemented differently in various database systems. Similar data types may have different names in different databases. Some data types are not supported in all databases. The following table shows example data types available in commonly used databases. 

Table 2.2.2: Example data types in MySQL, Oracle, and SQL Server. Category

MySQL

Oracle

SQL Server

Integer

TINYINT SMALLINT MEDIUMINT BIGINT

INT NUMBER

TINYINT SMALLINT INT BIGINT

Decimal

FLOAT DOUBLE DECIMAL

FLOAT NUMBER

FLOAT NUMERIC DECIMAL

Date and Time

DATE DATETIME TIMESTAMP

DATE TIMESTAMP TIMESTAMP WITH TIMEZONE INTERVAL

DATE TIME DATETIME DATETIMEOFFSET

Character

CHAR VARCHAR TEXT

CHAR VARCHAR2 LONG

CHAR VARCHAR TEXT

TINYBLOB MEDIUMBLOB LONGBLOB

BLOB BFILE RAW

BINARY VARBINARY IMAGE

Binary



Spatial

POINT POLYGON GEOMETRY

Miscellaneous

ENUM BOOLEAN BIT

PARTICIPATION ACTIVITY

SDO_GEOMETRY

POINT POLYGON

XMLTYPE

MONEY XML BIT

2.2.5: Data type implementations.

1) NUMERIC is a decimal type supported by the Oracle database. True False 2) 3.1415 can be stored as an INT data type in the Oracle database. True False 3) 1/29/2020 14:30:00 might represent a DATETIME value in the MySQL database. True False

Creating a table in SQL The SQL CREATE TABLE statement creates a new table by specifying the table name, column names, and column data types. The animation below creates an Employee table with four columns using data types from MySQL. PARTICIPATION ACTIVITY

2.2.6: Creating an Employee table.

Animation content: Slide 1. There is code that states CREATE TABLE Employee left parenthesis with right parenthesis semicolon being five lines down. Text appears that states table name and points to the text Employee in the first line of code. An empty table named Employee appears to the right of the code. Slide 2. Four lines of code appear between the two lines of existing code. The first line of code states ID SMALL INT comma. The second line of code states Name VARCHAR left parenthesis 60 right parenthesis comma. The third line of code states Birth Date DATE comma. The fourth line of code states Salary DECIMAL left parenthesis 7 comma 2 right parenthesis. The words ID Name Birth Date and Salary are labeled column names and the words SMALL INT VAR CHAR left parenthesis 60 right parenthesis DATE and DECIMAL left parenthesis 7 comma 2 right parenthesis are labeled data types. The table Employee gets four columns named ID Name Birth Date and Salary.

Animation captions: 1. The CREATE TABLE statement names the new table "Employee". 2. The column names and data types are separated by commas.

PARTICIPATION ACTIVITY

2.2.7: Creating tables in SQL.

Refer to the animation above. 1) The Employee table is created with 4 different data types. True False 2) Only the ID column stores numbers. True False

3) The BirthDate column stores only a date and no time. True False 4) The Employee table is...


Similar Free PDFs