IT help - IT hjælp PDF

Title IT help - IT hjælp
Course Marketing
Institution Universidad Dominicana O&M
Pages 13
File Size 1 MB
File Type PDF
Total Downloads 71
Total Views 162

Summary

IT hjælp...


Description

Introduction to SQL Michael H. Larsen - 2018

Introduction to SQL

Michael H. Larsen

Table of contents Introduction and definitions ................................................................................................................... 2 Databases ............................................................................................................................................ 2 SQL ...................................................................................................................................................... 2 Rows and columns .............................................................................................................................. 2 Statements .......................................................................................................................................... 2 Schema ................................................................................................................................................ 2 Select ....................................................................................................................................................... 3 Introduction ........................................................................................................................................ 3 Selecting specific columns .................................................................................................................. 3 Ordering the results ............................................................................................................................ 4 Filtering the results ............................................................................................................................. 4 The LIKE clause ................................................................................................................................ 6 Multiple clauses .............................................................................................................................. 6 Limiting the number of results............................................................................................................ 7 Order of evaluation ............................................................................................................................. 7 Alias ..................................................................................................................................................... 8 Grouping ............................................................................................................................................. 8 Mathematical operations ................................................................................................................... 9 Joining tables ....................................................................................................................................10 Combining it all ................................................................................................................................. 12 How to access from Excel?............................................................. Fejl! Bogmærke er ikke defineret. Mac ............................................................................................Fejl! Bogmærke er ikke defineret. Windows .................................................................................... Fejl! Bogmærke er ikke defineret. Assignments ................................................................................... Fejl! Bogmærke er ikke defineret.

1

Introduction to SQL

Michael H. Larsen

Introduction and definitions Databases A database is, basically, a collection of data. Traditionally databases have been highly structured with a set of tables each containing a set of columns. Databases come in many flavors. Microsoft SQL Server, Oracle SQL Server, Oracle MySQL and MariaDB just to name a few SQL server variants. Common for the mentioned database servers is that they all rely on SQL – with some local variance – for their underlying language. While not all commands are available on all platforms, the basics are, more or less, the same. For the purposes of this introduction the differences are negligible. The databases that this introduction focuses on are known collectively as Relational Database Management Systems (RDBMS). Recently a different type of database has started emerging, known as NO-SQL databases. These will not be covered in this introduction. In this introduction a Microsoft SQL Server 2016 has been used. The Northwind dataset from Microsoft will be used for this introduction.

SQL Short for Structured Query Language and pronounced either as “sequel” or by letter S-Q-L. SQL is a programming language specifically used for the communication with SQL Servers. As mentioned above SQL is somewhat interchangeable between SQL Server types. The SQL is a standard under ANSI and ISO. The standards have been continually revised adding new features since their original standardization in 1986 and 1987 respectively. Despite the standardization local variance exists.

Rows and columns A database table is not unlike an Excel spreadsheet. It contains a number of rows and columns. Unlike Excel you have to define the data type for each column and the datatype is strictly enforced. This means that, for example, it is impossible to enter text in an integer column.

Statements SQL Statements – the general name for SQL queries to an SQL Server – always begin with a keyword that tells the database server what type of request is being performed. Many keywords exist, but not all will be covered in this introduction. This introduction will focus on the most used keywords: Select, Update, Delete and Insert.

Schema Some database servers use what is known as a schema. This is way to segment data on the database server. Some database servers use this more extensively than others. For this introduction the dbo schema will be used throughout.

2

Introduction to SQL

Michael H. Larsen

Select Introduction The select statement is the most widely used SQL statement. The select statement is used to fetch data from one or more tables. In the demonstration dataset the table Customers contains 91 rows representing 91 unique customers.

The table contains the columns CustomerID, CompanyName, ContactName, ContactTitle, Address, City, Region, PostalCode, Country, Phone and Fax. This SQL Statement selects the customers from the table Customers: SELECT * FROM Customers;

Both the ANSI and ISO standards for SQL require any statement to end with a semicolon. Most database servers will automatically append an invisible semicolon if the user omits it. It is, however, best practice to always end a statement with a semicolon.

Selecting specific columns In the example above all the columns from Customers a fetched using the *. It is possible to fetch only specific columns if need be. This can be very useful on large datasets where only a subset of the data is needed. The selection is done by adding the column names to the query in place of the *. For fetching, for example, the CompanyName, ContactName and Phone from the Customer table: SELECT CompanyName, ContactName, Phone FROM Customers;

3

Introduction to SQL

Michael H. Larsen

Ordering the results The results can be ordered by using the clause ORDER BY. Returning to the full fetch from Customers the results can, for example, be ordered by city by adding ORDER BY City to the statement: SELECT * FROM Customers ORDER BY City;

It is possible to order by more than one column by simply adding more columns to the ORDER BY clause. An example could be to order by Country first and then by City: SELECT * FROM Customers ORDER BY Country, City;

To order in reverse order the keyword DESC (DESC is short for Descending) can be added. SELECT * FROM Customers ORDER BY Country DESC, City;

Filtering the results Often only a specific subset of the data is required. Based on requirements for the specific task a clause can be created with the WHERE keyword. The WHERE keyword requires a following set of clauses that specify what the condition is. The conditions used by the WHERE keyword must use an operator. Following the SQL standards, the condition can use one of the following operators: Operator =

> < = IS NULL IS NOT NULL BETWEEN [X] AND [Y] != !> !< LIKE

Description Equals / ”Is identical with” Not equals / “Is not identical to” “Is larger than” “Is less than” “Is less than or equal to” “Is larger than or equal to” “Is empty” / “Is not defined” “Is not empty” / “Is defined” “Is between X and Y “Is not identical to” “Is not larger than” “Is not less than” “Is like” - % is used as wildcard

Using the WHERE clause it is then possible to, for example, find a customer with a specific CustomerID in the Customer table: SELECT * FROM Customers WHERE CustomerID = 'BOLID';

4

Introduction to SQL

Michael H. Larsen

Multiple clauses may be combined using the AND keyword to narrow the results even further: SELECT * FROM Customers WHERE Country = 'Brazil' AND City = 'Rio de Janeiro';

Note that both WHERE clauses shown above use text strings as the parameter. These text strings must be enclosed in single quote (‘). This is also the case for dates. If the WHERE clause was using a numerical comparison, the parameter should not be enclosed in single quotes, even though this will not result in an error. As there are no numerical columns in Customers this can be demonstrated using a query to the Orders table: SELECT * FROM Orders WHERE EmployeeID < 5;

SELECT * FROM Orders WHERE EmployeeID = 5;

The result is the same if, by mistake, the parameter is enclosed in single quotes – it just doesn’t represent “best practice”: SELECT * FROM Orders WHERE EmployeeID < '5';

Filtering on dates is done in the same way as filtering on text using single quotes. Note though the date format. SELECT * FROM Orders WHERE OrderDate > '1997-01-01';

5

Introduction to SQL

Michael H. Larsen

The LIKE clause Using the LIKE clause is especially powerful for searching through a database. The LIKE clause uses %, _ and [] as a wildcards and it is therefore possible to find records where only some of the information is known. If, for example, only the customers given name is known, a wildcard search can be performed. SELECT * FROM Customers WHERE ContactName LIKE 'Ann%';

Where % allows any number of unknown characters, _ is a single unknown character. If, for example, that the user is uncertain on whether the last name of Ann Devon is spelled Devon or Devan, the query could be changed: SELECT * FROM Customers WHERE ContactName LIKE 'Ann Dev_n';

The wildcard [] is used by enclosing the possibilities to accept in the brackets. Continuing on the example above. If, for example, that the user knows that Devon is spelled with either “o” or “e” and doesn’t want results that is outside this scope the query could again be changed: SELECT * FROM Customers WHERE ContactName LIKE 'Ann Dev[eo]n';

Multiple wildcards may be used in one parameter: SELECT * FROM Customers WHERE ContactName LIKE 'Ann%Dev%';

Multiple clauses As mentioned above it is possible to use multiple WHERE clauses using the AND keyword. Likewise, it is possible to use OR and IN. OR works as in any selection statement and may combine any of the clause types described above: SELECT * FROM Customers WHERE CustomerID = 'ALFKI' OR CustomerID = 'FAMIA' OR ContactName LIKE 'Hanna%';

6

Introduction to SQL

Michael H. Larsen

Using the IN clause is different than the above uses as this clause has a built-in OR. The IN clause allows the query to look for entries in the table where the value is one of the specified: SELECT * FROM Customers WHERE CustomerID IN ('ALFKI', 'FAMIA');

Limiting the number of results It is possible to limit the number of results returned by a query. This is one of the elements that has local variance between different database servers. As this introduction is based on Microsoft SQL Server the Microsoft syntax will be used. Oracle SQL Server, Oracle MySQL and MariaDB uses a different approach to limit the number of results. In Microsoft SQL Server the keyword TOP is used right after the SELECT keyword. If, for example, the first 10 customers from the Customers table is to be returned: SELECT TOP 10 * FROM Customers;

The TOP clause may be combined with other clauses – here the top 7 orders by EmployeeID 5 ordered descending by Freight: SELECT TOP 7 * FROM Orders WHERE EmployeeID = 5 ORDER BY Freight DESC;

Order of evaluation Just like in mathematics there is a specific order of precedent to the clauses. Clauses are evaluated individually, then NOT, AND and OR are evaluated in that order.

7

Introduction to SQL

Michael H. Larsen

Alias On occasion it comes in handy giving columns aliases. Often this is used when using constants, calculated columns or joins. An alias is simply another name for a column. The keyword AS is used to apply an alias. If, for example, CustomerID needs to be changed into UserID in a query, the query could look like this: SELECT CustomerID AS UserID, * FROM Customers;

Grouping Grouping data from the SQL Server is a powerful way of doing calculations on the backend rather than doing the calculation on the receiving end. Grouping is done with the keyword GROUP BY Grouping is often parred with SUM, COUNT, MIN, MAX or AVG. As an example, lets fetch the number of orders by each customer from the Orders table: SELECT COUNT(OrderID) AS NumberOfOrders, CustomerID FROM Orders GROUP BY CustomerID ORDER BY NumberOfOrders DESC;

As shown by to SQL Statement above GROUP BY is often combined with ORDER BY. The GROUP BY statement should always be combined with either SUM, COUNT, MIN, MAX or AVG. An example for the use of SUM could be to fetch to total freight weight for each customer from the Orders table: SELECT SUM(Freight) AS SumOfFreight, CustomerID FROM Orders GROUP BY CustomerID ORDER BY SumOfFreight DESC;

8

Introduction to SQL

Michael H. Larsen

Mathematical operations Just like SUM and COUNT used above it is possible to do mathematical operations directly in the query. The mathematical operations available are rather basic but should be used where possible as the SQL Server is much faster at the calculation than the time it would take to transfer the data to the receiver and do the calculation there. Operator + * / %

Description Addition Subtraction Multiplication Division Modulus. Returns the “rest” of a division.

The OrderDetails table contains information about each of the orders in the Orders table. The information includes OrderID, ProductID, UnitPrice, Quantity and Discount. To find the total order price, first the total price per product must be found. This can be done using the multiplication function: SELECT OrderID, ProductID, (UnitPrice*Quantity) AS ProductPrice, Discount FROM OrderDetails;

Should the discount also be applied the query can be changed: SELECT OrderID, ProductID, ((UnitPrice*Quantity)*(1-Discount)) AS ProductPrice FROM OrderDetails;

9

Introduction to SQL

Michael H. Larsen

Now that the product prices are known the total order price can be calculated: SELECT OrderID, SUM((UnitPrice*Quantity )*(1-Discount)) AS OrderPrice FROM OrderDetails GROUP BY OrderID;

Joining tables As shown above data is stored in multiple different tables. As discussed earlier databases are normally relational. This means that there is a relationship between the different tables in the database. For this example the tables share CustomerID and OrderID. Generally two different types of joins are used – inner and outer joins. For the following examples we are going to use the Orders table and the Customers table and the shared CustomerID. From a conceptual perspective Customers and Orders are two different sources of data. Each have their own data. CustomerID is shared among them.

If an inner join is used the subset returned will be only the data where CustomerID is present in both dataset. Should there be a customer without any orders this customer will not be shown in the results.

10

Introduction to SQL

Michael H. Larsen

Alternatively, an outer join could be used. The result of which will be the entire dataset from either the RIGHT dataset or the LEFT dataset enriched with the data from the LEFT and RIGHT tables respectively.

In practice the implementation of an INNER JOIN between the Orders table and the Customers table could look like this: SELECT * FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

As an example, lets now assume that all that is needed is to enrich the Orders data with the ContactName and ContactTitle from Customers: SELECT Orders.*, Customers.ContactName, Customers.ContactTitle FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

Here is a break down of each element in the SQL Statement above: SELECT Orders.* Customer.ContactName Customer.ContactTitle FROM Customers INNER JOIN Orders ON Customers.CustomerID=Orders.CustomerID

SELECT keyword tells the database that we are going to fetch data Fetch all columns from Orders Fetch ContactName from Customer Fetch ContactTitle from Customer FROM defines which table we are fetching from – in this case Customers Join Orders data to Customers data using an INNER JOIN Join Orders data to Customers data where the CustomerID is identical in the two tables.

Usually a LEFT/RIGHT JOIN is used when data may be present in one table only, but the information is still wanted. Here is the same SQL Statement with a LEFT JOIN. No change will occur in the data in this example: SELECT Orders.*, Customers.ContactName, Customers.ContactTitle FROM Customers LEFT JOIN Orders ON Customers.CustomerID = Orders.CustomerID;

11

Introduction to SQL

Michael H. Larsen

Combining it all With the information above it is now possible to create rather complex SQL Statements. For example, let’s assume that an overview of the 10 customers and their contacts that have purchased the most in 1996 can be created: SELECT TOP 10 Customers.ContactName , Customers.ContactTitle, SUM((UnitPrice*Quantity)*(1-Discount)) AS OrderPrice FROM Customers INNER JOIN Orders ON Customers.CustomerID = Orders.CustomerID INNER JOIN OrderDetails ON OrderDetails.OrderID=Orders.OrderID WHERE OrderDate >= '1996-01-01' AND OrderDate...


Similar Free PDFs