Title | MOD 01 - Intro DBMS - Lecture notes 1 |
---|---|
Author | The Odin |
Course | Modellierung |
Institution | Universität Stuttgart |
Pages | 62 |
File Size | 1.2 MB |
File Type | |
Total Downloads | 91 |
Total Views | 141 |
Introduction to
Database & Database
Management...
Introduction to Database&Database Management
Prof.Dr.FrankLeymann InstituteofArchitectureofApplicationSystems
Prof.Dr.BernhardMitschang Dr.MatthiasWieland IPVS– ApplicationSoftware
as
Anwendungssoftware
Overview
Why Database Management? Database Systems and Information Systems Classes of Database Applications
Transactions, OLTP Data Warehouse System
Realization of Information Systems
© IPVS AS, IAAS
2
What is aDatabaseManagementSystem?
Manages very large amounts of data Supports efficient access to very large amounts of data Supports concurrent access to very large amounts of data
Example: bank and its ATM machines
Supports secure atomic access to very large amounts of data
Contrast two people editing the same UNIX file - last to write "wins" - with the problem if two people deduct money from the same account via ATM machines at the same time - new balance (b) is wrong whichever writes last. read b b:=b-$100
© IPVS AS, IAAS
read b
b
b:=b-$250 3
Three Aspects to Studying DBMS´s
Modeling and Design of Databases
Programming: Queries and DB operations like „update“
Allows exploration of issues before committing to an implementation
SQL (Structured Query Language) = „query“ language for (relational) DBMSs
DBMS Implementation
© IPVS AS, IAAS
4
DataStructures?(1)
Well-known Data Structures:
Array List, Tuple Record, Dictionary Graph, Tree
mostly transient data:
maintained in main memory does only survive a single program execution
© IPVS AS, IAAS
5
DataStructures?(2)
New Aspect: usage of external memory (secondary memory, non-volatile memory)
Persistence: Data survives end of program, end of session, operating system uptime, ... New kind of access: Read operations and Write operations based on units of data blocks or units of tuples
data structures and algorithms (search and sort) that are efficient for main memory do not necessarily show efficiency for secondary memory
Value-based access (associative access) to large data sets Voluminous attribute values (e. g. pictures, documents)
© IPVS AS, IAAS
6
DataModels?(1)
Constructors to generate data structures and associated operations. For example, Tables (sets of equally structured records or tuples) CREATE TABLE STUDENT (MATRIKELNUMBER LASTNAME FIRSTNAME BIRTHDATE .... )
© IPVS AS, IAAS
INTEGER, VARCHAR(40), VARCHAR(40), CHAR(8),
7
DataModels?(2)
Important Role of Relationships Some data models have special relationships between tuples (records): hierarchies, aggregations, ... e.g.: students enrole in a course that is given by a lecturer student
enrol
course
give
lecturer
Notion of Model
Given set of language features to describe an universe of discourse
© IPVS AS, IAAS
8
Databases?(1)
Efficient Management of Huge Amounts of Data Data Independence (of the applications)
Data usage without any knowledge of the technical aspects and implementation (abstract data model table) Ease of use and powerful operations
© IPVS AS, IAAS
9
Databases?(2)
Openness to New Applications
Explicit constraints Single and neutral representation System enforced integrity
customer management transfers ATM
?
Transactions (ACID property)
Atomicity Consistency Isolated execution Durability
© IPVS AS, IAAS
bank data
10
Databases?(3)
Fault Tolerance
Logging of redundant data during normal operations Automatic repair (recovery) of data structures after program failure, system failure, or media failure
Undo of not finished transactions Redo of the effect of completed (committed) transactions
Multi User Operation
Simultaneous (concurrent) access of different users to the same data Synchronization
© IPVS AS, IAAS
11
Howmuchinformationisthere1?(1)
Several thousand PBytes² suffice to store all relevant information in the world There will be enough disk and tape capacity to store everything written, said, done, and photographed by mankind
This is already true today for written information In some years this will be true for the remaining amount of information
Computers store and manage information more effective than humans 1 http://www.lesk.com/mlesk/ksg97/ksg.html 2
© IPVS AS, IAAS
1 Gigabyte (GByte) = 1,000 Megabytes 1 Terabyte (TByte) = 1,000 Gigabytes 1 Petabyte (PByte) = 1,000 Terabytes 1 Exabyte (EByte) = 1,000 Petabytes 12
Howmuchinformationisthere?(2)
Consequences for the Future
In a few years one will be able to retain all information indefinitely, i. e., no information needs to be discarded Computers will do searching, storage, and processing automatically, without human intervention Today, digital libraries focus on input: scanning, compression, and OCR technology Tomorrow, the focus will shift to retrieval: selection, search, and quality assessment.
© IPVS AS, IAAS
13
Overview
Why Database Management? Database Systems and Information Systems Classes of Database Applications
Transactions, OLTP Data Warehouse System
Realization of Information Systems
© IPVS AS, IAAS
14
Computer‐basedinformationsystems(CIS)
Database system: key component for CIS applications
database system
CIS operating system hardware
A database system (DBS) covers
a database, i.e., a collection of stored data that is used by applications a database management system (DBMS) managing the database.
© IPVS AS, IAAS
15
DatabaseManagementSystemsasPartofaCIS
DBMS: A tool for creating and managing large amounts of data efficiently and allowing it to persist over long periods of time, safely. (Garcia-Molina et. al., 2002) Databases today are essential to every business, in order to:
present data to customers present data on the World Wide Web support commercial processes
DBMS
Some important capabilities:
Persistent storage for large amounts of data Programming interface for users and applications Transaction management
© IPVS AS, IAAS
Database
DBS
16
IssuesinBusinessInformationSystems(1)
Requirements on a business information system can be distinguished according to the following three levels:
Operational level (clerks) Middle management level (middle management) Strategic level (board of directors)
Operational Level: Enhancement of processes by usage of query systems, report systems, reservation systems, production systems and all their applications (enterprise resource planning, ERP) Characteristics:
huge amounts of data high update probability
© IPVS AS, IAAS
17
IssuesinBusinessInformationSystems(2) Middle Management Level: Support of and partial automations of business processes:
Interactive data analysis Automation of routine decisions Use of mathematical and statistical methods Characteristics:
partially unpredictable information need aggregated data no updates
Strategic Level: Data provision for mostly unpredictable information needs © IPVS AS, IAAS
18
ExamplesforInformationSystems(1) Airline Reservations Systems Data items: 1. Reservations by a single customer on a single flight, including such information as assigned seat or meal preference. 2. Information about flights - the airports they fly from and to, their departure and arrival times, or the aircraft flown, for example. 3. Information about ticket prices, requirements, and availability. Typical queries ask for flights leaving about a certain time from one given city to another, what seats are available, and at what prices. Typical data modifications include the booking of a flight for a customer, assigning a seat, or indicating a meal preference. Many agents will be accessing parts of the data at any given time. The DBMS must allow such concurrent accesses, prevent problems such as two agents assigning the same seat simultaneously, and protect against loss of records if the system suddenly fails. © IPVS AS, IAAS
19
ExamplesforInformationSystems(2) Banking Systems Data items include names and addresses of customers, accounts, loans, and their balances, and the connection between customers and their accounts and loans e.g., who has signature authority over which accounts. Queries for account balances are common, but far more common are modifications representing a single payment from or deposit to an account. As with the airline reservation system, we expect that many tellers and customers (through ATM machines) will be querying and modifying the bank’s data at once. It is vital that simultaneous accesses to an account not cause the effect of an ATM transaction to be lost. Failures cannot be tolerated. For example, once the money has been ejected from an ATM machine, the bank must record the debit, even if the power immediately fails. On the other hand it is not permissible for the bank to record the debit and then not deliver the money because the power fails. The proper way to handle this operation is far from obvious and can be regarded as one of the significant achievements of DBMS technology.
© IPVS AS, IAAS
20
Enhancement of processes by usage of query systems, report systems, reservation systems, production systems and all their applications (enterprise resource planning, ERP) Characteristics:
huge amounts of data high update probability
© IPVS AS, IAAS
IssuesinBusinessInformationSystems(2) Middle Management Level: Support of and partial automations of business processes:
17
Interactive data analysis Automation of routine decisions Use of mathematical and statistical methods Characteristics:
partially unpredictable information need aggregated data no updates
Strategic Level: Data provision for mostly unpredictable information needs © IPVS AS, IAAS
18
ExamplesforInformationSystems(1) Airline Reservations Systems Data items: 1. Reservations by a single customer on a single flight, including such information as assigned seat or meal preference. 2. Information about flights - the airports they fly from and to, their departure and arrival times, or the aircraft flown, for example. 3. Information about ticket prices, requirements, and availability. Typical queries ask for flights leaving about a certain time from one given city to another, what seats are available, and at what prices. Typical data modifications include the booking of a flight for a customer, assigning a seat, or indicating a meal preference. Many agents will be accessing parts of the data at any given time. The DBMS must allow such concurrent accesses, prevent problems such as two agents assigning the same seat simultaneously, and protect against loss of records if the system suddenly fails. © IPVS AS, IAAS
19
ExamplesforInformationSystems(2) Banking Systems Data items include names and addresses of customers, accounts, loans, and their balances, and the connection between customers and their accounts and loans e.g., who has signature authority over which accounts. Queries for account balances are common, but far more common are modifications representing a single payment from or deposit to an account. As with the airline reservation system, we expect that many tellers and customers (through ATM machines) will be querying and modifying the bank’s data at once. It is vital that simultaneous accesses to an account not cause the effect of an ATM transaction to be lost. Failures cannot be tolerated. For example, once the money has been ejected from an ATM machine, the bank must record the debit, even if the power immediately fails. On the other hand it is not permissible for the bank to record the debit and then not deliver the money because the power fails. The proper way to handle this operation is far from obvious and can be regarded as one of the significant achievements of DBMS technology.
© IPVS AS, IAAS
20
ExamplesforInformationSystems(3) What's the fundamental difference between the CIS of a car manufacturer and a bank? The role of CIS in banking systems: “In banking, by contrast, the data actually is the inventory - the two are synonymous. In increasingly many cases, the DB transaction is the financial transaction. There are no real, tangible tokens (greenbacks) moved as a result of the monetary transfer transaction. If the data is bad, money is lost or created. There is no possibility of counting the money (bits) in order to verify the status. Fiscal responsibility dictates that creating or destroying money - even temporarily - is unacceptable.”
(Mike Burman, Bank of America) © IPVS AS, IAAS
DatabaseSystems(1)
General Tasks of a DBS
Management of persistent data Efficient access to large amounts of data (GBytes, TBytes, …) Flexible multi-user support Join of objects of different types (type-spanning operations)
21
Classical Data Models
relational model
network model
hierarchical model
Department Department
Project Project Employee
relations/tables
data records OWNER/MEMBER
© IPVS AS, IAAS
DatabaseSystems(2)
data records hierarchy: parent/child-relationships 22
Data Structures
Formatted data structures, fixed record structure Record type, attributes and attribute values (Si/Aj/AVk) describe objects The description information (metadata) Aj and Si determine the meaning of each attribute value AVk . Schema Record type (Table, Relation) Student
Attributes MATRNR
SNAME
FNBR
BEGIN
Instance(State) STUDENT
© IPVS AS, IAAS
MATNR
SNAME
FNBR
BEGIN
123766
COY
F9
01.10.95
225332
MILLER
F5
15.04.87 23
DatabaseSystems(3)
DBS Interface Operations to define object types (description of objects) DB schema: which objects should be stored in the DB? Operations to find and update data Application interface: how to create, update and select DB objects.
Definition of integrity constraints ensuring of quality: which DB states are acceptable?
Definition of data control (e.g., access rights) which user is allowed to invoke which operation on which object with which parameters?
© IPVS AS, IAAS
24
DatabaseSystems(4)
The Nature of DB Languages Depends on the data model Formal language Navigation-based or descriptive Tuple- or set-oriented Selection power: minimum Predicate Logic (1st order)
Search Methods Character or value comparison: (FUNCTION = ‘ADMINISTRATOR’) AND (AGE > 60) Exact match query: Find exactly all records with the specified property Search for synonyms, fuzzy search, ... no support Natural language support, recognition of ambiguities no support in formatted DBS (see Information Retrieval Systems)
25
© IPVS AS, IAAS
Example Schema
Instance
Faculty
FNBR
FNAME
DEAN
Student
MATRNR
SNAME
FNBR
BEGIN
Examination
PNR
MATRNR
SUBJECT
DATE
Professor
PNR
NAME
FNBR
FACULTY
(state) STUDENT
FNBR
FNAME
DEAN
F9
BUSINESS ADMIN.
4711
F5
COMPUTER SCIENCE
2223
MATNR
SNAME
FNBR
BEGIN
123766
COY
F9
01.10.95
225332
MILLER
F5
15.04.87
654711
TAYLOR
F5
15.10.94
MARK
226302 196481
CANETTI F9 B EXAMINATION
01.10.95 PNR MATNR
SUBJECT
DATE
MARK
130680
S
5678
123766
DS
22.10.98
4
4711
123766
OS
16. 01.98
3
1234
654711
DB
17. 04.97
2
1234
123766
DB
17. 04.97
4
6780
654711
DS
19.09.97
1
6780
196481
OS
23.12.97
3 26
© IPVS AS, IAAS
Example Schema FACULTY
FNBR
FNAME
DEAN
STUDENT
MATRNR
SNAME
FNBR
BEGIN
EXAMINATION
PNR
MATNR
SUBJECT
DATUM
PNR
NAME
PROFESSOR
Instance (state)
FBNR
Q1: Find all students, who belong to F5 and had started studying before 1995. MARK
SELECT * FROM STUDENT WHERE FNBR=’F5’ AND BEGIN...