IS202 DM Notes - Summary Data Management PDF

Title IS202 DM Notes - Summary Data Management
Author Mingmiao Liu
Course Data Management
Institution Singapore Management University
Pages 17
File Size 1.3 MB
File Type PDF
Total Downloads 666
Total Views 1,015

Summary

Topics for DM FinalER (Design and Critique a ER Model)Good Business Rule Declarative – what, not how Precise – clear, agreed-upon meaning Atomic – one statement Consistent – internally and externally Expressible – structured, natural language Distinct – non-redundant Business-oriented – understood b...


Description

Topics for DM Final ER (Design and Critique a ER Model) Good Business Rule •

Declarative – what, not how



Precise – clear, agreed-upon meaning



Atomic – one statement



Consistent – internally and externally



Expressible – structured, natural language



Distinct – non-redundant



Business-oriented – understood by business people

Basic E-R Notation

Entity •



SHOULD BE: –

An object that will have many instances in the database



An object that will be composed of multiple attributes



An object that we are trying to model

SHOULD NOT BE: –

A user of the database system



An output of the database system (e.g. a report)

Identifier •



Identifier: –

An attribute (or combination of attributes) that uniquely identifies individual instances of an entity type



Identifier symbol: underline

Candidate Key: –



An attribute that could be an identifier… it satisfies the requirements for being an identifier

Characteristics –

Will not change in value



Will not be null: every instance must have a value for the identifier



Use simple identifiers instead of long, composite identifiers

Cardinality of Relationship •

One – to – One: Each entity in the relationship will have exactly one related entity



One – to – Many : An entity on one side of the relationship can have many related entities, but an entity on the other side will have a maximum of one related entity



Many – to – Many: Entities on both sides of the relationship can have many related entities on the other side



Constraints –



Minimum Cardinality •

If zero, then optional



If one or more, then mandatory

Maximum Cardinality •

The maximum number

Supertype and Subtypes •

Supertype: A generic entity type that has a relationship with one or more subtypes



Subtype: A subgrouping of the entities in an entity type which has attributes that are distinct from those in other subgroupings



Two Approach: Generalization and Specialization

Constraints in Subtype Completeness Constraints •

Whether an instance of a supertype must also be a member of at least one subtype –

Total Specialization Rule: Yes (double line)



Partial Specialization Rule: No (single line)

Total Specialization rule

Partial specialization rule

Disjointness Constraints •

Whether an instance of a supertype may simultaneously be a member of two (or more) subtypes –

Disjoint Rule: An instance of the supertype can be only ONE of the subtypes (represented by “d”)



Overlap Rule: An instance of the supertype could be more than one of the subtypes (represented by “o”)

Disjoint rule

Overlap Rule

Subtype Discriminators •

An attribute of the supertype whose values determine the target subtype(s) –

Disjoint – a simple attribute with alternative values to indicate the possible subtypes



Overlapping – a composite attribute whose subparts pertain to different subtypes. Each subpart contains a boolean value to indicate whether or not the instance belongs to the associated subtype

Disjoint rule

Overlap rule

LD (7 Steps to convert ER to LD Read the assumption given carefully!) Keys usually are used as indexes to speed up the response to user queries

Primary Key – Represent by underline -

-

An attribute (or a combination of attributes) that can uniquely identify each row in a relation -> Single fields ( Simple primary key) -> multiple field (Composite primary key) Each relation must have one primary key The primary key value for each row is unique

Foreign Key -

An attribute (or combination of attributes) in a relation that is a primary key of another relation

Integrity Constraints 1. Referential Integrity – Rules stating that any foreign key value MUST 1. 2.

Either match a primary key value in the relation it refer to Or be null

Referential integrity constraints are drawn via arrows from dependent table (in which it is the foreign key) to the parent table (in which it is the primary key)

2. Foreign Key Constraints – Deleting the information from parent relation 3. 4. 5.

Restrict: Deletion is not allowed, as long the data is involved in other relations Cascade: The data and other affected records are deleted Set-to-Null: the value of the data from the affected table is set to null

3. Domain Constraints 6.

Allowable values for an attributes

4. Entity Integrity 7.

No primary key attribute may be null. All primary key fields must have data

7 Steps to map EER diagram to relations 1. Mapping regular entities to relations a. Simple attributes – ER attributes map directly onto the relation

CUSTOMER entity type with simple attributes

CUSTOMER Relation

b. Composite attributes – Use only their simple, component attributes

CUSTOMER entity type with composite attributes

CUSTOMER relation with address detail

c. Multi-Valued attributes – a separate relation with foreign key taken from the superior entity

CUSTOMER entity type with multi-value attributes

Multi-valued attribute becomes a separate relation with foreign key. 1 – to – Many relationship between original entity and new relation

2. Mapping weak entities o o

Weak entity becomes a separate relation with a foreign key taken from the strong entity Primary key composed of partial identifier of weak entity and primary key of Strong entity

Weak entity DEPENDENT

Relations resulting from weak entity NOTE: the domain constraint for the foreign key should NOT allow null value if DEPENDENT is a weak entity Composite primary key

Foreign key

3. Mapping binary relationships a. One-to-Many: Primary key on the one side become a foreign key on the many side

Relationship between CUSTOMER and ORDER

Relation from one-to-many relationship

Foreign key

b. Many-to-Many: Create new relation with the primary keys of the two entity as it primary key

Many-to-Many Relationship

Resulting relations

Composite primary key

Foreign key

New intersection relation

Foreign key

c. One-to-One: Primary key on the mandatory side becomes a foreign key on the optional side

Binary One-to-One relationship

Resulting relation

4. Mapping associative entities a. Identifier Not Assign Default primary key for the associative is composed of the primary key of two entities

An Associative Entity

Resulting relation

b. Identifier Assign Have primary key of its own.

Shipment associative entity

Resulting Relation

5. Mapping unary relationships a. One-to-One & One-to-Many: Recursive foreign key in the same relation

One-to-One or One-to-Many Relationship

Resulting relation with recursive foreign key

b. Many-to-Many: Two relation (entity type & associative relation)

Many-to-Many Relationship

Resulting Relation

6. Mapping ternary relationships o o

One relation for each entity and one for the associative entity Associative entity has foreign keys to each entity in the relationship

Ternary relationship with associative entity

Resulting relation

7. Mapping supertype/subtype relationship o o o o o

One relation for supertype and for each subtype Supertype attributes (including identifier and subtype discriminator) go into supertype relation Subtype attributes go into each subtype primary key of supertype relation also becomes primary key of subtype relation 1:1 relationship established between supertype and each subtype, with supertype as primary table

Supertype/subtype relationship

Resulting relation

SQL (Not require to write but have to guess the output) SQL -> Structured Query Language DDL (data definition language): SQL commands to create databases, tables, drop tables etc. DML (data manipulation language): SQL commands used for querying databases

Create Table



char(n): a string with n characters



varchar(n): a string with at most n characters



All constraint names in a schema/database should have different names



Child tables can be created only when ALL the parent tables are created

How to define primary key -

Simple primary key: o

-

CONSTRAINT course_pk PRIMARY KEY (cid)

Composite primary key o

CONSTRAINT qualification_pk PRIMARY KEY (fid, cid)

How to define foreign key -

Single foreign key o CONSTRAINT qualification_fk1 FOREIGN KEY (cid) REFERENCES course(cid) Composite foreign key o CONSTRAINT test_FK1 FOREIGN KEY (StuID,CourseID) REFERENCES Registration(SID,CID))

Insert Data •

INSERT INTO Student VALUES (38214, ’Ama');

Delete Data

Update Data

Change table structure •

Add a new column –



Add a new constraint –



ALTER TABLE table_name ADD CONSTRAINT constraint_name constraint_definition

Drop an existing column –



ALTER TABLE table_name ADD column_name column_type [not null];

ALTER TABLE table_name DROP column_name

Drop an existing foreign key constraint –

ALTER TABLE table_name DROP FOREIGN KEY foreign_key_constraint_name

SELECT Statement •

Clauses of the SELECT statement: [look at DM_SQL Syntax.docx for the possible syntax] –

SELECT: list the columns (and expressions) that should be returned from the query



FROM: indicate the table(s) or view(s) from which data will be obtained



WHERE: indicate the conditions under which a row will be included in the result



GROUP BY: indicate categorization of results



HAVING: indicate the conditions under which a category (group) will be included



ORDER BY: sort the result according to specified criteria

Join Table Join

Inner-Join

Left Outer Join

Right Outer Join

Equal Join

Not-Equal Join

Sub-Query A subquery may occur in: -

A WHERE clause A FROM clause A HAVING clause

Normalization (Determine the functional dependency base on given assumption and data dictionary specified in the question) Well-Structured Relations -

Contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies Goal is to avoid anomalies o Insertion Anomaly: adding new rows forces user to create duplicate data o Delete Anomaly: adding new rows forces user to create duplicate data o Modification Anomaly: adding new rows forces user to create duplicate data

General rule of thumb: a table should not pertain to more than one entity type

Data Normalization -

-

Tool to validate and improve a logical design so that it satisfies certain constraints that avoid unnecessary duplication of data The problems of having duplication of data o Waste of space o Difficulty in consistency control The process of decomposing relations with anomalies to produce smaller, well-structured relations

Normalization Form All relations are in 1st Normal Form

1st Normal Form – 1NF -

No multivalued attributes, and every attribute value is atomic

2nd Normal Form – 2NF -

1NF + every non-key attribute is fully functionally dependent on the ENTIRE primary key o Every non-key attribute must be defined by the entire key, not by only part of the key o No partial functional dependencies Functional Dependency: The value of one attribute, or of combination of attributes, (the determinant) determines the value of another attribute (Determinants can be multiple keys)

Representing Functional Dependencies - Graphical representation: a link is issued from the determinant pointing to the other attribute

1NF on entire primary key Dependency

Dependencies on part of the key (partial functional dependencies) For example: EmpID is the determinant, and Name, DeptName, Salary is functionally dependent on EmpID - Text representation: EmpID -> Name, DeptName, Salary CourseID -> Course_Title EmpID, CourseID -> DateCompleted

3rd Normal Form – 3NF -

2NF PLUS no transitive dependencies A transitive dependency is a functional dependency between two (or more) non-key attributes.

Transitive dependencies are removed...


Similar Free PDFs