MIS Final - Summary Modern Database Management PDF

Title	MIS Final - Summary Modern Database Management
Course	Database Management Systems
Institution	Binghamton University
Pages	30
File Size	2 MB
File Type	PDF
Total Downloads	10
Total Views	128

Preview

CLICK TO PREVIEW PDF

Summary

Summary of lecture notes and relevant textbook chapters for final examination....

Description

Chapter 4: logical database design and the relational model ● Conceptual data modeling is about understanding the organization ● Logical database design is about creating stable database structures in a technical language ○ Goal: structuring the data for sound database processing. To translate the conceptual design (which represents an organization's requirements for data) into a logical database that can be implemented Components of relational model: ● Data structure ○ Tables (relations), row, columns ● Data manipulation ○ Powerful SQL operations for retrieving and modifying data ● Data integrity ○ Mechanisms for implementing business rules that maintain integrity of manipulated data Relational data model: represents data in the form of tables ● Most commonly used ● A form of logical data model. E-R data model is not a relational data model and may not obey the rules for a well structured relational data model, called normalization:  the process of designing well-structured relations which is an important component of logical design for the relational model ● Consists of three components: ○ Data structure: data are organized in the form of tables with rows and columns ○ Data manipulation: p  owerful operations are used to manipulate data stored in the relations ○ Data integrity: mechanisms to specify business rules that maintain the integrity of data when they are manipulated Correspondence with E-R model: ● Relations (tables) correspond with entity types and many-to-many relationship types ● Rows correspond with entity instances and with many-to-many relationship instances ● Columns correspond with attributes *note the word relation (in relational database) is NOT the same as the word relationship (in E-R model)lj Relational data structure: ● A Relation i s a named, 2-dimensional table of data ● A table consists of rows (records) and columns (attribute or field) ● Each relation (or table) consists of a set of named columns and an arbitrary number of unrelated rows ● An attribute is a named column of a relation ● Each row of a relation corresponds to a record that contains data (attribute) values for a single entity ● You can express the structure of a relation by using the name of the relation followed by (the names of attributes in the relation)

●

Employee(empID,name,deptname,salary)

Relational keys: keys are usually used as indexes to speed up the response to user queries ●

● ●

● ●

●

A primary key is an attribute or combo of attributes that uniquely identifies each row in a relation ○ Examples: employee numbers, social security # ○ this guarantees that all rows are unique The attribute or a collection of attributes indicated as an identifier in an E-R diagram may be the same that are primary keys Exceptions: ○ Associative entities do not have to have an identifier, and the partial identifier of a weak entity forms only part of a corresponding relations primary key ○ There may be several attributes of an entity that may serve as the associated relations primary key A composite key is a primary key that consists of more than one attribute ○ Keys can be simple (a single field) or composite (more than one field) A foreign key represents the relationship between two tables or relations. It is an attribute (possibly composite) in a relation that serves as the primary key of another relation ○ Foreign keys are identifiers that enable a dependent relation (on the many side of a relationship) to refer to its parent relation EMPLOYEE1(empID, name, deptname, salary) DEPARTMENT (deptname, location, fax) ● The attribute Deptname is a foreign key to employee1. It allows the user to associate any employee with the department to which he or she is assigned

Requirements for a table to qualify as a relations: Properties of relations:Two-dimensional tables of data. ◆ Each relation(or table) has a unique name ◆ Every attribute value must be atomic (not multivalued, not composite) ● An entry at the intersection of each row and column is atomic (single valued) there can only be one value associated with each attribute on a specific row of a table; no multivalued attributes are allowed in a relation

◆ Each row is unique (cant have two rows with exactly the same values for all the fields) ◆ Each attribute (or column) has a unique name ◆ The order of the columns is irrelevant ◆ The order of the rows is irrelevant ➔ NOTE: all relations are in 1st normal form Sample databases ◆ A relational database may consist of any number of relations. The structure is described through the use of a schema, which is a description of the overall logical structure of the database. Methods for expressing a schema: ● Short text statements where each relation is named and the names of its attributes follow in parenthesis. simple ● A graphical representation, where each relation is represented by a rectangle containing the attributes for the relation. Provides a better means of expressing referential integrity constraints ● Sample date allows you to test your assumptions about the design, check accuracy of your design, improve communications and develop prototype applications to test queries CUSTOMER(CustomerlD  , CustomerName, CustomerAddress, CustomerCity, CustomerState, CustomerPostalCode) ORDER(OrderlD,   OrderDate, CustomerID. ) ORDER LINE(OrderlD,  product id, O  rderedQuantity) PRODUCT(ProductlD  , ProductDescription, ProductFinish, ProductStandardPrice, ProductlinelD) ● Primary key for order line is a composite key consisting of attributes orderid and productid ● Customerid is a foreign key in the order relation ○ This allows the user to associate an order with the customer who submitted the order ○ Order line has two foreign keys that allow the user to associate each line on an order with the relevant order and product Integrity constraints: domain definitions enforce domain integrity constraints ● relational data model includes several rules limiting acceptable values and actions to facilitate maintaining the accuracy and integrity of data. ● The major types of integrity constraints are domain constraints, entity integrity, and referential integrity ● 1. Domain constraints: Allowable values for an attribute ○ all values in a column of a relation must be from the same domain. Domain is the set of values that may be assigned to an attribute. Domain definition usually consists of (domain name, meaning, data type, size, allowable values or allowable range).

●

●

2. Entity integrity: no primary attribute may be null. All primary keys MUST contain data values ○ designed to ensure that every relation has a primary key and that the data values for that primary key are all valid. ○ Guarantees that every primary key attribute is not null. Entity integrity rule: states that no primary key attribute (or component of a primary key attribute) may be null ○ Null: the relational data model allows you to assign a null value to an attribute. A null is a value that may be assigned to an attribute when no other value applies or when the applicable value is unknown. 3. Referential integrity: rules that maintain consistency between the rows of two related tables ○ In the relational mata model, associations between tables are defined through the use of foreign keys ○ Referential integrity constraint: rule that maintains consistency among the rows of two relations. ■ States that any foreign key (on the relation of the many side) MUST match the primary key value in the relation of the one side (or the foreign key can be null) ○ Null foreign keys: ■ You know if its allowed to be null when its defined ■ If each order must have a customer then the foreign key customerid cannot be null in the order relation ■ What is you want to delete a customer but you still want to see their sales ● Delete the associated orders (called cascading delete) → we lose not only the customer but also the sales history ● Prohibit deletion of the customer until all associated orders are first deleted (a safety check) ● Place a null value in the foreign key ■ Delete rules: ● Restrict- don't allow delete of “parent” side if related rows exist in “dependent side” ● Cascade- automatically delete “dependent” side rows that correspond with the “parent” side row to be deleted ● Set-to-null set the foreign key in the dependent side to null if deleting from the parent side → not allowed for weak entities

●

Well structured relations ○ Contains minimal redundancy and allows the user to insert, modify and delete the rows in a table with inconsistencies ○ Anomalies: errors or inconsistencies caused by redundancies in a table when a user attempts to update the data in the table ■ Insertion anomality: Suppose that we need to add a new employee to EMPLOYEE2.The primary key for this relation is the combination of EmpID and Co urseTitle (as noted earlier). Therefore, to insert a new row, the user must supply values for bothEmpID and CourseTitle (because primary key values cannot be null or nonexistent).This is an anomaly because the user should be able to enter employee data without supplying course data. ■ Deletion anomaly ■ Modification anomaly

●

Transforming eer diagrams into relations: ○ During logical design you transform diagrams into relational databases

○ ○

CASE tools often cant model more complex data relationships, such as ternary and supertype/subtype. You may have to do these manually Types of entities: ■ Regular entities: ■ Weak entities: c annot exist without a relationship ■ Associative entities:  formed by many-to-many relationships

Step 1: Map regular entities to relations: ○ Simple attributes: E-R attributes map directly onto the relation ○ Composite attributes: use their simple, component attributes ○ Multivalued attribute: becomes a separate relation with a foreign key taken from the superior entity ○ When it has composite attributes only the simple components are included (customername instead of last name, first name) ○ When it has multivalued attributes, two new relations (rather than one) are created. ● Composite attributes

Relation with address detail:

● Multivalued attributes Employee and Employee Skill Relations: ● Multivalued attribute becomes a separate relation with a foreign key ● One-to-many relationship between original entity and new relation

Step 2: Map weak entities: ➔ Become a separate relation with a foreign key taken from the superior entity ➔ Primary key composed of ◆ Partial identifier of the weak entity ◆ Primary keys of identifying relation (strong entity) ● Weak entity types do not have an independent existence but exist only through an identifying relationship with another entity called the owner ● Does not have a composite identifier but must have an attribute called the partial identifier that permits distinguishing the various occurrences of the weak entity for each owner entity instance ● Include the primary key of the identifying relation  as a foreign key attribute in the new relation ● The primary key of the new relation is the combination of this primary key of the identifying relation and the partial identifier of the weak entity type

Relations resulting from weak entity:

●

●

●

When to create a surrogate key: ○ Usually created to simplify the key structures ○ When there is a composite primary key ○ The natural primary key is inefficient. Ex: it might be very long and costly for database software to handle if it is used as a foreign key that references other tables ○ The natural primary key is recycled (reused or repeated periodically, so it may not actually be unique overtime). A more general statement is when the natural primary key cannot be guaranteed to be unique overtime Whenever a surrogate key is created, the natural key is always kept as nonkey data in the same relation because the natural key has organizational meaning that has to be captured in the database Natural keys are used as identifiers in searches

Step 3: map binary relationships: ● Mapping binary relationships: ○ One-to-many: primary key on the one side becomes a foreign key on the many side ○ Many-to-many: create a new relation with the primary keys of the two entities as a primary key ○ One-to-one: primary key on mandatory side becomes a foreign key on optional side ● Map binary one-to-one relationships: ○ Create a relation for each of the two entity types participating in the relationship ○ Include the primary key attribute(s) of the entity on one side of the relationship

Customer and order in relations with a foreign key in order:

●

Map binary many-to-many relationships

Three resulting relationships: the completes r elationship will need to become a separate relation

●

Map binary one-to-one relationships: ○ Two relations are created, one for each of the participating entity types ○ The primary key of one of the relations is included asa foreign key in the other ○ *often in 1:1 relationships, one direction is optional

Resulting relation

Step 4: map associative entities: ● When you encounter many to many relationships you may choose to model it as an associative entity in an E-R diagram ○ Most appropriate when the end user can best visualize the relationship as an entity type rather than as an M:N relationship ● Create three relations: one for each of the two participating entity types and a third for the associative entity ● The relation formed from the associative entity is called the associative relation ● Identifier: ● Mapping associative entities: ○ Identifier not assigned: ■ Default primary key for the association relation is composed of the primary keys of the two entities (as in M:N relationship) ○ Identifier assigned: ■ It is natural and familiar to end users ■ Default identifier may not be unique ○ If the identifier was not assigned:the  default primary key for the associative relation is a composite key that consists of the two primary key attributes from the other two relations. These attributes are then foreign keys that reference the other two relations

●

If the identifier is assigned ○ Sometimes you assign a single-attribute identifier to the associative entity type on the E-R diagram ○ It is natural and familiar to end-users ○ Default identifier may not be unique ○ Two reasons you may have assigned a single-attribute key during conceptual data modeling ■ The associative entity type has a natural single-attribute identifier that is familiar to end users ■ The default identifier (consisting of the identifiers for each of the participating entity types) ○ New associative relation is created to represent the associative entity ○ The primary key is the identifier assigned on the E-R diagram (rather than the default key) ○ The primary keys for the entity types are then included as foreign keys in the associative relation Example of mapping an associative entity with an identifier: a.) SHIPMENT associative entity

Resulting relations:

Step 5: map unary relationships ● One-to-many: recursive foreign key in the same relation ● Many-to-many: ○ One for each type ○ One for an associative relation in which the primary key has two attributes, both taken from the primary key of the entity ● Unary relationship: between instances of a single entity type ● Also called recursive relationships ○ Unary one-to-one relationships: ■ A foreign key attribute is added to the same relation which references the primary key values in the same relation (the foreign key must have the same domain as the primary key). This is also called a recursive foreign key (a) Employee entity with unary relationship

Employee relation with recursive foreign key

●

Unary many-to-many relationships ○ Two relations are created to represent the entity type in the relationship and the associative relationship to represent the M:N relationship itself

○

The primary key takes their values from the primary key of the other relation. Any nonkey attribute of the relationship is included in the associative entity (a) Bill-of-materials relationships (unary M:N)

Item and component relations:

Step 6: map ternary (and n-ary) relationships: ● One relation for each entity and one for the associative entity ● Associative entity has foreign keys to each entity in the relationship ● Ternary relationship is among three entity types, recommend you convert a ternary relationship to an associative entity to represent participation constraints more accurately ● Create a new associative relation ○ The default primary key consists of three primary key attributes for the participating entity types that act in the role of foreign keys that reference the individual primary keys of the participating entity types ○ Any attributes of the associative entity type become attributes of the new relation

Step 7: map supertype/subtype relationships ● One relation for supertype and for each subtype ● Supertype attributes (including identifier and subtype discriminator) go into supertype relation ● Subtype attributes go into each subtype; primary key of supertype relation also becomes primary key of subtype relation ● 1:1 relation established between supertype and each subtype, with supertype as primary table ● The relational data model does not support supertype/subtype relationships but you can ○ Create a separate relation for the supertype and for each of its subtypes ○ Assign to the relation created for the supertype the attributes that are common to all members of the supertype, including the primary key ○ Assign to the relation for each subtype the primary key of the supertype and only those attributes that are unique to that subtype ○ Assign one or more attributes of the supertype to function as the subtype discriminator

Example mapping supertype,subtype relationships to relations

Summary of EER- to-relational transformations:

Normalization: ○ Primary tool to validate and improve logical design so that it satisfies certain constraints that avoid unnecessary duplication of data ○ The process of decomposing relations with anomalities to produce smaller, well-structured relations ○ ○

Following the steps mentioned for transforming EER diagrams into relations typically results in well-structured relations Well-structured relations: ■ A relation that contains minimal data redundancy and allows users to insert, delete, and update rows without causing data inconsistencies ■ Goal is to avoid anomalities ● Insertion anomality: adding new rows forces users to create duplication data ● Deletion anomality: deleting rows may cause a loss of data that would be needed for other future rows

●

○ ○ ○

● ●

●

●

●

Modification anomallity: changing data in a row forces changes to other rows because of duplication ■ General rule of thumb: a table should not pertain to more than one entity type But, there is no guarantee that all anomalities are removed by following these steps Normalization is a formal process for deciding which attributes should be grouped together in a relation so that all anomalities are removed. Use normalization 1. During logical database design:  to verify the quality of the relations that are obtained from mapping E-R diagrams 2. When reverse-engineering older systems:  many of the tables and use views for older systems are redundant and subject to the anomalities Normalization is the process of successfully reducing relations with anomalities to produce smaller, well-structured relations. Goals to normalization: ○ Minimize data redundancy, thereby avoiding anomalities and conserving storage space ○ Simplify the enforcement of referential integrity constraints ○ Make it easier to maintain data (insert, update and delete) ○ Provide a better design that is an improved representation of the real w...