MIS 335 - W2-Std

Report
MIS 335 - Database Systems
Entity-Relationship Model
http://www.mis.boun.edu.tr/durahim/
Ahmet Onur Durahim
Learning Objectives
• Database Design
• Main concepts in the ER model?
• ER Diagrams
Database Design and ER Diagrams
• Requirements Analysis: find out what the users want from
the database
– What data is to be stored in the DB
– What applications must be built on top of it
– What operations are most frequent and subject to performance
requirements
• Conceptual Database Design: create a simple description of the data
that closely matches how users and developers think of the data
– A high-level (semantic) description of data to be stored in the DB
along with the constraints known to hold over this data
– Carried out using the ER Model
• Logical Database Design: choose DBMS to implement
conceptual database design
– Convert conceptual DB design (ER schema) into a DB schema in
the data model of the chosen DBMS (relational DB schema)
DB Design
• Schema Refinement
– Analyze the collection of relations in relational DB schema to
identify potential problems and refine it (- Normalization of the
relations)
• Physical DB Design
– Consider expected workloads to refine for meeting the desired
performance criteria
– Building indexes on tables
– Clustering some tables
– Redesign of parts of the DB schema
• Application and Security Design
– Identify entities (users, departments) and relevant roles of each
entity
– Enforce access rules: For each role, identify the parts of the DB
that must be accessible and must not be accessible
Entity-Relationship Model
• Before developing your database application,
you need to
– Collect the requirements
– Build a conceptual database design
• ER Model: used to describe the data involved
in an enterprise in terms of objects and
relationships
– Widely accepted standard for initial (conceptual)
database design
Entity-Relationship Model
• Conceptual DB design:
– What information about these entities and
relationships should we store in the database?
– What are the integrity constraints or business rules
that hold?
• A database `schema’ in the ER Model can be
represented pictorially
– ER diagrams
• Can map an ER diagram into a relational schema
Entity-Relationship Diagram
name
ssn
lot
Employees
cost
Policy
pname
age
Dependents
Key Concepts of ER Model
• Entities
– An object that is capable of independent existence
and can be uniquely identified (can be
distinguished from other objects)
Employee
Student
Item
• An entity is described using a set of attributes
ssn
sid
type
Key Concepts of ER Model
• Entity set
– A collection of similar entities
• share the same set of properties/attributes
• Reflects the level of detail at to represent information
about entities
Students
Onur
Alp
Zubeyde
Esra
Arzu
Ahmet
Key Concepts of ER Model
• Entity set may overlap
– Any example?
Students
Onur
Alp
Employees
Mert
Zubeyde
Arzu
Esra
Ahmet
Emrecan
Mehmet
Key Concepts of ER Model
• Each entity sets has attributes
• Each attribute has a domain
– Domain: set of permitted values
• name attribute – (set of 20-character string)
• age attribute – (set of integers between 0-150)
• Each entity set has a key
– minimal set of attributes whose values uniquely identify an
entity in the set
– denoted by underlining the attribute name in the ERdiagram
Employee
ssn
address
name
Key Concepts of ER Model
• Relationships
– Association (relation) among two or more entities
– Ahmet is enrolled in MIS335
Enrolled
Works_In
• Relationship sets
– A collection of similar relationships
– Share the same properties
Key Concepts of ER Model
• Relationships also has attributes
– Descriptive attributes: used to record the
information about the relationship
– Ahmet Works_In University since 2014
Employee
ssn
Works_In
address
name
since
ER Model
cname
sid
Student
Enrolled
Course
cid
name
semester
Rectangles : Entity sets
Diamonds
: Relationship Sets
Ellipses/Oval : Attributes
ER Model
• Degree of a relationship set is the number of
entity sets that participate in a relationship
• Binary relationship sets involve two entity sets
cname
sid
Student
Enrolled
Course
name
semester
cid
ER Model
• Ternary relationship sets involve three entity
sets
address
Employee
ssn
name
Locations
capacity
Works_In
since
Departments
did
budget
dname
An Instance of the WorksIn
Relationship Set
ER Model
• The set of entities that participate in a relationship
set may belong to the same entity set
• Each entity plays a different role in such a relationship
Employees
ssn
Employees
name
subordinate
supervisor
Reports_To
Reports_To =>
Unary relationship
ER Model
• The set of entities that participate in a relationship
set may belong to the same entity set
• Each entity plays a different role in such a relationship
Students
Student
sid
name
tutee
tutor
Helps
Cardinality Mappings
• One-to-One (1-1)
– One occurrence of an entity relates to only one occurrence in another
entity
– rarely exists in practice
• consider combining them into one entity
– Example: an employee is allocated a company car, which can only be
driven by that employee
• One-to-Many (1-M) / Many-to-One (M-1)
– One occurrence in an entity relates to many occurrences in another
entity
– Example: an employee works in one department but a department has
many employees.
Cardinality Mappings
• Many-to-Many (M-N)
– Many occurrences in an entity relate to many occurrences in
another entity
– The normalisation process would prevent any such relationships
– Rarely exist
• They occur because an entity has been missed.
– Example: an employee may work on several projects at the
same time and a project has a team of many employees.
– In the normalisation process this many-to-many is resolved by
the entity Project Team.
Cardinality Mappings
1-to-1
1-to-Many
Many-to-1
Many-to-Many
ER Model – Key Constraints
Employees
ssn
Works_In
Departments
dname
did
name
since
since
An employee can Work In multiple departments and
a department can have multiple employees.
What is the type of this relationship?
Many-to-Many
ER Model – Key Constraints
Employees
ssn
Departments
Manages
dname
did
name
since
An employee can Manage multiple departments, but a department
can be managed by only one employee (Manager)
What is the type of this relationship?
This is called a key constraint (the restriction that each
department has at most one manager)
denoted by an arrow
1-to-Many
since
An Instance of the Manages Relationship Set
Department with did = ‘51’
violates the key constraint of
the Manages relationship
Instance of Manages relationship
that satisfies the key constraint
of the Manages relationship
Participation Constraints
• If every department is required to have a manager, this
requirement is a participation constraint
• The participation of the entity set Departments in the relationship
set Manages is total
• The participation of the entity set Employees in the relationship set
Manages is partial
– Since not every employee gets to manage a department
• Total participation constraint of an Entity set in a relationship set is
indicated by connecting them by thick line
Employees
ssn
Manages
Departments
dname
did
name
since
since
Participation Constraints
• If each employee works in at least one department,
and if each department has at least one employee
– Total or Partial Participation of Employees & Departments
entities
since
Works_In
Employees
ssn
Manages
Departments
dname
did
name
since
since
Class/ISA (“is a”) Hierarchies
• Classify entities into subclasses
• Every entity in a subclass also belongs
to superclass (Employees)
• The attributes for the entity set
Employees are inherited by the entity
set Hourly_Emps
• Hourly_Emps ISA Employees
• Reasons for using ISA:
• To add descriptive attributes
specific to a subclass.
• To identify entities that
participate in a relationship
ssn
name
Employees
Hourly_Emps
hourly_wages
hours_worked
ISA
Contract_Emps
contractid
Class/ISA (“is a”) Hierarchies
ssn
• Specialization: process of identifying subsets
of an entity set (Employees) that share some
distinguishing characteristic
– Employees is specialized into subclasses
• Generalization: process of identifying some
common characteristics of a collection of entity
sets and creating a new entity set that contains
entities possessing these common
characteristics
– Hourly_Emps and Contract_Emps are
generalized by Employees
name
Employees
Hourly_Emps
hourly_wages
hours_worked
ISA
Contract_Emps
contractid
Class/ISA (“is a”) Hierarchies
ssn
• Overlap Constraints: determine whether two
subclasses are allowed to contain the same
entity
– Can Ahmet belong to both Contract_Emps
entity and Hourly_Emps?
name
Employees
• Covering Constraints: determine whether the
entities in the subclasses collectively include all
entities in the superclass
– Does every Employees entity have to belong to
one of Hourly_Emps and Contract_Emps?
Hourly_Emps
hourly_wages
hours_worked
ISA
Contract_Emps
contractid
Weak Entities
• Weak Entity: Entity set that does not include a key
• A weak entity can be identified uniquely only by
considering the primary key of another entity (called
identifying owner)
– Set of attributes of a weak entity set that uniquely identify a
weak entity for a given owner entity => partial key
• A weak entity set is denoted by a rectangle with thick lines
Employees
ssn
Policy
Dependents
pname
name
cost
age
Weak Entities
• A weak entity can be identified uniquely only by
considering the primary key of another entity (called
identifying owner)
• A weak entity set is denoted by a rectangle with thick lines
• The relationship between a week entity and the owner
entity is denoted by a diamond with thick lines
Employees
ssn
Policy
Dependents
pname
name
cost
age
Weak Entities
• A weak entity can be identified uniquely only by
considering the primary key of another entity (called
identifying owner)
• What can you say about the constraints on the
identifying relationship? (i.e., participation and key
constraints)
Employees
ssn
Policy
Dependents
pname
name
cost
age
Weak Entities
• What can you say about the constraints on the
identifying relationship? (i.e., participation and key
constraints)
– Owner entity set and weak entity set must participate in a
one-to-many relationship set (one owner, many weak
entities)
– Weak entity set must have total participation in this
identifying relationship set
Employees
ssn
Policy
Dependents
pname
name
cost
age
ssn
Aggregation
name
• Used to indicate that a relationship set
(denoted by a dashed box) participates
in another relationship set
– Allows us to treat a relationship set as an
entity set for purposes of participation in
other relationships
Projects
started_on
Employees
Sponsors
• Aggregation vs. Ternary relationship:
– Monitors is a distinct relationship, with a
descriptive attribute
– Also, can say that each sponsorship is
monitored by at most one employee
Departments
did
pid
pbudget
until
Monitors
dname
since
budget
Conceptual Design Using the ER Model
• Design choices:
– Should a concept be modeled as an entity or an
attribute?
– Should a concept be modeled as an entity or a
relationship?
– Identifying relationships: Binary or ternary?
Aggregation?
• Constraints in the ER Model:
– A lot of data semantics can (and should) be captured
– But some constraints cannot be captured in ER
diagrams
Entity vs. Attribute
• Should address be an attribute of Employees or an
entity (connected to Employees by a relationship)?
• Depends upon the use we want to make of address
information, and the semantics of the data:
– If only one address is to be recorded per employee
• Use attribute ‘address’
– If we have several addresses per employee
• address must be an entity (since attributes cannot be set-valued)
– If we want to capture the structure (break down address
into country, city, street, etc.) of an address
• e.g., we want to retrieve employees in a given city
• address must be modeled as an entity (since attribute values are
atomic)
Entity vs. Attribute
• Works_In does not allow an
employee to work in a
department for two or more
periods
– This possibility is ruled out by
the ER diagram’s semantic,
because relationship is
uniquely identified by the
participating entities (without
reference to its descriptive
attributes)
• Similar to the problem of
wanting to record several
addresses for an employee
– We want to record several
values of the descriptive
attributes for each instance of
this relationship
– Accomplished by introducing
new entity set, Duration
from
Employees
to
Works_In
Departments
did
ssn
name
budget
from
Employees
Duration
Works_In
ssn
name
dname
to
Departments
did
dname
budget
Entity vs. Relationship
• ER diagram is OK if a manager
gets a separate discretionary
budget for each department
• What if a manager gets a
discretionary budget that
covers all managed
departments?
– Redundancy: dbudget stored
for each dept managed by
manager
– Misleading: Suggests dbudget
associated with departmentmgr combination
since
Employees
dbudget
Departments
Manages
did
ssn
name
dname
budget
name
ssn
Employees
Manages
Departments
ISA
did
since
Managers
dbudget
dname
budget
Entity vs. Relationship
• ER diagram is OK if a manager
gets a separate discretionary
budget for each department
• What if a manager gets a
discretionary budget that
covers all managed
departments?
– Redundancy: dbudget stored
for each dept managed by
manager
– Misleading: Suggests dbudget
associated with departmentmgr combination
Redundancies are eliminated
by Normalization technique
since
Employees
dbudget
Departments
Manages
did
ssn
name
dname
budget
name
ssn
since
Employees
Manages
Departments
ISA
did
Managers
dbudget
dname
budget
Binary vs. Ternary Relationship
name
pname
ssn
• Models the situation where;
Employees
– An employee can own several policies
– Each policy can be owned by several
employees
– Each dependent can be covered by
several policies
Covers
Dependents
Policies
policyid
age
cost
Binary vs. Ternary Relationship
name
• If we have additional requirements;
– A policy cannot be owned jointly by
two or more employees
– Every policy must be owned by
some employee
– Dependents is a weak entity, and
uniquely identified by taking pname
in conjunction with policyid of a
policy entity
name
pname
ssn
Covers
Employees
Bad design
Dependents
Policies
policyid
• ER diagram is inaccurate
cost
ssn
• What are the additional
constraints in the 2nd diagram?
age
pname
Employees
age
Dependents
Purchaser
Beneficiary
Better design
Policies
policyid
cost
Binary vs. Ternary Relationship (Contd.)
• An example in the other direction:
• A ternary relation Contracts relates entity sets
Parts, Departments and Suppliers, and has
descriptive attribute qty.
• No combination of binary relationships is an
adequate substitute:
– S “can-supply” P, D “needs” P, and D “deals-with”
S does not imply that D has agreed to buy P from S
– How do we record qty?
Summary of Conceptual Design
• Conceptual design follows requirements analysis
– Yields a high-level description of data to be stored
• ER model popular for conceptual design
– Constructs are expressive, close to the way people
think about their applications
• Basic constructs
– entities, relationships, and attributes (of entities and
relationships)
• Some additional constructs
– weak entities, ISA hierarchies, and aggregation
• Note: There are many variations on ER model
Summary of Conceptual Design
• Several kinds of integrity constraints can be
expressed in the ER model:
– key constraints
– participation constraints
– overlap/covering constraints for ISA hierarchies
• Some foreign key constraints are also implicit in
the definition of a relationship set
– Some constraints (notably, functional dependencies)
cannot be expressed in the ER model
• Constraints play an important role in determining
the best database design for an enterprise
Summary of Conceptual Design
• ER design is subjective
• There are often many ways (alternatives) to
model a given scenario
• Common choices include:
– Entity vs. attribute, Entity vs. relationship
– Binary or n-ary relationship
– Whether or not to use ISA hierarchies / aggregation
• To ensuring good database design:
– Resulting relational schema should be analyzed and
refined further
– FD information and normalization techniques are
especially useful
ER Modeling Question - 0
• Should explain the following terms:
– entity, relationship, entity set, relationship set,
– attribute, domain,
– one-to-many relationship, many-to-many
relationship,
– participation constraint, overlap constraint,
covering constraint,
– weak entity set, aggregation, role indicator.
ER Modeling Example - 1
• A university database contains information
about professors (identified by social security
number, or SSN) and courses (identified by
courseid)
– Professors teach courses; each of the following
situations concerns the Teaches relationship set.
– For each situation, draw an ER diagram that
describes it (assuming no further constraints hold)
ER Modeling Example - 1
– Professors can teach the same course in several
semesters, and each offering must be recorded
– Professors can teach the same course in several
semesters, and only the most recent such offering
needs to be recorded. (Assume this condition
applies in all subsequent questions.)
– Every professor must teach some course
ER Modeling Example - 1
– Every professor teaches exactly one course (no
more, no less)
– Every professor teaches exactly one course (no
more, no less), and every course must be taught
by some professor
– Now suppose that certain courses can be taught
by a team of professors jointly, but it is possible
that no one professor in a team can teach the
course. Model this situation, introducing
additional entity sets and relationship sets if
necessary
Different ER Modeling Notations
Chen vs. Crow’s Foot Notation
Crow’s Foot Notation

similar documents