CS 474 - Independent Study
Terminology
- Database
- A self-describing collection of records.
- Schema
- A complete logical view of the database.
- Meta data
- Data about data. Defines the structure of the database.
- Relation/Table
- A two dimensional array containing single-value entries and no
duplicate rows. Values for a given entity instance are shown in rows;
values of attributes of that entity are shown in columns. The order of
rows and columns is immaterial.
- Domain
- A named set of all possible values tahat an attribute can have.
Domains can be defined by listing their values or by defining a rule
for the allowed values.
- Column/Attribute/Field
- Define a property or characteristic of an entity. Values in a single
column must come from a single domain.
- Row/Tuple/Record
- Include all of the characteristics of a single entity instance in a
relation.
- Entity
- Loosely - Anything that we want to track.
- Entity class
- The collection of all individuals within a single entity.
- Entity instance
- A single tuple from an entity class.
- Entity-Relationship diagram
- A database model which describes the named entities in a database,
their characteristics and their relationships with other entities. A
tool for documenting or describing a database.
- Existence Dependency
- An existence dependency exists if there is one entity that depends on
another entity for its existence. In other words, one entity makes
sense in the database only in relation to another.
- Strong Entity
- In an existence dependency relationship, the strong entity is the
entity that is the parent or upon which the other entity depends.
- Weak Entity
- In an existence dependency relationship, the weak entity is the
entity who exists only in relation to its parent.
- Key
- Loosely - The attribute or attributes that can be used to identify
unique tuples in a relation.
- Superkey
- Any of a set of attributes which can be used to identify unique
tuples in a relation.
- Candidate key
- One or more attributes that uniquely identify tuples in a relation.
Candiate keys is a superkey such that no proper subset of the
attributes is itself a superkey.
- Primary key
- The single candidate key that is chosen to identify unique tuples in
a relation. A primary key may never have null values.
- Alternate key
- The candidate keys that are not chosen to identify unique tuples in a
relation.
- Foreign key
- A field in one table that has the same values as the primary key of
another.
- Secondary key - Index
- A way to access records in a table in a particular order. Does not
require unique values for all tuples in the relation.
- Composite key
- A key with multiple attributes.
- Surrogate key
- A field that is added to a table to serve as a "meaningless" key.
- Degree
- Number of attributes in a table.
- Cardinality
- Number of rows in a table.
- Cardinality of a Relationship
- Number of entities linked by a relationship (ie. binary - 2, ternary
- 3, n-ary n)
- Integrity
- Correctness and internal consistency of the data in the database by
not allowing users to enter data that would make the database
incorrect.
- Entity integrity
- A primary key constraint that requires that no attribute of the
primary key may contain null values.
- Referential integrity
- A foreign key constraint that requires that no foreign key value can
be used unless the value is found as the primary key in the home
table.
- Domain integrity
- Restriction on the allowed values for an attribute.
- Integrity constraints
- Rules or restrictions that apply to all instances of a database.
- Select σ
- A unary operation which produces a new table (or result set) which
meet a specified condition.
- Project π
- A unary operation which produces a new table (or result set) which
contains a subset of the attributes of the original table.
- Product X
- A binary operation which produces a new table (or result set) which
contains all rows of one table combined with all rows of the second
table. The new table will contain all of the attributes of both tables,
degree = (AttributesTableA + AttributesTableB), and the number of rows
- cardinality = (RowsTableA * RowsTableB).
- Join
- An binary operation which uses a Product result to put two tables
together in specific ways.
- Theta Join
- A join operation which is the result of performing a Select operation
on a Product result.
- EquiJoin
- A join operation in which the Select criterion is based on a common
element or elements between two tables having identical values.
- NaturalJoin
- An EquiJoin operation in which the values of the common elements are
only shown once in the resulting table.
- SemiJoin
- An EquiJoin where the result table is then Projected over only one of
the tables.
- RightSemiJoin/LeftSemiJoin
- A SemiJoin where in RightSemiJoin the rightmost operand is Projected
and in LeftSemiJoin the leftmost is Projected.
- OuterJoin
- A Join operation which extends EquiJoin or NaturalJoin by including
all of the rows of one table using nulls for attributes in the other
table for which is there is no match on the common elements.
- RightOuterJoin/LeftOuterJoin
- An OuterJoin in which the rows from the right operand are included in
their entirety is a RightOuterJoin and where the leftmost operand is
included in its entirety, we have a LeftOuterJoin.
- Division
- A binary operation on two tables where the entire structor of the
divisor table is contained in the attributes of the dividend. The
Division result set tells us which attributes are present for all
values of the divisor table contained in the dividend.
- Union
- A binary operation which combines two tables which contain the same
attributes. A Union results in a table with the same attributes as the
original tables and all of the unique tuples found in one or the other
or both.
- Union Compatible
- Two tables are union compatible if they contain the same
attributes.
- Intersection
- A binary operation which results in a table containing the rows that
are the same between two union compatible tables.
- Minus
- A binary operation which results in a table containing the rows in
the left operand that are not present in the right.
- Normalization
- The process of refining a database to remove anomalies.
- Update Anomaly
- Due to duplication, the update of one row in a table, results in
inconsistent data in other rows of the table where the meaning
(semantic) is the same.
- Insertion Anomaly
- We cannot insert new data into a table because of the dependency of
this data on information that is not yet present.
- Deletion Anomaly
- Deletion of one row in a table results in the loss of data.
- Functional Dependency
- If R is a relation schema and A and B are non-empty sets of
attributes in R, we say that B is functionally dependent on A if, and
only if, each value of A in R has associated with it exactly one value
of B in R.
- Determinant
- A determines B if B is functionally dependent on A. A is the
determinant.
- Dependent
- B is dependent on A if B is functionally dependent on A. B is the
dependent.
- First Normal Form
- A relation is in first normal form (1NF) if, and only if, every
attribute is single-valued for each tuple.
- Atomic
- A datum is atomic if it is indivisible...ie, it is not a collection
of other data.
- Fully Functionally Dependent
- In a relation R, attribute A of R is fully functionally dependent on
an attribute or set of attributes X of R if A is functionally dependent
on X but not functionally dependent on any proper subset of X.
- Second Normal Form
- A relation is in second normal form (2NF) if, and only if, it is in
1NF and all the non-key attributes are fully functionally dependent on
the key.
- Transitive Dependency
- If A, B, and C are attributes of relation R, such that A determines B
and B determines C, then C is transitively dependent on A.
- Third Normal Form
- A relation is in third normal form (3NF) if, and only if, it is in
2NF and whenever a non-trivial functional dependency X determines A
exists, then either X is a superkey or A is a member of some candidate
key.
- Boyce-Codd Nromal Form
- A relation is in Boyce-Codd normal form (BCNF) if, and only if, it is
in 2NF and whenever a non-trivial functional dependency X determines A
exists, then either X is a superkey.
- Lossless Decomponsition
- A decomposition of a relation R is called a lossless decomposition
for R if the natural join of the new relations produces exactly the
relation R.