contents
Chapter 13
Design Theory for Relational Databases
名詞定義
Functional Dependencies
- 定義
X -> Y is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X , then they must also agree on all attributes in set Y . - 特性
- X->A1A2 … An holds for R exactly when each of X->A1, X->A2,…, X->An hold for R.
例子 Drinkers(name, addr, beersLiked, manf, favBeer)
FD : name -> addr favBeer & beersLiked -> manf
name -> addr favBeer = name -> addr and name -> favBeer .
- X->A1A2 … An holds for R exactly when each of X->A1, X->A2,…, X->An hold for R.
- 定義
Keys of Relations
換句話說 superkey 可能是複合的好幾個 key 構成,因此 key 一定也是 superkey,不論 superkey 或者是 key 都可以決定所有的 attribute。特別注意,key 是最小單位,因此 key 的子集不會是 key。- superkey
K is a superkey for relation R if K functionally determines all of R. - key
K is a key for R if K is a superkey, but no proper subset of K is a superkey. - 例子 Drinkers(name, addr, beersLiked, manf, favBeer)
FD : name -> addr favBeer & beersLiked -> manf
得到 {name, beersLiked} 是一個 superkey,發現 {name} 和 {beersLiked} 不是 superkey,得到 {name, beersLiked} 是一個 key。
- superkey
Inference Test
如何找到 inferring FD?所謂的 inferring FD 就是藉由遞移關係得到的 FD,可以由原本集合中的幾個 FD 推斷出來,因此這些 inferring FD 不是必要存在的。那就先把 Y -> B 移除,再由剩下的 FD 組合,查看是否存在可以推論出 B 的可能,如果任何情況都推不出 B,則 Y -> B 不是 inferring FD。
We are given FD’s X1 -> A1, X2 -> A2,…, Xn -> An , and we want to know whether an FD Y -> B must hold in any relation that satisfies the given FD’s.
例子 If A -> B and B -> C hold, surely A -> C.Closure Test
遞移閉包測試。
An easier way to test is to compute the closure of Y , denoted Y+ .- Basis: Y+ = Y.
- Induction: Look for an FD’s left side X that is a subset of the current Y+ . If the FD is X -> A, add A to Y+ .
Relational Schema Design
schema design 目標是避免異常、冗餘。
Goal of relational schema design is to avoid anomalies and redundancy.
- Update anomaly :
更新異常,當一個事實改變時,只會發生一次修改,而不會發生很多次的修改。意即針對一個關聯屬性修改,最多更動一個 row (entity),不會更動數行的 row。
one occurrence of a fact is changed, but not all occurrences. Deletion anomaly :
刪除異常,當一個 row 移除時,確定合法的事實被移除。如果發生關聯上屬性移除,但該屬性本應該存在,卻遭到移除而不存在資料庫中,就這麼地憑空消失。
valid fact is lost when a tuple is deleted.Boyce-Codd Normal Form
無論怎麼找 FD,都會發現到 FD 的 left hand side X 都是 superkey,那麼 R 就是一個 BCNF 。
We say a relation R is in BCNF if whenever X ->Y is a nontrivial FD that holds in R, X is a superkey.- nontrivial means Y is not contained in X.
- a superkey is any superset of a key (not necessarily a proper superset).
- 例子 Drinkers(name, addr, beersLiked, manf, favBeer)
FD : name -> addr favBeer & beersLiked -> manf
In each FD, the left side is not a superkey. Shows Drinkers is not in BCNF.
Decomposition into BCNF
|
|
Third Normal Form
接續上面的 Decomposition,將會產生一些問題。拆分上不合乎常理,發生過度的拆分,這問題發生在於給定的 FD 格式。
舉個例子 (street address, city, zip code)
- FD : AB -> C and C -> B.
Example: A = street address, B = city, C = zip code. - There are two keys, {A, B} and {A, C} .
- C -> B is a BCNF violation, so we must decompose into AC, BC.
- 顯然地 AC 有很嚴重的問題,既不存在 A -> C 也不存在 C -> A。
避免此問題
- An attribute is prime if it is a member of any key. 也就是 attribute 是某一 key 的 subset。
- X -> A violates 3NF if and only if X is not a superkey, and also A is not prime.
接續上個例子 (street address, city, zip code)
- There are two keys, {A, B} and {A, C} .
- Thus A, B, and C are each prime.
- Although C -> B violates BCNF, it does not violate 3NF.
properties of a decomposition
最後,來看看得到的 3NF 和 BCNF 給予什麼特性?
There are two important properties of a decomposition:
- Lossless Join :
it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original. - Dependency Preservation :
it should be possible to check in the projected relations whether all the given FD’s are satisfied.
資料不失真,保證可以藉由投影、Join 將分解的 schema 組合成原本的資料。同時所有的相依關係仍然保留,可以把原本 FD 都滿足所有的 schema。