資料庫複習 Ch.13

contents

  1. 1. Chapter 13
    1. 1.1. 名詞定義
    2. 1.2. Relational Schema Design
    3. 1.3. Third Normal Form
    4. 1.4. properties of a decomposition

Chapter 13

Design Theory for Relational Databases

名詞定義

  • Functional Dependencies

    • 定義
      X -> Y is an assertion about a relation R that whenever two tuples of R agree on all the attributes of X , then they must also agree on all attributes in set Y .
    • 特性
      • X->A1A2 … An holds for R exactly when each of X->A1, X->A2,…, X->An hold for R.
        例子 Drinkers(name, addr, beersLiked, manf, favBeer)
        FD : name -> addr favBeer & beersLiked -> manf
        name -> addr favBeer = name -> addr and name -> favBeer .
  • Keys of Relations
    換句話說 superkey 可能是複合的好幾個 key 構成,因此 key 一定也是 superkey,不論 superkey 或者是 key 都可以決定所有的 attribute。特別注意,key 是最小單位,因此 key 的子集不會是 key。

    • superkey
      K is a superkey for relation R if K functionally determines all of R.
    • key
      K is a key for R if K is a superkey, but no proper subset of K is a superkey.
    • 例子 Drinkers(name, addr, beersLiked, manf, favBeer)
      FD : name -> addr favBeer & beersLiked -> manf
      得到 {name, beersLiked} 是一個 superkey,發現 {name} {beersLiked} 不是 superkey,得到 {name, beersLiked} 是一個 key。
  • Inference Test
    如何找到 inferring FD?所謂的 inferring FD 就是藉由遞移關係得到的 FD,可以由原本集合中的幾個 FD 推斷出來,因此這些 inferring FD 不是必要存在的。那就先把 Y -> B 移除,再由剩下的 FD 組合,查看是否存在可以推論出 B 的可能,如果任何情況都推不出 B,則 Y -> B 不是 inferring FD。
    We are given FD’s X1 -> A1, X2 -> A2,…, Xn -> An , and we want to know whether an FD Y -> B must hold in any relation that satisfies the given FD’s.
    例子 If A -> B and B -> C hold, surely A -> C.

  • Closure Test
    遞移閉包測試。
    An easier way to test is to compute the closure of Y , denoted Y+ .

    • Basis: Y+ = Y.
    • Induction: Look for an FD’s left side X that is a subset of the current Y+ . If the FD is X -> A, add A to Y+ .

Relational Schema Design

schema design 目標是避免異常、冗餘。

Goal of relational schema design is to avoid anomalies and redundancy.

  • Update anomaly :
    更新異常,當一個事實改變時,只會發生一次修改,而不會發生很多次的修改。意即針對一個關聯屬性修改,最多更動一個 row (entity),不會更動數行的 row。
    one occurrence of a fact is changed, but not all occurrences.
  • Deletion anomaly :
    刪除異常,當一個 row 移除時,確定合法的事實被移除。如果發生關聯上屬性移除,但該屬性本應該存在,卻遭到移除而不存在資料庫中,就這麼地憑空消失。
    valid fact is lost when a tuple is deleted.

  • Boyce-Codd Normal Form
    無論怎麼找 FD,都會發現到 FD 的 left hand side X 都是 superkey,那麼 R 就是一個 BCNF
    We say a relation R is in BCNF if whenever X ->Y is a nontrivial FD that holds in R, X is a superkey.

    • nontrivial means Y is not contained in X.
    • a superkey is any superset of a key (not necessarily a proper superset).
    • 例子 Drinkers(name, addr, beersLiked, manf, favBeer)
      FD : name -> addr favBeer & beersLiked -> manf
      In each FD, the left side is not a superkey. Shows Drinkers is not in BCNF.
  • Decomposition into BCNF

1
2
3
4
5
6
7
BCNF_Decompose(R, FD)
find X s.t.: X != (X+) != [all attributes] // X is one of FD's left hand side
if (not found) then “R is in BCNF”
let Y = (X+) - X
let Z = [all attributes] - (X+)
decompose R into R1(X union Y) and R2(X union Z)
continue to decompose recursively R1 and R2

Third Normal Form

接續上面的 Decomposition,將會產生一些問題。拆分上不合乎常理,發生過度的拆分,這問題發生在於給定的 FD 格式。

舉個例子 (street address, city, zip code)

  • FD : AB -> C and C -> B.
    Example: A = street address, B = city, C = zip code.
  • There are two keys, {A, B} and {A, C} .
  • C -> B is a BCNF violation, so we must decompose into AC, BC.
  • 顯然地 AC 有很嚴重的問題,既不存在 A -> C 也不存在 C -> A。

避免此問題

  • An attribute is prime if it is a member of any key. 也就是 attribute 是某一 key 的 subset。
  • X -> A violates 3NF if and only if X is not a superkey, and also A is not prime.

接續上個例子 (street address, city, zip code)

  • There are two keys, {A, B} and {A, C} .
  • Thus A, B, and C are each prime.
  • Although C -> B violates BCNF, it does not violate 3NF.

properties of a decomposition

最後,來看看得到的 3NF 和 BCNF 給予什麼特性?

There are two important properties of a decomposition:

  • Lossless Join :
    it should be possible to project the original relations onto the decomposed schema, and then reconstruct the original.
  • Dependency Preservation :
    it should be possible to check in the projected relations whether all the given FD’s are satisfied.

資料不失真,保證可以藉由投影、Join 將分解的 schema 組合成原本的資料。同時所有的相依關係仍然保留,可以把原本 FD 都滿足所有的 schema。