Relationship between Database Normalization and RDF#
Normalization - 1NF, 2NF, 3NF and 4NF, https://www.youtube.com/watch?v=UrYLYV7WSHM, User channel5567
Items: XBOX1, PS4, PSVITA
Bottom up#
Ausgangspunkt: Eine Tabelle. Ziel: mehrere normalisierte Tabellen.
NF 1#
nf1, Table Sales Records, 5:00, columns:
Cust ID, Cust Name, Shipping Address, Newsletter, Supplier, Supplier Phone, Price

NF 2#
nf2, 8:46
#C-CN-SA-N: #Cust ID, Cust Name, Shipping Address, Newsletter
#Item, Supplier, Supplier Phone, Price
C-I: Cust ID, Item (Table Transactions, keine ID)

NF 3#
Wir betrachten nur die rechte Hälfte … 2 Abbildungen, ähnliche Farbgebung wie im Video
2NF, 9:58, Tables:
Table #I-S-SP-P: #Item, has_Supplier, has_Supplier_Phone, has_Price
Wir haben hier eine einzige Tabelle, mithin eine Entity, mit 3 Attributen: Ein Item hat einen Supplier, ein Supplier Phone, einen Price …
Problem (9:58): wir entdecken eine logische Abhängigkeit zwischen den Attribute has_Supplier und has_Supplier_Phone , “seperate it out” … das Telefon des Suppliers ist nicht vom Item, sondern vom Supplier abhängig ; “obviously we have to sepererate it out” .. we will get:
3NF, Tables 12:06
Table #I-S-P: #Item, has_Supplier, has_Price
Table #S-SP: #Supplier, has_Supplier Phone

Was passiert hier? Wir erkennen, dass wir nicht nur eine einzige Entity Item haben, sondern dass eine weitere Entity Supplier gibt. Für diese legen wir eine eigene Tabelle an, und moven die Spalte `has_Supplier_Phone” in diese Tabelle.
Nachteil: Wenn wir in unsere Datenbank ein neues Item einpflegen, müssen wir bei Bedarf auch einen neuen Supplier einpflegen. Vorteil: Vermeidung von Inkonsistenzen wie z.B. zwei unterschiedliche Telefonnummern eine Providers.
ChatGPT#
(We simply upload the former image to ChatGPT withot any other question. This is the answer:)
This diagram is a visual representation of Third Normal Form (3NF) in database normalization. The key principle of 3NF is that:
All fields (columns) must depend only on the primary key, and nothing else.
Here’s how each table supports 3NF:
🟧 Customer Table
Primary Key:
Cust IDOther fields like
Cust Name,Shipping Address, andNewsletterare dependent only onCust ID.Ensures no transitive dependencies (i.e., no column depends on a non-key column).
customers = [
{"Cust_ID": "at_smith", "Cust_Name": "Alan Smith", "Shipping_Address": "35 Palm St, Miami", "Newsletter": "Xbox News"},
{"Cust_ID": "roger25", "Cust_Name": "Roger Banks", "Shipping_Address": "47 Campus Rd, Boston", "Newsletter": "PlayStation News"},
{"Cust_ID": "wilson44", "Cust_Name": "Evan Wilson", "Shipping_Address": "28 Rock Av, Denver", "Newsletter": "Xbox News"},
{"Cust_ID": "wilson44", "Cust_Name": "Evan Wilson", "Shipping_Address": "28 Rock Av, Denver", "Newsletter": "PlayStation News"},
{"Cust_ID": "am_smith", "Cust_Name": "Alan Smith", "Shipping_Address": "47 Campus Rd, Boston", "Newsletter": "PlayStation News"},
]
🟥 Customer-Item Relationship Table
Composite Primary Key:
Cust ID + ItemShows which customer bought or subscribes to which item.
Maintains many-to-many relationships between customers and items.
customer_items = [
{"Cust_ID": "at_smith", "Item": "Xbox One"},
{"Cust_ID": "roger25", "Item": "PlayStation 4"},
{"Cust_ID": "wilson44", "Item": "Xbox One"},
{"Cust_ID": "wilson44", "Item": "PS Vita"},
{"Cust_ID": "am_smith", "Item": "PlayStation 4"},
]
🟨 Item Table
Primary Key:
ItemHas a Foreign Key to the Supplier.
Priceis dependent only onItem, not onSupplier, preserving 3NF.
items = [
{"Item": "Xbox One", "Supplier": "Microsoft", "Price": 250},
{"Item": "PlayStation 4", "Supplier": "Sony", "Price": 300},
{"Item": "PS Vita", "Supplier": "Sony", "Price": 200},
]
🟩 Supplier Table
Primary Key:
SupplierSupplier Phoneis directly dependent on theSupplier.
suppliers = [
{"Supplier": "Microsoft", "Supplier_Phone": "(800) BUY-XBOX"},
{"Supplier": "Sony", "Supplier_Phone": "(800) BUY-SONY"},
]
✅ How It Meets 3NF
No transitive dependencies.
Each non-key attribute is fully functionally dependent on the whole key.
Repetition and redundancy are minimized.
Tables are modular and can be joined by keys when needed.
Of course! Here’s how you can convert each of those Python data structures into pandas DataFrames:
import pandas as pd
# 🟧 Customer Table
customers_df = pd.DataFrame([
{"Cust_ID": "at_smith", "Cust_Name": "Alan Smith", "Shipping_Address": "35 Palm St, Miami", "Newsletter": "Xbox News"},
{"Cust_ID": "roger25", "Cust_Name": "Roger Banks", "Shipping_Address": "47 Campus Rd, Boston", "Newsletter": "PlayStation News"},
{"Cust_ID": "wilson44", "Cust_Name": "Evan Wilson", "Shipping_Address": "28 Rock Av, Denver", "Newsletter": "Xbox News"},
{"Cust_ID": "wilson44", "Cust_Name": "Evan Wilson", "Shipping_Address": "28 Rock Av, Denver", "Newsletter": "PlayStation News"},
{"Cust_ID": "am_smith", "Cust_Name": "Alan Smith", "Shipping_Address": "47 Campus Rd, Boston", "Newsletter": "PlayStation News"},
])
# 🟥 Customer-Item Table
customer_items_df = pd.DataFrame([
{"Cust_ID": "at_smith", "Item": "Xbox One"},
{"Cust_ID": "roger25", "Item": "PlayStation 4"},
{"Cust_ID": "wilson44", "Item": "Xbox One"},
{"Cust_ID": "wilson44", "Item": "PS Vita"},
{"Cust_ID": "am_smith", "Item": "PlayStation 4"},
])
# 🟨 Item Table
items_df = pd.DataFrame([
{"Item": "Xbox One", "Supplier": "Microsoft", "Price": 250},
{"Item": "PlayStation 4", "Supplier": "Sony", "Price": 300},
{"Item": "PS Vita", "Supplier": "Sony", "Price": 200},
])
# 🟩 Supplier Table
suppliers_df = pd.DataFrame([
{"Supplier": "Microsoft", "Supplier_Phone": "(800) BUY-XBOX"},
{"Supplier": "Sony", "Supplier_Phone": "(800) BUY-SONY"},
])
You can now run these in any Python environment with pandas installed to manipulate and query the data.
Want an example of joining these tables together (like a SQL JOIN) using pandas?
Here’s how you can join all these tables together using pandas, similar to SQL joins.
We’ll perform the joins in this order:
customer_items_df⬅️ join withcustomers_dfonCust_IDResult ⬅️ join with
items_dfonItemResult ⬅️ join with
suppliers_dfonSupplier
# Step 1: Join customer_items with customers on Cust_ID
merged_df_1 = pd.merge(customer_items_df, customers_df, on="Cust_ID", how="left")
# Step 2: Join the result with items on Item
merged_df_2 = pd.merge(merged_df_1, items_df, on="Item", how="left")
# Step 3: Join the result with suppliers on Supplier
final_df = pd.merge(merged_df_2, suppliers_df, on="Supplier", how="left")
# Show the final merged DataFrame
print(final_df)
🔍 Final Output Columns#
The resulting DataFrame (final_df) will include:
Cust_IDItemCust_NameShipping_AddressNewsletterSupplierPriceSupplier_Phone
Let me know if you want to filter this data (e.g., only see PlayStation buyers or group by supplier).
NF 4#

Top-Down#
Ausgangspunkt: Ein Verständnis von Welt. Fragestellung: Wie modellieren wir das? ein Team setzt sich zusammen … legt Attribute auf den Tisch: has_Supplier, has_Supplier_Phone, has_Price … Brainstorming: Welche Dinge der Welt werden damit beschrieben? … Weltwissen: Handelswaren, die einen Preis haben, wollen wir Produkt (trading good, allgemeiner: Item) nennen; oder: typischerweise sind es Handelswaren, die einen Preis haben. Zum Attribut Telefonnummer: Typischerweise sind es Menschen oder Unternehmen, die eine Telefonnummer haben; Handelswaren haben i.A. keine Telefonnummer; die Benennungen der Attribut has_Supplier_Phone und has_Supplier weisen darauf hin, dass es auch eine Entity Supplier gibt; also scheint das Attribut has_Supplier_Phone ein Attribut eines Dings Supplier zu sein … wie kommunizieren, wie notieren wir das? Willy Chen hat hier eine Notation entworfen, nämlich ER-Diagramme …
BSP ER-Diagramm Situation 1
BSP ER-Diagramm Situation 2, ausführlich mit Relation has_Supplier; auch in UML
ER-Modelle gehen von einer Welt aus, in der es Dinge gibt, die beschrieben werden (a) durch ihre Eigenschaften und (b) ihre Beziehungen zueinander … Ziel ist ein sog. relationales Datenmodell, eine relationale Datenbank