Benutzer:MovGP0/Muster/SQL Antipatterns

aus Wikipedia, der freien Enzyklopädie
   MovGP0        Über mich        Hilfen        Artikel        Weblinks        Literatur        Zitate        Notizen        Programmierung        MSCert        Physik      


SQL Antipatterns

Jaywalking

Comma-Separated Lists

Ziel
mehrere Werte in einem Attribut speichern
CREATE TABLE Products (
   product_id SERIAL PRIMARY KEY, 
   product_name VARCHAR(1000), 
   account_id VARCHAR(100), -- comma separated list
   
);
Nachteile von Comma-Separated Lists
  • Abfrage nach einem Element das einen Bestimmten Wert besitzt benötigt Regex
SELECT * FROM Products WHERE account_id REGEXP '[[:<:]]12[[:>:]]'
  • Aggregate-Queries nur über Umwege möglich
SELECT product_id, LENGTH(account_id)  LENGTH(REPLACE(account_id, '','')) + 1
AS contacts_per_product
  • Keine Validierung möglich
INSERT INTO Products (product_id, product_name, account_id)
VALUES(DEVAULT, 'Valid Product Name', 'Invalid,Product,Names')
  • Eventuell Escape-Charater nötig
  • Nicht offensichtlich, welches Zeichen zur Trennung verwendet wird
  • Längenbegrenzung des Attributs begrenzt die mögliche Menge an Referenzen
Lösung: Intersection Table
CREATE TABLE Contacts (
   product_id BIGINT UNISGNED NOT NULL, 
account_id BIGINT UNSIGNED NOT NULL, 
   PRIMARY  KEY (product_id, account_id), 
   FOREIGN KEY (product_id) REFERENCES Products(product_id), 
   FOREIGN KEY (account_id) REFERENCES Account(account_id)
);
Verwendung
  • Abfrage eines Accounts
SELECT p.* 
FROM Products AS p JOIN Contacts AS c ON (p.product_id = c.product_id)
WHERE c.account_id = 34;
SELECT a.*
FROM Accounts AS a JOIN Contacts AS c ON (a.account_id = c.account_id)
WHERE c.product_id = 123;
  • Aggregate Queries
SELECT products_id, COUNT(*) AS accounts_per_product
FROM Contacts
GROUP BY product_id;
SELECT account_id, COUNT(*) AS accounts_per_product
FROM Contacts
GROUP BY product_id
SELECT account_id, COUNT(*) AS produts_per_account
FROM Contacts
GROUP BY account_id;
SELECT c.product_id, c.contacts_per_product
FROM (
   SELECT product_id, COUNT(*) AS accounts_per_product
   FROM Contacts
   GROUP BY product_id
) AS c
ORDER BY c.contacts_per_product DESC LIMIT 1;
  • Updating Contacts for a specific Product
INSERT INTO Contacts (product_id, account_id) VALUES (456, 34)
DELETE FROM Contacts WHERE product_id = 456 AND account_id = 34;
Ausnahme
Comma-Separated Lists können in denormalisierten Business-Intelligence-Datenbanken zum Einsatz kommen um eine Performanceverbesserung beim Auslesen der Datenbank zu erzielen.
In der OLTP-Datenbank wird dann weiterhin die normalisierte Variante eingesetzt.

Other Antipatterns

UPDATE instead of CASE
Blind Code Reuse
Pull more columns than needed
Double-Dip
  • Minimize queries to a single table
  • Don't query multiple times when its possible to query once
  • Use Temp-Tables when needed
No Pre-Staging of Report Data
  • Will cause large join operations
  • Create normalized tables and query from there
Recurring Delete or Update
  • Deletes and Updates should be done in batches
Cursors on Main Tables
  • Cursors are locking the table.
  • Use cursor on a temp table instead.
Nested Views
  • Views are hiding nested queries
  • Query optimizer can't optimize
  • More data load on the database
Scalar Valued Function in the SELECT statement
  • Use the APPLY operator instead.[1]
Copy to Shadow-Table
  • Avoid copying the whole table
  • Create partitions using the SWITCH statement
ORM Queries
  • Queries created by ORM are slow
  • Use ORM to access Stored Procedures
Large Operations involving many tables
  • Operation will lock every used table
  • Split into serval transactions
Triggers
  • Avoid triggers
Cluster on GUID
  • Sorting and clustering on GUID will cause massive fragmentation, since it's random
Count data to see if data exists
  • always traverses all columns
  • EXISTS stops on first match
    • faster with index
Don't Do
SET @numberOfRows = (SELECT COUNT(*) FROM dbo.Table);
If @numberOfRows > 0
BEGIN
<Do something>
END
If EXISTS (SELECT 1 FROM dbo.Table)
BEGIN
<Do something>
END
Searching for a negative
Don't Do
SELECT * 
FROM Customers 
WHERE RegionID <> 3
(
   SELECT * 
   FROM Customers 
   WHERE RegionID < 3 
) 
UNION ALL 
(
   SELECT * 
   FROM Customers 
   WHERE RegionID
)
Database as a Queue[2]
  • Use a specialized queuing solution instead.

Quellen

Referenzen

  1. Boost Your T-SQL with the APPLY Operator. In: Microsoft Virtual Academy. Microsoft, 29. Januar 2014, abgerufen am 21. Januar 2015 (englisch).
  2. Mike Hadlow: The Database As Queue Anti-Pattern. In: Code Rant. 26. April 2012, abgerufen am 21. Januar 2015 (englisch). }}