Database Design Principles

Core principles and patterns used in Smart Shelf database design

Database Design Principles

The Smart Shelf database follows established design principles to ensure scalability, performance, data integrity, and maintainability.

Normalization

Third Normal Form (3NF)

All tables are normalized to 3NF to reduce redundancy and maintain data consistency:

  • First Normal Form (1NF): Each table cell contains atomic values, no repeating groups
  • Second Normal Form (2NF): All non-key attributes depend on the entire primary key
  • Third Normal Form (3NF): No transitive dependencies between non-key attributes

Example: Product pricing is stored in the products table, while variant-specific pricing is in product_variants, eliminating redundancy.

Foreign Key Constraints

Referential integrity is enforced through foreign key constraints:

-- Example: Products must belong to valid categories
ALTER TABLE products 
ADD CONSTRAINT fk_products_category 
FOREIGN KEY (category_id) REFERENCES categories(id) ON DELETE SET NULL;

-- Example: Inventory entries must reference valid products and warehouses
ALTER TABLE inventory 
ADD CONSTRAINT fk_inventory_product 
FOREIGN KEY (product_id) REFERENCES products(id) ON DELETE CASCADE;

Junction Tables

Many-to-many relationships are handled through junction tables:

-- Example: Users can have access to multiple warehouses
CREATE TABLE user_warehouse_access (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    user_id UUID NOT NULL REFERENCES users(id) ON DELETE CASCADE,
    warehouse_id UUID NOT NULL REFERENCES warehouses(id) ON DELETE CASCADE,
    access_level VARCHAR(50) DEFAULT 'read',
    granted_by UUID REFERENCES users(id) ON DELETE SET NULL,
    granted_at TIMESTAMPTZ DEFAULT now(),
    
    UNIQUE(user_id, warehouse_id)
);

Performance Optimization

Strategic Indexing

Indexes are created based on query patterns and performance requirements:

Single Column Indexes

-- Frequently searched columns
CREATE INDEX idx_products_sku ON products(sku);
CREATE INDEX idx_users_email ON users(email);
CREATE INDEX idx_orders_status ON purchase_orders(status);

Composite Indexes

-- Multi-column queries
CREATE INDEX idx_inventory_product_warehouse ON inventory(product_id, warehouse_id);
CREATE INDEX idx_stock_movements_product_date ON stock_movements(product_id, created_at);

Partial Indexes

-- Conditional indexes for specific use cases
CREATE INDEX idx_inventory_low_stock ON inventory(quantity_on_hand) 
WHERE quantity_on_hand <= reorder_point;

CREATE INDEX idx_active_products ON products(name) 
WHERE is_active = true;

Full-Text Search Indexes

-- Text search capabilities
CREATE INDEX idx_products_search ON products 
USING GIN(to_tsvector('english', name || ' ' || COALESCE(description, '')));

Table Partitioning

Large tables are partitioned to improve query performance:

-- Partition stock_movements by month
CREATE TABLE stock_movements (
    id UUID NOT NULL,
    created_at TIMESTAMPTZ NOT NULL,
    -- other columns
) PARTITION BY RANGE (created_at);

-- Create monthly partitions
CREATE TABLE stock_movements_y2024m01 PARTITION OF stock_movements
FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');

CREATE TABLE stock_movements_y2024m02 PARTITION OF stock_movements
FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');

Materialized Views

Complex analytics queries use materialized views for better performance:

-- Product performance summary
CREATE MATERIALIZED VIEW product_performance_summary AS
SELECT 
    p.id,
    p.name,
    p.sku,
    SUM(i.quantity_on_hand) AS total_stock,
    AVG(i.average_cost) AS avg_cost,
    COUNT(DISTINCT sm.warehouse_id) AS active_warehouses,
    SUM(CASE WHEN sm.movement_type = 'out' THEN sm.quantity ELSE 0 END) AS total_sold_ytd
FROM products p
LEFT JOIN inventory i ON p.id = i.product_id
LEFT JOIN stock_movements sm ON p.id = sm.product_id 
    AND sm.created_at >= date_trunc('year', CURRENT_DATE)
WHERE p.is_active = true
GROUP BY p.id, p.name, p.sku;

-- Refresh strategy
CREATE INDEX idx_product_performance_summary_id 
ON product_performance_summary(id);

Security Architecture

Row Level Security (RLS)

All user-accessible tables implement Row Level Security:

-- Enable RLS
ALTER TABLE products ENABLE ROW LEVEL SECURITY;
ALTER TABLE inventory ENABLE ROW LEVEL SECURITY;
ALTER TABLE purchase_orders ENABLE ROW LEVEL SECURITY;

-- Admin access policy
CREATE POLICY admin_full_access ON products
FOR ALL TO authenticated
USING (
    EXISTS (
        SELECT 1 FROM users 
        WHERE users.id = auth.uid() 
        AND users.role = 'admin'
        AND users.is_active = true
    )
);

-- Warehouse-based access policy
CREATE POLICY warehouse_inventory_access ON inventory
FOR ALL TO authenticated
USING (
    warehouse_id IN (
        SELECT warehouse_id 
        FROM user_warehouse_access 
        WHERE user_id = auth.uid()
    )
    OR EXISTS (
        SELECT 1 FROM users 
        WHERE users.id = auth.uid() 
        AND users.role IN ('admin', 'manager')
    )
);

Role-Based Access Control

Database permissions are mapped to application roles:

-- Create database roles
CREATE ROLE smart_shelf_admin;
CREATE ROLE smart_shelf_manager;
CREATE ROLE smart_shelf_employee;
CREATE ROLE smart_shelf_viewer;

-- Grant permissions by role
GRANT ALL PRIVILEGES ON ALL TABLES IN SCHEMA public TO smart_shelf_admin;
GRANT SELECT, INSERT, UPDATE ON ALL TABLES IN SCHEMA public TO smart_shelf_manager;
GRANT SELECT, INSERT, UPDATE ON inventory, stock_movements TO smart_shelf_employee;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO smart_shelf_viewer;

Audit Trails

Comprehensive audit logging for compliance and security:

-- Audit trigger function
CREATE OR REPLACE FUNCTION audit_trigger_function()
RETURNS TRIGGER AS $$
DECLARE
    audit_row audit_logs;
    include_values boolean = false;
    log_diffs boolean = false;
    h_old hstore;
    h_new hstore;
    excluded_cols text[] = ARRAY[]::text[];
BEGIN
    -- Determine what to log based on operation
    IF TG_ARGV[0]::boolean IS NOT NULL THEN
        include_values = TG_ARGV[0]::boolean;
    END IF;
    
    IF TG_ARGV[1]::boolean IS NOT NULL THEN
        log_diffs = TG_ARGV[1]::boolean;
    END IF;
    
    IF TG_ARGV[2] IS NOT NULL THEN
        excluded_cols = string_to_array(TG_ARGV[2], ',');
    END IF;

    audit_row = ROW(
        gen_random_uuid(),
        TG_TABLE_NAME::text,
        NULL, -- record_id filled below
        TG_OP::text,
        NULL, -- old_values
        NULL, -- new_values
        NULL, -- changed_fields
        auth.uid(),
        current_setting('app.session_id', true),
        inet_client_addr(),
        current_setting('app.user_agent', true),
        now()
    );

    IF TG_OP = 'DELETE' THEN
        audit_row.record_id = OLD.id;
        IF include_values THEN
            audit_row.old_values = to_jsonb(OLD);
        END IF;
        INSERT INTO audit_logs VALUES (audit_row.*);
        RETURN OLD;
        
    ELSIF TG_OP = 'UPDATE' THEN
        audit_row.record_id = NEW.id;
        IF include_values THEN
            audit_row.old_values = to_jsonb(OLD);
            audit_row.new_values = to_jsonb(NEW);
        END IF;
        INSERT INTO audit_logs VALUES (audit_row.*);
        RETURN NEW;
        
    ELSIF TG_OP = 'INSERT' THEN
        audit_row.record_id = NEW.id;
        IF include_values THEN
            audit_row.new_values = to_jsonb(NEW);
        END IF;
        INSERT INTO audit_logs VALUES (audit_row.*);
        RETURN NEW;
    END IF;
    
    RETURN NULL;
END;
$$ LANGUAGE plpgsql;

Scalability Patterns

UUID Primary Keys

All tables use UUID primary keys for distributed system compatibility:

-- Generate UUIDs for primary keys
CREATE TABLE example_table (
    id UUID PRIMARY KEY DEFAULT gen_random_uuid(),
    -- other columns
);

-- Benefits:
-- 1. No central ID generation bottleneck
-- 2. Merge-friendly for distributed systems
-- 3. Non-sequential for security
-- 4. Globally unique across databases

Soft Deletes

Critical tables use soft deletes to maintain referential integrity:

-- Add deleted_at column
ALTER TABLE products ADD COLUMN deleted_at TIMESTAMPTZ NULL;

-- Create partial index for active records
CREATE INDEX idx_products_active ON products(id) WHERE deleted_at IS NULL;

-- Update queries to filter deleted records
SELECT * FROM products WHERE deleted_at IS NULL;

-- Soft delete function
CREATE OR REPLACE FUNCTION soft_delete_product(product_uuid UUID)
RETURNS BOOLEAN AS $$
BEGIN
    UPDATE products 
    SET deleted_at = now(), is_active = false
    WHERE id = product_uuid AND deleted_at IS NULL;
    
    RETURN FOUND;
END;
$$ LANGUAGE plpgsql;

Timestamp Tracking

All tables include created_at and updated_at for change tracking:

-- Standard timestamp columns
created_at TIMESTAMPTZ DEFAULT now(),
updated_at TIMESTAMPTZ DEFAULT now()

-- Auto-update trigger
CREATE OR REPLACE FUNCTION update_updated_at_column()
RETURNS TRIGGER AS $$
BEGIN
    NEW.updated_at = now();
    RETURN NEW;
END;
$$ LANGUAGE plpgsql;

-- Apply to all tables
CREATE TRIGGER update_products_updated_at 
    BEFORE UPDATE ON products
    FOR EACH ROW EXECUTE FUNCTION update_updated_at_column();

Data Integrity Patterns

Check Constraints

Business rules are enforced at the database level:

-- Quantity constraints
ALTER TABLE inventory 
ADD CONSTRAINT chk_inventory_quantities 
CHECK (quantity_on_hand >= 0 AND quantity_allocated >= 0);

-- Status constraints
ALTER TABLE purchase_orders 
ADD CONSTRAINT chk_po_status 
CHECK (status IN ('draft', 'sent', 'confirmed', 'partial', 'received', 'cancelled'));

-- Price constraints
ALTER TABLE products 
ADD CONSTRAINT chk_product_prices 
CHECK (cost_price >= 0 AND selling_price >= 0);

Generated Columns

Calculated fields are stored as generated columns:

-- Available quantity calculation
quantity_available INTEGER GENERATED ALWAYS AS (quantity_on_hand - quantity_allocated) STORED,

-- Total cost calculation
total_cost DECIMAL(12,2) GENERATED ALWAYS AS (quantity_ordered * unit_cost) STORED,

-- Balance due calculation
balance_due DECIMAL(12,2) GENERATED ALWAYS AS (total_amount - amount_paid) STORED

Domain Types

Custom types ensure data validation:

-- Email validation
CREATE DOMAIN email_address AS VARCHAR(255)
CHECK (VALUE ~ '^[A-Za-z0-9._%+-]+@[A-Za-z0-9.-]+\.[A-Za-z]{2,}$');

-- Phone number validation
CREATE DOMAIN phone_number AS VARCHAR(20)
CHECK (VALUE ~ '^\+?[1-9]\d{1,14}$');

-- Monetary amounts
CREATE DOMAIN money_amount AS DECIMAL(12,2)
CHECK (VALUE >= 0);

-- Percentage values
CREATE DOMAIN percentage AS DECIMAL(5,2)
CHECK (VALUE >= 0 AND VALUE <= 100);

JSON and Flexible Schema

JSONB for Flexible Data

JSONB columns handle variable attributes:

-- Product custom fields
custom_fields JSONB DEFAULT '{}',

-- User preferences
preferences JSONB DEFAULT '{}',

-- Product tags
tags JSONB DEFAULT '[]',

-- Operating hours
operating_hours JSONB DEFAULT '{}'

JSONB Indexing

GIN indexes support efficient JSONB queries:

-- Index for JSON containment queries
CREATE INDEX idx_products_tags ON products USING GIN(tags);
CREATE INDEX idx_products_custom_fields ON products USING GIN(custom_fields);

-- Query examples
SELECT * FROM products WHERE tags @> '["electronics"]';
SELECT * FROM products WHERE custom_fields @> '{"color": "red"}';

Performance Monitoring

Query Performance Tracking

Built-in views for monitoring performance:

-- Slow query monitoring
SELECT 
    query,
    mean_time,
    calls,
    total_time
FROM pg_stat_statements 
WHERE mean_time > 100
ORDER BY mean_time DESC;

-- Index usage statistics
SELECT 
    schemaname,
    tablename,
    indexname,
    idx_scan,
    idx_tup_read,
    idx_tup_fetch
FROM pg_stat_user_indexes
ORDER BY idx_scan DESC;

Table Size Monitoring

Track table growth and storage usage:

-- Table sizes
SELECT 
    schemaname,
    tablename,
    pg_size_pretty(pg_total_relation_size(schemaname||'.'||tablename)) as size
FROM pg_tables 
WHERE schemaname = 'public'
ORDER BY pg_total_relation_size(schemaname||'.'||tablename) DESC;

Best Practices Summary

Design Guidelines

  1. Normalize to 3NF but denormalize when performance requires it
  2. Use UUIDs for all primary keys in distributed systems
  3. Implement soft deletes for critical business data
  4. Add audit trails for compliance and debugging
  5. Use check constraints to enforce business rules

Performance Guidelines

  1. Index strategically based on query patterns
  2. Partition large tables by logical boundaries (time, geography)
  3. Use materialized views for complex analytics
  4. Monitor query performance regularly
  5. Implement connection pooling for scalability

Security Guidelines

  1. Enable Row Level Security on all user-accessible tables
  2. Use role-based access control aligned with application roles
  3. Encrypt sensitive data at rest and in transit
  4. Implement comprehensive audit logging
  5. Regular security reviews and vulnerability assessments