Social Media Platform Database Design

Complete solution with self-referential relationships and complex interactions.

Problem Statement

Design the conceptual model for a simple social media platform:

Tasks:

  1. List all entities needed
  2. Identify attributes for each entity
  3. Define primary keys
  4. Map all relationships (type + cardinality)
  5. Identify which tables need junction tables

1. Analysis & Entities Identification

Core Entities

  1. USERS - User accounts
  2. POSTS - User-created content
  3. COMMENTS - Comments on posts
  4. HASHTAGS - Tags for categorization

Junction Tables (for Many-to-Many Relationships)

  1. FOLLOWS - Users following users (self-referential)
  2. LIKES - Users liking posts
  3. POST_HASHTAGS - Posts tagged with hashtags

Key Insights

2. Entity-Relationship Diagram

erDiagram USERS ||--o{ POSTS : "creates" USERS ||--o{ COMMENTS : "writes" USERS }o--o{ USERS : "follows via FOLLOWS" USERS }o--o{ POSTS : "likes via LIKES" POSTS ||--o{ COMMENTS : "has" POSTS }o--o{ HASHTAGS : "tagged with" USERS { int user_id PK varchar username UK varchar email UK varchar password_hash varchar display_name text bio varchar profile_picture_url varchar location date birth_date datetime created_at datetime last_active boolean is_verified boolean is_private } POSTS { int post_id PK int user_id FK text content varchar media_url datetime created_at datetime updated_at int likes_count int comments_count int shares_count varchar visibility } COMMENTS { int comment_id PK int post_id FK int user_id FK text comment_text datetime created_at datetime updated_at int likes_count } HASHTAGS { int hashtag_id PK varchar tag_name UK int usage_count datetime created_at } FOLLOWS { int follower_id FK int following_id FK datetime followed_at } LIKES { int user_id FK int post_id FK datetime liked_at } POST_HASHTAGS { int post_id FK int hashtag_id FK datetime tagged_at }

3. Entities & Attributes

USERS

Attributes:

POSTS

Attributes:

Note: Visibility values: public, private, friends. The likes_count and comments_count are denormalized for performance.

COMMENTS

Attributes:

Extension: Could add parent_comment_id for nested replies

HASHTAGS

Attributes:

Note: tag_name should be case-insensitive (#programming = #Programming)

FOLLOWS (Junction Table)

Attributes:

Business Rule: follower_id ≠ following_id (cannot follow yourself)

LIKES (Junction Table)

Attributes:

Note: Unique constraint prevents duplicate likes

POST_HASHTAGS (Junction Table)

Attributes:

4. Relationships Analysis

Relationship Type Implementation Description
Users ←→ Users (FOLLOWS) Many-to-Many (Self-referential) FOLLOWS junction table A user can follow many users, and a user can have many followers. Crow's Foot: }o--o{
Users ←→ Posts One-to-Many POSTS.user_id FK A user can create many posts, but each post is by one user. Crow's Foot: ||--o{
Users ←→ Comments One-to-Many COMMENTS.user_id FK A user can write many comments, but each comment is by one user. Crow's Foot: ||--o{
Posts ←→ Comments One-to-Many COMMENTS.post_id FK A post can have many comments, but each comment is on one post. Crow's Foot: ||--o{
Users ←→ Posts (LIKES) Many-to-Many LIKES junction table A user can like many posts, and a post can be liked by many users. Crow's Foot: }o--o{
Posts ←→ Hashtags Many-to-Many POST_HASHTAGS junction table A post can have many hashtags, and a hashtag can be on many posts. Crow's Foot: }o--o{

5. Complete SQL Schema

-- ============================================
-- SOCIAL MEDIA PLATFORM DATABASE SCHEMA
-- ============================================

-- 1. USERS Table
CREATE TABLE USERS (
    user_id INT AUTO_INCREMENT PRIMARY KEY,
    username VARCHAR(50) UNIQUE NOT NULL,
    email VARCHAR(255) UNIQUE NOT NULL,
    password_hash VARCHAR(255) NOT NULL,
    display_name VARCHAR(100),
    bio TEXT,
    profile_picture_url VARCHAR(500),
    location VARCHAR(100),
    birth_date DATE,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    last_active DATETIME,
    is_verified BOOLEAN DEFAULT FALSE,
    is_private BOOLEAN DEFAULT FALSE,
    CHECK (CHAR_LENGTH(username) >= 3),
    INDEX idx_username (username),
    INDEX idx_email (email),
    INDEX idx_created_at (created_at)
) ENGINE=InnoDB;

-- 2. POSTS Table
CREATE TABLE POSTS (
    post_id INT AUTO_INCREMENT PRIMARY KEY,
    user_id INT NOT NULL,
    content TEXT NOT NULL,
    media_url VARCHAR(500),
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME ON UPDATE CURRENT_TIMESTAMP,
    likes_count INT DEFAULT 0,
    comments_count INT DEFAULT 0,
    shares_count INT DEFAULT 0,
    visibility VARCHAR(20) DEFAULT 'public',
    CHECK (visibility IN ('public', 'private', 'friends')),
    CHECK (likes_count >= 0),
    CHECK (comments_count >= 0),
    CHECK (shares_count >= 0),
    FOREIGN KEY (user_id) REFERENCES USERS(user_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    INDEX idx_user (user_id),
    INDEX idx_created_at (created_at),
    INDEX idx_visibility (visibility),
    FULLTEXT idx_content (content)
) ENGINE=InnoDB;

-- 3. COMMENTS Table
CREATE TABLE COMMENTS (
    comment_id INT AUTO_INCREMENT PRIMARY KEY,
    post_id INT NOT NULL,
    user_id INT NOT NULL,
    comment_text TEXT NOT NULL,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    updated_at DATETIME ON UPDATE CURRENT_TIMESTAMP,
    likes_count INT DEFAULT 0,
    CHECK (likes_count >= 0),
    FOREIGN KEY (post_id) REFERENCES POSTS(post_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    FOREIGN KEY (user_id) REFERENCES USERS(user_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    INDEX idx_post (post_id),
    INDEX idx_user (user_id),
    INDEX idx_created_at (created_at)
) ENGINE=InnoDB;

-- 4. HASHTAGS Table
CREATE TABLE HASHTAGS (
    hashtag_id INT AUTO_INCREMENT PRIMARY KEY,
    tag_name VARCHAR(100) UNIQUE NOT NULL,
    usage_count INT DEFAULT 0,
    created_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    CHECK (usage_count >= 0),
    INDEX idx_tag_name (tag_name),
    INDEX idx_usage_count (usage_count)
) ENGINE=InnoDB;

-- 5. FOLLOWS Junction Table (Self-Referential)
CREATE TABLE FOLLOWS (
    follower_id INT NOT NULL,
    following_id INT NOT NULL,
    followed_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (follower_id, following_id),
    CHECK (follower_id != following_id),
    FOREIGN KEY (follower_id) REFERENCES USERS(user_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    FOREIGN KEY (following_id) REFERENCES USERS(user_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    INDEX idx_follower (follower_id),
    INDEX idx_following (following_id),
    INDEX idx_followed_at (followed_at)
) ENGINE=InnoDB;

-- 6. LIKES Junction Table
CREATE TABLE LIKES (
    user_id INT NOT NULL,
    post_id INT NOT NULL,
    liked_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (user_id, post_id),
    FOREIGN KEY (user_id) REFERENCES USERS(user_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    FOREIGN KEY (post_id) REFERENCES POSTS(post_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    INDEX idx_user (user_id),
    INDEX idx_post (post_id),
    INDEX idx_liked_at (liked_at)
) ENGINE=InnoDB;

-- 7. POST_HASHTAGS Junction Table
CREATE TABLE POST_HASHTAGS (
    post_id INT NOT NULL,
    hashtag_id INT NOT NULL,
    tagged_at DATETIME DEFAULT CURRENT_TIMESTAMP,
    PRIMARY KEY (post_id, hashtag_id),
    FOREIGN KEY (post_id) REFERENCES POSTS(post_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    FOREIGN KEY (hashtag_id) REFERENCES HASHTAGS(hashtag_id)
        ON DELETE CASCADE
        ON UPDATE CASCADE,
    INDEX idx_post (post_id),
    INDEX idx_hashtag (hashtag_id)
) ENGINE=InnoDB;

-- ============================================
-- TRIGGERS FOR MAINTAINING COUNTS
-- ============================================

-- Increment likes_count when a like is added
DELIMITER //
CREATE TRIGGER after_like_insert
AFTER INSERT ON LIKES
FOR EACH ROW
BEGIN
    UPDATE POSTS
    SET likes_count = likes_count + 1
    WHERE post_id = NEW.post_id;
END//

-- Decrement likes_count when a like is removed
CREATE TRIGGER after_like_delete
AFTER DELETE ON LIKES
FOR EACH ROW
BEGIN
    UPDATE POSTS
    SET likes_count = likes_count - 1
    WHERE post_id = OLD.post_id;
END//

-- Increment comments_count when comment is added
CREATE TRIGGER after_comment_insert
AFTER INSERT ON COMMENTS
FOR EACH ROW
BEGIN
    UPDATE POSTS
    SET comments_count = comments_count + 1
    WHERE post_id = NEW.post_id;
END//

-- Decrement comments_count when comment is deleted
CREATE TRIGGER after_comment_delete
AFTER DELETE ON COMMENTS
FOR EACH ROW
BEGIN
    UPDATE POSTS
    SET comments_count = comments_count - 1
    WHERE post_id = OLD.post_id;
END//

-- Update hashtag usage_count when post_hashtag is added
CREATE TRIGGER after_post_hashtag_insert
AFTER INSERT ON POST_HASHTAGS
FOR EACH ROW
BEGIN
    UPDATE HASHTAGS
    SET usage_count = usage_count + 1
    WHERE hashtag_id = NEW.hashtag_id;
END//

-- Update hashtag usage_count when post_hashtag is removed
CREATE TRIGGER after_post_hashtag_delete
AFTER DELETE ON POST_HASHTAGS
FOR EACH ROW
BEGIN
    UPDATE HASHTAGS
    SET usage_count = usage_count - 1
    WHERE hashtag_id = OLD.hashtag_id;
END//

DELIMITER ;

-- ============================================
-- SAMPLE DATA
-- ============================================

INSERT INTO USERS (username, email, password_hash, display_name, bio, is_verified) VALUES
('alice_wonder', 'alice@example.com', 'hashed_pwd_1', 'Alice Smith', 'Photography enthusiast 📷', TRUE),
('bob_builder', 'bob@example.com', 'hashed_pwd_2', 'Bob Johnson', 'Tech blogger', FALSE),
('charlie_dev', 'charlie@example.com', 'hashed_pwd_3', 'Charlie Brown', 'Full-stack developer', TRUE);

INSERT INTO POSTS (user_id, content, visibility) VALUES
(1, 'Beautiful sunset at the beach! #travel #photography', 'public'),
(1, 'My new photography portfolio is live!', 'public'),
(2, 'Just published a new blog post about React hooks', 'public'),
(3, 'Working on an exciting new project #coding #javascript', 'public');

INSERT INTO HASHTAGS (tag_name) VALUES
('travel'), ('photography'), ('coding'), ('javascript'), ('react');

INSERT INTO POST_HASHTAGS (post_id, hashtag_id) VALUES
(1, 1), (1, 2),  -- Post 1: #travel #photography
(3, 3), (3, 4);  -- Post 3: #coding #javascript

INSERT INTO FOLLOWS (follower_id, following_id) VALUES
(2, 1),  -- Bob follows Alice
(3, 1),  -- Charlie follows Alice
(3, 2);  -- Charlie follows Bob

INSERT INTO LIKES (user_id, post_id) VALUES
(2, 1),  -- Bob likes Alice's post
(3, 1);  -- Charlie likes Alice's post

6. Design Decisions & Best Practices

Implemented Best Practices

Performance Optimizations

Common Mistakes Avoided

7. Common Queries

-- Get user's feed (posts from people they follow)
SELECT p.*, u.username, u.display_name
FROM POSTS p
JOIN USERS u ON p.user_id = u.user_id
JOIN FOLLOWS f ON p.user_id = f.following_id
WHERE f.follower_id = ? -- logged in user
ORDER BY p.created_at DESC
LIMIT 20;

-- Get trending hashtags
SELECT h.tag_name, h.usage_count
FROM HASHTAGS h
ORDER BY h.usage_count DESC
LIMIT 10;

-- Get post with all its hashtags
SELECT p.*, GROUP_CONCAT(h.tag_name SEPARATOR ', ') as hashtags
FROM POSTS p
LEFT JOIN POST_HASHTAGS ph ON p.post_id = ph.post_id
LEFT JOIN HASHTAGS h ON ph.hashtag_id = h.hashtag_id
WHERE p.post_id = ?
GROUP BY p.post_id;

-- Check if user likes a post
SELECT COUNT(*) as is_liked
FROM LIKES
WHERE user_id = ? AND post_id = ?;

-- Get followers count
SELECT COUNT(*) as follower_count
FROM FOLLOWS
WHERE following_id = ?;

-- Get following count
SELECT COUNT(*) as following_count
FROM FOLLOWS
WHERE follower_id = ?;