Vishal.dev
Back
Full-Stack

TaskMesh — Collaboration Platform

Real-time collaborative task manager with WebSocket-based live updates, activity feeds, notification systems, and full-text search — built for team productivity at scale.

Next.jsTypeScriptReactPostgreSQLPrismaRedisWebSocketsBullMQ
<50ms
Real-time broadcast latency
10k+
Connections per server
<200ms
Search query time
99.9%
Notification delivery

Domain Knowledge

What problem this project solves

Collaboration platforms face a unique challenge: teams create huge amounts of connected data (projects, tasks, comments, files, users) that must stay in sync across multiple clients in real time. TaskMesh solves this with an event-driven architecture where every action (task created, assigned, commented) emits an event that is broadcast to all connected clients via WebSockets, persisted to an activity feed, and optionally routed to the notification system.

Architecture

How the system is structured

TaskMesh uses a layered architecture with CQRS influence. Write operations go through the API layer, persist to PostgreSQL, and emit domain events to Redis pub/sub. A WebSocket server subscribes to these events and broadcasts them to connected clients. Read operations can hit PostgreSQL directly or a Redis cache for frequently accessed data (workspace members, project lists). Activity feeds are materialized views updated by background workers. Search uses PostgreSQL full-text search with GIN indexes, with the option to swap to Elasticsearch as data grows.

Data Model

Schema design and data flow

The core entities form a hierarchy: Workspace → Project → Task → Comment. Each entity has a creator, assignee (for tasks), and audit timestamps. Task status transitions are tracked in a separate events table that doubles as the activity feed source. Notifications are denormalized per user for fast queries — each notification stores the event type, actor, target, and a read/unread flag.

Key Challenges

Hardest problems encountered

Scaling WebSocket connections was a major challenge — a single server can handle ~10k connections, but beyond that you need horizontal scaling with a pub/sub layer. Solved by using Redis pub/sub as the message bus between WebSocket servers: any server can publish an event, and all servers receive it and forward to their connected clients. The activity feed presented another challenge: every action generates an event, and querying the raw events table for a user's feed is slow. Solved by materializing feeds per user via background workers.

Scaling Strategy

How the system grows

WebSocket servers scale horizontally behind a load balancer with Redis pub/sub as the coordination layer. PostgreSQL handles the write path with read replicas for feed and search queries. Redis caches workspace metadata, member lists, and frequently accessed tasks. The notification system uses a separate queue to avoid blocking critical write operations.

Security

Defense-in-depth approach

Workspace-level isolation enforced at the API layer: every query includes a workspace_id filter. JWT tokens carry workspace membership claims. WebSocket connections authenticate on connect and authorize on each subscription. File uploads are scanned and stored with signed URLs that expire.

Failure Handling

Resilience and recovery

If the WebSocket connection drops, the client reconnects with the last known event ID, and the server replays missed events from the event store. Queue workers retry failed notification deliveries. The activity feed materializer handles backpressure by batching events when the write rate exceeds the processing rate.

Observability

Monitoring and debugging

Key metrics tracked: WebSocket connection count, messages per second, event processing latency, notification delivery time, search query latency. Structured logging with correlation IDs connects WebSocket events to the API requests that triggered them. Alerts fire when connection counts approach server limits or when event processing falls behind.

Trade-offs

Engineering decisions and alternatives

WebSockets were chosen over SSE for bidirectional communication (necessary for typing indicators and presence). Redis pub/sub was chosen over Kafka because the use case doesn't require message replay or long-term retention. PostgreSQL full-text search was chosen over Elasticsearch for simplicity — the workload is well within PG's capabilities at this scale.

Architecture Decisions

Key choices and what was rejected

Decision
Chosen
Rejected
Real-time protocol
WebSockets
SSE (unidirectional, no typing indicators)
Pub/sub
Redis pub/sub
Kafka (overkill for real-time broadcast)
Search
PostgreSQL full-text search
Elasticsearch (premature optimization)
Activity feed
Materialized view per user
Query raw events (too slow at scale)

Senior-Level Topics

Concepts this project explores

CQRSEvent StreamsWebSocket ScalingPub/Sub ArchitectureReal-Time SystemsSearch EnginesMaterialized ViewsConnection Management