Skip to content

Distributed job queue system using Redis and Spring Boot, designed to efficiently distribute computational tasks across multiple worker nodes.

License

Notifications You must be signed in to change notification settings

nadimnesar/distributed-job-queue

Repository files navigation

Distributed Job Queue System with Redis and Spring Boot

Overview

This project implements a scalable distributed job queue system using Redis and Spring Boot, designed to efficiently distribute computational tasks across multiple worker nodes. The system provides robust job tracking, fault tolerance, and handles job dependencies and failures gracefully.

Features

  • Deploy multiple worker and producer nodes across different machines.
  • Use Nginx to load balance producer requests.
  • Enqueue jobs with priorities for efficient task processing.
  • Use Directed Acyclic Graph (DAG) based dependency management to ensure proper execution order.
  • Track job statuses and support job cancellation.
  • Support proper distributed locking to prevent duplicate job processing.
  • Support virtual threads for concurrent job execution.
  • Automatic retry mechanism for failed jobs.
  • Dead Letter Queue (DLQ) for unprocessable jobs and support revive dead jobs.
  • Monitoring APIs for worker system metrics.

System Design

flowchart TD
    User((User)) --> Frontend[Frontend UI]
    Frontend --> Nginx[Load Balancer: Nginx]

    subgraph Producers[Producer Service Cluster]
        direction TB
        Producer1[Producer 1]
        Producer2[Producer 2]
        Producer3[Producer 3]
    end

    Nginx --> Producers
    Producers -->|Enqueue Job| Redis[("Redis: Job Queue")]

    subgraph Workers[Worker Cluster]
        direction TB
        Worker1[Worker 1]
        Worker2[Worker 2]
        Worker3[Worker 3]
        Worker4[Worker 4]
        Worker5[Worker 5]
    end

    Redis -->|Consume Job| Workers
    Workers -->|Update Job| PostgreSQL[("PostgreSQL: Job Metadata")]
    Producers -->|Store Job Metadata| PostgreSQL
    Workers --> Decision1{Successful?}
    Decision1 -->|Yes| Done([Done])
    Decision1 -->|No| Decision2{Limit Exceeded?}
    Decision2 -->|Yes: Move to DLQ| Redis
    Decision2 -->|No| Redis
    
    classDef producer fill: #8E44AD, stroke: #6C3483, color: white, stroke-width: 2px, font-weight: bold
    classDef worker fill: #F57C00, stroke: #E65100, color: white, stroke-width: 2px, font-weight: bold
    classDef database fill: #336791, stroke: #274472, color: white, stroke-width: 2px, font-weight: bold, stroke-dasharray: 5 2
    classDef cache fill: #D32F2F, stroke: #B71C1C, color: white, stroke-width: 2px, font-weight: bold, stroke-dasharray: 5 2
    classDef loadbalancer fill: #388E3C, stroke: #2E7D32, color: white, stroke-width: 2px, font-weight: bold
    
    class Producer1,Producer2,Producer3 producer
    class Worker1,Worker2,Worker3,Worker4,Worker5 worker
    class PostgreSQL database
    class Redis cache
    class Nginx loadbalancer
Loading

Technical Specifications

  • Backend: Spring Boot
  • Queue: Redis
  • Load Balancer: Nginx
  • Database: PostgreSQL
  • Database Migration: Liquibase

Getting Started

Prerequisites

  • Java 21
  • Docker 20.10.13 or higher

Build and Run the Project

  1. Clone the repository:

    git clone https://github.com/nadimnesar/distributed-job-queue-with-redis-and-spring.git
    cd distributed-job-queue-with-redis-and-spring
  2. Start the services:

    make up
  3. Stop the services:

    make down

API Endpoints

Download the Postman collection and environment for testing the APIs.

Postman Collection Postman Environment

Logs

All logs are stored in the following directory: docs/logs/

Use the following commands to view logs in real-time:

# View log files for all producer instances
ls docs/logs/producer-*.log
# View logs for a specific producer instance (example: producer-062f8e6ef23)
tail -f docs/logs/producer-062f8e6ef23.log

Future Enhancements

  • Implement horizontal scaling for worker nodes based on queue length
  • Improve job distribution algorithm for better worker utilization
  • Implement job scheduling capabilities
  • Implement Redis Cluster for high availability
  • Add PostgreSQL replication for database redundancy
  • Add advanced monitoring with Prometheus and Grafana

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

About

Distributed job queue system using Redis and Spring Boot, designed to efficiently distribute computational tasks across multiple worker nodes.

Topics

Resources

License

Stars

Watchers

Forks