Workers - Automagik Suite Documentation

Overview

Workers are Celery processes that pick up tasks from the Redis queue and execute workflows. Workers run continuously in the background, processing tasks as they are created by schedules or manual workflow runs.

No API Endpoints: Workers are managed exclusively through the CLI. There are no REST API endpoints for worker management, status, or health checks.

Worker Architecture

Spark uses Celery workers to execute workflow tasks asynchronously: Components:

Redis Queue: Stores pending tasks
Worker Process: Picks up and executes tasks
Workflow Source: External system (LangFlow, Hive, Agents)
PostgreSQL: Stores task results and status

Managing Workers via CLI

All worker management is done through the automagik-spark CLI:

Start Workers

Start worker processes to begin executing tasks:

# Start single worker (development)
automagik-spark workers start

# Start multiple workers (production)
automagik-spark workers start --workers 4

# Start with custom concurrency
automagik-spark workers start --concurrency 2

Options:

--workers <n>: Number of worker processes (default: 1)
--concurrency <n>: Tasks per worker (default: 1)
--loglevel <level>: Logging level (debug, info, warning, error)

Check Worker Status

View running worker processes and their status:

# List active workers
automagik-spark workers status

# Detailed worker information
automagik-spark workers status --verbose

Example Output:

Worker: celery@hostname
  Status: Active
  Concurrency: 1
  Tasks Processed: 47
  Active Tasks: 2
  Uptime: 3h 24m

Stop Workers

Gracefully stop worker processes:

# Stop all workers
automagik-spark workers stop

# Force stop (not recommended)
automagik-spark workers stop --force

Graceful Shutdown: Always use graceful shutdown to allow workers to finish their current tasks. Force stopping may leave tasks in an inconsistent state.

View Worker Logs

Monitor worker activity and debug issues:

# View worker logs
automagik-spark workers logs

# Follow logs in real-time
automagik-spark workers logs --follow

# Filter by log level
automagik-spark workers logs --level error

Worker Configuration

Configure workers through environment variables in your .env file:

Concurrency Settings

# Number of concurrent tasks per worker
CELERY_WORKER_CONCURRENCY=1

# Worker pool type (prefork, solo, threads)
CELERY_WORKER_POOL=prefork

# Maximum tasks per worker before restart
CELERY_WORKER_MAX_TASKS_PER_CHILD=100

Resource Limits

# Task execution timeout (seconds)
CELERY_TASK_TIME_LIMIT=300

# Soft timeout warning (seconds)
CELERY_TASK_SOFT_TIME_LIMIT=280

# Memory limit per worker (MB)
CELERY_WORKER_MAX_MEMORY_PER_CHILD=200000

Queue Configuration

# Redis connection for task queue
CELERY_BROKER_URL=redis://localhost:6379/0

# Result backend (optional)
CELERY_RESULT_BACKEND=redis://localhost:6379/0

# Task serialization format
CELERY_TASK_SERIALIZER=json
CELERY_RESULT_SERIALIZER=json

Retry Settings

# Maximum retry attempts for tasks
CELERY_TASK_MAX_RETRIES=3

# Retry backoff strategy
CELERY_TASK_RETRY_BACKOFF=True
CELERY_TASK_RETRY_BACKOFF_MAX=120

# Retry delay (seconds)
CELERY_TASK_DEFAULT_RETRY_DELAY=5

Production Deployment

Docker Compose

Run workers as Docker services for production:

version: '3.8'

services:
  # API Server
  spark-api:
    image: automagik/spark:latest
    command: automagik-spark api start
    ports:
      - "8883:8883"
    environment:
      - DATABASE_URL=postgresql://user:pass@postgres:5432/spark
      - CELERY_BROKER_URL=redis://redis:6379/0
      - SPARK_API_KEY=${SPARK_API_KEY}
    depends_on:
      - postgres
      - redis

  # Celery Workers (scaled)
  spark-worker:
    image: automagik/spark:latest
    command: automagik-spark workers start --concurrency 2
    environment:
      - DATABASE_URL=postgresql://user:pass@postgres:5432/spark
      - CELERY_BROKER_URL=redis://redis:6379/0
    depends_on:
      - postgres
      - redis
    deploy:
      replicas: 3  # Run 3 worker containers

  # Celery Beat (scheduler)
  spark-beat:
    image: automagik/spark:latest
    command: automagik-spark beat start
    environment:
      - DATABASE_URL=postgresql://user:pass@postgres:5432/spark
      - CELERY_BROKER_URL=redis://redis:6379/0
    depends_on:
      - postgres
      - redis

  # PostgreSQL Database
  postgres:
    image: postgres:14
    environment:
      - POSTGRES_USER=user
      - POSTGRES_PASSWORD=pass
      - POSTGRES_DB=spark
    volumes:
      - postgres_data:/var/lib/postgresql/data

  # Redis Queue
  redis:
    image: redis:7
    volumes:
      - redis_data:/data

volumes:
  postgres_data:
  redis_data:

Systemd Service

Run workers as a systemd service:

[Unit]
Description=Spark Celery Worker
After=network.target redis.service postgresql.service

[Service]
Type=forking
User=spark
WorkingDirectory=/opt/spark
Environment="PATH=/opt/spark/venv/bin"
ExecStart=/opt/spark/venv/bin/automagik-spark workers start --workers 4 --concurrency 2
ExecStop=/opt/spark/venv/bin/automagik-spark workers stop
Restart=on-failure
RestartSec=5s

[Install]
WantedBy=multi-user.target

Enable and start:

sudo systemctl enable spark-worker
sudo systemctl start spark-worker
sudo systemctl status spark-worker

Process Management (Supervisor)

Use Supervisor to manage worker processes:

[program:spark-worker]
command=/opt/spark/venv/bin/automagik-spark workers start --workers 4
directory=/opt/spark
user=spark
autostart=true
autorestart=true
redirect_stderr=true
stdout_logfile=/var/log/spark/worker.log
environment=PATH="/opt/spark/venv/bin",DATABASE_URL="postgresql://..."

Scaling Workers

Horizontal Scaling

Run multiple worker processes to handle more concurrent tasks:

# Development: Single worker
automagik-spark workers start

# Production: Multiple workers with concurrency
automagik-spark workers start --workers 8 --concurrency 2
# This creates 8 worker processes, each handling 2 tasks = 16 total concurrent tasks

Formula:

Total Concurrent Tasks = Workers × Concurrency

Resource Considerations

CPU-Bound Tasks:

Workers = Number of CPU cores
Concurrency = 1 per worker
Example: 4 cores = 4 workers × 1 concurrency

I/O-Bound Tasks (typical for workflows):

Workers = 2-4 per CPU core
Concurrency = 2-4 per worker
Example: 4 cores = 8 workers × 2 concurrency = 16 concurrent tasks

Load Balancing

Workers automatically load balance by picking tasks from the shared Redis queue:

Task created → added to Redis queue
First available worker picks up task
Worker executes and updates status
Worker picks next task

No additional configuration needed for load balancing.

Monitoring Workers

Health Checks

Check if workers are running and responding:

# Quick health check
automagik-spark workers status

# Check Redis connection
redis-cli ping

# Check worker queue
redis-cli llen celery

# Check active workers
celery -A automagik_spark.celery inspect active

Performance Metrics

Monitor worker performance:

# Task processing rate
celery -A automagik_spark.celery inspect stats

# Active tasks
celery -A automagik_spark.celery inspect active

# Scheduled tasks (upcoming)
celery -A automagik_spark.celery inspect scheduled

# Reserved tasks (claimed but not started)
celery -A automagik_spark.celery inspect reserved

Logs and Debugging

Common issues and how to debug: Worker Not Starting:

# Check logs for errors
automagik-spark workers logs --level error

# Verify Redis connection
redis-cli -h localhost -p 6379 ping

# Check environment variables
env | grep CELERY_BROKER_URL

Tasks Not Processing:

# Check active workers
automagik-spark workers status

# Check queue length
redis-cli llen celery

# Inspect worker state
celery -A automagik_spark.celery inspect active

High Memory Usage:

# Restart workers with memory limit
export CELERY_WORKER_MAX_MEMORY_PER_CHILD=150000
automagik-spark workers stop
automagik-spark workers start --workers 4

Worker Pools

Celery supports different execution pools:

Prefork Pool (Default)

Best for CPU-bound tasks:

export CELERY_WORKER_POOL=prefork
automagik-spark workers start --workers 4 --concurrency 2

Pros: Process isolation, fault tolerance Cons: Higher memory usage

Solo Pool

Single process, single task:

export CELERY_WORKER_POOL=solo
automagik-spark workers start

Pros: Simplicity, low memory Cons: No concurrency

Threads Pool

Thread-based concurrency:

export CELERY_WORKER_POOL=threads
automagik-spark workers start --concurrency 4

Pros: Low memory, good for I/O-bound Cons: GIL limitations in Python

Troubleshooting

Common Issues

Workers exit unexpectedly:

Check memory limits (CELERY_WORKER_MAX_MEMORY_PER_CHILD)
Review logs for errors
Verify database connections

Tasks stuck in pending:

Confirm workers are running: automagik-spark workers status
Check Redis queue: redis-cli llen celery
Restart workers: automagik-spark workers stop && automagik-spark workers start

High CPU usage:

Reduce concurrency per worker
Increase number of workers
Check for infinite loops in workflows

Connection errors:

Verify Redis is accessible
Check CELERY_BROKER_URL configuration
Ensure network connectivity to workflow sources

Best Practices

Graceful Shutdown

Always stop workers gracefully to finish in-progress tasks

Monitor Resources

Track CPU, memory, and task processing rates

Scale Appropriately

Match worker count to workload and available resources

Use Process Manager

Deploy with systemd, supervisor, or Docker for auto-restart

Next Steps

CLI Reference

Complete worker CLI commands

Tasks API

Monitor task execution

Production Deployment

Deploy Spark to production

Scaling Guide

Advanced scaling strategies

Getting Started

Examples

API Reference

CLI Reference

Configuration

Core Concepts

Advanced

Troubleshooting

​Overview

​Worker Architecture

​Managing Workers via CLI

​Start Workers

​Check Worker Status

​Stop Workers

​View Worker Logs

​Worker Configuration

​Concurrency Settings

​Resource Limits

​Queue Configuration

​Retry Settings

​Production Deployment

​Docker Compose

​Systemd Service

​Process Management (Supervisor)

​Scaling Workers

​Horizontal Scaling

​Resource Considerations

​Load Balancing

​Monitoring Workers

​Health Checks

​Performance Metrics

​Logs and Debugging

​Worker Pools

​Prefork Pool (Default)

​Solo Pool

​Threads Pool

​Troubleshooting

​Common Issues

​Best Practices

Graceful Shutdown

Monitor Resources

Scale Appropriately

Use Process Manager

​Next Steps

CLI Reference

Tasks API

Production Deployment

Scaling Guide

Overview

Worker Architecture

Managing Workers via CLI

Start Workers

Check Worker Status

Stop Workers

View Worker Logs

Worker Configuration

Concurrency Settings

Resource Limits

Queue Configuration

Retry Settings

Production Deployment

Docker Compose

Systemd Service

Process Management (Supervisor)

Scaling Workers

Horizontal Scaling

Resource Considerations

Load Balancing

Monitoring Workers

Health Checks

Performance Metrics

Logs and Debugging

Worker Pools

Prefork Pool (Default)

Solo Pool

Threads Pool

Troubleshooting

Common Issues

Best Practices

Next Steps