Skip to main content

Goal

By the end of this guide, you’ll have a workflow running automatically every 5 minutes in Spark. Total time: under 10 minutes. This is your “hello world” for Spark scheduling—the simplest possible path from installation to automated workflow execution.

Prerequisites

Before starting, ensure you have:
1

Spark installed and running

Complete the installation guide first.Required services:
  • PostgreSQL (running on port 5432)
  • Redis (running on port 6379)
  • Spark API (running on port 8883)
2

A workflow source available

You need either:
  • LangFlow running on http://localhost:7860, OR
  • Automagik Hive running on http://localhost:8000
Don’t have one? Quickly install LangFlow:
pip install langflow
langflow run
Then create a simple flow (any flow works for this example).
3

API key from your source

Get an API key from your LangFlow or Hive instance. You’ll need this to connect Spark to your workflow source.

Quick Prerequisites Check

Run these commands to verify everything is ready:
# Check PostgreSQL
psql -h localhost -U postgres -c "SELECT version();"

# Check Redis
redis-cli ping
# Should output: PONG

# Check Spark API
curl http://localhost:8883/health
# Should output: {"status":"ok"}

# Check LangFlow (if using)
curl http://localhost:7860/health
All checks must pass before proceeding. If any fail, see common errors for fixes.

Step-by-Step: Schedule in 10 Minutes

1

Add Your Workflow Source

Connect Spark to your LangFlow or Hive instance.For LangFlow:
automagik-spark sources add \
  --name "my-langflow" \
  --type "langflow" \
  --url "http://localhost:7860" \
  --api-key "your-langflow-api-key"
For Hive:
automagik-spark sources add \
  --name "my-hive" \
  --type "automagik-agents" \
  --url "http://localhost:8000" \
  --api-key "your-hive-api-key"
Expected output:
Health check passed: status ok
Version check passed: 1.0.65
Successfully added source: http://localhost:7860
What this does: Registers your workflow source with Spark so it can discover and sync workflows.
2

List Available Workflows

Discover what workflows are available to schedule.
automagik-spark workflows sync
Expected output:
┌────────────────────┬─────────────────────┬──────────────────────┬──────────┐
│ ID                 │ Name                │ Description          │ Source   │
├────────────────────┼─────────────────────┼──────────────────────┼──────────┤
│ flow-abc-123       │ daily-report        │ Generate daily stats │ langflow │
│ flow-def-456       │ data-processor      │ Process CSV data     │ langflow │
└────────────────────┴─────────────────────┴──────────────────────┴──────────┘

Command: sync <flow_id> • Sources: my-langflow
Note the workflow ID you want to schedule (e.g., flow-abc-123).
What this does: Queries your source for all available workflows without syncing them yet. This is your discovery phase.
3

Sync the Workflow

Bring your chosen workflow into Spark’s database.
# Replace flow-abc-123 with your workflow ID
automagik-spark workflows sync flow-abc-123
Expected output:
Successfully synced flow flow-abc-123
Verify it was synced:
automagik-spark workflows list
Expected output:
┌────────────┬──────────────┬────────────┬───────────┬──────────┐
│ ID         │ Name         │ Latest Run │ Schedules │ Source   │
├────────────┼──────────────┼────────────┼───────────┼──────────┤
│ workflow-1 │ daily-report │ NEW        │ 0         │ langflow │
└────────────┴──────────────┴────────────┴───────────┴──────────┘
What this does: Copies the workflow metadata into Spark’s database so it can be scheduled and executed locally.
4

Create a Schedule (Every 5 Minutes)

Set up the workflow to run automatically every 5 minutes.
automagik-spark schedules create
Interactive prompts and responses:
Available Workflows:
0: daily-report (0 schedules)

Select a workflow: 0

Schedule Type:
  0: Interval (e.g., every 30 minutes)
  1: Cron (e.g., every day at 8 AM)
  2: One-time (run once at a specific time)

Select schedule type: 1

Cron Examples:
  * * * * *     - Every minute
  */5 * * * *   - Every 5 minutes
  0 * * * *     - Every hour
  0 0 * * *     - Every day at midnight

Enter cron expression: */5 * * * *

Enter input value (or press Enter to skip): {"message": "hello"}

Schedule created successfully with ID: schedule-abc-123
Cron expression breakdown: */5 * * * *
  • */5 = Every 5 minutes
  • * = Every hour
  • * = Every day of month
  • * = Every month
  • * = Every day of week
Translation: “Run every 5 minutes, always.”
What this does: Creates a database entry that Celery Beat monitors. Every 5 minutes, Beat creates a task that workers execute.
5

Verify the Schedule

Confirm your schedule was created and is active.
automagik-spark schedules list
Expected output:
┌──────────────┬──────────────┬──────┬─────────────┬────────────┬────────┐
│ ID           │ Workflow     │ Type │ Expression  │ Next Run   │ Status │
├──────────────┼──────────────┼──────┼─────────────┼────────────┼────────┤
│ schedule-123 │ daily-report │ cron │ */5 * * * * │ In 5 mins  │ ACTIVE │
└──────────────┴──────────────┴──────┴─────────────┴────────────┴────────┘
Key things to check:
  • ✅ Status is ACTIVE (not PAUSED or STOPPED)
  • ✅ Next Run shows a future time
  • ✅ Expression matches what you entered
What this does: Shows you all configured schedules and when they’ll next execute. The “Next Run” time counts down to the next execution.
6

Check Workers Are Running

Verify Spark workers and scheduler are running.
automagik-spark worker status
Expected output:
Worker is running (PID: 12345)
Beat scheduler is running (PID: 12346)
If workers are NOT running:
# Start workers and beat scheduler
automagik-spark worker start

# Verify they started
automagik-spark worker status
Critical: Without workers running, schedules will NOT execute. Always ensure workers are running.
What this does: Workers execute tasks. Beat creates tasks from schedules. Both must run for automation to work.
7

Wait for First Execution (Up to 5 Minutes)

Your schedule will fire within 5 minutes. Meanwhile, check the task list.
# Check tasks list (should be empty initially)
automagik-spark tasks list

# Wait 5 minutes, then check again
# You should see a new task
After 5 minutes, expected output:
┌──────────────┬──────────────┬─────────────┬────────┬────────────┬──────────┐
│ ID           │ Workflow     │ Schedule    │ Status │ Created    │ Duration │
├──────────────┼──────────────┼─────────────┼────────┼────────────┼──────────┤
│ task-001     │ daily-report │ schedule-12 │ ✓ OK   │ Just now   │ 1.2s     │
└──────────────┴──────────────┴─────────────┴────────┴────────────┴──────────┘
Timeline:
  • 0:00 - Schedule created
  • 0:00-5:00 - Waiting for first cron trigger
  • 5:00 - First task created and executed
  • 10:00 - Second task executed
  • Every 5 minutes - New task executed
What this does: Confirms your schedule is working. Each execution creates a task record with status, logs, and output.
8

View Execution Logs

Check the output from your workflow execution.
# Get the task ID from the list above (e.g., task-001)
automagik-spark tasks view task-001
Expected output:
Task ID: task-001
Workflow: daily-report
Schedule: schedule-abc-123
Status: SUCCESS
Created: 2025-11-04 15:30:00
Started: 2025-11-04 15:30:01
Completed: 2025-11-04 15:30:02
Duration: 1.2s

Input:
{"message": "hello"}

Output:
{"result": "Report generated successfully", "timestamp": "2025-11-04T15:30:02Z"}

Logs:
[2025-11-04 15:30:01] Task started
[2025-11-04 15:30:01] Executing workflow daily-report
[2025-11-04 15:30:02] Workflow completed successfully
What this does: Shows you everything that happened during execution—input sent, output received, logs, timing.
9

Control the Schedule

Pause, resume, or modify your schedule.Pause the schedule:
automagik-spark schedules update schedule-abc-123 pause
Resume the schedule:
automagik-spark schedules update schedule-abc-123 resume
Change the frequency:
# Run every 10 minutes instead of 5
automagik-spark schedules set-expression schedule-abc-123 "*/10 * * * *"
Delete the schedule:
automagik-spark schedules delete schedule-abc-123
What this does: Gives you full control over your automation. Pause during maintenance, adjust frequency based on needs, or clean up when done.
10

Clean Up (Optional)

When you’re done testing, remove the schedule and workflow.
# Delete the schedule
automagik-spark schedules delete schedule-abc-123

# Delete the workflow (also removes all associated tasks)
automagik-spark workflows delete workflow-1

# Optionally remove the source
automagik-spark sources delete my-langflow
What this does: Cleans up test resources. In production, you’d keep these running indefinitely.

Expected Timeline

Here’s what to expect when following this guide:
TimeActionWhat’s Happening
0:00Start guidePrerequisites ready
2:00Source added, workflow syncedWorkflow now in Spark database
3:00Schedule createdAutomation configured
4:00Workers verified, waitingBeat monitoring schedule
5:00First executionTask created and executed
10:00Second executionAutomation running on schedule
TotalUnder 10 minutesFully automated workflow
Most time is waiting: Setup takes 3-4 minutes. The rest is waiting for the 5-minute schedule to fire.

Verification Commands

Use these commands to verify each step worked:
# After adding source
automagik-spark sources list

# After syncing workflow
automagik-spark workflows list

# After creating schedule
automagik-spark schedules list

# After first execution
automagik-spark tasks list

# To view specific task
automagik-spark tasks view <task-id>

# To check workers
automagik-spark worker status

# To follow worker logs in real-time
automagik-spark worker logs --follow

Common Mistakes

Problem: Workers not running or Beat scheduler not started.Solution:
# Check worker status
automagik-spark worker status

# If not running, start workers
automagik-spark worker start

# Verify both worker and beat are running
automagik-spark worker status
What to look for: Both “Worker is running” AND “Beat scheduler is running” must show.
Problem: Invalid cron expression entered.Common mistakes:
  • 5 * * * * (runs at minute 5 of every hour, not every 5 minutes)
  • * * * * * * (6 fields—cron uses 5 fields only)
  • */5 (incomplete expression)
Correct for every 5 minutes: */5 * * * *Solution: Use crontab.guru to validate expressions before entering.
Problem: Schedule was manually paused or created in paused state.Solution:
# Resume the schedule
automagik-spark schedules update schedule-abc-123 resume

# Verify it's now ACTIVE
automagik-spark schedules list
Problem: Workflow execution failed (source unreachable, bad input, etc.).Solution:
# View task details to see error
automagik-spark tasks view task-abc-123

# Common causes:
# 1. LangFlow/Hive not running - check with curl
# 2. Invalid input data format - check input in task view
# 3. Workflow deleted from source - resync workflow

# Test workflow manually first
automagik-spark workflows run workflow-1 --input '{"test": "data"}'
Problem: Redis or PostgreSQL connection failure.Solution:
# Check Redis
redis-cli ping

# Check PostgreSQL
psql -h localhost -U postgres -d automagik_spark -c "SELECT 1;"

# Check worker logs for specific error
automagik-spark worker logs

# Verify environment variables
echo $AUTOMAGIK_SPARK_DATABASE_URL
echo $AUTOMAGIK_SPARK_CELERY_BROKER_URL
See common errors for detailed connection troubleshooting.

Troubleshooting

If something doesn’t work, check these in order:
1

Verify prerequisites

# All must return success
psql -h localhost -U postgres -c "SELECT 1;"
redis-cli ping
curl http://localhost:8883/health
curl http://localhost:7860/health
2

Check workers are running

automagik-spark worker status

# Both must show as running
# If not, start them:
automagik-spark worker start
3

Verify schedule is ACTIVE

automagik-spark schedules list

# Status should be ACTIVE, not PAUSED
# If paused, resume:
automagik-spark schedules update <schedule-id> resume
4

Check worker logs

automagik-spark worker logs --follow

# Look for error messages
# Common: connection errors, invalid input, source unavailable
5

Test workflow manually

# Try running the workflow directly (bypasses schedule)
automagik-spark workflows run workflow-1 --input '{"test": "data"}'

# If this fails, the problem is with the workflow, not the schedule
Still stuck? See the comprehensive common errors guide for detailed solutions.

What You’ve Accomplished

Congratulations! You’ve successfully:
  • Connected Spark to a workflow source
  • Synced a workflow into Spark
  • Created a schedule that runs every 5 minutes
  • Verified execution and viewed logs
  • Learned how to control and manage schedules
Your workflow is now running on autopilot. Every 5 minutes, Spark:
  1. ⏰ Beat scheduler triggers the schedule
  2. 📋 Creates a new task in the queue
  3. 👷 Worker picks up the task
  4. 🚀 Executes the workflow via adapter
  5. ✅ Logs results and completion

Next Steps


Quick Reference: Common Schedules

Copy-paste these cron expressions for common scheduling patterns:
Schedule PatternCron ExpressionDescription
Every minute* * * * *Testing only (creates many tasks)
Every 5 minutes*/5 * * * *Frequent monitoring
Every 15 minutes*/15 * * * *Regular polling
Every 30 minutes*/30 * * * *Moderate frequency
Every hour0 * * * *Hourly reports
Every 2 hours0 */2 * * *Periodic checks
Every day at 8 AM0 8 * * *Daily morning reports
Every day at midnight0 0 * * *Daily batch processing
Every weekday at 9 AM0 9 * * 1-5Business day automation
Every Monday at 10 AM0 10 * * 1Weekly reports
First day of month at 9 AM0 9 1 * *Monthly reports
Use crontab.guru to build and validate complex cron expressions.

Complete Example Script

Here’s the entire guide as a single script you can copy and run:
#!/bin/bash
# Simple Spark Schedule - Complete Example

# Prerequisites check
echo "Checking prerequisites..."
redis-cli ping || { echo "Redis not running!"; exit 1; }
psql -h localhost -U postgres -c "SELECT 1;" || { echo "PostgreSQL not running!"; exit 1; }
curl -s http://localhost:8883/health || { echo "Spark API not running!"; exit 1; }

# Step 1: Add source
echo "Adding workflow source..."
automagik-spark sources add \
  --name "my-langflow" \
  --type "langflow" \
  --url "http://localhost:7860" \
  --api-key "your-api-key-here"

# Step 2: List workflows
echo "Listing available workflows..."
automagik-spark workflows sync

# Step 3: Sync workflow (replace flow-abc-123 with your workflow ID)
echo "Syncing workflow..."
read -p "Enter workflow ID to sync: " WORKFLOW_ID
automagik-spark workflows sync "$WORKFLOW_ID"

# Step 4: Get synced workflow ID
SYNCED_ID=$(automagik-spark workflows list | grep -o "workflow-[0-9]*" | head -1)
echo "Synced workflow ID: $SYNCED_ID"

# Step 5: Start workers if not running
echo "Checking workers..."
automagik-spark worker status || automagik-spark worker start

# Step 6: Create schedule (every 5 minutes)
echo "Creating schedule (every 5 minutes)..."
# Note: Use the interactive CLI for simplicity
automagik-spark schedules create

# Step 7: Verify schedule
echo "Schedule created. Verifying..."
automagik-spark schedules list

# Step 8: Monitor tasks
echo "Waiting for first execution (up to 5 minutes)..."
echo "Monitor with: automagik-spark tasks list"
echo "View logs with: automagik-spark worker logs --follow"

echo ""
echo "Setup complete! Your workflow will execute every 5 minutes."
echo "Check status with: automagik-spark tasks list"
To use this script:
  1. Save as setup-schedule.sh
  2. Make executable: chmod +x setup-schedule.sh
  3. Update API key in Step 1
  4. Run: ./setup-schedule.sh

You’ve now mastered the basics of Spark scheduling. From here, you can build complex automation workflows, integrate multiple sources, and scale to production deployments.