Quota System

The quota system ensures fair resource allocation by limiting concurrent jobs per user per service.

How Quotas Work

User 1 submits 10 jobs for "example" service
Quota: SERVICE_EXAMPLE_RUNS_PER_USER=5

┌─────────────────────────────────────────┐
│ Jobs 1-5:  Dispatched immediately       │
│ Jobs 6-10: Remain queued                │
└─────────────────────────────────────────┘

When Job 1 completes:
┌─────────────────────────────────────────┐
│ Job 6: Now dispatched (slot available)  │
└─────────────────────────────────────────┘

Configuration

Set quotas per service using environment variables:

SERVICE_<NAME>_RUNS_PER_USER=<limit>

Examples

# Allow 5 concurrent jobs per user for "example" service
SERVICE_EXAMPLE_RUNS_PER_USER=5

# Allow 3 concurrent jobs per user for "haddock" service
SERVICE_HADDOCK_RUNS_PER_USER=3

# Allow 10 concurrent jobs per user for "quick" service
SERVICE_QUICK_RUNS_PER_USER=10

Default Value

If not specified, the default quota is 5 concurrent jobs per user per service.

Quota Scope

Quotas are enforced per user, per service:

User	Service	Quota	Can Submit
user_1	example	5	Up to 5 concurrent
user_1	haddock	3	Up to 3 concurrent
user_2	example	5	Up to 5 concurrent (independent of user_1)

Users don’t compete with each other - each user has their own quota allocation.

Quota States

Jobs transition through these states relative to quotas:

┌──────────┐     Quota      ┌────────────┐
│  Queued  │ ──Available──▶ │ Processing │
└──────────┘                └────────────┘
     │                            │
     │ Quota Exhausted            │
     ▼                            ▼
┌──────────────────┐      ┌────────────┐
│ Remains Queued   │      │ Submitted  │
│ (waits for slot) │      │ (running)  │
└──────────────────┘      └────────────┘

Choosing Quota Values

Factors to Consider

Job Duration: Longer jobs need lower quotas
Resource Usage: CPU/memory intensive jobs need lower quotas
User Base: More users may need lower per-user quotas
Client Capacity: Match quotas to available compute resources

Guidelines

Job Type	Suggested Quota
Quick jobs (< 1 min)	10-20
Medium jobs (1-10 min)	5-10
Long jobs (10+ min)	2-5
Resource-intensive	1-3

Example Scenarios

Scientific Computing Platform

# Quick validation jobs - high quota
SERVICE_VALIDATE_RUNS_PER_USER=20

# Standard analysis - medium quota
SERVICE_ANALYZE_RUNS_PER_USER=5

# Heavy simulation - low quota
SERVICE_SIMULATE_RUNS_PER_USER=2

Educational Platform

# Student exercises - moderate quota
SERVICE_EXERCISE_RUNS_PER_USER=3

# Final projects - allow more
SERVICE_PROJECT_RUNS_PER_USER=5

Monitoring Quota Usage

Check Queue Status

Jobs waiting due to quota exhaustion remain in Queued status:

# Check how many jobs are queued vs running
curl http://localhost:5000/swagger-ui/  # Use API explorer

Server Logs

The server logs quota decisions:

INFO: User 1 has 5/5 jobs running for service 'example', job 123 remains queued
INFO: User 1 slot available, dispatching job 123 to service 'example'

Testing Quotas

Submit multiple jobs to observe throttling:

# Submit 10 jobs with quota of 5
for i in {1..10}; do
  echo '#!/bin/bash
sleep 30
echo "Job complete" > output.txt' > run.sh

  curl -s -X POST http://localhost:5000/upload \
    -F "file=@run.sh" \
    -F "user_id=1" \
    -F "service=example" | jq -r '.status'
done

You’ll see:

First 5 jobs: Quickly move to Submitted
Jobs 6-10: Stay in Queued until slots open

Fair Scheduling

The quota system provides basic fairness:

Per-user isolation: One user can’t starve others
Per-service isolation: Heavy service usage doesn’t block other services
Automatic queuing: No jobs are rejected, just delayed

Limitations

Current limitations (improvements planned):

No priority queues (FIFO within quota constraints)
No global quotas (only per-user)
No time-based quotas (e.g., jobs per hour)
No burst allowances

job-orchestrator