Skip to main content

Introduction

The Batches API allows you to process large volumes of API requests asynchronously at half the cost of standard API calls. Perfect for bulk operations, data processing pipelines, and overnight jobs that donโ€™t require immediate responses.

Save 50% with Batch Processing

Batch API offers 50% cost savings compared to standard synchronous API calls. Process thousands of requests efficiently while you sleep!

Key Benefits

  • ๐Ÿ’ฐ 50% Cost Reduction: Significantly lower costs for bulk processing
  • โšก Async Processing: Submit jobs and retrieve results when ready
  • ๐Ÿ“Š Progress Tracking: Monitor completion status in real-time
  • ๐Ÿ”„ Automatic Retries: Built-in retry logic for failed requests
  • ๐Ÿ“ Organized Results: Separate output files for successes and errors
  • โฑ๏ธ 24-Hour Window: All batches complete within 24 hours

How It Works

  1. Upload a JSONL file containing your requests
  2. Create a batch job referencing the uploaded file
  3. Monitor progress as requests are processed asynchronously
  4. Download results from output files when complete

API Endpoints

Supported Endpoints

Batch processing is available for the following endpoints:

Chat Completions

/v1/chat/completionsProcess conversations at scale

Embeddings

/v1/embeddingsGenerate embeddings in bulk

Responses

/v1/responsesBatch response generation

Quick Start

Step 1: Prepare Your Input File

Create a JSONL file with your requests:
{"custom_id": "request-1", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "garda-beta-mini", "messages": [{"role": "user", "content": "Translate to Indonesian: Hello"}], "max_tokens": 100}}
{"custom_id": "request-2", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "garda-beta-mini", "messages": [{"role": "user", "content": "Translate to Indonesian: Goodbye"}], "max_tokens": 100}}
{"custom_id": "request-3", "method": "POST", "url": "/v1/chat/completions", "body": {"model": "garda-beta-mini", "messages": [{"role": "user", "content": "Translate to Indonesian: Thank you"}], "max_tokens": 100}}

Step 2: Upload the File

curl https://api.neosantara.xyz/v1/files \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F purpose="batch" \
  -F file="@batch_input.jsonl"
Response:
{
  "id": "file-abc123",
  "purpose": "batch",
  "filename": "batch_input.jsonl",
  "bytes": 1024,
  "created_at": 1699564800
}

Step 3: Create the Batch

curl https://api.neosantara.xyz/v1/batches \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "input_file_id": "file-abc123",
    "endpoint": "/v1/chat/completions",
    "completion_window": "24h"
  }'
Response:
{
  "id": "batch-xyz789",
  "status": "validating",
  "request_counts": {
    "total": 3,
    "completed": 0,
    "failed": 0
  }
}

Step 4: Monitor Progress

curl https://api.neosantara.xyz/v1/batches/batch-xyz789 \
  -H "Authorization: Bearer YOUR_API_KEY"

Step 5: Download Results

Once the batch status is completed:
# Download successful results
curl https://api.neosantara.xyz/v1/files/file-output-123/content \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output results.jsonl

# Download errors (if any)
curl https://api.neosantara.xyz/v1/files/file-errors-456/content \
  -H "Authorization: Bearer YOUR_API_KEY" \
  --output errors.jsonl

Batch Status Lifecycle

StatusDescription
validatingInput file is being validated
in_progressRequests are being processed
finalizingResults are being compiled
completedBatch finished successfully
failedBatch encountered an error
canceledBatch was manually canceled
expiredBatch exceeded 24-hour window

Tier Limitations

Free Tier: Batch API is not available on the Free tier. Upgrade to Basic or higher to unlock batch processing.
TierConcurrent BatchesStatus
Free0โŒ Not Available
Basic5โœ… Available
Pro10โœ… Available
EnterpriseCustomโœ… Available

Best Practices

Balance between batch size and completion time. Larger batches (1000+ requests) maximize cost savings, while smaller batches complete faster. Consider your use case requirements.
Assign meaningful custom_id values to each request. This helps you map results back to your original data when processing output files.
Always check both output and error files. Some requests may succeed while others fail. Implement retry logic for failed requests if needed.
Stay within your tierโ€™s concurrent batch limit. Queue additional batches to start after current ones complete.
All batches complete within 24 hours. For time-sensitive operations, consider using standard API calls instead.
Ensure your JSONL file is properly formatted before creating a batch. Invalid format will cause immediate failure during validation.

Output File Format

Successful results are returned in JSONL format:
example.jsonl
{"id": "batch-req-1", "custom_id": "request-1", "response": {"status_code": 200, "body": {"id": "chatcmpl-123", "object": "chat.completion", "choices": [{"message": {"role": "assistant", "content": "Halo"}}]}}}
{"id": "batch-req-2", "custom_id": "request-2", "response": {"status_code": 200, "body": {"id": "chatcmpl-124", "object": "chat.completion", "choices": [{"message": {"role": "assistant", "content": "Selamat tinggal"}}]}}}
Error file format:
Response error
{"id": "batch-req-3", "custom_id": "request-3", "error": {"code": "invalid_request", "message": "Invalid model specified"}}

Use Cases

Data Labeling

Classify or label thousands of text samples for ML training datasets

Content Moderation

Analyze large volumes of user-generated content for policy compliance

Translation

Translate documentation or content into multiple languages at scale

Sentiment Analysis

Process customer feedback, reviews, or social media posts in bulk

Embeddings Generation

Create vector embeddings for entire document collections or knowledge bases

Report Generation

Generate hundreds of personalized reports from structured data

Error Handling

Common batch errors and solutions:
Error CodeDescriptionSolution
batch_api_not_allowedFree tier restrictionUpgrade to Basic tier or higher
missing_required_fieldRequired parameter missingInclude all required fields in request
invalid_endpointUnsupported endpointUse supported endpoints only
invalid_input_fileFile not found or wrong purposeEnsure file exists and has purpose="batch"
invalid_jsonlJSONL validation failedCheck file format for syntax errors
concurrent_batches_limitToo many active batchesWait for existing batches to complete
batch_not_foundBatch ID doesnโ€™t existVerify the batch_id is correct
cannot_cancelBatch already finishedCan only cancel in-progress batches

Rate Limits

Batch processing has different rate limits than standard API calls:
  • Concurrent Jobs: Based on your tier (see table above)
  • Requests per Batch: No hard limit, but recommend 10,000 max for optimal performance
  • File Size: Maximum 100MB per input file
  • Processing Time: All batches complete within 24 hours
For processing more than 50,000 requests or custom concurrent limits, contact our enterprise team for a custom plan.
Last modified on December 4, 2025