Introduction
The Batches API allows you to process large volumes of API requests asynchronously at half the cost of standard API calls. Perfect for bulk operations, data processing pipelines, and overnight jobs that donโt require immediate responses.Save 50% with Batch Processing
Batch API offers 50% cost savings compared to standard synchronous API calls. Process thousands of requests efficiently while you sleep!
Key Benefits
- ๐ฐ 50% Cost Reduction: Significantly lower costs for bulk processing
- โก Async Processing: Submit jobs and retrieve results when ready
- ๐ Progress Tracking: Monitor completion status in real-time
- ๐ Automatic Retries: Built-in retry logic for failed requests
- ๐ Organized Results: Separate output files for successes and errors
- โฑ๏ธ 24-Hour Window: All batches complete within 24 hours
How It Works
- Upload a JSONL file containing your requests
- Create a batch job referencing the uploaded file
- Monitor progress as requests are processed asynchronously
- Download results from output files when complete
API Endpoints
Create Batch
Start a new batch job for asynchronous request processing
Get Batch
Check the status and progress of a batch job
Cancel Batch
Cancel a running batch job before completion
List Batches
View all your batch jobs with filtering and pagination
Supported Endpoints
Batch processing is available for the following endpoints:Chat Completions
/v1/chat/completionsProcess conversations at scaleEmbeddings
/v1/embeddingsGenerate embeddings in bulkResponses
/v1/responsesBatch response generationQuick Start
Step 1: Prepare Your Input File
Create a JSONL file with your requests:Step 2: Upload the File
Step 3: Create the Batch
Step 4: Monitor Progress
Step 5: Download Results
Once the batch status iscompleted:
Batch Status Lifecycle
| Status | Description |
|---|---|
validating | Input file is being validated |
in_progress | Requests are being processed |
finalizing | Results are being compiled |
completed | Batch finished successfully |
failed | Batch encountered an error |
canceled | Batch was manually canceled |
expired | Batch exceeded 24-hour window |
Tier Limitations
| Tier | Concurrent Batches | Status |
|---|---|---|
| Free | 0 | โ Not Available |
| Basic | 5 | โ Available |
| Pro | 10 | โ Available |
| Enterprise | Custom | โ Available |
Best Practices
Optimize Batch Size
Optimize Batch Size
Balance between batch size and completion time. Larger batches (1000+ requests) maximize cost savings, while smaller batches complete faster. Consider your use case requirements.
Use Custom IDs Effectively
Use Custom IDs Effectively
Assign meaningful
custom_id values to each request. This helps you map results back to your original data when processing output files.Handle Partial Failures
Handle Partial Failures
Always check both output and error files. Some requests may succeed while others fail. Implement retry logic for failed requests if needed.
Monitor Concurrent Limits
Monitor Concurrent Limits
Stay within your tierโs concurrent batch limit. Queue additional batches to start after current ones complete.
Set Appropriate Timeouts
Set Appropriate Timeouts
All batches complete within 24 hours. For time-sensitive operations, consider using standard API calls instead.
Validate Input Format
Validate Input Format
Ensure your JSONL file is properly formatted before creating a batch. Invalid format will cause immediate failure during validation.
Output File Format
Successful results are returned in JSONL format:example.jsonl
Response error
Use Cases
Data Labeling
Classify or label thousands of text samples for ML training datasets
Content Moderation
Analyze large volumes of user-generated content for policy compliance
Translation
Translate documentation or content into multiple languages at scale
Sentiment Analysis
Process customer feedback, reviews, or social media posts in bulk
Embeddings Generation
Create vector embeddings for entire document collections or knowledge bases
Report Generation
Generate hundreds of personalized reports from structured data
Error Handling
Common batch errors and solutions:| Error Code | Description | Solution |
|---|---|---|
batch_api_not_allowed | Free tier restriction | Upgrade to Basic tier or higher |
missing_required_field | Required parameter missing | Include all required fields in request |
invalid_endpoint | Unsupported endpoint | Use supported endpoints only |
invalid_input_file | File not found or wrong purpose | Ensure file exists and has purpose="batch" |
invalid_jsonl | JSONL validation failed | Check file format for syntax errors |
concurrent_batches_limit | Too many active batches | Wait for existing batches to complete |
batch_not_found | Batch ID doesnโt exist | Verify the batch_id is correct |
cannot_cancel | Batch already finished | Can only cancel in-progress batches |
Rate Limits
Batch processing has different rate limits than standard API calls:- Concurrent Jobs: Based on your tier (see table above)
- Requests per Batch: No hard limit, but recommend 10,000 max for optimal performance
- File Size: Maximum 100MB per input file
- Processing Time: All batches complete within 24 hours