DeepSeek OCR

Optical Character Recognition

The DeepSeek OCR API enables you to extract text from images with high accuracy. This powerful OCR model can process images from URLs or base64-encoded data, making it ideal for document processing, receipt scanning, ID card extraction, and other text recognition tasks.

Key Features

High Accuracy: Advanced OCR model for precise text extraction
Flexible Input: Support for both URL and base64-encoded images
Fast Processing: Optimized for quick text extraction
Optional Image Return: Get back an annotated image with recognized text regions
OpenAI-Compatible: Standard API format for easy integration

API Endpoint

POST https://api.neosantara.xyz/v1/ocr

Authentication

Include your API key in the request header:

Authorization: Bearer YOUR_API_KEY

First Request Latency: The initial OCR request may experience a 10-15 second delay due to cold start initialization. Subsequent requests will process much faster (2-5 seconds). If you receive a timeout error on your first request, please retry after a few seconds.

Request Parameters

model

string

default:"deepseek-ocr"

The OCR model to use. Currently supports deepseek-ocr.

image

string

required

The image to process. Can be either:

A publicly accessible image URL (when image_type is “url”)
Base64-encoded image data (when image_type is “base64”)

image_type

string

default:"url"

The type of image data being provided. Must be either:

"url" - Image is provided as a URL
"base64" - Image is provided as base64-encoded data

return_image

boolean

default:false

Whether to return an annotated image with recognized text regions. When set to true, the response will include result_image_base64 field containing the processed image.

Response Format

The API returns a JSON response in OpenAI-compatible format with the following structure:

{
  "id": "ocr-abc123def456",
  "object": "ocr.completion",
  "created": 1234567890,
  "model": "deepseek-ocr",
  "choices": [
    {
      "index": 0,
      "text": "The recognized text from your image...",
      "finish_reason": "stop"
    }
  ],
  "result_image_base64": "iVBORw0KG...", // Only if return_image is true
  "usage": {
    "prompt_tokens": 50,
    "completion_tokens": 0,
    "total_tokens": 50
  }
}

Response Fields

id: Unique identifier for this OCR request
object: Type of response object, always “ocr.completion”
created: Unix timestamp of when the request was processed
model: The model used for OCR processing (e.g., “deepseek-ocr”)
choices: Array containing the OCR results
- index: Choice index, always 0 for OCR
- text: The text extracted from the image
- finish_reason: Completion status, typically “stop”
result_image_base64: Base64-encoded annotated image (only included if return_image is true)
usage: Token usage information for billing

Quick Start Testing

Want to try the OCR API immediately? Use this test image URL to extract text from a sample receipt: Test Image URL: https://res.cloudinary.com/dcwnn9c0u/image/upload/v1762880072/gh1nwfzslf4glu2cde9z.jpg

This sample image is a receipt from Mandiri BNI. You can use it to test the API without needing your own image. Expected result: merchant info, transaction details, amounts, and date/time.

Quick Test with Curl

curl -X POST https://api.neosantara.xyz/v1/ocr \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "deepseek-ocr",
    "image": "https://res.cloudinary.com/dcwnn9c0u/image/upload/v1762880072/gh1nwfzslf4glu2cde9z.jpg",
    "image_type": "url",
    "return_image": false
  }'

Remember: The first request may take 10-15 seconds due to cold start. If you get a timeout, wait a few seconds and try again.

Code Examples

import requests
import base64

# Example 1: OCR from URL
def ocr_from_url(image_url, api_key):
    url = "https://api.neosantara.xyz/v1/ocr"

    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": "deepseek-ocr",
        "image": image_url,
        "image_type": "url",
        "return_image": False
    }

    response = requests.post(url, json=payload, headers=headers)
    result = response.json()

    print("Extracted Text:", result["choices"][0]["text"])
    return result

# Example 2: OCR from base64 image
def ocr_from_base64(image_path, api_key):
    url = "https://api.neosantara.xyz/v1/ocr"

    # Read and encode image
    with open(image_path, "rb") as image_file:
        image_base64 = base64.b64encode(image_file.read()).decode('utf-8')

    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": "deepseek-ocr",
        "image": image_base64,
        "image_type": "base64",
        "return_image": True  # Get annotated image back
    }

    response = requests.post(url, json=payload, headers=headers)
    result = response.json()

    print("Extracted Text:", result["choices"][0]["text"])

    # Save annotated image if returned
    if "result_image_base64" in result:
        with open("annotated_image.png", "wb") as f:
            f.write(base64.b64decode(result["result_image_base64"]))
        print("Annotated image saved as 'annotated_image.png'")

    return result

# Usage
api_key = "your_api_key_here"

# OCR from URL (using test image)
result = ocr_from_url(
    "https://res.cloudinary.com/dcwnn9c0u/image/upload/v1762880072/gh1nwfzslf4glu2cde9z.jpg",
    api_key
)

# OCR from local file
result = ocr_from_base64("./document.png", api_key)

Use Cases

Document Processing

Extract text from scanned documents, PDFs converted to images, or photographed papers for digital archival and processing.

# Process a scanned document
result = ocr_from_url(
    "https://example.com/scanned-document.jpg",
    api_key
)
print("Document Text:", result["choices"][0]["text"])

Receipt Scanning

Automatically extract transaction details, amounts, and merchant information from receipt images for expense tracking.

# Extract receipt information (using test receipt)
result = ocr_from_url(
    "https://res.cloudinary.com/dcwnn9c0u/image/upload/v1762880072/gh1nwfzslf4glu2cde9z.jpg",
    api_key
)
# Parse extracted text for specific fields
# (amount, merchant, date, etc.)
print("Merchant:", "DUNNY MANDIRI - BNI OBP")
print("Amount:", "Rp 100.000")

ID Card & Document Verification

Extract information from ID cards, passports, and other identity documents for verification processes.

# Extract ID card information
result = ocr_from_base64("./id-card.jpg", api_key)
# Parse extracted text for name, ID number, etc.

Image Annotation

Use the return_image parameter to get back an annotated image showing detected text regions, useful for debugging and visualization.

# Get annotated image with text regions (using test receipt)
import requests

url = "https://api.neosantara.xyz/v1/ocr"
headers = {
    "Authorization": f"Bearer {api_key}",
    "Content-Type": "application/json"
}
payload = {
    "model": "deepseek-ocr",
    "image": "https://res.cloudinary.com/dcwnn9c0u/image/upload/v1762880072/gh1nwfzslf4glu2cde9z.jpg",
    "image_type": "url",
    "return_image": True  # Request annotated image
}

result = requests.post(url, json=payload, headers=headers).json()
# result["result_image_base64"] contains the annotated image
print("Has annotated image:", "result_image_base64" in result)

Error Handling

The API returns structured error responses for invalid requests:

{
  "error": {
    "message": "Detailed error message",
    "type": "invalid_request_error",
    "param": "image",
    "code": "parameter_missing"
  }
}

Common Error Codes

model_not_found: The specified model does not exist
unsupported_capability: The model does not support OCR
parameter_missing: Required parameter is missing
invalid_parameter: Parameter value is invalid
invalid_parameter_type: Parameter has wrong data type
invalid_url_format: Image URL is malformed
invalid_url_protocol: Image URL must use HTTP/HTTPS
invalid_base64_format: Base64 data is not valid

Error Handling Example

import requests

def ocr_with_error_handling(image, api_key, image_type="url"):
    url = "https://api.neosantara.xyz/v1/ocr"

    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json"
    }

    payload = {
        "model": "deepseek-ocr",
        "image": image,
        "image_type": image_type,
        "return_image": False
    }

    try:
        response = requests.post(url, json=payload, headers=headers)

        # Check for HTTP errors
        if response.status_code != 200:
            error_data = response.json()
            error = error_data.get("error", {})
            print(f"Error: {error.get('message')}")
            print(f"Type: {error.get('type')}")
            print(f"Code: {error.get('code')}")
            return None

        result = response.json()
        return result["choices"][0]["text"]

    except requests.exceptions.RequestException as e:
        print(f"Request failed: {str(e)}")
        return None

# Usage
text = ocr_with_error_handling(
    "https://example.com/document.jpg",
    "your_api_key_here"
)

if text:
    print("Extracted:", text)

Pricing

The DeepSeek OCR API uses token-based pricing:

Per Image Processing: 100 tokens equivalent (approximately 50 tokens)
Token Cost: Charged according to your Neosantara AI pricing tier

Token usage is reported in the usage field of the response.

The return_image parameter does not affect token usage. Annotated images are provided at no additional cost.

Best Practices

First Request Latency: The first OCR request may take 10-15 seconds due to cold start initialization. Subsequent requests will be significantly faster (2-5 seconds). If you encounter a timeout on the first request, retry after a few seconds.
Image Quality: Use high-resolution, clear images for best results
Image Format: Supported formats include JPEG, PNG, and other common image formats
URL Accessibility: Ensure image URLs are publicly accessible when using image_type: "url"
Base64 Size: Be mindful of base64 encoded image sizes for large images
Error Handling: Always implement proper error handling in your application, including retry logic for timeout errors
Rate Limits: Respect API rate limits according to your tier

Rate Limits

Rate limits vary by your Neosantara AI subscription tier. Check your tier limits in the dashboard.

Upgrade your tier for higher rate limits and additional features. Contact support@neosantara.xyz for enterprise solutions.

Support

Need help with the OCR API? We are here to assist:

Email: support@neosantara.xyz
Documentation: API Reference
Community: Join our Discord community

Next Steps

Chat Completions API

Learn about our Chat Completions API for conversational AI

Error Codes

Complete reference of API error codes

Rate Limits

Understanding rate limits and quotas

Core Concepts

Tools & Agents

Use Cases & Tutorials

Optical Character Recognition

Key Features

API Endpoint

Authentication

Request Parameters

Response Format

Response Fields

Quick Start Testing

Quick Test with Curl

Code Examples

Use Cases

Document Processing

Receipt Scanning

ID Card & Document Verification

Image Annotation

Error Handling

Common Error Codes

Error Handling Example

Pricing

Best Practices

Rate Limits

Support

Next Steps

Chat Completions API

Error Codes

Rate Limits

Core Concepts

Tools & Agents

Use Cases & Tutorials

​Optical Character Recognition

​Key Features

​API Endpoint

​Authentication

​Request Parameters

​Response Format

​Response Fields

​Quick Start Testing

​Quick Test with Curl

​Code Examples

​Use Cases

​Document Processing

​Receipt Scanning

​ID Card & Document Verification

​Image Annotation

​Error Handling

​Common Error Codes

​Error Handling Example

​Pricing

​Best Practices

​Rate Limits

​Support

​Next Steps

Chat Completions API

Error Codes

Rate Limits

Optical Character Recognition

Key Features

API Endpoint

Authentication

Request Parameters

Response Format

Response Fields

Quick Start Testing

Quick Test with Curl

Code Examples

Use Cases

Document Processing

Receipt Scanning

ID Card & Document Verification

Image Annotation

Error Handling

Common Error Codes

Error Handling Example

Pricing

Best Practices

Rate Limits

Support

Next Steps