Create Job
Submit a document for parsing.
POST /v1/jobs
Request Body
| Parameter | Type | Required | Description |
|---|---|---|---|
source_type | string | Yes | Document source type: "url" or "file" |
source_url | string | Conditional | Required if source_type is "url". URL of the document. |
file_name | string | Conditional | Required if source_type is "file". Name of the file including extension. |
data_id | string | No | Your custom identifier for this document. Included in results for easy mapping. Max 128 chars, alphanumeric with -, _, . allowed. |
parsing_params | object | No | Parsing configuration options. |
parsing_params.model | string | No | Model to use: "base" (default) or "advanced". Different credit costs apply. |
parsing_params.ocr_enabled | boolean | No | Enable OCR for scanned documents or images. Default: false. |
Response
For source_type: "url"
Status: 202 Accepted
The job is immediately queued for processing.
{
"job_id": "job_abc123def456",
"status": "pending",
"source_type": "url",
"data_id": "my_document_001",
"created_at": "2025-01-15T10:30:00Z"
}
For source_type: "file"
Status: 200 OK
Returns an upload URL for the file.
{
"job_id": "job_xyz789ghi012",
"status": "waiting-file",
"source_type": "file",
"data_id": "my_document_002",
"upload_url": "https://storage.knowhereto.ai/uploads/job_xyz789ghi012?X-Amz-...",
"upload_headers": {
"Content-Type": "application/pdf"
},
"created_at": "2025-01-15T10:30:00Z"
}
Examples
Parse Document from URL
- cURL
- Python
- Node.js
curl -X POST https://api.knowhereto.ai/v1/jobs \
-H "Authorization: Bearer $KNOWHERE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source_type": "url",
"source_url": "https://arxiv.org/pdf/1706.03762.pdf",
"data_id": "attention_paper",
"parsing_params": {
"model": "base",
"ocr_enabled": false
}
}'
import requests
response = requests.post(
"https://api.knowhereto.ai/v1/jobs",
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
},
json={
"source_type": "url",
"source_url": "https://arxiv.org/pdf/1706.03762.pdf",
"data_id": "attention_paper",
"parsing_params": {
"model": "base",
"ocr_enabled": False
}
}
)
job = response.json()
print(f"Job ID: {job['job_id']}")
print(f"Status: {job['status']}")
const response = await fetch('https://api.knowhereto.ai/v1/jobs', {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
source_type: 'url',
source_url: 'https://arxiv.org/pdf/1706.03762.pdf',
data_id: 'attention_paper',
parsing_params: {
model: 'base',
ocr_enabled: false
}
})
});
const job = await response.json();
console.log(`Job ID: ${job.job_id}`);
console.log(`Status: ${job.status}`);
Upload Local File
- cURL
- Python
- Node.js
# Step 1: Create job
curl -X POST https://api.knowhereto.ai/v1/jobs \
-H "Authorization: Bearer $KNOWHERE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source_type": "file",
"file_name": "quarterly_report.pdf",
"data_id": "q3_2025_report"
}'
# Response:
# {
# "job_id": "job_xyz789",
# "status": "waiting-file",
# "upload_url": "https://storage.knowhereto.ai/...",
# "upload_headers": {"Content-Type": "application/pdf"}
# }
# Step 2: Upload file
curl -X PUT "https://storage.knowhereto.ai/..." \
-H "Content-Type: application/pdf" \
--data-binary @quarterly_report.pdf
import requests
# Step 1: Create job and get upload URL
response = requests.post(
"https://api.knowhereto.ai/v1/jobs",
headers={
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
},
json={
"source_type": "file",
"file_name": "quarterly_report.pdf",
"data_id": "q3_2025_report"
}
)
job = response.json()
print(f"Job ID: {job['job_id']}")
# Step 2: Upload file to presigned URL
with open("quarterly_report.pdf", "rb") as f:
upload_response = requests.put(
job["upload_url"],
headers=job.get("upload_headers", {}),
data=f.read()
)
if upload_response.status_code in [200, 204]:
print("Upload successful!")
else:
print(f"Upload failed: {upload_response.status_code}")
import fs from 'fs';
// Step 1: Create job and get upload URL
const response = await fetch('https://api.knowhereto.ai/v1/jobs', {
method: 'POST',
headers: {
'Authorization': `Bearer ${API_KEY}`,
'Content-Type': 'application/json'
},
body: JSON.stringify({
source_type: 'file',
file_name: 'quarterly_report.pdf',
data_id: 'q3_2025_report'
})
});
const job = await response.json();
console.log(`Job ID: ${job.job_id}`);
// Step 2: Upload file to presigned URL
const fileBuffer = fs.readFileSync('quarterly_report.pdf');
const uploadResponse = await fetch(job.upload_url, {
method: 'PUT',
headers: job.upload_headers || {},
body: fileBuffer
});
if (uploadResponse.ok) {
console.log('Upload successful!');
} else {
console.log(`Upload failed: ${uploadResponse.status}`);
}
With Advanced Options
curl -X POST https://api.knowhereto.ai/v1/jobs \
-H "Authorization: Bearer $KNOWHERE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source_type": "url",
"source_url": "https://example.com/scanned_document.pdf",
"data_id": "scanned_001",
"parsing_params": {
"model": "advanced",
"ocr_enabled": true
}
}'
Errors
| Code | HTTP Status | Description |
|---|---|---|
INVALID_ARGUMENT | 400 | Missing or invalid parameters |
UNAUTHENTICATED | 401 | Invalid API key |
RESOURCE_EXHAUSTED | 429 | Rate limit exceeded |
Example Error Response
{
"success": false,
"error": {
"code": "INVALID_ARGUMENT",
"message": "source_url must be a valid HTTP or HTTPS URL",
"request_id": "req_abc123",
"details": {
"field": "source_url",
"reason": "INVALID_URL_FORMAT"
}
}
}
Supported File Formats
| Format | Extension | Max Size |
|---|---|---|
.pdf | 100 MB | |
| Word | .docx | 50 MB |
| Excel | .xlsx | 50 MB |
| PowerPoint | .pptx | 100 MB |
Next Steps
- Get Job - Check job status and retrieve results
- File Upload Guide - Detailed upload instructions