Skip to main content

Quick Start

Parse your first document with Knowhere API in under 5 minutes.

Prerequisites

  • A Knowhere API key (Get one here)
  • A document to parse (PDF, DOCX, XLSX, or PPTX)

Step 1: Set Your API Key

export KNOWHERE_API_KEY="your_api_key_here"

Step 2: Create a Parsing Job

You have two options for submitting documents:

Option A: Parse from URL

If your document is publicly accessible:

curl -X POST https://api.knowhereto.ai/v1/jobs \
-H "Authorization: Bearer $KNOWHERE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source_type": "url",
"source_url": "https://example.com/document.pdf"
}'

Option B: Upload a Local File

For local files, first create a job to get an upload URL:

# Step 1: Create job and get upload URL
curl -X POST https://api.knowhereto.ai/v1/jobs \
-H "Authorization: Bearer $KNOWHERE_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"source_type": "file",
"file_name": "document.pdf"
}'

# Response includes upload_url and upload_headers
# Step 2: Upload the file
curl -X PUT "UPLOAD_URL_FROM_RESPONSE" \
-H "Content-Type: application/pdf" \
--data-binary @document.pdf

Step 3: Poll for Results

curl https://api.knowhereto.ai/v1/jobs/JOB_ID \
-H "Authorization: Bearer $KNOWHERE_API_KEY"

Step 4: Download and Use Results

Once the job is complete, download the ZIP file from result_url:

# Download the result ZIP
curl -o result.zip "RESULT_URL_FROM_RESPONSE"

# Extract
unzip result.zip -d result/

# View the chunks
cat result/chunks.json | jq '.chunks[0]'

Result Structure

The downloaded ZIP contains:

result.zip
├── manifest.json # Metadata and file index
├── chunks.json # All chunks with content and metadata
├── content.md # Full document as Markdown (optional)
├── images/ # Extracted images
└── tables/ # Extracted tables as HTML

See Result Handling for detailed documentation on the result format.

Next Steps