Skip to main content

Knowhere API

Knowhere API is a Chunking-as-a-Service platform that transforms complex, unstructured documents into structured, RAG-ready data. Our API extracts and preserves the semantic structure of your documents—including paragraphs, tables, headings, and images—delivering high-quality chunks that significantly improve your RAG application's performance.

Core Value Proposition

Extract structure, not just text.

Traditional document parsing tools output plain text streams, losing critical semantic information. Knowhere API preserves the document's logical hierarchy, relationships between content blocks, and multi-modal elements (tables, images), enabling superior chunking and embedding quality.

Key Features

Intelligent Document Parsing

  • Structure Preservation: Maintains document hierarchy, headings, and logical sections
  • Multi-Modal Support: Extracts and processes text, tables, and images
  • RAG-Optimized Output: Delivers chunks with metadata, relationships, and semantic context

Supported File Formats

FormatExtensionDescription
PDF.pdfAdobe Portable Document Format
Word.docxMicrosoft Word documents
Excel.xlsxMicrosoft Excel spreadsheets
PowerPoint.pptxMicrosoft PowerPoint presentations

Developer Experience

  • Simple REST API: Two endpoints to handle all your parsing needs
  • Async Processing: Submit jobs and retrieve results when ready
  • Multiple Language Support: Code examples in cURL, Python, and Node.js

How It Works

  1. Create a Job: Submit a parsing request with your document URL or request an upload URL
  2. Upload (if needed): For local files, upload directly to our secure storage using the presigned URL
  3. Poll or Wait: Check job status or wait for completion
  4. Get Results: Download structured results as a ZIP package containing chunks, images, and tables

Base URL

All API requests should be made to:

https://api.knowhereto.ai

Need Help?