Which document formats are supported?

We support PDF and images (PNG, JPG, TIFF). Our system automatically detects the format and applies the best extraction method for each type.

Are documents stored on your servers?

We keep uploaded files and extracted data only as long as needed to deliver the service, and minimize what we retain. Retention terms are defined per plan; Enterprise retention is set in your agreement.

How does the page limit work?

Each processed document page counts as 1 unit. A 10-page PDF consumes 10 units of your monthly limit. Unused pages do not roll over to the next month.

Can I use the API in production?

Yes. Our API is built for production environments with high availability (99.95% observed uptime), configurable rate limiting, and webhook support for asynchronous processing.

How can I evaluate Parsift?

Contact our team for a guided evaluation. We'll walk you through the platform, discuss your use case, and help you find the plan that fits your needs.

How do I cancel my subscription?

You can cancel at any time, with no cancellation fee. You keep access until the end of your current billing period.

Document Intelligence API

Any PDF goes in,Typed JSON comes out.

One REST call turns any invoice, receipt, or statement into typed JSON: fields, tables, and relations with per-field confidence and pixel-level traceability, read by our own proprietary extraction engine.

Join the waitlist View pricing

Proprietary engine
Enterprise encryption
US data residency

Precision · 30d

98.4%

Latency · p50

1.2s/page

SLA observed · 12m

99.95%

Region

USAzure

// measurementPrecision = exact field match on an internal benchmark of 12.4k labeled documents (invoices, contracts, forms); p50 latency per page under mixed loads. SLA observed · 12m is uptime measured across all tiers over the last 12 months; contractual SLA is set per plan (99% to 99.99%). Request the full methodology.

How it works

Three steps, one REST call.

Send a document, the engine reads it, you get typed data back. No templates to maintain, no manual review in the loop.

STEP 01

Upload PDF

Drag & drop via dashboard or POST via API. We support PDFs and images with automatic format detection.

STEP 02

Engine reads it

Our proprietary engine identifies tables, key-value pairs, and entities with 98.4% field-level precision, measured on a 12.4k-document benchmark.

STEP 03

Structured Output

Receive clean JSON, Markdown, or CSV, ready to drop straight into your database or LLM context.

Extraction

Three operations cover most of the document work.

Parsift exposes a small, well-defined set of semantic primitives that you compose into your product's flow. There are no models to train, no prompts to write, and no brittle templates to maintain.

KIEKey Information Extraction

Identifies fields by meaning, not by position. Issuer, dates, amounts, identifiers, and clauses, returned as typed JSON with a per-field confidence score. Works on unseen layouts, mixed languages, and noisy scans.

coverage60+ fields

ontologycustomizable

confidenceper field

TablesTable Structure Recognition

Reconstructs tables with merged cells, hierarchical headers, and multi-page breaks. Output in JSON, CSV, or Markdown with traceability back to the original pixels, ready for validation or direct ingestion into your database.

formatsJSON · CSV · MD

multi-pagecontinuous

merged cellssupported

LayoutLayout metadata

Detects paragraphs, lists, signatures, and stamps while preserving reading order. Every element carries coordinates, page, and original order, ideal for reprocessing, auditing, and composing into downstream LLMs.

bounding boxper element

reading orderpreserved

trailauditable

Developer Experience

From PDF to typed JSON in one call.

No custom regex parsers to maintain. A documented REST API you can call from any language. Rotation, skew, and low-quality scans are handled for you.

Request · POST /v1/extract

# Turn an invoice PDF into typed JSON
curl -X POST https://api.parsift.com/v1/extract \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -F "[email protected]" \
  -F "schema_type=invoice"

Response · 200 OK

{
"status": "success",
"data": {
"vendor": "AWS Inc.",
"total": 4500.00
}
}

Typed JSON, every call

Fields, tables, and relations come back with per-field confidence, ready to validate against your own schema.

Webhooks for batch jobs

Submit large documents and get a callback when results are ready, so a long batch never blocks a request.

Pixel-level traceability

Every value carries its coordinates on the page, so you can show users exactly where a number came from.

Pricing

Pick the plan that matches your document volume.

MonthlyYearly-20%

Trial

Test the API before you commit, no credit card required

Free

500 pages
14-day window
No credit card
Proprietary extraction engine
Table extraction
REST API access
JSON export

Join the waitlist

Starter

For developers and small projects shipping their first extractions

$10/mo

1,000 pages/month
Proprietary extraction engine
Built-in templates for common documents
REST API access
JSON export
Proprietary engine, processing you control
Email support

Join the waitlist

ProRecommended

For growing teams with steady extraction volume

$30/mo

3,000 pages/month
Everything in Starter
Add-on page packs: 1,000 pages / $10
Unlimited templates
REST API + webhooks
JSON, CSV and Markdown export
Schema validation
90-day extraction history
Priority support

Join the waitlist

Enterprise

For high-volume operations with compliance and deployment requirements

Contact sales

Unlimited pages/month
Everything in Pro
Custom templates
Guaranteed SLA
SSO / SAML
On-premise deployment
Custom integrations
Dedicated 24/7 support

Contact sales

All plans include SSL, data encryption, and a proprietary engine with processing you control.

Compliance

Built for the teams that have to prove compliance.

We operate with auditable practices and provide the contractual instruments needed for deployment in fintechs, law firms, insurers, and healthcare providers.

privacy

LGPD · GDPR

Designed to comply with the LGPD and to be compatible with the GDPR. A data processing agreement (DPA) and standard contractual clauses are available on request.

private

Private processing

Extraction runs on our own engine, and you control how your documents are processed: they are not handed to external model providers without your choice.

crypto

Encryption

Encrypted in transit with TLS and at rest in our storage and database layers. Access to production data is restricted, authenticated, and logged.

residency

Residency

Processing in US regions on Azure. Contractual guarantee that we never train on your documents.

Every control above is documented and verifiable. Read the full posture, or request the report and agreement your review needs.

Compliance overview Request a DPA

Access · 500 pages to test

Five hundred pages to test the API.

Parsift is in early access. Join the waitlist and we will send your invite when account creation opens. Your account starts on a 14-day trial: 500 pages to test the API, no credit card to start.

Join the waitlist Book a call

Precision · 30d98.4 %

Latency · p501.2 s/page

SLA observed · 12m99.95 %

Trial · 500 pages14 days