Dev Tools · 1h ago
Parsewise Launches API to Extract Structured Data from Unstructured Documents
Parsewise, a YC-backed startup, offers an API that transforms unstructured data like PDFs and emails into schema-compliant JSON or CSV with traceable lineage. The platform uses self-improving agent definitions and vLLMs for parsing, achieving state-of-the-art results on the Databricks OfficeQA benchmark. It aims to solve validation challenges by providing word-level citations and human-in-the-loop verification.
Meridian48 take
Parsewise's focus on verifiability and lineage could differentiate it in the crowded document parsing space, but its success hinges on adoption beyond early-stage tinkerers.
Read the full reporting
Launch HN: Parsewise (YC P25) – Reason Across Documents with an API →
Hacker News
document-parsingapi