Dev Tools · 1h ago
Zstd Frames Cut Cloud Egress Costs for Parquet Files
Reading a 4 GB Parquet file for schema discovery may fetch only 50 KB, but naive full-file downloads waste 99.9% of egress bytes. Using seekable Zstd frames and a jump table, HuskHoard enables partial reads via HTTP Range requests, reducing a $9,216 catalog sync to $9. The technique maps Parquet row groups to independent compressed frames for precise byte-level access.
Meridian48 take
The cost savings are real, but adoption depends on whether existing Parquet tooling integrates with frame-level seeking—a gap HuskHoard aims to fill.
parquetzstd