Forma FAQs
Fast answers to common questions about online DDL, JSONB, MongoDB, and the EAV “anti-pattern,” plus Forma’s positioning: use one JSON Schema to drive validation, storage, CRUD APIs, and push data into the lakehouse for OLAP.
What is Forma’s positioning?
- Streamline backend work end-to-end: one JSON Schema defines storage + validation → drives hot-table/EAV writes → exposes Frontend-friendly querying/CRUD APIs → CDC pushes to the lakehouse for OLAP (Intro, Part 3).
- The goal is not “another database,” but to make field changes, type safety, query performance, and lakehouse integration part of the same low-friction development path.
Databases have online DDL—why bother with Forma?
- Online DDL reduces locking but not the delivery path: code changes, index design, and regression still take hours or even days, while AI fields change daily (10–50 variants) — the pace does not match (see Part 1).
- In Forma, adding a field = updating JSON Schema metadata; writes take effect immediately with no table/index changes (same link).
Beyond avoiding DDL, what workflow does Forma change?
- AI apps already need JSON Schema to steer LLM output and define API contracts; Forma reuses that single schema for type validation and storage mapping, eliminating duplicate modeling/migrations (Part 1).
- CRUD: the hot table + EAV layer use the schema to decide which fields map to B-tree-backed hot columns, reducing handwritten code (Part 1).
- Data path: CDC ships data to Parquet/DuckDB for OLAP/lakehouse without rebuilding another derived model (Part 3).
PostgreSQL has JSONB with GIN/B-tree—what’s the issue?
- Range/sort queries on JSONB need expression indexes, which are still DDL (Part 1).
- Partial updates rewrite the whole JSONB blob, causing write amplification and higher WAL/replication cost (Part 1).
- Portability is weak: JSONB features tie you to PostgreSQL; moving across databases/cloud services is harder (Part 1).
- Forma uses pre-indexed typed hot columns; mapping a field is metadata-only via JSON Schema, while EAV stays pure SQL-compatible (same link).
Why not just use MongoDB?
- MongoDB fits schema-free document workloads. When you need SQL JOINs, full ACID, the PostgreSQL ecosystem, and low-cost cold storage, Forma is a better fit (Part 1).
- Cold data lands in Parquet/DuckDB with predictable cost and consistency controls (Part 3).
Isn’t EAV an anti-pattern? How does Forma avoid the traps?
- Hot table: lift the hot 20% fields into physical columns with B-tree indexes to avoid full scans on ranges/sorts (Part 1).
- Single-query aggregation: CTE + JSON_AGG collapses 101 round-trips to 1, killing N+1 (Part 2).
- Cold/hot split with consistency: DuckDB + Parquet for history, Anti-Join + Dirty Set for zero-dirty-read federated queries (Part 3).