Skip to content

Forma Engineering Blog

Choose your language: 中文 | English

Why Forma?

Traditional databases weren't built for the AI era. When your AI Agent outputs 12 fields today and 30 fields tomorrow, waiting 3-7 days for DDL approval isn't an option.

Forma solves this with a modern take on the EAV pattern:

ProblemTraditional DBForma
New fieldALTER TABLE (days)JSON Schema update (seconds)
Schema changeDowntime requiredZero downtime
AI outputManual adaptationDirect JSON Schema mapping
N+1 queries101 round-trips1 round-trip
Historical dataSame table, same costCold storage on S3

English Series

A three-part engineering blog series explaining how Forma solves the challenges of building flexible, high-performance data storage for AI applications.

Series Introduction: From EAV to Zero-Dirty-Read Lakehouse

What is Forma and What Problems Does It Solve?

Three posts that explain a flexible data storage engine designed for the AI era. Start here for an overview of Forma's architecture and the three core problems it solves.

Part 1: Why EAV is the Most Underrated Data Model for AI

JSON Schema + Hot Table = AI-Ready Infrastructure

JSON Schema isn't just a validation tool—it's the core of AI-Ready infrastructure. Learn how to achieve: AI output → instant validation → zero-DDL storage.

Part 2: Killing N+1

How One SQL Trick Cut Our Latency by 40x

We reduced database round-trips from 101 to 1, and latency from 1000ms to 25ms—a 97% improvement. The secret is PostgreSQL's CTE + JSON_AGG.

Part 3: Zero Dirty Reads Lakehouse

Building a Trustworthy Lakehouse with DuckDB

PostgreSQL handles "the present," DuckDB + Parquet handles "the past." Learn how Anti-Join + Dirty Set mechanisms ensure zero dirty reads.

中文系列

三篇工程博客,讲透一个为 AI 时代设计的灵活数据存储引擎。

系列介绍:从 EAV 到零脏读的 Lakehouse

Forma 是什么?它解决什么问题?

三篇文章,讲透一个为 AI 时代设计的灵活数据存储引擎。从这里开始了解 Forma 的架构和它要解决的三个核心问题。

第一篇:为什么 EAV 是 AI 时代最被低估的数据模型

JSON Schema + 热表 = AI-Ready 基础设施

JSON Schema 不只是一个校验工具——它是 AI-Ready 基础设施的核心。实现:AI 输出 → 即时校验 → 零 DDL 入库。

第二篇:杀死 N+1

一次 SQL 优化如何让延迟从 1 秒降到 25 毫秒

我们将数据库查询次数从 101 次减少到 1 次,延迟从 1000ms 降至 25ms。秘诀是 PostgreSQL 的 CTE + JSON_AGG。

第三篇:零脏读的 Serverless 湖仓

我们如何用 DuckDB 解决一致性难题

PostgreSQL 负责"当下",DuckDB + Parquet 负责"历史"。Anti-Join + Dirty Set 机制确保联邦查询零脏读。