Skip to content

LTBase Engineering Blog

Choose your language: 中文 | English

Forma: A Flexible Data Storage Engine for the AI Era

Why Forma?

Traditional databases weren't built for the AI era. When your AI Agent outputs 12 fields today and 30 fields tomorrow, waiting 3-7 days for DDL approval isn't an option.

Forma solves this with a modern take on the EAV pattern:

ProblemTraditional DBForma
New fieldALTER TABLE (days)JSON Schema update (seconds)
Schema changeDowntime requiredZero downtime
AI outputManual adaptationDirect JSON Schema mapping
N+1 queries101 round-trips1 round-trip
Historical dataSame table, same costCold storage on S3

English Series

From Chaos to Clarity: The Strategic Roadmap for Building a Human-AI Partnership

Your AI Isn't "Stupid" — It Just Needs a Better Harness

A three-part engineering blog series explaining how Forma solves the challenges of building flexible, high-performance data storage for AI applications.

Series Introduction: From EAV to Zero-Dirty-Read Lakehouse

What is Forma and What Problems Does It Solve?

Three posts that explain a flexible data storage engine designed for the AI era. Start here for an overview of Forma's architecture and the three core problems it solves.

Part 1: Why EAV is the Most Underrated Data Model for AI

JSON Schema + Hot Table = AI-Ready Infrastructure

JSON Schema isn't just a validation tool—it's the core of AI-Ready infrastructure. Learn how to achieve: AI output → instant validation → zero-DDL storage.

Part 2: Killing N+1

How One SQL Trick Cut Our Latency by 40x

We reduced database round-trips from 101 to 1, and latency from 1000ms to 25ms—a 97% improvement. The secret is PostgreSQL's CTE + JSON_AGG.

Part 3: Zero Dirty Reads Lakehouse

Building a Trustworthy Lakehouse with DuckDB

PostgreSQL handles "the present," DuckDB + Parquet handles "the past." Learn how Anti-Join + Dirty Set mechanisms ensure zero dirty reads.

中文系列

三篇工程博客,讲透一个为 AI 时代设计的灵活数据存储引擎。

系列介绍:从 EAV 到零脏读的 Lakehouse

Forma 是什么?它解决什么问题?

三篇文章,讲透一个为 AI 时代设计的灵活数据存储引擎。从这里开始了解 Forma 的架构和它要解决的三个核心问题。

第一篇:为什么 EAV 是 AI 时代最被低估的数据模型

JSON Schema + 热表 = AI-Ready 基础设施

JSON Schema 不只是一个校验工具——它是 AI-Ready 基础设施的核心。实现:AI 输出 → 即时校验 → 零 DDL 入库。

第二篇:杀死 N+1

一次 SQL 优化如何让延迟从 1 秒降到 25 毫秒

我们将数据库查询次数从 101 次减少到 1 次,延迟从 1000ms 降至 25ms。秘诀是 PostgreSQL 的 CTE + JSON_AGG。

第三篇:零脏读的 Serverless 湖仓

我们如何用 DuckDB 解决一致性难题

PostgreSQL 负责"当下",DuckDB + Parquet 负责"历史"。Anti-Join + Dirty Set 机制确保联邦查询零脏读。

LTSeq: A Fast, Memory-Efficient Engine for Ordered-Sequence Operations

English Series

A 5-part engineering blog series explaining how LTSeq enables fast, memory-efficient ordered-sequence operations on ordered datasets.