Although I work for VAST Data, these notes are my own personal notes and are not authoritative. They may be wrong.

VAST DataBase is a capability built into the VAST Element Store to represent structured (tabular) data in a way that leverages the strengths of VAST’s pools of SCM and flash. It combines row‑based transactions and column‑based analytics by writing new rows into SCM and, once enough data accumulates, transposing rows into small (~32 KB) columnar chunks as they are written down to flash.

Queries can read from the write buffers and from the columnar chunks, so the system delivers ACID transaction semantics while still providing column‑store performance.

The behavior of DataBase is pretty well documented in the VAST DataBase Administrator’s Guide.

Interfaces

Querying the VAST DataBase is mostly commonly done using either Trino or Spark SQL. VAST ships drivers for both that implement predicate pushdown.

The low-level interface into the VAST DataBase is provided by the VAST Arrow Database Connectivity (ADBC) driver.1 This exposes a DuckDB-like SQL interface.

Underneath this is a REST-based API that end users aren’t meant to use. I think it encodes messages using protobuf.

VAST Query Engine

VAST implements a query engine within DataBase that is becoming progressively more capable as they release new versions. As of version 5.4, the VAST Query Engine supports:2

FeatureFunction
Vector searchThe basis for the VAST vector database
Column permissionsControl which users can see which columns
Filter pushdownScalable filtering of simple predicates3
Nested datatype processingQuery elements within structures within columns

Footnotes

  1. https://vast-data.github.io/data-platform-field-docs/vast_vectordb/overview/overview.html#getting-started

  2. VAST Query Engine

  3. Supported Pushdowns