The Platform

A lakehouse engine built for all three workloads.

DataLynxr's query engine, streaming runtime, and ML layer all share the same storage cursor — not separate stores bolted together. That's the difference.

Start Free Trial Read the Docs

Abstract representation of DataLynxr's unified storage layer — three parallel channels funneling into a single cohesive data plane, rendered in deep teal and amber

Unified Storage Cursor

SQL queries, streaming ingestion, and ML feature reads all target the same lakehouse tables via a single storage abstraction. No copies, no divergence.

Vectorized SQL Engine

Column-store vectorized execution with JIT compilation. Sub-second P95 latency on 10TB datasets against S3-backed Iceberg and Delta tables.

Streaming Ingest Runtime

Kafka, Kinesis, Pulsar to lakehouse tables with exactly-once semantics and ACID transaction support. Sub-5s end-to-end latency, no landing zone.

ML Feature Layer

Python SDK exposes point-in-time correct reads from lakehouse tables. Training and inference use the same path — zero training/serving skew by architecture.

ACID Transactions

Full ACID guarantees across all workload types. Snapshot isolation, time-travel queries (up to 7 days), and Z-ordering for file-level compaction.

Multi-Cloud Storage

S3, GCS, and ADLS Gen2 native connectors. Data stays in your account. No storage copy, no lock-in. Bring your own bucket policy.

Apache Iceberg

Full read/write support including partition evolution, schema evolution, hidden partitioning, and branching/tagging.

Partition evolution
Schema evolution
Hidden partitioning
Branch & tag

Delta Lake

Full Delta protocol support: DML operations (MERGE, UPDATE, DELETE), Change Data Feed, liquid clustering.

DML (MERGE / UPDATE / DELETE)
Change Data Feed
Liquid clustering
Column stats

Apache Hudi

Copy-on-Write and Merge-on-Read tables. Incremental queries and bootstrapping from existing Hudi datasets.

Copy-on-Write
Merge-on-Read
Incremental queries
Bootstrap from existing

Architecture

Stateless compute. Decoupled storage.

DataLynxr separates compute from storage at the protocol level. Your query nodes are stateless and horizontally autoscalable. Your data lives in your cloud object storage, not in our infrastructure.

The catalog layer tracks all table metadata and transaction logs. Every read — SQL, streaming, or ML — goes through the same catalog transaction before touching storage. That's how ACID is enforced across workload types.

View Full Architecture Diagram

dlx-arch-overview

┌──────────────────────────────────┐
│  DataLynxr Compute Layer          │
│  (stateless, autoscale)           │
│  ┌──────┐  ┌────────┐  ┌──────┐  │
│  │ SQL  │  │ Stream │  │  ML  │  │
└──┴──────┴──┴────────┴──┴──────┴──┘
          ↓ same cursor ↓
┌──────────────────────────────────┐
│   Catalog + Transaction Layer     │
│   (Iceberg / Delta / Hudi)        │
└──────────────────────────────────┘
          ↓ your bucket ↓
┌──────────────────────────────────┐
│   S3 / GCS / ADLS (your account)  │
└──────────────────────────────────┘

Python SDK

JDBC / ODBC

Apache Airflow

dbt Core

Apache Superset

Apache Kafka

Amazon Kinesis

Pulsar

View all connectors

Not a data warehouse

DataLynxr does not store your data in proprietary columnar storage. Your data lives in your S3/GCS/ADLS bucket in open Parquet format. If you need a traditional OLAP warehouse with its own storage layer, Snowflake or Redshift is the right tool.

Not a BI or dashboarding tool

DataLynxr provides the SQL query interface. Visualization and dashboarding are handled by tools that connect to it: Tableau, Metabase, Apache Superset, or any JDBC/ODBC-compatible client. DataLynxr doesn't build charts.

Not a managed Spark service

There is no Spark runtime in DataLynxr. The query engine is purpose-built and vectorized. If your workload depends on Spark-specific APIs (RDD operations, Spark MLlib, SparkR), you need a Spark-native platform such as Databricks or EMR.

One platform. Three workloads. No ETL.

Connect your cloud storage and run your first query in under 10 minutes.

Start Free Trial Read the Quickstart