Built by engineers who built too many ETL pipelines.
We spent years debugging Airflow DAGs that copied data from one warehouse to another just so analysts could query it. DataLynxr is the platform we wished we had.
Why we started DataLynxr
In late 2021, Ryan Whitaker was the data engineering lead on a 200TB lakehouse migration for a mid-size logistics company in Denver. The architecture was modern on paper: Apache Iceberg tables in S3, dbt transformations, Airflow orchestration. But there were still six separate Airflow DAGs whose only job was copying data from the lake into a Snowflake schema so the analytics team could query it — because the analysts needed Snowflake's SQL interface, and the lakehouse didn't have one.
Jordan Park, an Apache Iceberg contributor who'd spent three years building query engine internals, saw the same pattern from the other side: perfectly good Parquet tables in S3, and teams paying to copy them somewhere else just to run SQL. The format spec could support direct SQL queries. The tooling to make that production-grade wasn't there.
They left their respective roles in early 2022, moved to a shared workspace in downtown Denver, and started building DataLynxr. The goal was specific: a compute platform that runs SQL, streaming ingestion, and ML feature reads against the same Iceberg or Delta tables, without requiring a copy of the data to land somewhere else first.
DataLynxr is self-funded. We have not raised venture capital. We're profitable on our operational infrastructure and growing through word of mouth among data engineering teams at 50TB+ companies. That's the business we chose to build — deliberately sized for durability, not for a fundraising narrative.
What we stand for
- Open formats, always. Your data stays in your S3/GCS/ADLS bucket in Apache Iceberg or Delta Lake format. We don't have a proprietary storage format. You can point any Iceberg-compatible tool at your tables and we're gone.
- Reproducible benchmarks. Every performance number we publish includes the test harness, the TPC-DS query files, the table DDL, and the Z-ordering configuration. No marketing math.
- Honest scope. DataLynxr is not a BI tool, not a managed Spark service, not a full data platform. It is a compute layer for SQL, streaming, and ML against lakehouse storage. If that's not your problem, we'll say so.
- Bootstrapped independence. We answer to our customers, not to a funding schedule. Features get prioritized by customer impact, not by what helps a future fundraise story.
The people building it
Ryan Whitaker
CEO & Co-founder
Former data engineering lead at an enterprise analytics company. Spent 8 years building data infrastructure before founding DataLynxr.
Jordan Park
CTO & Co-founder
Query engine architect. Contributed to open-source Apache Iceberg before co-founding DataLynxr to build a unified lakehouse runtime.
Maya Chen
Head of Product
Product background in developer tooling. Joined DataLynxr to build the workflows data engineers actually want to use.
Alex Torres
Developer Advocate
Data engineer turned community builder. Maintains the DataLynxr notebook library and runs the community Slack.
Headquartered in Denver, CO
We're based in downtown Denver at the heart of Colorado's growing data engineering community. Remote-first team — engineers distributed across North America.
1600 Stout Street, Suite 1100Denver, CO 80202
[email protected]
+1 (720) 504-8312
Want to talk shop?
Ryan and Jordan talk to every prospective team before onboarding. No sales process — just an engineer-to-engineer conversation.