Skip to content

Embucket/embucket

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Embucket

Run Snowflake SQL on your data lake in 30 seconds. Zero dependencies.

License SQL Logic Test Coverage dbt Gitlab run results

Quick start

Start Embucket and run your first query in 30 seconds:

docker run --name embucket --rm -p 8080:8080 -p 3000:3000 embucket/embucket

Open localhost:8080—login: embucket/embucket—and run:

CREATE TABLE sales (id INT, product STRING, revenue DECIMAL(10,2));
INSERT INTO sales VALUES (1, 'Widget A', 1250.00), (2, 'Widget B', 899.50);
SELECT product, revenue FROM sales WHERE revenue > 1000;

Done. You just ran Snowflake SQL on Apache Iceberg tables with zero configuration.

What just happened?

Embucket provides a single binary that gives you a Snowflake-compatible lakehouse:

  • Snowflake SQL & API: Use your existing queries, dbt projects, and BI tools
  • Apache Iceberg storage: Your data stays in open formats on object storage
  • Zero dependencies: No databases, no clusters, no configuration files
  • Query-per-node: Each instance handles complete queries independently

Perfect for teams who want Snowflake's simplicity without the vendor lock-in.

Architecture

Embucket Architecture

Zero-disk lakehouse: all data and metadata live in object storage. Nodes stay stateless and replaceable.

Built on proven open source:

Why Embucket?

Escape the dilemma: choose between vendor lock-in (Snowflake) or operational complexity (do-it-yourself lakehouse).

  • Radical simplicity - Single binary deployment
  • Snowflake compatibility - Works with your existing tools
  • Open data - Apache Iceberg format, no lock-in
  • Horizontal scaling - Add nodes for more throughput
  • Zero operations - No external dependencies to manage

Next steps

Ready for more? Check out the comprehensive documentation:

Quick Start Guide - Detailed setup and first queries
Architecture Guide - How the zero-disk lakehouse works
Configuration - Production deployment options
dbt Integration - Run existing dbt projects

From source:

git clone https://github.com/Embucket/embucket.git
cd embucket && cargo build
./target/debug/embucketd

Contributing

Contributions welcome. To get involved:

  1. Fork the repository on GitHub
  2. Create a new branch for your feature or bug fix
  3. Submit a pull request with a detailed description

For more details, see CONTRIBUTING.md.

License

This project uses the Apache 2.0 License. See LICENSE for details.