Designed by Agile Lab, Witboost is a versatile platform that addresses a wide range of sophisticated data engineering challenges. It enables businesses to discover, enhance, and productize their data, fostering the creation of automated data platforms that adhere to the highest standards of data governance. Want to know more about Witboost? Check it out here or contact us!
This repository is a guide to our Starter Kit meant to showcase Witboost's integration capabilities and provide a "batteries-included" product.
- Tech Adapters
- Templates
- Computational Governance Platform
- Data Contract
- Data Catalog Plugins
- Other Integrations
- Practice Shaper Presets
- Access Control Request Template
A Tech Adapter (formerly called Specific Provisioner) is a microservice which is in charge of deploying components that use a specific technology. When the deployment of a system (e.g. a Data Product) is triggered, the platform generates its descriptor and orchestrates the deployment of every component contained in the system. For every such component the platform knows which Tech Adapter is responsible for its deployment, and can thus send a provisioning request with the descriptor to it so that the Tech Adapter can perform whatever operation is required to fulfill this request and report back the outcome to the platform.
Tech Adapters were previously called Specific Provisioners, so you might still find this name in most of the repositories while we perform the name transition. So whenever you encounter Specific Provisioner, just read it as Tech Adapter.
You can learn more about how the Tech Adapters fit in the broader picture here.
We provide two main kinds of projects:
- Tech Adapters: microservice implementations for a specific technology that you can customize to suit your needs
- Scaffolds: base projects that you can start from if you want to implement a tech adapter yourself
Tech | Kind | Project | Scope | Supported components | Notes |
---|---|---|---|---|---|
![]() |
Tech Adapter | Airbyte Adapter | ELT - Airbyte | Workload | |
![]() |
Tech Adapter | Airflow Adapter | Scheduling - Airflow/MWAA | Workload | |
![]() |
Tech Adapter | Azure ADLS Storage Area Adapter | Object Storage - Azure Data Lake Storage | Storage Area | Deployable with Azure ADLS Umbrella Chart |
![]() |
Tech Adapter | Azure ADLS Output Port Adapter | Object Storage - Azure Data Lake Storage | Output Port | |
![]() |
Tech Adapter | CDP Impala Adapter | SQL Query Engine - CDP Impala | Output Port | |
![]() |
Tech Adapter | CDP S3 Adapter | Object Storage - CDP S3 | Output Port | |
![]() |
Tech Adapter | CDP Spark Adapter | Data Processing - CDP Spark | Workload | |
![]() |
Tech Adapter | CDP HDFS Adapter | Distributed File System - CDP HDFS | Output Port, Storage Area | |
![]() |
Tech Adapter | Databricks Adapter | Data Processing - Databricks Spark | Workload, Output Port | |
![]() |
Tech Adapter | Hasura Adapter | GraphQL - Hasura | Output Port | Needs the Hasura Authentication Webhook and Role Mapper |
![]() |
Tech Adapter | Snowflake Adapter | DWH - Snowflake | Output Port, Storage Area | |
![]() |
Tech Adapter | Azure Data Factory Adapter | ETL - Azure Data Factory | Workload | |
![]() |
Tech Adapter | Microsoft Fabric Output Port Adapter | DWH - Microsoft Fabric | Output Port | |
![]() |
Tech Adapter | Great Expectations Guardian Adapter | Data Quality - Great Expectations | Workload | |
![]() |
Tech Adapter | Confluent Kafka Adapter | Streaming - Confluent Kafka | Output Port | |
![]() |
Tech Adapter | Google BigQuery Adapter | DWH - Google BigQuery | Output Port | |
![]() |
Tech Adapter | AWS Athena Adapter | Serverless Query Engine - Amazon Athena | Output Port | |
![]() |
Tech Adapter | AWS Glue Adapter | Serverless Integration Service - AWS Glue | Workload | |
![]() |
Tech Adapter | AWS S3 Adapter | Object Storage - AWS S3 | Storage Area | |
![]() |
Scaffold | Java Scaffold | Generic - Java | NA | Uses the Java Tech Adapter Framework library. |
![]() |
Scaffold | Python Scaffold | Generic - Python | NA | |
![]() |
Scaffold | Terraform Scaffold | Generic - Terraform | NA |
A Template is a tool that helps create components inside Witboost under a specific Data Landscape (e.g. Data Mesh). Templates help establish a standard across the organization. This standard leads to easier understanding, management and maintenance of components. Templates provide a predefined structure so that developers don't have to start from scratch each time, which leads to faster development and allows them to focus on other aspects, such as testing and business logic.
For more information, please refer to the official documentation.
Tech | Component | Project | Scope | Tech Adapter | Notes |
---|---|---|---|---|---|
![]() |
Data Product | Data Product | NA | No Tech Adapter needed | |
![]() |
Storage Area | Azure ADLS Storage Area | Data Lake Storage - Azure | Azure ADLS Storage Area Adapter | |
![]() |
Output Port | Azure ADLS Output Port | Data Lake Storage - Azure | Azure ADLS Output Port Adapter | |
![]() |
Output Port | CDP CDW Impala Output Port | SQL Query Engine - CDP CDW Impala | CDP Impala Adapter | |
![]() |
Output Port | CDP DL S3 Output Port | Object Storage - CDP DL S3 | CDP S3 Adapter | |
![]() |
Storage Area | CDP HDFS Storage Area | Distributed File System - CDP HDFS | CDP HDFS Adapter | |
![]() |
Output Port | CDP HDFS Output Port | Distributed File System - CDP HDFS | CDP HDFS Adapter | |
![]() |
Output Port | Hasura Output Port | GraphQL - Hasura | Hasura Adapter | |
![]() |
Output Port | Snowflake Output Port | DWH - Snowflake | Snowflake Adapter | |
![]() |
Storage Area | Snowflake Storage Area | DWH - Snowflake | Snowflake Adapter | |
![]() |
Workload | Snowflake SQL Workload | Data processing - Snowflake | No Tech Adapter needed | It's triggered by an orchestrator through the Airflow Workload Template |
![]() |
Workload | Airbyte Workload | ELT - Airbyte | Airbyte Adapter | |
![]() |
Workload | CDP CDE Spark Workload | Data Processing - CDP CDE Spark | CDP Spark Adapter | |
![]() |
Workload | DBT Workload | Data processing - DBT | No Tech Adapter needed | |
![]() |
Workload | Airflow Workload | Scheduling - Airflow/MWAA | Airflow Adapter | |
![]() |
Workload | Azure Data Factory Workload | ETL - Azure Data Factory | Azure Data Factory Adapter | |
![]() |
Workload | Great Expectations Guardian Workload | Data Quality - Great Expectations | Great Expectations Guardian Adapter | |
![]() |
Output Port | Confluent Kafka Output Port | Streaming - Confluent Kafka | Confluent Kafka Adapter | |
![]() |
Output Port | Google BigQuery Output Port | DWH - Google BigQuery | Google BigQuery Adapter | |
![]() |
Output Port | AWS Athena Output Port | Serverless Query Engine - Amazon Athena | AWS Athena Adapter | |
![]() |
Workload | AWS Glue Workload | Serverless Integration Service - AWS Glue | AWS Glue Adapter | |
![]() |
Storage Area | AWS S3 Storage Area | Object Storage - AWS S3 | AWS S3 Adapter |
Looking to build your own template? Check out the Templates Gallery, which contains howtos and practical examples to kickstart the process.
This module enables a true shift left of data governance within the software and data development processes.
It allows the platform team to create, evolve and enforce computational policies and metrics. That means, data governance is no longer just guidelines, it is enforced automatically through code and are not bypassable.
For more information, please refer to the official documentation.
Looking to build your own policies that can be used in Witboost either as-is or as a starting point for your custom policies? Check out the Policies Gallery, which contains howtos and practical examples to kickstart the process.
To implement an end to end Data Contract mechanism with pluggable architectural pattern, three components are needed.
- Data Contract Definition Template: Useful for the end user to define the contract metadata. It is possible to adopt any data contract specification you want ( Bitol or others ).
- Data Contract Guardian: This is an autonoumous agent that Witboost injects in all the Data COntracts to actively monitor them and raise alerts in case of Contract Drift. The contract guardian can be implemented with ay technology and architectural pattern, providing huge freedom degrees. Guardian autonomy provides infinite scalability and no bottlenecks to the overall solution.
Tech | Guardian | Project | Notes |
---|---|---|---|
![]() |
Batch With Circuit Break | Great Expectations Guardian Workload | Great Expectations Guardian Tech Adapter |
A Data Catalog Plugin is an extension point for Witboost that allows publishing entities on an external, pluggable Data Catalog. It is invoked at the end of the provisioning flow and receives the whole information about the entity descriptor, provisioning info, etc.
You can learn more about how Data Catalog plugins fit in the broader picture here.
Tech | Kind | Project | Scope | Notes |
---|---|---|---|---|
![]() |
Data Catalog Plugin | Collibra Data Catalog Plugin | Data Catalog - Collibra |
In this section you can find a list of possible integrations. They are not as production ready as the ones above, but are in any way a good starting point to address specific use cases and to understand Witboost capabilities.
Tech | Kind | Project | Scope | Supported components | Notes |
---|---|---|---|---|---|
![]() |
Tech Adapter | Ice Panel | C4 Architecture Diagram | Data Product | Needs an IcePanel license |
![]() |
Tech Adapter | Tonic.ai | Synthetic Data Generation | Output Port | Needs a Tonic.ai license |
![]() |
Tech Adapter | DCAT - OWL - RDF | Data Catalog | Output Port | Needs an RDF Triple Store endpoint |
![]() |
Tech Adapter | GoodData | Analytics | Output Port | Needs a GoodData license |
![]() |
Extension | MCP Server | Agentic AI | Help to connect third party agents to Witboost |
The Practice Shaper is the main and most impactful Witboost setting that models entities (domains, systems, components, templates) as nodes of a fully-configurable property graph.
This enables data-oriented organizations to shape Witboost based on their unique use cases, structure, and needs.
Thanks to the Practice Shaper, a company can approach any project scenario in data (Data Landscape), such as Data Mesh, Business Intelligence, Machine Learning and others, by defining which practices are enabled and regulated, with the possibility to define technological and methodological guardrails.
Refer to the Witboost documentation to learn more about Practice Shaper and Data Landscapes.
The Practice Shaper Presets repository provides some ready-to-import Data Landscapes, allowing organizations to quickly set up and customize their witboost environment to suit specific business needs.
The Access Control Request Template is the mechanism of Witboost used to configure the information required by the users when requesting access to consumables on the Marketplace.
Refer to the Witboost documentation to learn more about Access Control Request Templates.
Check out the Access Control Request Template repository which provides the base set of access control request templates for your platform that you can customize further.
This project is available under the Apache License, Version 2.0; see LICENSE for full details.
Witboost is a cutting-edge Data Experience platform, that streamlines complex data projects across various platforms, enabling seamless data production and consumption. This unified approach empowers you to fully utilize your data without platform-specific hurdles, fostering smoother collaboration across teams.
It seamlessly blends business-relevant information, data governance processes, and IT delivery, ensuring technically sound data projects aligned with strategic objectives. Witboost facilitates data-driven decision-making while maintaining data security, ethics, and regulatory compliance.
Moreover, Witboost maximizes data potential through automation, freeing resources for strategic initiatives. Apply your data for growth, innovation and competitive advantage.
Contact us or follow us on: