Platform Roadmap¶
High-level timeline for the development and expansion of the KBase Data Lakehouse (K-BERDL).
Q1 2025: Foundation¶
- Launch: Initial production release of the Lakehouse Core.
- Ingestion: General availability of DTS for bulk uploads.
- Tenants: Onboarding of JGI and NMDC as pilot tenants.
- Catalog: Release of the Unified Metadata Catalog (Alpha).
Q2 2025: Federation¶
- Cross-Tenant Queries: Support for SQL joins across authorized tenant schemas.
- Federated Search: Ability to search external repositories (NCBI, EBI) from within the Lakehouse interface.
- Data Commons: Launch of the "BER Public Data Commons" for open datasets.
Q3 2025: AI & Automation¶
- Agent SDK: Release of the developer kit for building custom AI agents.
- Auto-Curator: Deployment of AI models for automated metadata tagging and quality improvement.
- Vector Search: Full semantic search capabilities across Narrative text and documentation.
Q4 2025: Advanced Analytics¶
- User Warehouses: Ability for users to spin up ephemeral SQL warehouses for ad-hoc analysis.
- Graph Analytics: Native support for large-scale knowledge graph queries.
- Real-time Dashboards: Integrated visualization tools (Superset/Grafana) for data metrics.
Note: Timelines and features are subject to change based on user feedback and DOE priorities.