Telemetry
oleander collects telemetry via OpenLineage and OpenTelemetry the two open standards already running in your stack. No agents are deployed in your environment. No direct access to your infrastructure is required. Point your existing tools at oleander’s ingest endpoint and telemetry starts flowing immediately. Every event is written to theoleander.telemetry namespace in the Iceberg catalog within seconds of emission.
| Standard | What it captures |
|---|---|
OpenLineage | Job runs, dataset inputs and outputs, lineage edges, schema versions |
OpenTelemetry | Spans, traces, logs, and resource attributes from pipeline execution |
Catalog
Every oleander organization is provisioned a private Apache Iceberg REST catalog namedoleander, backed by Lakekeeper and stored on S3. It exposes a standard Iceberg REST endpoint, making it natively compatible with Spark, DuckDB, and any Iceberg-aware tool.
The catalog contains two namespaces out of the box:
| Namespace | Purpose |
|---|---|
oleander.telemetry | Platform events written automatically run_events, traces, logs |
oleander.default | Your own tables, created via SQL, Parquet upload, or S3 sync |
Graph
oleander reads from the catalog and builds a versioned context graph: a unified object model where every run, dataset, schema, query, cost, and lineage edge is a first-class, interconnected node. The graph is updated automatically after every execution. You can query it as it exists now or as it existed at any point in time, diff any two runs, and trace exactly what changed and when.| Object | What it contains |
|---|---|
| Runs | Every DAG run, Spark job, dbt model: duration, status, logs, lineage |
| Datasets | Tables and schemas versioned at every write, with upstream and downstream lineage |
| Costs | Every warehouse query attributed to the workflow and dataset that drove it |
| Changes | Schema diffs and PR-level impact before they cascade downstream |
Compute
oleander provides two managed compute surfaces that read and write from the Iceberg catalog with automatic lineage. Every read and write is captured in the context graph without any instrumentation required.Serverless Spark
Submit PySpark applications to fully managed infrastructure. No clusters to provision, no configuration required. Spark jobs can read from and write to botholeander.default and external Iceberg catalogs. Lineage and cost flow into the context graph from the first run.
Tasks
Run arbitrary Python scripts in oleander’s managed environment. Tasks can connect to the Iceberg catalog, external data sources, and trigger via webhooks or on demand. Every read and write to the catalog generates automatic lineage.| Status | Description |
|---|---|
QUEUED | Waiting to execute |
WORKING | Currently running |
COMPLETED | Finished successfully |
FAILED | Encountered an error or non-zero exit code |
CANCELED | Manually canceled |
Query
Engineers and AI agents query the same context graph and catalog from any surface. The query layer exposes four access patterns, all backed by the same underlying graph and catalog.| Surface | Use case |
|---|---|
| MCP server | AI agents (Cursor, Copilot, Claude, BYO) query the context graph directly |
| API | Programmatic access to the graph, catalog, and SQL proxy |
| CLI | Terminal-based queries, Spark submission, lake management |
| SDK | Custom integrations and automated workflows |
The SQL proxy lets you query BigQuery, Snowflake, and other warehouses directly via DuckDB, without leaving oleander. Queries are attributed to the workflow that triggered them and appear in the context graph.