Polars is in beta. Observability (lineage, traces, logs) is not yet captured for Polars workloads.
oleander’s Polars integration lets you run Polars workloads directly against your Iceberg lake tables — no infrastructure to provision. Workloads execute in an isolated sandbox and can optionally be dispatched to Polars Cloud for distributed execution.
Modes
SQL query
Run a Polars SQL expression against one or more registered lake tables. This is the fastest way to explore data without writing Python.
Python script
Write a Python script using the Polars DataFrame API. oleander injects the runtime, auth, and catalog — your script only needs to assign result.
The following are available in scope without any imports:
| Variable | Type | Description |
|---|
pl | module | The polars module |
scan(table) | function | Returns a LazyFrame for namespace.table with credentials wired in |
params | dict | Key-value parameters passed at runtime |
catalog | object | The underlying pyiceberg catalog (advanced use) |
Your script must assign its output to result — a Polars LazyFrame or DataFrame. oleander handles .collect() and distributed dispatch; do not call them in your script.
table = params.get("table", "default.flowers")
limit = int(params.get("limit", 50))
flowers = scan(table)
result = (
flowers.group_by("species")
.agg(
pl.len().alias("count"),
pl.col("sepal_length").mean().round(2).alias("avg_sepal_length"),
pl.col("petal_length").mean().round(2).alias("avg_petal_length"),
)
.sort("avg_sepal_length", descending=True)
.head(limit)
)
Distributed execution
Enable distributed mode to run your workload on Polars Cloud instead of a local sandbox. This is useful for large datasets that exceed sandbox memory.
Distributed execution incurs Polars Cloud compute costs in addition to your oleander usage. Start with a small cluster size and scale up as needed.
Saving results
Results can be written back to a table in your Iceberg catalog. Choose overwrite to replace the table or append to add rows to an existing one.
CLI & API
Polars workloads can also be triggered programmatically: