Run a Polars workload
Execute a Polars workload against your Iceberg lake tables. Supports two modes: query (Polars SQL run against registered tables) and script (user-authored Python that assigns result to a LazyFrame or DataFrame). Workloads run in an isolated sandbox; pass distributed: true to offload execution to Polars Cloud.
Beta: Observability (lineage, traces, logs) is not yet captured for Polars workloads.
Cost note: Distributed execution incurs Polars Cloud compute charges in addition to oleander usage. Start small and scale up as needed.
Authorizations
Bearer authentication header of the form Bearer <token>, where <token> is your auth token.
Body
Execution mode. query runs a Polars SQL expression against registered tables. script executes a Python script that has pl, scan(table), and params in scope and must assign result.
query, script Polars SQL query string. Required when mode is query.
"SELECT species, avg(sepal_length) AS avg_sepal_length, avg(petal_length) AS avg_petal_length FROM flowers GROUP BY species ORDER BY avg_sepal_length DESC"
Tables to register for query mode. Each entry is either alias=namespace.table or an object {alias, table}. Required when mode is query.
"flowers=default.flowers"
Python script source for script mode. The script runs with pl (polars), scan(table), params, and catalog in scope. It must assign result to a Polars LazyFrame or DataFrame. Do not call .collect() or .remote() inside the script — oleander handles execution.
"table = params.get('table', 'default.flowers')\nlimit = int(params.get('limit', 50))\n\nflowers = scan(table)\n\nresult = (\n flowers.group_by('species')\n .agg(\n pl.len().alias('count'),\n pl.col('sepal_length').mean().round(2).alias('avg_sepal_length'),\n pl.col('petal_length').mean().round(2).alias('avg_petal_length'),\n )\n .sort('avg_sepal_length', descending=True)\n .head(limit)\n)"
Key-value parameters available inside a script as the params dict.
{ "table": "default.flowers", "limit": "25" }Run the workload on Polars Cloud instead of a local sandbox.
Polars Cloud instance type. Only used when distributed is true.
"t4g.medium"
Number of Polars Cloud nodes. Only used when distributed is true.
4
Sandbox vCPU tier for local (non-distributed) execution. Valid values: 2, 4, 8, 16, 32.
2, 4, 8, 16, 32 Catalog name to query against.
Write results to a lake table. Format: namespace.table. If omitted, results are returned inline.
"default.flower_summary"
How to write to destination. overwrite replaces the table; append adds rows.
overwrite, append Response
Workload executed successfully
Whether the workload executed successfully.
Inline results (omitted when destination is set and distributed is true).
Total number of rows in the result.
Wall-clock time for the workload (e.g. "340ms").
Error message if success is false.
Present when destination was set.
Compute environment details.