> ## Documentation Index
> Fetch the complete documentation index at: https://docs.oleander.dev/llms.txt
> Use this file to discover all available pages before exploring further.

# Polars

> Run Polars SQL queries and Python scripts against your Iceberg lake.

<Warning>
  Polars is in **beta**. Observability (lineage, traces, logs) is not yet captured for Polars workloads.
</Warning>

oleander's Polars integration lets you run Polars workloads directly against your Iceberg lake tables — no infrastructure to provision. Workloads execute in an isolated sandbox and can optionally be dispatched to Polars Cloud for distributed execution.

## Modes

### SQL query

Run a Polars SQL expression against one or more registered lake tables. This is the fastest way to explore data without writing Python.

<img className="block dark:hidden rounded-md" src="https://mintcdn.com/oleander/hwtYU1ps4RmiVAc4/images/lake-light.png?fit=max&auto=format&n=hwtYU1ps4RmiVAc4&q=85&s=3e440cbe69dfce1a75bdd3d5866fcaae" alt="Lake query" width="3618" height="2418" data-path="images/lake-light.png" />

<img className="hidden dark:block rounded-md" src="https://mintcdn.com/oleander/hwtYU1ps4RmiVAc4/images/lake-dark.png?fit=max&auto=format&n=hwtYU1ps4RmiVAc4&q=85&s=2079fac241cd394f67a35941c5bed2a8" alt="Lake query dark" width="3612" height="2418" data-path="images/lake-dark.png" />

### Python script

Write a Python script using the Polars DataFrame API. oleander injects the runtime, auth, and catalog — your script only needs to assign `result`.

The following are available in scope without any imports:

| Variable      | Type     | Description                                                           |
| ------------- | -------- | --------------------------------------------------------------------- |
| `pl`          | module   | The `polars` module                                                   |
| `scan(table)` | function | Returns a `LazyFrame` for `namespace.table` with credentials wired in |
| `params`      | `dict`   | Key-value parameters passed at runtime                                |
| `catalog`     | object   | The underlying pyiceberg catalog (advanced use)                       |

Your script must assign its output to `result` — a Polars `LazyFrame` or `DataFrame`. oleander handles `.collect()` and distributed dispatch; do not call them in your script.

```python theme={null}
table = params.get("table", "default.flowers")
limit = int(params.get("limit", 50))

flowers = scan(table)

result = (
    flowers.group_by("species")
    .agg(
        pl.len().alias("count"),
        pl.col("sepal_length").mean().round(2).alias("avg_sepal_length"),
        pl.col("petal_length").mean().round(2).alias("avg_petal_length"),
    )
    .sort("avg_sepal_length", descending=True)
    .head(limit)
)
```

## Distributed execution

Enable distributed mode to run your workload on Polars Cloud instead of a local sandbox. This is useful for large datasets that exceed sandbox memory.

<Warning>
  Distributed execution incurs Polars Cloud compute costs in addition to your oleander usage. Start with a small cluster size and scale up as needed.
</Warning>

## Saving results

Results can be written back to a table in your Iceberg catalog. Choose `overwrite` to replace the table or `append` to add rows to an existing one.

## CLI & API

Polars workloads can also be triggered programmatically:

* **CLI:** [`oleander polars`](/cli/polars)
* **API:** [`POST /api/v1/warehouse/polars`](/api-reference/endpoint/polars)
