Skip to main content
Run your PySpark applications directly on oleander’s managed Spark infrastructure. Upload your scripts, manage job execution, and automatically capture lineage metadata for complete observability of your data transformations.

Installation

Using Homebrew

Install the oleander CLI to manage your Spark jobs from the command line:
brew tap OleanderHQ/tap
brew install oleander-cli

Configuration

Authenticate with your oleander account using your API key. You can find your API key in your oleander settings.
oleander configure --api-key <YOUR_API_KEY>

Managing Spark Jobs

List your Spark jobs

View all your uploaded PySpark scripts and their current status:
oleander spark job list
This command displays all your Spark jobs, making it easy to see which scripts are available for execution.

Upload your PySpark script

Upload your PySpark application to oleander. Your script will be stored and ready to execute on demand:
oleander spark job upload <your_script_path>
Example:
oleander spark job upload ./transformations/process_sales_data.py

Include Python dependencies

If your PySpark script requires additional Python modules, package them in a ZIP archive and include them with the --py-files option:
oleander spark job upload <your_script_path> --py-files <module_archive_zip>
Example:
oleander spark job upload ./etl_pipeline.py --py-files ./dependencies.zip

Update an existing job

To replace an existing Spark job with a new version of your script, use the --overwrite flag:
oleander spark job upload <your_script_path> --overwrite
This is useful when iterating on your data transformations or fixing bugs in your PySpark code.

Delete a Spark job

Remove a Spark job that you no longer need:
oleander spark job delete <script_name>
Example:
oleander spark job delete process_sales_data

Submit and execute a Spark job

Run your uploaded PySpark script on oleander’s compute infrastructure. The --wait flag keeps the command running until the job completes, so you can monitor execution in real-time:
oleander spark job submit <script_name> --wait
Example:
oleander spark job submit process_sales_data --wait
When your Spark job runs, oleander automatically captures OpenLineage metadata, tracking data lineage, transformations, and dependencies. View the results and lineage graph in your oleander dashboard.