Installation
Using Homebrew
Install the oleander CLI:Configuration
Authenticate with your API key. Find it in your oleander settings.Oleander Managed Spark
Upload, list, and delete artifacts only on the oleander-managed cluster.Initialize a PySpark workspace
Create a new PySpark job workspace:entrypoint.pyas the Spark job entrypointmylib/for Python modules packaged aspyFilespyproject.tomlanduv.lockfor project and dependency management withuvMakefiletargets for building deployable artifacts
uv to manage dependencies:
make to build the deployment artifacts:
out/pyfiles.zipout/environment.tar.gz
List your Spark artifacts
List your uploaded Spark artifacts:Upload your Spark artifact
Upload a local.py or .jar artifact to oleander:
Include Python dependencies
If your Python artifact needs additional Python modules, package them in a ZIP and include them with--py-files:
Include a virtual environment
If your Python artifact depends on a packaged virtual environment, include it with--virtualenv:
Delete a Spark artifact
Delete a Spark artifact:Submit and execute a Spark job
Submit your uploaded artifact to the oleander-managed cluster. Use the exact uploaded filename without the path, such asprocess_sales_data.py or analytics-batch.jar. The --wait flag keeps the command running until the job finishes.
Common submit options
--cluster: Cluster name. Defaults to the oleander-managed cluster when omitted.--namespace(required): Namespace for the job, a logical group such as a team or project.--name(required): Job name. Runs with the same namespace and name are grouped under the same job.--args: Spark job entrypoint arguments.--sparkConf: Spark configurations without--conf, for examplespark.default.parallelism=8. Separate multiple configurations with whitespace.--packages: Extra package coordinates.--jobTags: Job-specific tags inkey=valueform. Separate multiple tags with whitespace.--runTags: Run-specific tags.--wait: Wait until the job finishes.
Oleander-managed submit options
--driverMachineType: oleander Spark driver machine type.--executorMachineType: oleander Spark executor machine type.--executorNumbers: Number of executor instances.
Registered EMR Serverless Spark
Register your EMR Serverless cluster and target it by name when submitting jobs. Include--cluster <name> and provide the S3 entrypoint to a .py or .jar artifact.
Register an EMR Serverless cluster
Register options
--region: AWS region of the EMR Serverless application.--account-id: AWS account ID of the EMR Serverless application.--controller-role-arn: IAM role ARN oleander assumes to start job runs. Add this to the role’s trust policy so oleander can assume it:
--execution-role-arn: IAM role ARN the job uses; the Spark application runs with this role’s permissions.--application-id: EMR Serverless application ID.--log-bucket: S3 bucket for job logs.
Submit a job to EMR Serverless
Submit options
--cluster(required): Name of the registered cluster.--namespace(required): Namespace for the job, a logical group such as a team or project.--name(required): Job name. Runs with the same namespace and name are grouped under the same job.--args: Spark job entrypoint arguments.--sparkConf: Spark configurations without--conf, for examplespark.default.parallelism=8. Separate multiple configurations with whitespace.--packages: Extra package coordinates.--jobTags: Job-specific tags inkey=valueform. Separate multiple tags with whitespace.--runTags: Run-specific tags.--executionIamPolicy: IAM policy for job permissions. Final permissions are the intersection of the job execution role and this policy.--pyFiles: ExtrapyFilesfor the PySpark job. Mutually exclusive with--mainClass.--virtualenv: Virtual environment archive for Python jobs.--mainClass: Entrypoint main class for the Java/Scala Spark job. Use instead of Python-specific options such as--pyFilesand--virtualenv.--wait: Wait until the job finishes.
Registered Glue Spark
Register your Glue cluster and target it by name when submitting jobs. Include--cluster <name>. Submit uses the existing Glue job name in your environment.
Register a Glue cluster
Register options
--controller-role-arn: IAM role ARN oleander assumes to start job runs. Add this to the role’s trust policy so oleander can assume it:
Submit a job to Glue
Use--cluster to select the registered cluster:
Submit options
--cluster(required): Name of the registered cluster.--namespace(required): Namespace for the job, a logical group such as a team or project.--name(required): Job name. Runs with the same namespace and name are grouped under the same job.--args: Spark job entrypoint arguments.--sparkConf: Spark configurations without--conf, for examplespark.default.parallelism=8. Separate multiple configurations with whitespace.--packages: Extra package coordinates.--jobTags: Job-specific tags inkey=valueform. Separate multiple tags with whitespace.--runTags: Run-specific tags.--executionIamPolicy: IAM policy for job permissions. Final permissions are the intersection of the job execution role and this policy.--workerType: Glue worker type.--numberOfWorkers: Number of Glue workers.--enableAutoScaling: Set totruefor auto scaling,falseotherwise.--executionClass: Glue execution class. EitherSTANDARDorFLEX.--timeoutMinutes: Glue job timeout in minutes.--wait: Wait until the job finishes.