list_spark_jobs(options?)
List your Spark job scripts with pagination.
Parameters
Number of scripts to return per page. Must be greater than 0.
Number of scripts to skip for pagination. Must be 0 or greater.
Return type: ListSparkJobsResult
| Field | Type | Description |
|---|---|---|
scripts | list[str] | Script names for the current page |
has_more | bool | Whether more scripts are available |
submit_spark_job(options)
Submit a Spark job for execution. Returns a SparkJobRun with a run_id that you can use with get_run() to poll status.
Parameters
Job namespace. Must be non-empty.
Job name. Must be non-empty.
Name of the script to execute. Must be non-empty.
Arguments to pass to the script.
Machine type for the Spark driver.
Machine type for Spark executors.
Number of executors (1-20).
Tags applied to the job.
Tags applied to this run.
Return type: SparkJobRun
| Field | Type | Description |
|---|---|---|
run_id | str | The ID of the submitted run |
Machine types
TheSparkMachineType enum covers compute-optimized (c), balanced (b), and memory-optimized (m) options:
| Type | vCPUs | Category |
|---|---|---|
spark.1.c / spark.1.b / spark.1.m | 1 | Compute / Balanced / Memory |
spark.2.c / spark.2.b / spark.2.m | 2 | Compute / Balanced / Memory |
spark.4.c / spark.4.b / spark.4.m | 4 | Compute / Balanced / Memory |
spark.8.c / spark.8.b / spark.8.m | 8 | Compute / Balanced / Memory |
spark.16.c / spark.16.b / spark.16.m | 16 | Compute / Balanced / Memory |
submit_spark_job_and_wait(options)
Submit a Spark job and poll until it reaches a terminal state (COMPLETE, FAIL, or ABORT). Raises TimeoutError if the timeout is exceeded.
submit_spark_job parameters plus:
Milliseconds between status polls. Must be greater than 0.
Maximum time to wait in milliseconds before raising
TimeoutError.Return type: SubmitAndWaitResult
| Field | Type | Description |
|---|---|---|
run_id | str | The ID of the submitted run |
state | str | Terminal state (COMPLETE, FAIL, or ABORT) |
run | RunResponse | Full run details |
get_run(run_id)
Get the current status of a run. Use this to poll a job submitted with submit_spark_job().
Return type: RunResponse
| Field | Type | Description |
|---|---|---|
id | str | Run ID |
state | Optional[str] | Current state (COMPLETE, FAIL, ABORT, etc.) |
started_at | Optional[str] | ISO timestamp when the run started |
queued_at | Optional[str] | ISO timestamp when the run was queued |
scheduled_at | Optional[str] | ISO timestamp when the run was scheduled |
ended_at | Optional[str] | ISO timestamp when the run ended |
duration | Optional[float] | Run duration in seconds |
error | Optional[Any] | Error details if the run failed |
tags | list[RunTag] | List of tags with key, value, and optional source |
job | RunJobInfo | Job info with id, name, namespace |
pipeline | RunPipelineInfo | Pipeline info with id, name, namespace |