Spark Jobs - oleander

`listSparkJobs(options?)`

List your Spark job scripts with pagination.

const { scripts, hasMore } = await client.listSparkJobs();

// Paginate through all scripts
let offset = 0;
const allScripts: string[] = [];
while (true) {
  const page = await client.listSparkJobs({ limit: 50, offset });
  allScripts.push(...page.scripts);
  if (!page.hasMore) break;
  offset += 50;
}

Parameters

options.limit

number

default:"20"

Number of scripts to return per page.

options.offset

number

default:"0"

Number of scripts to skip for pagination.

Return type: `ListSparkJobsResult`

Field	Type	Description
`scripts`	`string[]`	Script names for the current page
`hasMore`	`boolean`	Whether more scripts are available

`submitSparkJob(options)`

Submit a Spark job for execution. Returns a run ID that you can use with getRun() to poll status.

const { runId } = await client.submitSparkJob({
  namespace: "my-namespace",
  name: "daily-etl",
  scriptName: "etl_pipeline.py",
  args: ["--date", "2025-01-15"],
  executorNumbers: 4,
});

// Poll until done
const run = await client.getRun(runId);

Parameters

options.namespace

string

required

Job namespace.

options.name

string

required

Job name.

options.scriptName

string

required

Name of the script to execute.

options.args

string[]

default:"[]"

Arguments to pass to the script.

options.driverMachineType

SparkMachineType

default:"spark.1.b"

Machine type for the Spark driver.

options.executorMachineType

SparkMachineType

default:"spark.1.b"

Machine type for Spark executors.

options.executorNumbers

number

default:"2"

Number of executors (1-20).

options.jobTags

string[]

default:"[]"

Tags applied to the job.

options.runTags

string[]

default:"[]"

Tags applied to this run.

Machine types

The SparkMachineType enum covers compute-optimized (c), balanced (b), and memory-optimized (m) options:

Type	vCPUs	Category
`spark.1.c` / `spark.1.b` / `spark.1.m`	1	Compute / Balanced / Memory
`spark.2.c` / `spark.2.b` / `spark.2.m`	2	Compute / Balanced / Memory
`spark.4.c` / `spark.4.b` / `spark.4.m`	4	Compute / Balanced / Memory
`spark.8.c` / `spark.8.b` / `spark.8.m`	8	Compute / Balanced / Memory
`spark.16.c` / `spark.16.b` / `spark.16.m`	16	Compute / Balanced / Memory

`submitSparkJobAndWait(options)`

Submit a Spark job and poll until it reaches a terminal state (COMPLETE, FAIL, or ABORT). Throws an error if the timeout is exceeded.

const { runId, state, run } = await client.submitSparkJobAndWait({
  namespace: "my-namespace",
  name: "daily-etl",
  scriptName: "etl_pipeline.py",
  pollIntervalMs: 5000,
  timeoutMs: 300000,
});

if (state === "COMPLETE") {
  const elapsed = run.duration; // seconds
  // proceed with downstream work ...
} else {
  throw new Error(`Run ${runId} ended with state: ${state}`);
}

Accepts all submitSparkJob parameters plus:

options.pollIntervalMs

number

default:"10000"

Milliseconds between status polls.

options.timeoutMs

number

default:"600000"

Maximum time to wait in milliseconds before throwing a timeout error.

`getRun(runId)`

Get the current status of a run. Use this to poll a job submitted with submitSparkJob().

const run = await client.getRun(runId);

if (run.state === "COMPLETE") {
  const duration = run.duration; // seconds
  const jobName = run.job.name;
  // handle completion ...
} else if (run.state === "FAIL") {
  const error = run.error;
  // handle failure ...
}

Return type: `RunResponse`

Field	Type	Description
`id`	`string`	Run ID
`state`	`string`	Current state (`COMPLETE`, `FAIL`, `ABORT`, etc.)
`started_at`	`string`	ISO timestamp when the run started
`queued_at`	`string`	ISO timestamp when the run was queued
`scheduled_at`	`string`	ISO timestamp when the run was scheduled
`ended_at`	`string`	ISO timestamp when the run ended
`duration`	`number`	Run duration in seconds
`error`	`unknown`	Error details if the run failed
`tags`	`Tag[]`	Array of `{ key, value, source }` objects
`job`	`object`	Job info with `id`, `name`, `namespace`
`pipeline`	`object`	Pipeline info with `id`, `name`, `namespace`

TypeScript

​listSparkJobs(options?)

​Parameters

​Return type: ListSparkJobsResult

​submitSparkJob(options)

​Parameters

​Machine types

​submitSparkJobAndWait(options)

​getRun(runId)

​Return type: RunResponse

`listSparkJobs(options?)`

Parameters

Return type: `ListSparkJobsResult`

`submitSparkJob(options)`

Parameters

Machine types

`submitSparkJobAndWait(options)`

`getRun(runId)`

Return type: `RunResponse`