Skip to main content

listSparkJobs(options?)

List your Spark job scripts with pagination.
const { scripts, hasMore } = await client.listSparkJobs();

// Paginate through all scripts
let offset = 0;
const allScripts: string[] = [];
while (true) {
  const page = await client.listSparkJobs({ limit: 50, offset });
  allScripts.push(...page.scripts);
  if (!page.hasMore) break;
  offset += 50;
}

Parameters

options.limit
number
default:"20"
Number of scripts to return per page.
options.offset
number
default:"0"
Number of scripts to skip for pagination.

Return type: ListSparkJobsResult

FieldTypeDescription
scriptsstring[]Script names for the current page
hasMorebooleanWhether more scripts are available

submitSparkJob(options)

Submit a Spark job for execution. Returns a run ID that you can use with getRun() to poll status.
const { runId } = await client.submitSparkJob({
  namespace: "my-namespace",
  name: "daily-etl",
  scriptName: "etl_pipeline.py",
  args: ["--date", "2025-01-15"],
  executorNumbers: 4,
});

// Poll until done
const run = await client.getRun(runId);

Parameters

options.namespace
string
required
Job namespace.
options.name
string
required
Job name.
options.scriptName
string
required
Name of the script to execute.
options.args
string[]
default:"[]"
Arguments to pass to the script.
options.driverMachineType
SparkMachineType
default:"spark.1.b"
Machine type for the Spark driver.
options.executorMachineType
SparkMachineType
default:"spark.1.b"
Machine type for Spark executors.
options.executorNumbers
number
default:"2"
Number of executors (1-20).
options.jobTags
string[]
default:"[]"
Tags applied to the job.
options.runTags
string[]
default:"[]"
Tags applied to this run.

Machine types

The SparkMachineType enum covers compute-optimized (c), balanced (b), and memory-optimized (m) options:
TypevCPUsCategory
spark.1.c / spark.1.b / spark.1.m1Compute / Balanced / Memory
spark.2.c / spark.2.b / spark.2.m2Compute / Balanced / Memory
spark.4.c / spark.4.b / spark.4.m4Compute / Balanced / Memory
spark.8.c / spark.8.b / spark.8.m8Compute / Balanced / Memory
spark.16.c / spark.16.b / spark.16.m16Compute / Balanced / Memory

submitSparkJobAndWait(options)

Submit a Spark job and poll until it reaches a terminal state (COMPLETE, FAIL, or ABORT). Throws an error if the timeout is exceeded.
const { runId, state, run } = await client.submitSparkJobAndWait({
  namespace: "my-namespace",
  name: "daily-etl",
  scriptName: "etl_pipeline.py",
  pollIntervalMs: 5000,
  timeoutMs: 300000,
});

if (state === "COMPLETE") {
  const elapsed = run.duration; // seconds
  // proceed with downstream work ...
} else {
  throw new Error(`Run ${runId} ended with state: ${state}`);
}
Accepts all submitSparkJob parameters plus:
options.pollIntervalMs
number
default:"10000"
Milliseconds between status polls.
options.timeoutMs
number
default:"600000"
Maximum time to wait in milliseconds before throwing a timeout error.

getRun(runId)

Get the current status of a run. Use this to poll a job submitted with submitSparkJob().
const run = await client.getRun(runId);

if (run.state === "COMPLETE") {
  const duration = run.duration; // seconds
  const jobName = run.job.name;
  // handle completion ...
} else if (run.state === "FAIL") {
  const error = run.error;
  // handle failure ...
}

Return type: RunResponse

FieldTypeDescription
idstringRun ID
statestringCurrent state (COMPLETE, FAIL, ABORT, etc.)
started_atstringISO timestamp when the run started
queued_atstringISO timestamp when the run was queued
scheduled_atstringISO timestamp when the run was scheduled
ended_atstringISO timestamp when the run ended
durationnumberRun duration in seconds
errorunknownError details if the run failed
tagsTag[]Array of { key, value, source } objects
jobobjectJob info with id, name, namespace
pipelineobjectPipeline info with id, name, namespace