Data Flow: Application¶

Manage OCI Data Flow applications. This page is generated from checked-in package metadata, CRD schemas, and sample manifests.

Resource Snapshot¶

Field	Value
Service	`dataflow`
Resource	`Application`
API Version	`dataflow.oracle.com/v1beta1`
Package	Data Flow
Support Status	Preview
Latest Released Version	`v2.0.0-alpha`
Install Namespace	`oci-service-operator-dataflow-system`

Quick Links¶

Resource Guide Index
Setup Guide
Package Page
API Reference
Spec Reference
Status Reference
Rendered Sample (config/samples/dataflow_v1beta1_application.yaml)

Spec Fields¶

This summary shows the top-level spec fields. Use the full API reference for nested fields, defaults, and enum values.

Field	Description	Type	Required
`applicationLogConfig`	ApplicationLogConfig defines nested fields for Application.ApplicationLogConfig.	`object`	No
`archiveUri`	A comma separated list of one or more archive files as Oracle Cloud Infrastructure URIs. For example, `oci://path/to/a.zip,oci://path/to/b.zip`. An Oracle Cloud Infrastructure URI of an archive.zip file containing custom dependencies that may be used to support the execution of a Python, Java, or Scala application. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No
`arguments`	The arguments passed to the running application as command line arguments. An argument is either a plain text or a placeholder. Placeholders are replaced using values from the parameters map. Each placeholder specified must be represented in the parameters map else the request (POST or PUT) will fail with a HTTP 400 status code. Placeholders are specified as `Service Api Spec`, where `name` is the name of the parameter. Example: `[ "--input", "${input_file}", "--name", "John Doe" ]` If "input_file" has a value of "mydata.xml", then the value above will be translated to `--input mydata.xml --name "John Doe"`	`list[string]`	No
`className`	The class for the application.	`string`	No
`compartmentId`	The OCID of a compartment.	`string`	Yes
`configuration`	The Spark configuration passed to the running process. See https://spark.apache.org/docs/latest/configuration.html#available-properties. Example: { "spark.app.name" : "My App Name", "spark.shuffle.io.maxRetries" : "4" } Note: Not all Spark properties are permitted to be set. Attempting to set a property that is not allowed to be overwritten will cause a 400 status to be returned.	`map[string, string]`	No
`definedTags`	Defined tags for this resource. Each key is predefined and scoped to a namespace. For more information, see Resource Tags (https://docs.oracle.com/iaas/Content/General/Concepts/resourcetags.htm). Example: `{"Operations": {"CostCenter": "42"}}`	`map[string, map[string, string]]`	No
`description`	A user-friendly description. Avoid entering confidential information.	`string`	No
`displayName`	A user-friendly name. It does not have to be unique. Avoid entering confidential information.	`string`	Yes
`driverShape`	The VM shape for the driver. Sets the driver cores and memory.	`string`	Yes
`driverShapeConfig`	ApplicationDriverShapeConfig defines nested fields for Application.DriverShapeConfig.	`object`	No
`execute`	The input used for spark-submit command. For more details see https://spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit. Supported options include `--class`, `--file`, `--jars`, `--conf`, `--py-files`, and main application file with arguments. Example: `--jars oci://path/to/a.jar,oci://path/to/b.jar --files oci://path/to/a.json,oci://path/to/b.csv --py-files oci://path/to/a.py,oci://path/to/b.py --conf spark.sql.crossJoin.enabled=true --class org.apache.spark.examples.SparkPi oci://path/to/main.jar 10` Note: If execute is specified together with applicationId, className, configuration, fileUri, language, arguments, parameters during application create/update, or run create/submit, Data Flow service will use derived information from execute input only.	`string`	No
`executorShape`	The VM shape for the executors. Sets the executor cores and memory.	`string`	Yes
`executorShapeConfig`	ApplicationExecutorShapeConfig defines nested fields for Application.ExecutorShapeConfig.	`object`	No
`fileUri`	An Oracle Cloud Infrastructure URI of the file containing the application to execute. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No
`freeformTags`	Free-form tags for this resource. Each tag is a simple key-value pair with no predefined name, type, or namespace. For more information, see Resource Tags (https://docs.oracle.com/iaas/Content/General/Concepts/resourcetags.htm). Example: `{"Department": "Finance"}`	`map[string, string]`	No
`idleTimeoutInMinutes`	The timeout value in minutes used to manage Runs. A Run would be stopped after inactivity for this amount of time period. Note: This parameter is currently only applicable for Runs of type `SESSION`. Default value is 2880 minutes (2 days)	`integer (int64)`	No
`language`	The Spark language.	`string`	Yes
`logsBucketUri`	An Oracle Cloud Infrastructure URI of the bucket where the Spark job logs are to be uploaded. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No
`maxDurationInMinutes`	The maximum duration in minutes for which an Application should run. Data Flow Run would be terminated once it reaches this duration from the time it transitions to `IN_PROGRESS` state.	`integer (int64)`	No
`metastoreId`	The OCID of OCI Hive Metastore.	`string`	No
`numExecutors`	The number of executor VMs requested.	`integer`	Yes
`parameters`	An array of name/value pairs used to fill placeholders found in properties like `Application.arguments`. The name must be a string of one or more word characters (a-z, A-Z, 0-9, _). The value can be a string of 0 or more characters of any kind. Example: [ { name: "iterations", value: "10"}, { name: "input_file", value: "mydata.xml" }, { name: "variable_x", value: "${x}"} ]	`list[object]`	No
`poolId`	The OCID of a pool. Unique Id to indentify a dataflow pool resource.	`string`	No
`privateEndpointId`	The OCID of a private endpoint.	`string`	No
`sparkVersion`	The Spark version utilized to run the application.	`string`	Yes
`type`	The Spark application processing type.	`string`	No
`warehouseBucketUri`	An Oracle Cloud Infrastructure URI of the bucket to be used as default warehouse directory for BATCH SQL runs. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No

Status Fields¶

This summary shows the top-level status fields. Use the full API reference for nested fields, defaults, and enum values.

Field	Description	Type	Required
`applicationLogConfig`	ApplicationLogConfig defines nested fields for Application.ApplicationLogConfig.	`object`	No
`archiveUri`	A comma separated list of one or more archive files as Oracle Cloud Infrastructure URIs. For example, `oci://path/to/a.zip,oci://path/to/b.zip`. An Oracle Cloud Infrastructure URI of an archive.zip file containing custom dependencies that may be used to support the execution of a Python, Java, or Scala application. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No
`arguments`	The arguments passed to the running application as command line arguments. An argument is either a plain text or a placeholder. Placeholders are replaced using values from the parameters map. Each placeholder specified must be represented in the parameters map else the request (POST or PUT) will fail with a HTTP 400 status code. Placeholders are specified as `Service Api Spec`, where `name` is the name of the parameter. Example: `[ "--input", "${input_file}", "--name", "John Doe" ]` If "input_file" has a value of "mydata.xml", then the value above will be translated to `--input mydata.xml --name "John Doe"`	`list[string]`	No
`className`	The class for the application.	`string`	No
`compartmentId`	The OCID of a compartment.	`string`	No
`configuration`	The Spark configuration passed to the running process. See https://spark.apache.org/docs/latest/configuration.html#available-properties. Example: { "spark.app.name" : "My App Name", "spark.shuffle.io.maxRetries" : "4" } Note: Not all Spark properties are permitted to be set. Attempting to set a property that is not allowed to be overwritten will cause a 400 status to be returned.	`map[string, string]`	No
`definedTags`	Defined tags for this resource. Each key is predefined and scoped to a namespace. For more information, see Resource Tags (https://docs.oracle.com/iaas/Content/General/Concepts/resourcetags.htm). Example: `{"Operations": {"CostCenter": "42"}}`	`map[string, map[string, string]]`	No
`description`	A user-friendly description.	`string`	No
`displayName`	A user-friendly name. This name is not necessarily unique.	`string`	No
`driverShape`	The VM shape for the driver. Sets the driver cores and memory.	`string`	No
`driverShapeConfig`	ApplicationDriverShapeConfig defines nested fields for Application.DriverShapeConfig.	`object`	No
`execute`	The input used for spark-submit command. For more details see https://spark.apache.org/docs/latest/submitting-applications.html#launching-applications-with-spark-submit. Supported options include `--class`, `--file`, `--jars`, `--conf`, `--py-files`, and main application file with arguments. Example: `--jars oci://path/to/a.jar,oci://path/to/b.jar --files oci://path/to/a.json,oci://path/to/b.csv --py-files oci://path/to/a.py,oci://path/to/b.py --conf spark.sql.crossJoin.enabled=true --class org.apache.spark.examples.SparkPi oci://path/to/main.jar 10` Note: If execute is specified together with applicationId, className, configuration, fileUri, language, arguments, parameters during application create/update, or run create/submit, Data Flow service will use derived information from execute input only.	`string`	No
`executorShape`	The VM shape for the executors. Sets the executor cores and memory.	`string`	No
`executorShapeConfig`	ApplicationExecutorShapeConfig defines nested fields for Application.ExecutorShapeConfig.	`object`	No
`fileUri`	An Oracle Cloud Infrastructure URI of the file containing the application to execute. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No
`freeformTags`	Free-form tags for this resource. Each tag is a simple key-value pair with no predefined name, type, or namespace. For more information, see Resource Tags (https://docs.oracle.com/iaas/Content/General/Concepts/resourcetags.htm). Example: `{"Department": "Finance"}`	`map[string, string]`	No
`id`	The application ID.	`string`	No
`idleTimeoutInMinutes`	The timeout value in minutes used to manage Runs. A Run would be stopped after inactivity for this amount of time period. Note: This parameter is currently only applicable for Runs of type `SESSION`. Default value is 2880 minutes (2 days)	`integer (int64)`	No
`language`	The Spark language.	`string`	No
`lifecycleState`	The current state of this application.	`string`	No
`logsBucketUri`	An Oracle Cloud Infrastructure URI of the bucket where the Spark job logs are to be uploaded. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No
`maxDurationInMinutes`	The maximum duration in minutes for which an Application should run. Data Flow Run would be terminated once it reaches this duration from the time it transitions to `IN_PROGRESS` state.	`integer (int64)`	No
`metastoreId`	The OCID of OCI Hive Metastore.	`string`	No
`numExecutors`	The number of executor VMs requested.	`integer`	No
`ownerPrincipalId`	The OCID of the user who created the resource.	`string`	No
`ownerUserName`	The username of the user who created the resource. If the username of the owner does not exist, `null` will be returned and the caller should refer to the ownerPrincipalId value instead.	`string`	No
`parameters`	An array of name/value pairs used to fill placeholders found in properties like `Application.arguments`. The name must be a string of one or more word characters (a-z, A-Z, 0-9, _). The value can be a string of 0 or more characters of any kind. Example: [ { name: "iterations", value: "10"}, { name: "input_file", value: "mydata.xml" }, { name: "variable_x", value: "${x}"} ]	`list[object]`	No
`poolId`	The OCID of a pool. Unique Id to indentify a dataflow pool resource.	`string`	No
`privateEndpointId`	The OCID of a private endpoint.	`string`	No
`sparkVersion`	The Spark version utilized to run the application.	`string`	No
`status`	-	`object`	Yes
`timeCreated`	The date and time the resource was created, expressed in RFC 3339 (https://tools.ietf.org/html/rfc3339) timestamp format. Example: `2018-04-03T21:10:29.600Z`	`string`	No
`timeUpdated`	The date and time the resource was updated, expressed in RFC 3339 (https://tools.ietf.org/html/rfc3339) timestamp format. Example: `2018-04-03T21:10:29.600Z`	`string`	No
`type`	The Spark application processing type.	`string`	No
`warehouseBucketUri`	An Oracle Cloud Infrastructure URI of the bucket to be used as default warehouse directory for BATCH SQL runs. See https://docs.oracle.com/iaas/Content/API/SDKDocs/hdfsconnector.htm#uriformat.	`string`	No

Sample Manifest¶

This example is generated from the checked-in sample manifest at config/samples/dataflow_v1beta1_application.yaml. Replace placeholder values before applying it.

Open the rendered sample page

#
# Copyright (c) 2021, Oracle and/or its affiliates. All rights reserved.
# Licensed under the Universal Permissive License v 1.0 as shown at http://oss.oracle.com/licenses/upl.
#

#
# Replace the OCI identifiers and Object Storage URI below before running e2e.
# Update metadata.name and spec.displayName if you want to force a fresh create
# instead of reusing an existing Application with the same display name in the
# same compartment.
# Replace the starter shapes and Spark version with values currently supported in
# your region if needed.
#
apiVersion: dataflow.oracle.com/v1beta1
kind: Application
metadata:
  name: application-sample
spec:
  compartmentId: ocid1.compartment.oc1..exampleuniqueID
  displayName: "application-sample"
  driverShape: "VM.Standard.E4.Flex"
  executorShape: "VM.Standard.E4.Flex"
  language: "PYTHON"
  numExecutors: 2
  sparkVersion: "3.5.0"
  fileUri: "oci://bucket@namespace/app/main.py"