Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add spark k8s operator launcher #1225

Merged
merged 3 commits into from
Dec 16, 2020

Conversation

oavdeev
Copy link
Collaborator

@oavdeev oavdeev commented Dec 11, 2020

What this PR does / why we need it:
This allows Feast to run Spark jobs using spark-k8s-operator.

To enable it:

  1. set spark_launcher="k8s"
  2. set spark_k8s_use_incluster_config to True or False, depending on whether spark is running in the same k8s cluster or not. Make sure that feast serviceaccount has permissions to create SparkApplication resources. Provide KUBECONFIG if running Feast outside of the cluster.
  3. Typically you need spark_staging_location and historical_feature_output_location to be set and use s3a:// URL scheme if using S3.
  4. Lastly, set spark_k8s_job_template_path to point to YAML template containing spark application configuration. Feast comes with one out of the box, but in production you'll likely want to provide a custom one.

Additional changes:

  • Storage client now supports s3a:// URL scheme (used by OSS Spark to access S3, we didn't need this for EMR since it understands s3:// scheme just fine)
  • For historical pyspark jobs, arguments are now base64+JSON encoded. Previously they were just JSON encoded, but with OSS spark that caused issues with proper quoting (I think the issue is somewhere in the docker entrypoint script provided with Spark, not the k8s operator itself).

Does this PR introduce a user-facing change?:

Feast now supports launching Spark jobs using k8s operator.

@oavdeev
Copy link
Collaborator Author

oavdeev commented Dec 11, 2020

/kind feature

@feast-ci-bot feast-ci-bot added kind/feature New feature or request size/XL and removed needs-kind labels Dec 11, 2020
@oavdeev oavdeev force-pushed the k8s-operator-support branch 2 times, most recently from e77ebd4 to a5f7960 Compare December 11, 2020 18:17
@oavdeev
Copy link
Collaborator Author

oavdeev commented Dec 11, 2020

/retest

@oavdeev oavdeev force-pushed the k8s-operator-support branch from a5f7960 to 51a63f4 Compare December 12, 2020 03:04
@oavdeev oavdeev changed the title WIP add spark k8s operator launcher Add spark k8s operator launcher Dec 12, 2020
@oavdeev oavdeev force-pushed the k8s-operator-support branch from 51a63f4 to ca466af Compare December 15, 2020 01:48
Signed-off-by: Oleg Avdeev <oleg.v.avdeev@gmail.com>
@oavdeev oavdeev force-pushed the k8s-operator-support branch from ca466af to d855776 Compare December 15, 2020 22:33
Signed-off-by: Oleg Avdeev <oleg.v.avdeev@gmail.com>
@oavdeev oavdeev force-pushed the k8s-operator-support branch from bef6877 to d463fba Compare December 16, 2020 04:24
@@ -265,7 +264,7 @@ def _stage_file(self, file_path: str, job_id: str) -> str:
return blob_uri_str

def dataproc_submit(
self, job_params: SparkJobParameters
self, job_params: SparkJobParameters, extra_properties: Dict[str, str]
Copy link
Member

@woop woop Dec 16, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure if you saw #1198. Should we close that PR after yours has been merged, or first merge that one?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No i haven't. It looks like it is solving a slightly different problem, namely user-specified extra options. Don't feel strongly which one to merge first (and which one to rebase).

@woop
Copy link
Member

woop commented Dec 16, 2020

@oavdeev looks good overall 👍

The only things that stand out are docs (tracked elsewhere in #1231) and tests. I'd feel a bit more comfortable having the tests as part of this PR, or did you have another plan (minikube/kind based)?

Signed-off-by: Oleg Avdeev <oleg.v.avdeev@gmail.com>
@oavdeev
Copy link
Collaborator Author

oavdeev commented Dec 16, 2020

I'd feel a bit more comfortable having the tests as part of this PR, or did you have another plan (minikube/kind based)?

I'd rather add tests in a followup PR. I plan to add another integration test, alongside with test-end-to-end-[aws,gcp], and setting this stuff up tends to make a PR nearly unreadable due to lots of tweaks/rebases and retests.

@feast-ci-bot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: oavdeev, woop

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@woop
Copy link
Member

woop commented Dec 16, 2020

/lgtm

@feast-ci-bot feast-ci-bot merged commit 92800d9 into feast-dev:master Dec 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants