Skip to content

add functions for creating ray with oauth proxy in front of the dashboard #298

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Oct 20, 2023

Conversation

KPostOffice
Copy link
Collaborator

@KPostOffice KPostOffice commented Aug 10, 2023

Issue link

closes: #174

What changes have been made

I added functions which create the necessary objects and update the appwrapper so that the SDK can create a ray cluster with an OAuth Proxy in front of the dashboard.

  • OAuth sidecar container and tls secret mount
  • Ingress with annotations for tls passthrough
  • Service which points to OAuth port in the sidecar container and annotations for autogenerated tls certs
  • Service account which contains the OAuth Callback
  • ClusterRoleBinding giving OAuth sidecar necessary permissions

Verification steps

go through notebook but set openshift_oauth to true in the config

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • Testing is not required for this change

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Aug 10, 2023
@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Aug 10, 2023
@KPostOffice
Copy link
Collaborator Author

This isn't quite finished yet. I'm having tls related issues between the Ray API and the OAuth proxy.

@KPostOffice KPostOffice added the tide/merge-blocker Denotes an issue that blocks the tide merge queue for a branch while it is open. label Aug 14, 2023
@KPostOffice
Copy link
Collaborator Author

KPostOffice commented Aug 14, 2023

Blocked due to required changes upstream with the Ray Job Submission Client and TorchX

(edit: submission client has necessary changes. I was using an older version locally)

@openshift-merge-robot openshift-merge-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 11, 2023
@blublinsky
Copy link

This URL is used not only for dashboard access but also for the APIs - currently jobs and In future also Serve. Oath proxy requires manual log in, which will break these APIs

@KPostOffice KPostOffice changed the title WIP: add functions for creating ray with oauth proxy in front of the dashboard add functions for creating ray with oauth proxy in front of the dashboard Sep 14, 2023
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Sep 14, 2023
@KPostOffice KPostOffice removed the tide/merge-blocker Denotes an issue that blocks the tide merge queue for a branch while it is open. label Sep 14, 2023
@KPostOffice
Copy link
Collaborator Author

@blublinsky I've launched OAuth so that it supports both GUI manual login as well as authentication using an auth bearer token. Can you PTAL?

@KPostOffice KPostOffice force-pushed the issue174v1 branch 2 times, most recently from c064fec to 85a153e Compare September 14, 2023 20:39
@blublinsky
Copy link

@blublinsky I've launched OAuth so that it supports both GUI manual login as well as authentication using an auth bearer token. Can you PTAL?

How do you get the auth bearer token to use?

@KPostOffice
Copy link
Collaborator Author

@blublinsky I've launched OAuth so that it supports both GUI manual login as well as authentication using an auth bearer token. Can you PTAL?

How do you get the auth bearer token to use?

https://github.com/project-codeflare/codeflare-sdk/pull/298/files#diff-5706a96c5346dd54a22370211260411363c4ed41d798fddf240eb8983b1366eaR82-R86

It is the same token used for interacting with the kubernetes api with the kubectl client. I grab it here based on the currently logged in user or in cluster config. I believe it is the same token as the one you generate from the copy login command page in the OpenShift client.

@blublinsky
Copy link

It is the same token used for interacting with the kubernetes api with the kubectl client. I grab it here based on the currently logged in user or in cluster config. I believe it is the same token as the one you generate from the copy login command page in the OpenShift client.

So basically the only thing that you are validating is that user is logged in. Not very secure, really

@KPostOffice
Copy link
Collaborator Author

It is the same token used for interacting with the kubernetes api with the kubectl client. I grab it here based on the currently logged in user or in cluster config. I believe it is the same token as the one you generate from the copy login command page in the OpenShift client.

So basically the only thing that you are validating is that user is logged in. Not very secure, really

https://github.com/project-codeflare/codeflare-sdk/pull/298/files#diff-279fd895cfcb8295f1b610826f8eb7a40fd543ff873661acd9dc452cef3d9d14R423

The OAuth Proxy supports checking for authorization based on RBAC. Here I check to make sure the authenticated user has authorization to get pods in the given namespace.

@blublinsky
Copy link

So it is basically a single cluster solution

@KPostOffice
Copy link
Collaborator Author

So it is basically a single cluster solution

I'm unsure what you are expecting. How would you propose we authenticate for a multi-cluster design here? AFAIK, the codeflare-sdk expects the user to be logged into a specific k8 cluster.

@blublinsky
Copy link

I'm unsure what you are expecting. How would you propose we authenticate for a multi-cluster design here? AFAIK, the codeflare-sdk expects the user to be logged into a specific k8 cluster.

I do not think authentication/authorization should be linked to a given cluster, It should rather be for a given user, who can be logged in any of the several clusters, but that's just my opinion

@KPostOffice
Copy link
Collaborator Author

I was running into issues with the generate CA cert from the service annotation. Putting this here so I remember to look into the ca-injector.

https://cert-manager.io/docs/concepts/ca-injector/

@openshift-merge-robot openshift-merge-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Sep 18, 2023
@KPostOffice
Copy link
Collaborator Author

KPostOffice commented Sep 18, 2023

I'm unsure what you are expecting. How would you propose we authenticate for a multi-cluster design here? AFAIK, the codeflare-sdk expects the user to be logged into a specific k8 cluster.

I do not think authentication/authorization should be linked to a given cluster, It should rather be for a given user, who can be logged in any of the several clusters, but that's just my opinion

Where do you propose authentication and authorization occur in this case?

@KPostOffice KPostOffice force-pushed the issue174v1 branch 4 times, most recently from 761a2df to 8a83cbc Compare September 22, 2023 20:48
@MichaelClifford
Copy link
Collaborator

Tested out starting a ray cluster from local machine and submitting jobs. Seems to work as expected when logging in via sdk authentication or oc login.

lgtm

@Maxusmusti could you please review as well.

@KPostOffice KPostOffice force-pushed the issue174v1 branch 9 times, most recently from 8cec67f to 104f77c Compare October 5, 2023 21:36
@anishasthana
Copy link
Contributor

@KPostOffice could you create a follow-on task to switch this over to istio (once we have istio available ootb in ODH). Or at least make it configurable somehow. Alternatively, the follow-on winds up being "Remove this, and update the kuberay operator to allow auth enablement"

Copy link
Contributor

@Bobbins228 Bobbins228 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified working on OSD cluster awesome work Kevin /approve

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Oct 20, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Oct 20, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Bobbins228

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 20, 2023
@Bobbins228 Bobbins228 merged commit e44dd8e into project-codeflare:main Oct 20, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OAuth Proxy for Ray Dashboard
6 participants