-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[feat] Implement posit workbench credentials strategy and make credentials strategy fallback options more explicit #384
base: main
Are you sure you want to change the base?
Conversation
Hey there! 👋 We noticed that the title of your pull request doesn't follow the Conventional Commits specification. To ensure consistency, we kindly ask you to adjust the title accordingly. Here are the details:
|
|
||
|
||
class CredentialsStrategy(abc.ABC): | ||
class CredentialsStrategyWrapper(CredentialsStrategy): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Need to revisit this. I'd rather not depend on the databricks sdk but this violates the TYPE_CHECKING
import above.
return self._connect_strategy(*args, **kwargs) | ||
else: | ||
print() | ||
# log and continue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like move these log messages into the init so that they aren't printed everytime the user requests credentials
return positLocalContentCredentialsProvider( | ||
self._token_endpoint_url, | ||
self._client_id, | ||
self._client_secret, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me again why we need this? Shouldn't we be able to just pass the user session token to client.oauth.get_credentials()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we be able to just pass the user session token to client.oauth.get_credentials()?
cient.oauth.get_credentials
and client.oauth.get_content_credentials
are only available when the content is running on Connect. This helper would allow content running locally (or in Workbench) to obtain the same service principal credential that it would use when running on a Connect server.
Originally we had this because the databricks-cli doesn't support service principal auth however after looking more closely at the databricks-sdk, I think we should be able to use some of the provided service_principal credentials strategies rather than implementing this ourselves.
""" | ||
def __init__(self, | ||
client: Optional[Client] = None, | ||
user_session_token: Optional[str] = None, | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
""" | |
def __init__(self, | |
client: Optional[Client] = None, | |
user_session_token: Optional[str] = None, | |
): | |
""" | |
def __init__(self, | |
client: Optional[Client] = None, | |
*, | |
user_session_token: Optional[str] = None, | |
content_session_token: Optional[str] = None, | |
): |
Would keep these explicitly keyword arguments and then allows the dev to pass the token with an appropriately named param.
(1). "Posit", "Credentials", "Strategy", "posit_strategy", "credentials_strategy", "connect_strategy", "PositCredentials", "PositConnectCredentials", "PositWorkbenchCredentials" ... |
(2). |
(3). |
(4) |
(5). |
(6). |
(7) here's where I completely lost the logic trail. |
@@ -25,12 +29,11 @@ def server(i: Inputs, o: Outputs, session: Session): | |||
session_token = session.http_conn.headers.get("Posit-Connect-User-Session-Token") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A nit, and this applies to all the examples:
From the developer’s POV, I am in local/workbench first. It feels out of order to have the Connect-specific session token defined so early. My mental model is to get through the parts about defining how to handle creds in development vs deployment, then I’d start putting Connect-specific info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a constraint given the custom nature of this integration. The way it is shown here, this code can be written once and then adapt to the environment it is running it. But that means there is a lowest common denominator in terms of DX as a tradeoff. Ideally, devs are thinking about how their code would work in production as well if that is where they plan to deploy to.
Referencing your other comments though, if we had a way to grab the user session token for the dev if/when it is needed then that could be done behind the scenes at the appropriate time simplifying this immensely.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
posit_strategy = PositCredentialsStrategy(
local_strategy=databricks_cli,
workbench_strategy=PositWorkbenchCredentialsStrategy(Config(profile="workbench")),
connect_strategy=PositConnectCredentialsStrategy(user_session_token=session_token),
)
could become:
posit_strategy = PositCredentialsStrategy()
allowing for overrides to defaults, but otherwise it just grabs what it needs under the hood.
🪱 🥫 |
from posit.connect.external.databricks import ( | ||
PositCredentialsStrategy, | ||
PositConnectCredentialsStrategy, | ||
PositWorkbenchCredentialsStrategy, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a passing observation: it's odd to see PositWorkbenchCredentialsStrategy
inside the posit.connect
module. They aren't really relevant for Connect, right? Is it time to create posit.workbench
to contain these?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll move this before the PR becomes final. For now it's easier to iterate with everything inside a single module.
Great point! I think the trouble with doing this for people though is that its location is dependent on the app/framework they are using. Would be neat to automatically find it though, I agree. |
Any of the various |
We can do my best to hide some of this from the user but this is an artifact of the databricks SDK, not something we are imposing in our client.
Right. As Matt said, this is framework-dependent. We could consider adding a helper for each framework but that feels like it will lead to even more confusion like we see with
Good call. I think we can do
We can revisit this but if the choice is The way this is implemented at the moment, if
If you want to write content that only works on workbench then you don't need the More broadly, the word choices "CredentialsStrategy" and "CredentialsProvider" are constructs from the databricks-sdk, not something we came up with here. We tried to be consistent with their naming in the SDK to avoid confusion but we can call these things whatever we want.
This one is just a var name so an easy fix. How about we just call this
This is a major part of the friction with implementing these helpers. We are trying to make something easy to do for our users when some of these libraries (databricks-sdk and databricks-sql) aren't even compatible inside of Databricks' own ecosystem of tools. |
Oh an regarding the circular reference mentioned in |
d695abf
to
3d0e55b
Compare
@kmasiello could you take another look at this when you have a minute? I've done some refactoring based on your feedback. One of the main issues we had before was that there was this circular dependency between a import streamlit as st
from databricks.sdk.core import ApiClient
from databricks.sdk.service.iam import CurrentUserAPI
from posit.connect.external.databricks import (
new_config,
ConnectStrategy,
)
session_token = st.context.headers.get("Posit-Connect-User-Session-Token")
cfg = new_config(
posit_connect_strategy=ConnectStrategy(
user_session_token=session_token
),
)
databricks_user = CurrentUserAPI(ApiClient(cfg)).me()
st.write(f"Hello, {databricks_user.display_name}!") Unfortunately we can't really get rid of the need to pass in the session_token arg when specifying the connect strategy using viewer auth but hopefully this is an improvement. If you want to use a Service Account oauth integration when running on Connect then the empty default configuration should be sufficient. This code should work locally, on workbench, and on Connect: import streamlit as st
from databricks.sdk.core import ApiClient
from databricks.sdk.service.iam import CurrentUserAPI
from posit.connect.external.databricks import new_config
cfg = new_config()
databricks_user = CurrentUserAPI(ApiClient(cfg)).me()
st.write(f"Hello, {databricks_user.display_name}!") |
Closes #382
Adding this draft PR for initial review before we commit the changes to these APIs.
These changes make the separation between the different credential strategies more explicit and tries to be better about detecting the difference between local and workbench content.
It also combines the
Content
andViewer
strategy into a single new class:PositConnectCredentialStrategy
. Ifuser-session-token
is provided then we use the viewer implementation. If not then we should fall back to the content credentials implementation. I'm not sure if this is the right choice so looking for feedback on this.I also modified some of the naming in the hopes of the choice of which strategy to use more obvious to end users of these helpers.