-
Notifications
You must be signed in to change notification settings - Fork 289
Fix thrift client connection for Kerberos Hive Client #1747
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMHO, it would be better to define the client as a cached_property, and just return it.
+1 for moving the init out of the __init__
func, and converting the client to a cached_property
will help to do this, where the user will be able to use the client with and without a context manager.
@kevinjqliu I think this one is also good for 0.9.1 |
Gentle ping @kevinjqliu. Any thoughts on the |
Hi @hussein-awala, just to make sure I have this right, would this mean moving most of the logic in Edit: I'm asking because there might be cases where the above could actually result in a user invoking methods on a client whose underlying transport has been closed. For example: hive_client = _HiveClient(...)
p_client = hive_client.client
with hive_client as open_client:
print(open_client.get_all_databases())
print(p_client.get_all_databases()) # Results in TTransportException: Transport not open So its likely that this is not what you meant. |
Thanks for chiming in @hussein-awala @mnzpk. Please take a look at the new implementation. The context manager ( @mnzpk could you give this a try? |
6b8d2ee
to
7b21b5b
Compare
CI's currently failing for main branch, see https://github.com/apache/iceberg-python/pull/1899/files#r2040915222
|
7b21b5b
to
d788c8d
Compare
cool CI passes now @mnzpk please take a look when you have time :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for fixing this @kevinjqliu 🙌
Closes #1744 `TSaslClientTransport` cannot be reopen. This PR changes the behavior to recreate a `TSaslClientTransport` when its already closed. Note, `_HiveClient` should be used with context manager, but can be used without.
I was getting this exact error. Thanks for fixing this issue, but any idea how can I provide kerberos related details while creating catalog, like kerberos principal , kerberos_keytab, kerberos_service_name, kerberos_user ? I tried searching the documentation but could not find anything except this "hive.kerberos-authentication": "true". |
@abhisheksinha-pty im not really sure how those params are passed into hive/kerberos. Here is where we create the hive client, i would assume you have to pass the params in there somehow. Do you know how its passed into a regular hive client? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Apologies for not reviewing this sooner but unfortunately, installing from main
, I can still reproduce the reported issue. I've left a few comments with what I think would fix that. I've also added a few tests here that would allow testing this without having kerberos auth or a kerberized metastore instance set up. Not sure how useful you think those would be but let me know!
Thanks so much for all your work here.
"""Make sure the transport is initialized and open.""" | ||
if not self._transport.isOpen(): | ||
try: | ||
self._transport.open() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems that we're still going to try re-opening the transport once it has been closed and since the exception raised in that case would be a TypeError
, it would not be caught by the except
below.
self._transport.open() | ||
except TTransport.TTransportException: | ||
# reinitialize _transport | ||
self._transport = self._init_thrift_transport() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We're re-initializing the transport but since self._client
is a cached_property
, it'd still point to the old (and now closed) transport so I think we'd also want to del
ete self._client
so that it gets re-created?
<!-- Thanks for opening a pull request! --> <!-- In the case this PR will resolve an issue, please replace ${GITHUB_ISSUE_ID} below with the actual Github issue id. --> <!-- Closes #${GITHUB_ISSUE_ID} --> Closes #1744 (second try) # Rationale for this change First try (#1747) did not fully resolve the issue. See #1747 (review) # Are these changes tested? yes # Are there any user-facing changes? <!-- In the case of user-facing changes, please add the changelog label. --> --------- Co-authored-by: mnzpk <[email protected]>
<!-- Thanks for opening a pull request! --> <!-- In the case this PR will resolve an issue, please replace ${GITHUB_ISSUE_ID} below with the actual Github issue id. --> <!-- Closes #${GITHUB_ISSUE_ID} --> Closes #1744 (second try) First try (#1747) did not fully resolve the issue. See #1747 (review) yes <!-- In the case of user-facing changes, please add the changelog label. --> --------- Co-authored-by: mnzpk <[email protected]>
Closes #1744
TSaslClientTransport
cannot be reopen. This PR changes the behavior to recreate aTSaslClientTransport
when its already closed.Note,
_HiveClient
should be used with context manager, but can be used without.