Skip to content

Conversation

bijanhoule
Copy link
Collaborator

@bijanhoule bijanhoule commented Sep 5, 2024

PR Checklist

  • A description of the changes is added to the description of this PR.
  • If there is a related issue, make sure it is linked to this PR.
  • If you've fixed a bug or added code that should be tested, add tests!
  • If you've added or modified a feature, documentation in docs is updated

Description of changes

Updates the spark connector to support vended credentials for azure cloud storage (abfs / abfss).

Requires hadoop-azure, e.g.:

spark.jars.packages=org.apache.hadoop:hadoop-azure:3.3.6

Note: this currently doesn't work with hadoop-azure:3.4.0, this issue might be addressed in 3.4.1:https://issues.apache.org/jira/browse/HADOOP-19208

@bijanhoule bijanhoule marked this pull request as ready for review September 5, 2024 21:43
"fs.azure.account.auth.type" -> "SAS",
"fs.azure.account.hns.enabled" -> "true",
"fs.azure.sas.token.provider.type" -> "io.unitycatalog.connectors.spark.AbfsVendedTokenProvider",
"fs.azure.sas.fixed.token" -> azCredentials.getSasToken,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we put these string constants in AbfsVendedTokenProvider?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call! I cleaned that up, and pulled in the constants from abfs in cases where we were using already established conf values.


import static java.lang.String.format;

public class AbfsVendedTokenProvider implements SASTokenProvider {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, do we have document to tell uses how to set this provider in Spark configs?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not yet -- @dennyglee and I have been working on docs for this. I'm planning on getting that polished up once we get changes in and confirm everything is working as expected

@cloud-fan cloud-fan merged commit ee66190 into unitycatalog:main Sep 6, 2024
4 checks passed
vksx pushed a commit to vksx/unitycatalog that referenced this pull request Oct 7, 2024
**PR Checklist**

- [x] A description of the changes is added to the description of this
PR.
- [ ] If there is a related issue, make sure it is linked to this PR.
- [x] If you've fixed a bug or added code that should be tested, add
tests!
- [ ] If you've added or modified a feature, documentation in `docs` is
updated

**Description of changes**

Updates the spark connector to support vended credentials for azure
cloud storage (`abfs` / `abfss`).

Requires `hadoop-azure`, e.g.:
```
spark.jars.packages=org.apache.hadoop:hadoop-azure:3.3.6
```

*Note*: this currently doesn't work with hadoop-azure:3.4.0, this issue
might be addressed in
3.4.1:https://issues.apache.org/jira/browse/HADOOP-19208

Signed-off-by: Vikas Sharma <[email protected]>
kevinzwang pushed a commit to kevinzwang/unitycatalog that referenced this pull request Oct 10, 2024
**PR Checklist**

- [x] A description of the changes is added to the description of this
PR.
- [ ] If there is a related issue, make sure it is linked to this PR.
- [x] If you've fixed a bug or added code that should be tested, add
tests!
- [ ] If you've added or modified a feature, documentation in `docs` is
updated

**Description of changes**

Updates the spark connector to support vended credentials for azure
cloud storage (`abfs` / `abfss`).

Requires `hadoop-azure`, e.g.:
```
spark.jars.packages=org.apache.hadoop:hadoop-azure:3.3.6
```

*Note*: this currently doesn't work with hadoop-azure:3.4.0, this issue
might be addressed in
3.4.1:https://issues.apache.org/jira/browse/HADOOP-19208

Signed-off-by: Kevin Wang <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants