-
Notifications
You must be signed in to change notification settings - Fork 9.1k
HADOOP-18786. Use CDN instead of ASF archive #5789
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
* Use Yetus 0.14.1 from downloads.apache.org in yetus-wrapper * Use Maven 3.8.8 from downloads.apache.org in Win 10 Dockerfile * Point users to downloads.apache.org for JVSC * Use Solr 8.11.2 from downloads.apache.org in YARN Dockerfile
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mostly looks good. The solr update in the yarn application catalog is a little concerning to me because it's a major version update, and I am not sure if there are tests covering the Docker image usage. Too bad, it seems the no one's taking care of the YARN application catalog code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, I guess we want to this change in all active branches? i.e. branch-3.3 branch-3.2 and branch-2.10.
As a stop-gap, it would be good to prevent any downstream users from hitting this on all new releases, as it can trigger a site-wide ban if they've used the archives too much. Building Hadoop releases from source shouldn't cause users to be banned from access to the ASF. But, this is only a stop-gap solution, regardless of where it is applied. Once these dependencies roll over to the archives, then you'll have the problem of users being unable to build Hadoop releases from source without first patching its build in some way. So, a more complete solution still needs to be created. The Dockerfile changes are probably okay (as I imagine those are optional, and users can derive their own Dockerfiles easily enough from these as reference), and the generic message about JVSC is certainly okay. You could probably get away with just applying those changes to the trunk. The main problem to address across all branches is the yetus-wrapper's use of the archives. Perhaps one of the following would work?
|
|
Unfortunately, I'm pretty sure it's against policy. This is the same issue that people face when they want to include the gradlew or mvnw jars in their sources. |
yetus is special compared to gradle as it is asf, but then so would thriftc and mvn be. |
From https://www.apache.org/legal/release-policy.html#source-packages: So, it's fine if you're just bundling and redistributing Yetus scripts, which are also source. I'm not sure if Yetus contains compiled code, or if it's just scripts, and I'm not sure which parts are used by Hadoop's build. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds fair enough to do, but I don't think we should mix this with version upgrades, should do them separately
The version bumps are merely incidental to the actual issue this PR and JIRA are intending to expose: that the use of the archives should be avoided. If committers want to bump the versions first, I can rebase this PR once that is done. However, these changes are pretty trivial, so I'm not really needed for that part. Once the committers decide to act on this, they can rebase or merge however they see fit. Up to now, these "should" comments about what to do about the version bumps have been very unclear to me... it looks like discussion among yourselves... but if it's a request for me to change something in this PR, please state the request clearly. |
reviewing this. @ctubbsii do we need to bump up the versions for the move to downloads to work? |
@steveloughran wrote:
The old versions only exist in the archives, so yes. But, as previously said, this is really only a temporary fix, as it forces the build to depend on a transient state on downloads.a.o. The Hadoop devs need to figure out a more permanent solution. This is just a stop-gap. |
This was previously approved, and I've answered all the questions raised. I just resolved the merge conflicts from upstream, where some lines got moved around in the Dockerfile for Windows. Is anybody willing to merge this? |
The main thing I want to be sure is from this build, what gets into the distro? only stuff from the maven repo, right? that is: this PR MUST NOT force updates in the binaries we ship. |
I don't quite understand the question. The premise seems to be that the current build is only grabbing stuff from the Maven repo. However, that's not true currently, and that's part of the problem. The build currently grabs stuff from the archives, and not just from the Maven repo. Those are the URLs that this PR changes... to use the ASF CDN instead of the archives. The only change that might affect the distro is a couple of tools do not have that version available in the CDN anymore, so a version bump was necessary to be able to grab it from the CDN instead of from the archives. However, I don't know if those affect the binaries in the distro either, or if those are only used as unshipped build tools. But even if it does change the binaries in some way, the current situation of automatically going to the ASF archives cannot continue... it makes offline builds very hard, and the download of things from the archives causes frequent builds to trigger automated bans of ASF services, because the archives aren't meant to be used this way (for routine builds). |
ok, let's merge and see what happens. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
+1
thanks, in trunk. can you do a pr cherrypicking to branch-3.4 so we can keep that in sync. No need for more reviews, just a yetus test run. |
* Use Yetus 0.14.1 from downloads.apache.org in yetus-wrapper * Use Maven 3.8.8 from downloads.apache.org in Win 10 Dockerfile * Point users to downloads.apache.org for JVSC * Use Solr 8.11.2 from downloads.apache.org in YARN Dockerfile Contributed by Christopher Tubbs
* Use Yetus 0.14.1 from downloads.apache.org in yetus-wrapper * Use Maven 3.8.8 from downloads.apache.org in Win 10 Dockerfile * Point users to downloads.apache.org for JVSC * Use Solr 8.11.2 from downloads.apache.org in YARN Dockerfile Contributed by Christopher Tubbs
I'd be hesitant to backport this to previous versions. I'd feel more comfortable establishing it as the norm going forward, than risk changing the build process for previous release versions. |
Description of PR
How was this patch tested?
Download links verified to work.
For code changes: