Skip to content

Conversation

lucqui
Copy link
Contributor

@lucqui lucqui commented Jun 24, 2025

Which issue does this PR close?

Closes #16532

Rationale for this change

Sometimes, we'll forget to add the breaking change label, so it'll be sweet if we can have such a script.

Follow the page to develop the script: https://datafusion.apache.org/contributor-guide/api-health.html#breaking-changes

What changes are included in this PR?

Are these changes tested?

Local testing was performed. We will see this working in the ci process as well.

Are there any user-facing changes?

No. This is a developer experience enhancement.

@github-actions github-actions bot added the development-process Related to development process of DataFusion label Jun 24, 2025
@xudong963 xudong963 self-requested a review June 25, 2025 01:51
@lucqui lucqui force-pushed the detect-breaking-changes-script branch from 2af7bee to ff0dd7e Compare June 25, 2025 15:50
Copy link
Member

@xudong963 xudong963 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @lucqui

After this is ready for review, I suggest changing a pub API to see if it'll trigger the script

@alamb alamb marked this pull request as draft June 26, 2025 18:01
@alamb
Copy link
Contributor

alamb commented Jun 26, 2025

Thanks @lucqui and @xudong963 -- this looks very cool

@lucqui
Copy link
Contributor Author

lucqui commented Jun 27, 2025

Thank you @lucqui

After this is ready for review, I suggest changing a pub API to see if it'll trigger the script

Thanks @xudong963 will do!

@lucqui lucqui force-pushed the detect-breaking-changes-script branch 8 times, most recently from bb61f72 to b950e65 Compare June 27, 2025 19:39
@lucqui lucqui marked this pull request as ready for review June 27, 2025 19:43
@lucqui lucqui requested a review from xudong963 June 27, 2025 19:43
@lucqui
Copy link
Contributor Author

lucqui commented Jun 27, 2025

@xudong963 generation of report worked w/ a public api change locally.

@lucqui lucqui force-pushed the detect-breaking-changes-script branch from 1f0004f to 4e0e2f0 Compare June 27, 2025 19:52
@lucqui lucqui changed the title Adds script to detect breaking changes using semver Adds script to detect breaking changes Jun 28, 2025
@alamb alamb changed the title Adds script to detect breaking changes Adds script to detect breaking API changes/ semver Jun 30, 2025
Copy link
Contributor

@alamb alamb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this contribution @lucqui -- if we can get it working it will be super valuable

Local testing was performed. We will see this working in the ci process as well.

I think you can test it with a fork -- aka merge this code to your fork's main branch and then make a PR to your fork with a breaking change. That should trigger the workflow and we can see it in action!


- name: Install GitHub CLI
run: |
if ! command -v gh &> /dev/null; then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understood it the Github hosted runners already have gh installed on them -- thus this command to setup and install apt-get seems unecessary

runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the full history I believe.

fetch-depth: 0
token: ${{ secrets.GITHUB_TOKEN }}

- name: Install Rust toolchain
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other actions already have actions to setup rust -- can we reuse the same one please?

…analysis to detect API breaking changes in DataFusion PRs:- LogicalPlan enum modifications and variant removals - DataFrame API public method changes- Generates detailed reports and integrates with GitHub ActionsAddresses apache#16532
@lucqui lucqui force-pushed the detect-breaking-changes-script branch from bfc31bb to becb32b Compare July 21, 2025 22:33
lucqui added 3 commits July 21, 2025 19:05
…st toolchain installation (not needed for git-based analysis)- Remove GitHub CLI installation (pre-installed on GitHub runners) - Improve script robustness with consistent git diff range syntax- Make DataFrame API detection more specific to avoid false positivesAddresses reviewer comments on PR apache#16541
…Remove unnecessary token parameter (not needed for public repo)- Add clear comments explaining why fetch-depth: 0 is required- Improve base ref handling for GitHub Actions PR context- Use origin/$GITHUB_BASE_REF for proper remote branch referenceAddresses reviewer question about checkout configuration necessity.
@Jefffrey
Copy link
Contributor

Have we considered using a tool like cargo-semver-checks? It might be better to offload this semver checking logic to a tool dedicated for that purpose rather than rolling our own shell script.

@alamb
Copy link
Contributor

alamb commented Sep 22, 2025

DataFusion doesn't currently maintain semver (we crank out major releases with semver breaking changes every month or two). So maybe we just need to ensure that are flagging each PR that has breaking changes appropriately (rather than blog the CI)

cargo-semver-checks sounds good to me , though we need to make sure it is on the ASF approved list

@Jefffrey
Copy link
Contributor

It seems we also have #15408 and #13665 👀

I guess we need to centralize this discussion again

though we need to make sure it is on the ASF approved list

I'll admit I was thinking of this from the POV of running cargo-semver-checks manually before a release instead of it being an automated CI check; though it would increase the workload for performing releases 🙁

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development-process Related to development process of DataFusion
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement a script to detect breaking changes automatically
4 participants