Skip to content

[DISCUSSION]: move sqlparser to Apache (DataFusion) governance #1294

Closed
@alamb

Description

@alamb

(disclaimer: I am biased being the one who merges sqlparser prs and also am the Apache DataFusion PMC chair)

Problem Statement

sqlparser seems to have become the defacto sql parsing library in Rust (5.5M downloads at the time of this writing) 🎉

However the sqlparser-rs project doesn't have sufficient maintainer capacity. I (@alamb) do enough to keep it from going entirely dormant, but that is really not sufficient for a healthy project.

Here are the specific problems:

  1. Having contributors wait weeks for review feedback is a bad experience for everyone involved and for that I apologize.
  2. There is not enough capacity to drive large projects (e.g. token locations) forward

Challenges with current governance structure (or lack thereof)

  1. There is no clear way to add additional maintainers
  2. Some employers (for example Apple) only permit contributions to explicitly vetted projects with clear governance (e.g. ASF)

Past discussions:

When DataFusion was part of the Apache Arrow project, we didn't have the correct space to bring SQL parser at that time

Now that DataFusion is its own top level project (with @andygrove and myself on the PMC) there is a natural space to do thos

Specific Proposal:

  1. Move the sqlparser-rs code (and commit history) into the Apache DataFusion project and under its Governance. This would require an IP clearance process to run and would take time.
  2. Move sqlparser-rs repository to apache/datafusion-sqlparser
  3. Archive this repository, and leave links to apache/datafuson-sqlparser
  4. Continue to release sqlparser versions approximately monthly.

Benefits of ASF governance;

  1. More people can approve/merge PRs (committers to DataFusion)
  2. Clear governance structure (rather than sqlparser-rs today which seems to be mostly me)
  3. Clear path to add additional maintenaners (e.g. committers)

Drawbacks

  1. There is a danger that sqlparser becomes "captured" by DataFusion and only accepts features needed for DataFusion
  2. There is additional overhead to the ASF process (releases, in particular, take additional non trivial overhead)

There is plenty of experience with the ASF release process in DataFusion so I don't think that is a major hurdle. I also think DataFusion in general and sqlparser in particular has a long history of accepting features that benefit all users not just maintainers, so I am not worried about this either (but I am of course biased)

cc @Dandandan @tobyhede @andygrove @maxcountryman @nickolay

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions