Description
(disclaimer: I am biased being the one who merges sqlparser prs and also am the Apache DataFusion PMC chair)
Problem Statement
sqlparser seems to have become the defacto sql parsing library in Rust (5.5M downloads at the time of this writing) 🎉
However the sqlparser-rs project doesn't have sufficient maintainer capacity. I (@alamb) do enough to keep it from going entirely dormant, but that is really not sufficient for a healthy project.
Here are the specific problems:
- Having contributors wait weeks for review feedback is a bad experience for everyone involved and for that I apologize.
- There is not enough capacity to drive large projects (e.g. token locations) forward
Challenges with current governance structure (or lack thereof)
- There is no clear way to add additional maintainers
- Some employers (for example Apple) only permit contributions to explicitly vetted projects with clear governance (e.g. ASF)
Past discussions:
- @andygrove has brought this up on the arrow mailing list: https://lists.apache.org/thread/q80j49poyg99x2c01900312qz7ps9wgp
- Discussed adding @lovasoa @jmhain and @iffyio as maintainers here Discussion: adding iffyio, jmhain, and/or lovasoa as maintainer of sqlparser-rs #1243 (reactions but no real discussions)
- Discussion adding @AugustoFKL as maintainer: Propose adding AugustoFKL as maintainer of sqlparser-rs #808
When DataFusion was part of the Apache Arrow project, we didn't have the correct space to bring SQL parser at that time
Now that DataFusion is its own top level project (with @andygrove and myself on the PMC) there is a natural space to do thos
Specific Proposal:
- Move the sqlparser-rs code (and commit history) into the Apache DataFusion project and under its Governance. This would require an IP clearance process to run and would take time.
- Move sqlparser-rs repository to
apache/datafusion-sqlparser
- Archive this repository, and leave links to
apache/datafuson-sqlparser
- Continue to release sqlparser versions approximately monthly.
Benefits of ASF governance;
- More people can approve/merge PRs (committers to DataFusion)
- Clear governance structure (rather than sqlparser-rs today which seems to be mostly me)
- Clear path to add additional maintenaners (e.g. committers)
Drawbacks
- There is a danger that sqlparser becomes "captured" by DataFusion and only accepts features needed for DataFusion
- There is additional overhead to the ASF process (releases, in particular, take additional non trivial overhead)
There is plenty of experience with the ASF release process in DataFusion so I don't think that is a major hurdle. I also think DataFusion in general and sqlparser in particular has a long history of accepting features that benefit all users not just maintainers, so I am not worried about this either (but I am of course biased)