-
Notifications
You must be signed in to change notification settings - Fork 710
WIP/RFC: Parse .cabal files to a source code representation #6621
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
We already have `Field` structure which can be used to format .cabal files. See how `cabal-fmt` is implemented.
…Sent from my iPhone
On 29. Mar 2020, at 11.49, Matt Renaud ***@***.***> wrote:
NOTE: This is a work in progress, looking for input on direction/structuring (see Open Questions below).
TL;DR; Don't perform all simplification steps during parsing, do them as a follow-up transformation.
Why
Allows cabal format to maintain key pieces of source code (common stanzas for example #5734)
Separation of concerns; simplifies parser logic (no longer need to strange FromBuildInfo typeclass while parsing)
How
Create a source representation in the AST (PackageSourceDescription) for common stanza sections (CommonStanza) and for import directives within sections (CommonStanzaImports).
Don't perform inlining of sections during parsing, leave that to another step. The transformation from PackageSourceDescription is not yet implemented, but should be pretty trivial.
Another alternative is to update GenericPackageDescription in place, but I thought it may be useful (if not more code) to have a separate representation. May be too complicated but we could look at the "trees that grow" approach that is used in GHC as well.
NOTE: For simplicity I forked the code from GenericPackageDescription instead of modifying in place to keep things separate, so there's a lot of duplication right now. We'll likely want to refactor several modules as pre-steps to this.
NOTE: Ignore the HasCommonStanzaImports typeclass, that's an artifact from another approach I took and isn't used currently.
Testing
I've updated the cabal format command to round trip thorough PackageSourceDescription instead of GenericPackageDescription and it appears to work for all the common cases I've tested (including common stanzas that import other common stanzas).
Open Questions
What's up with PackageDescription and GenericPackageDescription? Why do they both exist? Historical reasons or separate concerns?
Currently I've added a CommonStanzaImports field to each of the section types that can have a common stanza, this results in having to insert memptys in places where these sections are created manually (and thus will never have imports). A more principled way would be to create separate types for ExecutableSource, LibrarySource etc. which has the CommonStanzaImports fields, and leave the existing section types used in GenericPackageDescription unchanged.
As a first step should we keep this separate from the GenericPackageDescription parser and just use it for cabal format?
Is there good test coverage for parsing to GenericPackageDescription? In other words, if I replace the String -> GenericPackageDescription parser with a String -> PackageSourceDescription parser and then a separate PackageSourceDescription -> GenericPackageDescription transform, how confident can I be that it works?
/cc @phadej @gbaz
Please include the following checklist in your PR:
Patches conform to the coding conventions.
Any changes that could be relevant to users have been recorded in the changelog.
The documentation has been updated, if necessary.
If the change is docs-only, [ci skip] is used to avoid triggering the build bots.
Please also shortly describe how you tested your change. Bonus points for added tests!
You can view, comment on, or merge this pull request online at:
#6621
Commit Summary
Initial check-in of PackageSourceDescription.
Checkpoint: common stanzas definitions maintained.
Checkpoint: Maintain common stanza imports within sections.
Simplify parser.
Clean up code, update cabal format command.
File Changes
M Cabal/Cabal.cabal (8)
M Cabal/Distribution/PackageDescription/FieldGrammar.hs (59)
A Cabal/Distribution/PackageDescription/PackageSourceDescriptionParser.hs (794)
A Cabal/Distribution/PackageDescription/PackageSourceDescriptionPrettyPrint.hs (219)
M Cabal/Distribution/PackageDescription/Parsec.hs (5)
M Cabal/Distribution/Simple/Build.hs (4)
M Cabal/Distribution/Simple/Haddock.hs (2)
M Cabal/Distribution/Types/Benchmark.hs (10)
M Cabal/Distribution/Types/Benchmark/Lens.hs (5)
A Cabal/Distribution/Types/CommonStanza.hs (55)
A Cabal/Distribution/Types/CommonStanza/Lens.hs (28)
A Cabal/Distribution/Types/CommonStanzaImports.hs (56)
A Cabal/Distribution/Types/CommonStanzaImports/Lens.hs (18)
M Cabal/Distribution/Types/Executable.hs (7)
M Cabal/Distribution/Types/Executable/Lens.hs (4)
M Cabal/Distribution/Types/ForeignLib.hs (9)
M Cabal/Distribution/Types/ForeignLib/Lens.hs (5)
A Cabal/Distribution/Types/Import.hs (34)
M Cabal/Distribution/Types/Lens.hs (4)
M Cabal/Distribution/Types/Library.hs (8)
M Cabal/Distribution/Types/Library/Lens.hs (5)
A Cabal/Distribution/Types/PackageSourceDescription.hs (82)
A Cabal/Distribution/Types/PackageSourceDescription/Lens.hs (132)
M Cabal/Distribution/Types/TestSuite.hs (8)
M Cabal/Distribution/Types/TestSuite/Lens.hs (5)
M Cabal/tests/Instances/TreeDiff.hs (4)
M cabal-install/main/Main.hs (8)
Patch Links:
https://github.com/haskell/cabal/pull/6621.patch
https://github.com/haskell/cabal/pull/6621.diff
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub, or unsubscribe.
|
Yeah, that's exactly what I'm using. The problem is that The open questions outline some various trade-offs we can make regarding embedding the source structure within the existing field types ( I imagine that this source representation will grow to include comments as well for preservation, and this could also be used to provide an intermediary form of the package description which could have processing steps performed on them (maybe via "cabal plugins" or similar). This could provide a possible place to address things like automatic module discovery (#5343) in a more generic way and also integrate with |
To clarify, https://hackage.haskell.org/package/cabal-fmt is not the
is incorrect. |
Ahhhh I see, I wasn't aware of that package, I misread that that as It seems like it would be beneficial to have this functionality as part of the cabal itself, this way as the Cabal format changes the formatting can be kept in sync. It also removes the discover-ability barrier of having to find and install another package. |
I agree. Yet, I'm not ready for bikeshedding. In Otherwise there will be issues like phadej/cabal-fmt#11, in a sense the fact no-one uses |
That's fair, but I would also imagine that a package named
I don't think it's reasonable to wait until some unspecified point in the future where a parser rewrite happens before making incremental improvements now. It's true that we'll eventually want some degree of configuration when formatting Cabal files (and some small subset of the community may have incredibly specific needs), but I imagine there is a pretty large group of the community that would be happy with cabal coming with a formatter with a couple of nobs (even just indentation). It doesn't make sense to me to tie creating a better internal representation to rewriting the parser for it. Also, I'd be willing to build out the configuration infrastructure as well, it's not a particularly difficult problem to solve in the grand scheme of things. There are some improvements that I think would be useful as well such as separating the folding in of common stanza data into the other sections, imho that should be a separate step from parsing.
As a generally available tool that we're going out and saying "everyone start using this now!" I agree. But, at the same time I disagree because it fills a real hole in the current tooling for those who want it (even if its not feature complete or fully configurable now). We could get it to a point where it handles most use cases (really just the fact that it inlines common stanzas right now is probably the only reason more folks aren't using it), and then make an announcement that it is available as an early access tool, and to expect changes to come in the future until a point at which all the configurability is available. So, what I'm looking for are concrete next steps here, I don't want to block all improvements in this space on waiting for the perfect end state to be defined, and as you said if no one is using the tool we should feel free to make changes and see what we like and don't like. Are there specific implementation details that you don't like here? Would you prefer if I do more prefactoring work to clean up the code before making these changes (to make it more clear how much is actually changing)? Would you like to see a write-up of a longer term plan for formatting in cabal? Can I help with the parser changes you mentioned above? Is there a group of people I should reach out to for wider input on this? I'd be happy to do any and all of these, I'm just looking for guidance and actionable input to move this forward. |
Most functionality of The internal representation, i.e.
Please take a lead, champion the functionality and be responsive when follow up issues and requests are reported and created. The progress is not forbidden. But there have to be a plan. It's true that since E.g. comments were added so Yet, I don't have time to deal with formatting related functionality in Some alternative to Importantly, I don't think the latter (GPD-like structure losing ordering and comments) is what people would want, even as incremental "update". Some people would also like to have I'd suggest you to look into #6187 and #5555, the annotations support I mention in #5555 is now in |
Marking this PR as draft 🙂 |
Hello, I am going through old PRs to check whether they are stale. If this PR is still “live”, write a comment and I will remove the |
TL;DR; Don't perform all simplification steps during parsing, do them as a follow-up transformation.
Why
Allows
cabal format
to maintain key pieces of source code (common stanzas for examplecabal format
inlines and removescommon
stanzas #5734)Separation of concerns; simplifies parser logic (no longer need to strange
FromBuildInfo
typeclass while parsing)How
Create a source representation in the AST (
PackageSourceDescription
) for common stanza sections (CommonStanza
) and forimport
directives within sections (CommonStanzaImports
).Don't perform inlining of sections during parsing, leave that to another step. The transformation from
PackageSourceDescription
is not yet implemented, but should be pretty trivial.Another alternative is to update
GenericPackageDescription
in place, but I thought it may be useful (if not more code) to have a separate representation. May be too complicated but we could look at the "trees that grow" approach that is used in GHC as well.Testing
I've updated the
cabal format
command to round trip thoroughPackageSourceDescription
instead ofGenericPackageDescription
and it appears to work for all the common cases I've tested (includingcommon
stanzas thatimport
other common stanzas).Open Questions
What's up with
PackageDescription
andGenericPackageDescription
? Why do they both exist? Historical reasons or separate concerns?Currently I've added a
CommonStanzaImports
field to each of the section types that can have a common stanza, this results in having to insertmempty
s in places where these sections are created manually (and thus will never haveimport
s). A more principled way would be to create separate types forExecutableSource
,LibrarySource
etc. which has theCommonStanzaImports
fields, and leave the existing section types used in GenericPackageDescription unchanged.As a first step should we keep this separate from the
GenericPackageDescription
parser and just use it forcabal format
?Is there good test coverage for parsing to
GenericPackageDescription
? In other words, if I replace theString -> GenericPackageDescription
parser with aString -> PackageSourceDescription
parser and then a separatePackageSourceDescription -> GenericPackageDescription
transform, how confident can I be that it works?/cc @phadej @gbaz
Please include the following checklist in your PR:
[ci skip]
is used to avoid triggering the build bots.Please also shortly describe how you tested your change. Bonus points for added tests!