RFC: new intermediate cabal file representation. #3614

phadej · 2016-07-25T11:21:31Z

Problem: Ultimately we want formatting-preserving programmagic refactorings of .cabal files.

Current situation:

The parsec parser's Field structure is annotated with source position (parametrised over ann), but
- values are ByteStrings
- Field could represent any cabal-file-like structure (e.g. nested sections)
GenericPackageDescription
- doesn't have annotations
- some preprocessing already done, like instances of the same field (e.g. build-depends) on the same level are merged.

Solution:

Introduce a new structure (CabalAst ann) with source annotations, which represents valid (to some degree) cabal files.
Change parsing pipeline from Field ann → GenericPackageDescription to Field ann → CabalAst ann → GenericPackageDescription
As a proof-of-concept, change cabal gen-bounds to work on CabalAst ann

This RFC doesn't propose how CabalAst would look like specifically, as some exprerimentation on implementation is needed. ghc-exactprint.

ping @alanz @hvr @dcoutts

The text was updated successfully, but these errors were encountered:

23Skidoo · 2016-07-25T11:24:56Z

Would be nice to have something like that for config files as well.

phadej · 2016-07-25T11:26:50Z

@23Skidoo, isn't config files flat: i.e. no sections? We could experiment on them first, as they are simpler.

phadej · 2016-07-25T11:44:02Z

From IRC discussion: starting with ~/.cabal/config would be also easier as the field types are much simpler. I have no good idea how to represent build-depends in editable yet formatting-preservable way.

Also: @dcoutts' experiment: http://code.haskell.org/~duncan/cabal-ast-experiment/

phadej · 2016-07-25T11:51:27Z

@dcoutts also proposed to change GenericPackageDescription into CabalAst i.e. into something which supports refactoring.

23Skidoo · 2016-07-25T12:08:21Z

@phadej No, they have sections: repository, install-dirs, program-locations, etc.

phadej · 2016-07-25T12:55:00Z

@23Skidoo: yes, but sections are predefined and used to group fields. i.e. the file could be flat.

dcoutts · 2016-07-25T15:22:09Z

@phadej so when I thought about it some time ago I concluded that there were only a few kinds of fields. Lets see if I can remember what they are:

free-form text like description etc
single atom, like category, license
list of atoms, like module lists, source files, ghc options
single compound expression, cabal-version
multiple comma-separated compound expressions, like build-depends

The point is, it may be possible to pre-split the list fields and then only keep pos info at that level, not within the AST of the field entries.

phadej · 2016-07-25T18:31:23Z

@dcoutts makes sense. Compound expressions are still tricky.

phadej · 2016-07-25T18:31:30Z

Also to remember: https://github.com/alanz/cabal-ast-play

alanz · 2016-07-26T19:57:28Z

Here is a brain dump from me

Based on my GHC / ghc-exactprint experiences I would suggest using an initial (from the parser) AST that has a clear mapping to the underlying cabal file, and generally keeps things in order. So resist putting all like things together, especially if they can be interleaved in the file.

Then have an ann parameter, which can have location info initially, and delta info later to be pretty printed.

If the initial AST gets processed for use, it makes sense to track in some clear way how each part is derived from the initial one, even if that is done via an annotation at that level too.

So the annotations are never part of the main line of processing, only ever used for round-tripping (with possible modification the cabal file).

phadej · 2019-06-23T22:03:10Z

I updated the description. Also to complement @dcoutts comment, current parsec approach parses ByteString to whatever field type is. There is no intermediate step, parsing e.g. to a list of tokens first. Maybe there should be (I even think it could help with the performance! or at least not ruin it).

phadej mentioned this issue Jul 25, 2016

Extend .cabal format with "common" definition stanzas #2832

Closed

phadej added the type: discussion label Jul 25, 2016

debug-ito mentioned this issue Apr 15, 2017

Add support for cabal-2.0 caret operator & trailing-zero normalisation debug-ito/staversion#2

Closed

Mikolaj added the exact-print label Jun 26, 2021

emilypi mentioned this issue Aug 14, 2021

Meta: Exact-printer Mega-issue #7544

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

RFC: new intermediate cabal file representation. #3614

RFC: new intermediate cabal file representation. #3614

phadej commented Jul 25, 2016 •

edited

Loading

23Skidoo commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

23Skidoo commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

dcoutts commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

alanz commented Jul 26, 2016

Uh oh!

phadej commented Jun 23, 2019 •

edited

Loading

Uh oh!

RFC: new intermediate cabal file representation. #3614

RFC: new intermediate cabal file representation. #3614

Comments

phadej commented Jul 25, 2016 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

23Skidoo commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

23Skidoo commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

dcoutts commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

phadej commented Jul 25, 2016

Uh oh!

alanz commented Jul 26, 2016

Uh oh!

phadej commented Jun 23, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

phadej commented Jul 25, 2016 •

edited

Loading

phadej commented Jun 23, 2019 •

edited

Loading