Skip to content

Versioned grammars + leading comma support for build-depends #4953

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

phadej
Copy link
Collaborator

@phadej phadej commented Dec 12, 2017

We need versioned grammars:

Leading comma is just a simple extension to experiment with :)

I'm not 100% sure that singletons-ish approach is the best, but it has good features:

  • we use dictionary as an argument, so the approach is at least as good (performance wise) as passing an argument:
    Versioned v => ... -> a vs. Version -> ... -> a.
  • The dictionaries can be specialised (have to check the core if it actually works)
  • Types are harder to change accidentally, a little like having MonadReader Version => ... -> m a.

It works, it's easy to see by changing cabal-version to 2.0 in leading-comma.cabal. Tests will fail. I'll add negative tests after common-stanza branch is merged. negative test added.

cc @23Skidoo @hvr

Surprisingly? this seems to be as performant as current master, parsing all Hackage packages starting with a: (including acme-everything :)

this PR

 % /usr/bin/time -v .../parser-hackage-tests parse-parsec a +RTS -s
Reading index from: /home/ogre/.cabal/packages/hackage.haskell.org/01-index.tar
6714 files processed
  27,946,371,856 bytes allocated in the heap
   2,082,050,064 bytes copied during GC
       6,101,392 bytes maximum residency (970 sample(s))
       1,084,576 bytes maximum slop
              14 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     25627 colls,     0 par    2.304s   2.300s     0.0001s    0.0032s
  Gen  1       970 colls,     0 par    0.067s   0.067s     0.0001s    0.0009s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    4.228s  (  4.225s elapsed)
  GC      time    2.371s  (  2.366s elapsed)
  EXIT    time   -0.000s  ( -0.000s elapsed)
  Total   time    6.598s  (  6.591s elapsed)

  %GC     time      35.9%  (35.9% elapsed)

  Alloc rate    6,610,560,408 bytes per MUT second

  Productivity  64.1% of total user, 64.1% of total elapsed

this PR without SPECIALIZED in D.PD.FieldGrammar

6714 files processed
  27,921,069,992 bytes allocated in the heap
   2,105,778,208 bytes copied during GC
       7,756,992 bytes maximum residency (983 sample(s))
         731,304 bytes maximum slop
              17 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     25473 colls,     0 par    2.275s   2.271s     0.0001s    0.0057s
  Gen  1       983 colls,     0 par    0.067s   0.067s     0.0001s    0.0009s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    4.188s  (  4.185s elapsed)
  GC      time    2.341s  (  2.338s elapsed)
  EXIT    time   -0.000s  ( -0.000s elapsed)
  Total   time    6.529s  (  6.522s elapsed)

  %GC     time      35.9%  (35.8% elapsed)

  Alloc rate    6,667,467,459 bytes per MUT second

  Productivity  64.1% of total user, 64.2% of total elapsed

master

% /usr/bin/time -v .../parser-hackage-tests parse-parsec a +RTS -s
Reading index from: /home/ogre/.cabal/packages/hackage.haskell.org/01-index.tar
6714 files processed
  27,943,265,024 bytes allocated in the heap
   2,079,495,768 bytes copied during GC
       7,922,800 bytes maximum residency (974 sample(s))
         396,456 bytes maximum slop
              17 MB total memory in use (0 MB lost due to fragmentation)

                                     Tot time (elapsed)  Avg pause  Max pause
  Gen  0     25618 colls,     0 par    2.246s   2.242s     0.0001s    0.0049s
  Gen  1       974 colls,     0 par    0.065s   0.065s     0.0001s    0.0010s

  INIT    time    0.000s  (  0.000s elapsed)
  MUT     time    4.277s  (  4.272s elapsed)
  GC      time    2.311s  (  2.307s elapsed)
  EXIT    time   -0.001s  ( -0.001s elapsed)
  Total   time    6.587s  (  6.578s elapsed)

  %GC     time      35.1%  (35.1% elapsed)

  Alloc rate    6,533,837,864 bytes per MUT second

  Productivity  64.9% of total user, 64.9% of total elapsed

leading comma for CommaFSep/CommaVSep fields, i.e. fields with mandatory comma are (atm):

  • build-depends
  • build-tool-depends
  • build-tools
  • mixins
  • pkgconfig-depends
  • reexported-modules
  • setup-depends

Please include the following checklist in your PR:

  • Patches conform to the coding conventions.
  • Any changes that could be relevant to users have been recorded in the changelog.
  • The documentation has been updated, if necessary.
  • If the change is docs-only, [ci skip] is used to avoid triggering the build bots.

Please also shortly describe how you tested your change. Bonus points for added tests!

@phadej phadej requested review from 23Skidoo and hvr December 12, 2017 19:27
@phadej phadej force-pushed the leading-comma branch 2 times, most recently from 24f4402 to 4533308 Compare December 13, 2017 11:42
@phadej
Copy link
Collaborator Author

phadej commented Dec 13, 2017

I won't write amend user manual yet, as e.g. cabal.project parser doesn't benefit from the comma change yet (as it's still ReadP) based. And there are e.g. constraints where you'd like to use leading comma, IIRC.

@phadej
Copy link
Collaborator Author

phadej commented Dec 13, 2017

ping @ezyang could you check Implement availableSince commit. I'll merge it in a week anyway, but if you want to comment on it before, there's time.

Copy link
Member

@23Skidoo 23Skidoo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, just some minor comments about the design.

import Distribution.Parsec.Class (Parsec (..), ParsecParser)

-- A class to select how to parse different fields.
class CabalSpecVersion v where
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it should be easy to convert this type class to a GADT? Then we could have

cabalSpecVersionOld :: CabalSpecVersion
cabalSpecVersionOld = ...


cabalSpecVersion22 :: CabalSpecVersion
cabalSpecVersion22 = ...

etc.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you really need data CabalSpecOld = CabalSpecOld / ..., you can make that a phantom type parameter.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm, GADT is not a bad idea. I however abuse implicit dictionary passing, so I'd need a class for that anyway... :)

-- "Booleans"
-------------------------------------------------------------------------------

data HasElif = HasElif | NoElif
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alternatively, we could just do something like

data CabalSpecFeature = Elif | CommonStanzas | ...

data CabalSpecVersion where
    ...
    cabalSpecVersionFeatures :: [CabalSpecFeature]
    ...

hasElif :: CabalSpecVersion -> Bool
hasElif = elem ElIf . cabalSpecVersionFeatures

hasCommonStanzas ::  :: CabalSpecVersion -> Bool
...

to make it more consistent with Compiler and Extension.

Copy link
Collaborator Author

@phadej phadej Dec 19, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, have to think how to deal with expanding feature set. But not now. (maybe in follow-up PR)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One way is to encode availableSince using CabalSpecFeature, then it will definitely make sense!

-- | 'parsec' /could/ consume trailing spaces, this function /must/ consume.
lexemeParsec :: ParsecParser a
lexemeParsec = parsec <* P.spaces
parsec22 :: ParsecParser a
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a Haddock comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if instead of adding a new method each time the spec is updated we could tag the ParsecParser with a phantom type corresponding to the spec version. Just a thought, no need to implement it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that if CabalSpecVersion dictionary is inlined (in only one place atm!), then all resolution is pointer chasing (which GHC might be able to eliminate, if inliner sees enough), and no branching on version is anywhere (except for availableSince)

NB: this is speculation, as I haven't benchmarked.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: I will look at the core, but not now.

@23Skidoo
Copy link
Member

Surprisingly? this seems to be as performant as current master, parsing all Hackage packages starting with a: (including acme-everything:)

So SPECIALIZED makes it go a little bit slower, but use less memory? Interesting result.

@phadej
Copy link
Collaborator Author

phadej commented Dec 19, 2017

The point of benchmark was to validate that the added abstraction doesn't add noticeable slowdown.

@23Skidoo
Copy link
Member

Feel free to merge in the current state and tweak the design in a later separate PR (if you feel like it).

@phadej
Copy link
Collaborator Author

phadej commented Dec 19, 2017

I'll add missing haddock and rebase when #4962 is merged.

@phadej phadej force-pushed the leading-comma branch 2 times, most recently from 78f8b71 to aedb82c Compare December 21, 2017 20:38
CabalSpecVersion type-class will allow to gather per-spec conditionals.
Currently it's used for selecting parsers / grammatical structure.

Leading (or trailing commas) for CommaFSep/CommaVSep fields,
i.e. fields with mandatory comma are (atm):

- build-depends
- build-tool-depends
- build-tools
- mixins
- pkgconfig-depends
- reexported-modules
- setup-depends
Tag Backpack fields (mixins, signatures) to `availableSince [2,0]`.
This "fixes" haskell#4448, as fields
are recognised, warned, but parsed as empty if cabal-version < 2.0
(actual cut-off is ! (>= 1.25). For example, a file with

    cabal-version: >=1.10
    library:
      mixins: foo-indef requires (Foo42 as FooImpl)

will be accepted, yet warned, and parsed `mixins` in `BuildInfo` will be
an empty list.

Also availableSince is removed from `build-tool-depends`,
as we **want** to parse (and not warn) it in old Cabal files.
It can be thought as added retrospectively to old specs, but old `Cabal`
s don't know how to use it.
@phadej
Copy link
Collaborator Author

phadej commented Dec 25, 2017

Merged as #4971

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants