Skip to content

Globbing support for mixins and rexported-modules #7290

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ekmett opened this issue Feb 21, 2021 · 13 comments
Open

Globbing support for mixins and rexported-modules #7290

ekmett opened this issue Feb 21, 2021 · 13 comments

Comments

@ekmett
Copy link
Member

ekmett commented Feb 21, 2021

Describe the bug
Using backpack often requires using large large numbers of reexported-modules stanzas and mixins stanzas.

If you use a backpack package and that package produces 17 outputs in some namespace, I have to repeat myself 17 times in the output to rename it in the mixins statement, and another 17 times in the rexported-modules stanza.

If I then go to use that backpack package 17x on another than provides 17 definitions, I'm now stuck with 289, and the exponent keeps growing from there. I'm currently using a program to produce my cabal files. This is not "the Way".

I don't want to use a custom setup to produce the boilerplate, because @phadej will be mad at me, but also because it gets in the way of incremental recompilation in this mode, as I understand the world.

This is currently the biggest blocker I have to using backpack 'in bulk' in my code.

I'd like to be able to rename in the mixins statement on the provides/requires sides with a wildcard notation of some sort:

in a sort of Makefile-like vocabulary:

consDefs (% as Foo.%) requires (% as Baz.%)

or since lowercase identifiers aren't a thing:

mixins: consDefs (x as Foo.x) requires (x as Baz.x)

or

mixins: consDefs (* as Foo.*) requires (* as Baz.*)

which gives an option of allowing recursive globbing or not

mixins: consDefs (** as Foo.**) requires (** as Baz.**)

Similarly,

reexported-modules:
  Foo.x as Bar.Internal.x

is practically required, as I have to re-export the things I get from all these bulk-work backpack modules.

options include using * and ** for recursive globbing or not. Other options are using something like a scheme like parameter pack syntax. I don't care what the syntax is, I just want something.

This would require cabal to be able to query a backpacked package for what signatures it has outstanding.

To Reproduce
Try to write large backpacked packages.

Alternatives explored
I am currently generating the entire cabal file with software, like I do with the gl package.

@danidiaz
Copy link
Collaborator

This would require cabal to be able to query a backpacked package for what signatures it has outstanding.

This would be useful on its own.

@ezyang
Copy link
Contributor

ezyang commented Feb 21, 2021

cc'ing myself

@ezyang
Copy link
Contributor

ezyang commented Feb 21, 2021

It'll be helpful to know exactly how the use case in question is working, so I can more easily tell if the specific request here is the right way to go about doing it.

One really important thing to remember about Backpack instantiation is that module identity is structural, not generative. So if you reinstantiate a package the same way it was instantiated internally inside another module, you can witness type equality across them. This means that you don't need to reexport internal modules; if someone needs them, you can just reimport the relevant internal module, and assuming that the requires are all named the same, they'll get it in scope. I'm loathe to predict that this exactly solves your problem, but I hope it gives you some ideas.

@ekmett
Copy link
Member Author

ekmett commented Feb 22, 2021

It'll be helpful to know exactly how the use case in question is working, so I can more easily tell if the specific request here is the right way to go about doing it.

Ok. Both barrels it is. =)

I'm working with writing code that is polymorphic over the RuntimeRep, to hack around levity polymorphism limitations.

To that end I go to define a module

Unlifted.Internal.Class.

In it I have classes that look like builtin haskell typeclasses, but parameterized by RuntimeRep.

e.g.

class Eq (a :: TYPE r) where
  (==),(/=) :: a -> a -> a

But because a has unknown levity, it can't actually supply the default definitions. However, while a has unknown levity, a -> a -> a has levity Type, so I can supply default definitions for that. I just can't take arguments. So in goes

  default (==) :: EqRep r => a -> a -> Bool    
  (==) = eqDef    
    
  default (/=) :: EqRep r => a -> a -> Bool    
  (/=) = neDef    
  {-# MINIMAL (/=) | (==) #-}    

and I make a class

class EqRep (r :: RuntimeRep) where
  eqDef, neDef :: forall (a :: TYPE r). Eq a => a -> a -> Bool

Repeat a nauseating number of times for different typeclasses for base, and add automatic lifting for things in TYPE 'LiftedRep, and add a bunch of machinery so I can extend printing in ghci to work with unlifted data types as well.

For any particular RuntimeRep I can write:

instance EqRep Rep where
  eqDef x y = not (x /= y)
  neDef x y = not (x == y)

But if it isn't concrete? no dice.

However, this is all predicated on the existence of instances of EqRep and a gajillion other classes for each RuntimeRep. But the universe of RuntimeReps is big, and while it isn't open, it does contain things like ('TupleRep '[ 'IntRep, 'IntRep, 'TupleRep '[], 'AddrRep ]), so I being a merely finite being will probably never finish writing them.

This is our common package.

library reps    
  import: base    
  hs-source-dirs: reps    
  exposed-modules:    
    Rep.A Rep.D Rep.F    
    Rep.I Rep.I8 Rep.I16 Rep.I32 Rep.I64    
    Rep.L Rep.U    
    Rep.S0 Rep.T0    
    Rep.W Rep.W8 Rep.W16 Rep.W32 Rep.W64    
    Nil    

I have a backpack "functor" def it takes in a signature Rep and spits out a module Def with these gajillion instances.

library defs    
  import: base    
  build-depends: unlifted:{def, core, reps}    
  hs-source-dirs: defs    
  mixins:    
    unlifted:def (Def as Def.A) requires (Rep as Rep.A),    
    unlifted:def (Def as Def.D) requires (Rep as Rep.D),    
    unlifted:def (Def as Def.F) requires (Rep as Rep.F),    
    unlifted:def (Def as Def.I) requires (Rep as Rep.I),    
    unlifted:def (Def as Def.I8) requires (Rep as Rep.I8),    
    unlifted:def (Def as I16) requires (Rep as Rep.I16),    
    unlifted:def (Def as I32) requires (Rep as Rep.I32),    
    unlifted:def (Def as I64) requires (Rep as Rep.I64),    
    unlifted:def (Def as Def.U) requires (Rep as Rep.U),    
    unlifted:def (Def as S0) requires (Rep as Rep.S0),    
    unlifted:def (Def as Def.T0) requires (Rep as Rep.T0),    
    unlifted:def (Def as Def.W) requires (Rep as Rep.W),    
    unlifted:def (Def as W8) requires (Rep as Rep.W8),    
    unlifted:def (Def as W16) requires (Rep as Rep.W16),    
    unlifted:def (Def as W32) requires (Rep as Rep.W32),    
    unlifted:def (Def as W64) requires (Rep as Rep.W64)    
  -- implemented    
  exposed-modules: A D F I I8 T0 L U W    
  -- TODO    
  reexported-modules: I16, I32, I64, S0, W8, W16, W32, W64    

Now we instantiate the def functor as a mixin on each of the 17 or so basic runtime reps. and defs exports 17 modules.

(Some of those I haven't gotten around to populating all the instances for things like Num Int16# in, hence the mixture of exposed-modules and reexported-modules)

Now, I can also define a signature for a list of RuntimeReps, and make a backpack library cons that takes the list, conses a new element and produces sums and products reps for that list. Internally, it has to work in two stages, one which conses onto the list to produce the rep, then the other which uses def on that rep to construct the swarm of instances.

So given any particular instance you want there is a single canonical tower of mixin instantiations to find it in. You the user just have to write a linear amount of mixin boilerplate once, and because the tower of things will match what anyone else writes to get the same instances, they aren't conflicting orphans.

It looks just like it does for anyone else who depends on parsers for the types parsers-parsec for some orphan instances, and then depends on two libraries that use parsers-parsec's orphans. It all commutes and everyone is happy.

But you know what? It still kinda sucks. So I'd like to have the outermost package just go through and enumerate the first couple of possible conses onto the tuplerep. that'll handle most of the things that come out of GHC.Prim, and sanely let folks work with my unboxed maybes and the like that live in TYPE ('SumRep '[ 'TupleRep '[], r ]), because they are an unlifted

newtype Maybe# (a :: TYPE r) = Maybe# (# (##) | a #)

So I can make a package that does all 17 conses each of the 2 ways, producing 17 modules containing instances of tuple-stuff and pattern synonyms for working at those reps, and drama, 17 for working with sums, and 17 for just the naked list of runtimereps i just extended. The names are all super-stylized and could just be '.' on to a name for the tail of the list of things or concatenated as part of the identifier to get either something like

Unlifted.Rep.TupleRep2.AddrRep.IntRep

or

T2AI

but since I have to type a thousand of these by hand or get code to generate my cabal file? i'm going with the short one and I'll gnaw off my leg before I to 3-tuples. That said, right now I'm barely willing to get the unary case working, because of the pain of the cabal-side syntactic overhead.

For any RuntimeRep the chain of invocations of the def modules and rep modules is canonical. So while this big mess of open must-be-compiled-exactly-at-a-fixed-type instances in these packages is messy, it is also unique. They are orphans but in a weird zen-like way they are not.

But I'd like to be able to produce 1200 module names in one or a dozen lines of cabal file, rather than than the 2400 lines of cabal I have to use today,

When/if we get class associated pattern synonyms I can reduce this a fair bit. (But right now pattern synonyms can't be used here polymorphically in the levity at all!) But right now I need to let the user open the module in question to get at pattern synonyms that can't be shared, or to access the individual data constructors for one flavor of the data family, because backpack is going to produce them all with the same names, so I can't just aggregate the output of the package that is producing 50+ 'temporary' instantions of the mixins into one module that bulk imports and just re-exports the instances at this time.

Depending on how globbing worked, I might have to swap consing for snocing or something, but the canonical uniqueness of where i'd expect you to get an instance from would remain, but I really really don't want to write out all those names.

If this was a one-off problem I'd lump it and leave it.

Users can then write code monomorphically against any RuntimeRep i predefined, use the classes without defaults at any RuntimeRep they want, and use a couple of mixins to construct any more exotic instance they need in a canonical location so that it doesn't conflict with any other user of the library. But that only works when they want to make their new extensions that build on the library fully monomorphic.

If they want to themselves be polymorphic, e.g. making a new monad that is runtime-rep agnostic, they have to repeat this dance! This is where c++ templates have haskell beat.

Smaller examples are when I go to export, say a half-dozen modules

library relative
  import: base, bifunctors, comonad, data-default, ghc-prim, hashable, lens, mtl, profunctors, text
  signatures: Delta
  hs-source-dirs: src/relative
  exposed-modules: Absolute Cat List Located Map Relative Queue Semi
  build-depends: common

Right now I write my backpack packages sloppy and ML-style in the top level namespace. I do this because I have to type all those names to move them.

library algebra
  import: base, data-default, hashable, lens, profunctors, text
  build-depends: common, relative
  hs-source-dirs: src/algebra
  mixins:
    relative
     (Absolute as Relative.Absolute,
      Cat as Relative.Cat,
      List as Relative.List,
      Located as Relative.Located,
      Map as Relative.Map,
      Queue as Relative.Queue,
      Relative as Relative.Class,
      Semi as Relative.Semi)
    requires (Delta as Relative.Delta.Type)
  exposed-modules:
    Relative.Delta
    Rev
  reexported-modules:
    Algebra.Ordered,
    Algebra.Zero,
    FingerTree,
    Relative.Absolute,
    Relative.Cat,
    Relative.Class,
    Relative.Delta.Type,
    Relative.List,
    Relative.Located,
    Relative.Map,
    Relative.Queue,
    Relative.Semi

and I have to move them because I'll often instantiate the same library several times and then use them in the same program.

This pattern of bulk importing, renaming to a new namespace and then re-exporting, is very painful, even before I explode the effort using the pattern from the RuntimeRep story above.

@ekmett
Copy link
Member Author

ekmett commented Feb 22, 2021

The fact that equality is structural is the very reason why I'm using backpack and not template haskell here.

@ekmett
Copy link
Member Author

ekmett commented Feb 23, 2021

Picking one of the straw-men syntax proposals above at random, the latter would collapse down to something like:

library algebra
  import: base, data-default, hashable, lens, profunctors, text
  build-depends: common, relative
  hs-source-dirs: src/algebra
  mixins: relative (* as Relative.*) requires (Delta as Relative.Delta.Type)
  exposed-modules: Relative.Delta Rev
  reexported-modules:
    Algebra.Ordered,
    Algebra.Zero,
    FingerTree,
    Relative.*

@ezyang
Copy link
Contributor

ezyang commented Feb 23, 2021

I'm still grokking the use case. However, I checked up on some of the relevant code fro the concrete proposals.

It looks like wildcarding in reexported modules will be relatively simple to implement. In LinkedComponent.hs:

    -- OK, compute the reexports
    -- TODO: This code reports the errors for reexports one reexport at
    -- a time.  Better to collect them all up and report them all at
    -- once.
    let hdl :: [Either Doc a] -> LogProgress [a]
        hdl es =
            case partitionEithers es of
                ([], rs) -> return rs
                (ls, _) ->
                    dieProgress $
                     hang (text "Problem with module re-exports:") 2
                        (vcat [hang (text "-") 2 l | l <- ls])
    reexports_list <- hdl . (flip map) src_reexports $ \reex@(ModuleReexport mb_pn from to) -> do
      case Map.lookup from (modScopeProvides linked_shape) of
        Just cands@(x0:xs0) -> do
          -- Make sure there is at least one candidate
          (x, xs) <-
            case mb_pn of
              Just pn ->
                let matches_pn (FromMixins pn' _ _)     = pn == pn'
                    matches_pn (FromBuildDepends pn' _) = pn == pn'
                    matches_pn (FromExposedModules _) = pn == packageName this_pid
                    matches_pn (FromOtherModules _)   = pn == packageName this_pid
                    matches_pn (FromSignatures _)     = pn == packageName this_pid
                in case filter (matches_pn . getSource) cands of
                    (x1:xs1) -> return (x1, xs1)
                    _ -> Left (brokenReexportMsg reex)
              Nothing -> return (x0, xs0)
          -- Test that all the candidates are consistent
          case filter (\x' -> unWithSource x /= unWithSource x') xs of
            [] -> return ()
            _ -> Left $ ambiguousReexportMsg reex x xs
          return (to, unWithSource x)
        _ ->
          Left (brokenReexportMsg reex)

At the point we process reexports, we know all the modules in scope (modScopeProvides linked_shape) so it would be straightforward to do some sort of wild-cardy thing.

I think wildcards in mixins should be doable too. We'd augment IncludeRenaming with another case and then update

convertInclude
    :: ComponentInclude (OpenUnitId, ModuleShape) IncludeRenaming
    -> UnifyM s (ModuleScopeU s,
                 Either (ComponentInclude (UnitIdU s) ModuleRenaming) {- normal -}
                        (ComponentInclude (UnitIdU s) ModuleRenaming) {- sig -})

to know how to do wildcarding. The current implementation makes a lot of assumptions about what IncludeRenaming looks like, though, so it'd probably just have to get rewritten from scratch.

@ezyang
Copy link
Contributor

ezyang commented Feb 23, 2021

Another thought is that I remember wondering whether or not we should add some sort of functor instantiation syntax to Cabal files, and got convinced off of it while waiting for a compelling use case. Is this the compelling use case?

@ekmett
Copy link
Member Author

ekmett commented Feb 24, 2021

What would such a functor instantiation syntax look like?

@danidiaz
Copy link
Collaborator

A glob-less (and less expressive) alternative syntax could use a new keyword, like:

somepackage (Foo as Bar recursively) requires (Wee as Wii recursively)

Meaning "rename Foo to Bar, including sub-modules, and rename Wee to Wii, including sub-signatures".

For example this

    test-pipes-streaming
            (Test.PipesStreaming as Test.Streaming.PipeStreaming) 
            requires (Test.PipesStreaming.Streamy as Streamy.Streaming,
                      Test.PipesStreaming.Streamy.Bytes as Streamy.Streaming.Bytes)

could be written as this

    test-pipes-streaming
            (Test.PipesStreaming as Test.Streaming.PipeStreaming) 
            requires (Test.PipesStreaming.Streamy as Streamy.Streaming recursively)

One problem with this syntax is that it doesn't lend itself well to moving all top-level modules.

@ezyang
Copy link
Contributor

ezyang commented Feb 24, 2021

Yeah, I'm extremely uncertain about syntax. The current syntax tracks pretty well with Haskell module import syntax and anything else we add here is gonna be stuck permanently. (cough maybe we can copy rust cough)

ezyang added a commit that referenced this issue Feb 24, 2021
Qualified module renamings let you bring all modules from a library
into scope, but qualified under some module prefix.  If you write
'pkg qualified Prefix', then if pkg exposes A and B, you will be
able to access them as Prefix.A and Prefix.B.  This functionality
doesn't require any GHC changes; Cabal takes care of desugaring
the qualified syntax into an explicit list of renamings.

Partially address #7290

Signed-off-by: Edward Z. Yang <[email protected]>
@danidiaz
Copy link
Collaborator

Another datapoint: version 0.3.0 of the cryptographic library library raaz uses Backpack internally, and the author has stated that he finds the mixins syntax verbose:

I am not sure why backpack performs the mixing and matching in the .cabal file. I feel it makes the .cabal file rather large and unwieldy. The raaz.cabal is close to 900 lines of code with no real way to split it up. It is probably one of the biggest file in the repository

Looking at the cabal file, there are mixin sections like:

mixins: bench-prim (Benchmark.Primitive as Benchmark.Blake2b.CPortable)
        requires   (Implementation as Blake2b.CPortable)

      , bench-prim (Benchmark.Primitive as Benchmark.Blake2b.CHandWritten)
        requires   (Implementation as Blake2b.CHandWritten)

      , bench-prim (Benchmark.Primitive as Benchmark.Blake2s.CHandWritten)
        requires   (Implementation as Blake2s.CHandWritten)

      , bench-prim (Benchmark.Primitive as Benchmark.ChaCha20.CPortable)
        requires   (Implementation as ChaCha20.CPortable)

@ezyang
Copy link
Contributor

ezyang commented May 26, 2021

Yeah, so this mixin section is an example where good old fashioned functor-style syntax would be better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants