Skip to content

Put libraries into $store/lib #4656

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Aug 22, 2017
Merged

Conversation

angerman
Copy link
Collaborator

@angerman angerman commented Aug 2, 2017

Please include the following checklist in your PR:

  • Patches conform to the coding conventions.
  • Any changes that could be relevant to users have been recorded in the changelog.
  • The documentation has been updated, if necessary.

Please also shortly describe how you tested your change. Bonus points for added tests!

This is a stab at #4263 as an alternative to #4426.

The general strategy is:

  • put all libraries into $store/lib. This though makes them not prefix relocatable anymore.
    I'm fine with that for macOS.
  • As that is not enough to stay under the load command limit for my tiny sample app, we
    mutilate the package name as well. We cut the abi hash to 4 byte 😱 and drop vowels.
    We can not drop the -ghcX.Y.Z suffix it seems as GHC seems to append that on it's own.

This then produces only one @rpath entry to $HOME/.cabal/store/ghc-x.y.z/lib. And thus
saving quite a bit already.

The alternative strategy I though of was to rename only the dynamic library to libHS-ghcX.Y.Z.dylib, and try to turn the reference to @rpath/name-version-hash/lib/libHS-ghcX.Y.Z.dylib. That would also
reduce the @rpath entry to one, but suffer from the same long filename issue.

@hvr
Copy link
Member

hvr commented Aug 2, 2017

How does this interact w/ #4400?

@angerman
Copy link
Collaborator Author

angerman commented Aug 2, 2017

@hvr, it basically assumes that if we didn't bail out, and are renaming the directory already, we can copy any left over files as well. I'm open for better solutions.

@23Skidoo
Copy link
Member

23Skidoo commented Aug 2, 2017

/cc @dcoutts @christiaanb

Judging by the description, this seems to be similar to an earlier fix that got rejected (see discussion in #4426). I'm inclined to accept this PR as a pragmatic "temporary" solution if it unborks new-build on macOS, since it's been broken for quite a while and no-one managed to fix it in "proper" way.

@angerman
Copy link
Collaborator Author

angerman commented Aug 2, 2017

Note: this patch, while ugly, at least allows me to work with new-build on macOS.
In general this clearly is not the optimal solution. If GHC would fall back to static archives
when no dynamic ones could be found for TH, we might just get away with static only
on macOS. Not that I consider that a clean option either.

Maybe we have to do some dylib chaining. (e.g. build intermediate dylibs to load more dylibs.)

@angerman
Copy link
Collaborator Author

angerman commented Aug 2, 2017

Also to note: this as well as the alternative strategy only give us a bit more wiggle room until this issues comes up again. If someone has a massive list of dependencies and manages to even with mutilated names and everything in "$store/lib", to blow past the load command limit, we are back to square one, and need to try and engineer a proper(?) solution? If one exists? Maybe via chain loading dylibs?

@23Skidoo
Copy link
Member

23Skidoo commented Aug 2, 2017

Also to note: this as well as the alternative strategy only give us a bit more wiggle room until this issues comes up again.

This applies to old-build too, right?

@angerman
Copy link
Collaborator Author

angerman commented Aug 3, 2017

As @christiaanb also mentioned. The issue is essentially that any linked library will require an
@rpath entry and a library entry relative to @rpath. What this and the alternative
strategy would do is to ensure that only a single @rpath entry is needed (that is, all libraries
are relative to the same @rpath.) This saves about ~50%. As we now require only 1 @rpath
entry for N libraries. Where we needed N @rpath entries before.

However, the length of the library name/path relative to the @rpath also takes up space. With the hashes in the library name (and path), this easily trips up the linker.

For old-build, where the library names are shorter, this is a less prominent issue, but eventually, if
a project manages to link in enough other libraries, this would become a problem with old-build as well. This is why the mutilating of the library names and shortening of the hashes helps with
new-build to at least for my project stay under the limit.

I haven't looked at it yet in detail, and I do not know if we flatten out the list of dependent libraries, and
if this is needed. Maybe I'll get around to investigating this today.

After giving the strategies some additional thought, I must admit, I start liking the $store/lib one over the alternative, because it makes linking by hand a bit easier, as only one -L flag instead of N -L are required to link N Haskell libraries.

@hvr
Copy link
Member

hvr commented Aug 3, 2017

If this gets merged, can we get a verbose source-code comment note somewhere which explains the rationale of this hack (and maybe its tradeoffs, e.g. increased hash collision risk; potential consequences for future garbage-collecting code as well as remote artifact upload/downloading code; IOW, what invariants future code needs to take into account so that it doesn't break new-build for OSX, especially for those of us who develop on the blessed Linux arch which is less burdened and would assume everything works just fine everywhere if it works on Linux =) ).

@23Skidoo
Copy link
Member

@angerman Can you please fix the compile failures on older GHCs?

This is done, such that we have fewer issue with
the load command limit with Mach-O files on macOS.
@23Skidoo
Copy link
Member

@angerman Thanks! I think that this is good to go, please merge once the CI is green.

@angerman
Copy link
Collaborator Author

@23Skidoo if this goes green, I'll squash the last four commits, and merge if still green afterwards.

@angerman angerman merged commit 3da83a0 into haskell:master Aug 22, 2017
@dcoutts
Copy link
Contributor

dcoutts commented Sep 2, 2017

I'm still not happy :-)

I want us to fix ghc so that it does not impose any .dynlib naming convention on us, and then we can use a scheme that does allow the (imho important) property of the store that each package lives in its own directory.

The alternative strategy I though of was to rename only the dynamic library to libHS-ghcX.Y.Z.dylib, and try to turn the reference to @rpath/name-version-hash/lib/libHS-ghcX.Y.Z.dylib. That would also
reduce the @rpath entry to one, but suffer from the same long filename issue.

I think we can do this but make it better...

We don't need the ghcX.Y.Z bit if we change ghc to get rid of that convention.

https://ghc.haskell.org/trac/ghc/ticket/14182

We don't actually need the HS either. We can use short lib names. As for the name-version-hash bit, we could compress this, or we could at least add compressed symlinks that we still think are safe to some probability of non-clashing.

For example a compressed id need not include the name or version, and just N bits of the hash, rendered as baseXX rather than hex.

@angerman
Copy link
Collaborator Author

angerman commented Sep 3, 2017

@dcoutts I'm not very happy with this either. But it at least allows me to work. I hope you will be at icfp, and we might be able to have a chat about this.

Changing the naming is obviously only a solution that will work until someone ends up with massive amounts of dependencies and runs out again.

The dynamic linker on macOS is, as far as I can see, capable of recursively linking dynamic libraries. See the following example, in which we have

main -> b -> a

instead of

main --> a
     `-> b
$ otool -L main
main:
	lib/libb.dylib (compatibility version 0.0.0, current version 0.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)

$ otool -L lib/libb.dylib
lib/libb.dylib:
	lib/libb.dylib (compatibility version 0.0.0, current version 0.0.0)
	lib/liba.dylib (compatibility version 0.0.0, current version 0.0.0)
	/usr/lib/libSystem.B.dylib (compatibility version 1.0.0, current version 1238.60.2)

$ otool -L lib/liba.dylib
lib/liba.dylib:
	lib/liba.dylib (compatibility version 0.0.0, current version 0.0.0)

The source for this is in angerman/dylib-linking.

I therefore believe we might want to pursue a recursive solution on macOS.

Note: libtool on macOS (man page), is quite different from libtool other platforms I believe.

@dcoutts
Copy link
Contributor

dcoutts commented Sep 3, 2017

I hope you will be at icfp, and we might be able to have a chat about this.

I'm hoping to at least make it for HS+HIW.

Changing the naming is obviously only a solution that will work until someone ends up with massive amounts of dependencies and runs out again.

True enough, though it's still a good idea to go for shortening and sharing. It's good for linking performance on ELF if nothing else.

The dynamic linker on macOS is, as far as I can see, capable of recursively linking dynamic libraries.

Worth looking into. It wouldn't surprise me though if any automatic splitting is really tricky.

Along similar lines, we should perhaps look into getting GHC to avoid directly linking to libraries that are only needed indirectly. I rather suspect currently GHC will link each lib & exe to the transitive closure of all the Haskell libs, because any of them could be the source for symbols that are used (due to re-exports & inlining). But in practice I suspect in really big projects that there are a lot of indirect deps that need not be direct deps. GHC ought to be able to work this out by collecting the set of packages used in "original" names.

@cartazio
Copy link
Contributor

What precludes having a symlink dir in a new build ? And have that point into the new build store ? That seems to avoid most of the problems ?

@cartazio
Copy link
Contributor

To clarify: the dylibs are needed by default only for ghci and template Haskell right ?

@angerman
Copy link
Collaborator Author

The issue is that we run over the load command size. This is a fixed size that includes the name of the library, as well as (all) relative paths. By putting all libs into the same directory, we can cut down the relative path entries. We still have those really long library names.

And even if we shortened the library names, at some point one might link in just enough libraries, that even the short names will overflow the load command limit. I do believe that recursive linking is the right approach here, but as @dcoutts mentioned during ICFP, this would need to be done in GHC, and I haven't gotten around to look into this.

Regarding the need for dynamic libraries. I have not been able to produce a purely static ghc on macOS, and ghci should (via the internal linker) be perfectly able to load archives instead of dylibs. And on the other hand I saw the cross compiler I built produce dynamic libraries by default. (hence the recent addition of -staticlib)

@ghost
Copy link

ghost commented Apr 10, 2018

@angerman

his would need to be done in GHC, and I haven't gotten around to look into this.

Why does this need to be done in GHC? Would it be possible to use a modified linker that does recursive linking like the way nixpkgs does it, and then tell GHC to use that wrapper with -pgml?

@jship
Copy link

jship commented Apr 18, 2018

@dogirardo: I put a comment over on #5220 that I'll link here. #5220 (comment)

angerman added a commit to stable-haskell/cabal that referenced this pull request Feb 23, 2025
andreabedini pushed a commit to stable-haskell/cabal that referenced this pull request Mar 6, 2025
andreabedini pushed a commit to andreabedini/cabal that referenced this pull request Mar 6, 2025
andreabedini pushed a commit to stable-haskell/cabal that referenced this pull request Mar 7, 2025
andreabedini pushed a commit to andreabedini/cabal that referenced this pull request Mar 13, 2025
andreabedini pushed a commit to andreabedini/cabal that referenced this pull request Mar 13, 2025
andreabedini pushed a commit to stable-haskell/cabal that referenced this pull request Mar 18, 2025
andreabedini pushed a commit to andreabedini/cabal that referenced this pull request Apr 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants