Skip to content
This repository was archived by the owner on Oct 7, 2020. It is now read-only.

improving the build/installation process #1002

Closed
2mol opened this issue Dec 18, 2018 · 10 comments
Closed

improving the build/installation process #1002

2mol opened this issue Dec 18, 2018 · 10 comments
Labels
build Continuous integration and building type: discussion type: enhancement type: refactor Refactor and tidy up internals.
Milestone

Comments

@2mol
Copy link

2mol commented Dec 18, 2018

Hi everybody!

I spent some time trying to figure out how HIE is built and I found a bunch of problems that might be pretty easy to fix.

This is me getting started on that and soliciting feedback.

My end-goal is to create a PR that makes the installation more pleasant for first time users, including some readme polish. Please note that I'm neither extremely experienced with makefiles, nor with other build tools or tricky dependency handling, so I'm not promising that I'll be extremely fast. I will track progress here.

The problems. Numbered so you can comment on them:

  1. building everything seems robust-ish, but is very time intensive and uses up loads of harddisk.
  2. building a targeted ghc version is broken in my opinion:
    • The submodules will use different resolvers. The redundancy in downloads and compilations is pretty nuts. Incorrect, they will use the resolver from the stack build command.
    • The selection of ghc version doesn't propagate to the other build commands.
    • The command to check the ghc version (GHC := $(shell stack path --compiler-exe)) will use the main stack.yaml instead of the targeted one. This will also attempt to download an entire resolver set, just to tell you its ghc version.
  3. The resolvers in stack.yaml and stack-8.6.3.yaml are two different nightly ones, which I think could be aligned to use the same one, improving build-all.
  4. It's unclear what I have to have on my system and what can get reused. If I have the latest cabal, no need to compile the whole thing.
  5. happy gets compiled and installed with each version, effectively overwriting itself 7 times.

I really think installing one, maybe two specific versions is the right use-case for most intermediate users, especially those installing hie for the first time.

As an example, I managed to get an amazing fast install experience (for ghc 8.4.4) by doing the following:

  • remove the dependencies on cabal and submodules from hie-%: in the Makefile
    • not fully necessary, but nice in case we already have the right cabal version.
  • copy stack-8.4.4.yaml over stack.yaml
  • change the resolvers of submodules/HaRe and submodules/brittany to use the same one as HIE (and create a stack-8.4.4.yaml for the latter)
    • not necessary
  • make sure that cabal is installed. I happened to have it via ghcup so I have no idea if it even needed cabal update, I only did cabal new-update. It seems it's also not mandatory to install via stack.
  • make build-doc-8.4.4 at the end.
  • everything works in projects with 8.4.4. Amazing!
    • space cost: 200Mb. Would be 100Mb, but there is both hie-8.4 and hie-8.4.4. Does it need both? Who knows. md5 says they're the same thing. Would a symlink do?

Now the challenge is to make it equally nice without having to tweak any files.

I gotta run now, but I will update this post with goals and a todo list.

Chime in if you have comments!

@Anrock
Copy link
Collaborator

Anrock commented Dec 18, 2018

Regarding 2) - not sure if changing resolver of submodules is actually ok, i think at least some code (ghc-mod probably) there relies on building with specific ghc, @alanz may tell better.

However propagating specific stack-x.y.z.yaml everywhere to reuse existing installations is probably a good thing to do to reduce time\space waste. Proper --stack-yaml parameter inferred from requested GHC version in stack invocations should remove a need in copying specific stack-x.y.z.yaml over stack.yaml. This will make a good first step.

Regarding 3) - i think most redundancies are handled by cache system of stack. At least for cabal - in my experience stack won't rebuild cabal if it is already installed. However, afaik, hie may require a specific version of cabal and globally installed one may not be suitable. So stack install cabal here is okay - it's either installs proper cabal or does nothing.

Regarding specific libraries (like libgmp), however, i'm not sure we can do anything better than case-by-case individual manually written checks based on user reports.

but there is both hie-8.4 and hie-8.4.4. Does it need both? Who knows. md5 says they're the same thing. Would a symlink do?

I think it will. Not sure about portability, though. Symlinks aren't widely used on Windows for some reason but in my experience they work okay, so we can give it a shot.

@2mol
Copy link
Author

2mol commented Dec 19, 2018

Thanks for the feedback @Anrock! I have a few questions for you at the end.

  1. if I stay in the same overall LTS range, i.e. 12.* then it should be ok. I think the increasing LTS versions are supersets of the previous ones iirc? I don't know yet how to pass the choice of resolver to them.

I agree that copying over stack.yaml is not a solution, it was only my hack to prove that it's possible to massively reduce the amount of building that has to be done.

  1. To be more slightly precise, stack will install cabal over the old one, but if it was already compiled with that resolver, then it just copies the artifact it already has anyway. So yes, only one compilation. But we could make it zero, maybe by checking cabal --version somehow.

To your last point a question: I now have three executables: hie, hie-8.4, and hie-8.4.4. What would happen if I install all versions? wouldn't several hie-8.4.* just keep pointlessly overwriting hie-8.4? To me that means that hie-8.4 cannot be used anywhere. Or could you explain to me how those 3 are needed?

IMHO in a perfect world we would only have hie-8.4.4 and have hie-wrapper do the rest.

I also got aware of #991, so I will limit the scope of my attempt to really just list here what I think is broken with build-all, and attempt to fix that.

Will report back with an overview of my understanding of make build-all

@Anrock
Copy link
Collaborator

Anrock commented Dec 19, 2018

@2mol
2) Probably. as i said i'm not sure. For example, at the moment, hie always builds cabal-2.4.0.1 which is available, i believe, only for GHC 8.6.
3) That may work, yeah.

wouldn't several hie-8.4.* just keep pointlessly overwriting hie-8.4?

They would.

To me that means that hie-8.4 cannot be used anywhere.

Why, tho?

Or could you explain to me how those 3 are needed?

You mean hie, hie-8.4 and hie-8.4.4? This is a common practice/convention in *nix world for having multiple versions of same library installed simultaneously.
Basically you have a whatever large set of libfoo-x.y.z.so files anywhere and a chain of symlinks, like /usr/lib/libfoo.so -> /usr/lib/libfoo-x.so -> /usr/lib/libfoo-x.y.so -> /whatever/libfoo-x.y.z.so. Then by manipulating symlinks to point to different lib files user is able to select "default" version of libfoo and build systems can look for dependencies in semi-standart way, like "i need any version of libfoo, so i just link with /usr/lib/libfoo.so" or "i need at least libfoo-2.4, so i link with /usr/lib/libfoo-2.4.so" and symlink chain will ultimately resolve to some conrecte file at the end.
Hie follows this convention, but copies binaries instead of symlinking, for some reason. And hie-wrapper here acts as "build system" from example.

@alanz
Copy link
Collaborator

alanz commented Dec 19, 2018

hie, hie-8.4, and hie-8.4.4.

It just struck me that the original Makefile built from the oldest GHC to the newest, so the hie and hie-n executables would be for the most recent GHC version. This is not the case for the recent change to the Makefile.

And for say the hie-8.6.3 one, the logic is that GHC 8.6.3 should be backward compatible with 8.6.2, and with 8.6.1

@alanz
Copy link
Collaborator

alanz commented Dec 19, 2018

building a targeted ghc version is broken in my opinion:

  • The submodules will use different resolvers. The redundancy in downloads and compilations is pretty nuts.

In a multi-directory project, stack will build the submodules with the resolver specified in the root. Otherwise you would not be able to use a version 8.6.3 compiled module with say a GHC 8.4.4 compiled one.

  • The selection of ghc version doesn't propagate to the other build commands.

The actual compilation step is driven by

## Builds hie for GHC version % only
hie-%: submodules cabal
	stack --stack-yaml=stack-$*.yaml install happy
	stack --stack-yaml=stack-$*.yaml build
	stack --stack-yaml=stack-$*.yaml install                                \
		&& cp '$(STACKLOCALBINDIR)/hie' '$(STACKLOCALBINDIR)/hie-$*'    \
		&& cp '$(STACKLOCALBINDIR)/hie-$*' '$(STACKLOCALBINDIR)/hie-$(basename $*)'
.PHONY: hie-%

So each of the invocations of stack uses exactly the same stack yaml file.

  • Even the command to check the ghc version (GHC := $(shell stack path --compiler-exe)) will use the main stack.yaml instead of the targeted one. This will also attempt to download an entire resolver set, just to tell you its ghc version (!)

This bit is done on the assumption that all was being built, so the specific version being used is not important, as it will be installed anyway. And the original purpose is to make sure that the cabal executable is available in the path, so that ghc-mod / cabal-helper can use it while trying to determine the project type.

@2mol
Copy link
Author

2mol commented Dec 21, 2018

Thanks @alanz, I appreciate you chiming in.

And for say the hie-8.6.3 one, the logic is that GHC 8.6.3 should be backward compatible with 8.6.2, and with 8.6.1

Ah, that makes sense, very good to know! That solves my comment to @Anrock about me thinking that 8.4 couldn't really be used.

But I don't fully understand: why then compile 8.6.1 if you could just compile 8.6.3 and use that for every version below it? I guess I really should look into the code for hie-wrapper at some point.

Either way, what do you think about switching to symlinks? My understanding is that the build for Windows is anyway done with a powershell batch file. So we could avoid copying the binaries, no?

In a multi-directory project, stack will build the submodules with the resolver specified in the root. Otherwise you would not be able to use a version 8.6.3 compiled module with say a GHC 8.4.4 compiled one.

Also something I didn't know! Some of my original points are therefore not valid and my step-by-step thing can be simplified. I started measuring compilation times and disk usage, to avoid being too wrong about how much can be improved :)

@alanz
Copy link
Collaborator

alanz commented Dec 21, 2018

@2mol I think we would have to experiment to confirm whether the GHC 8.6.3 binary does in fact work for GHC 8.6.1 projects.

I do know there was a breaking AST change from GHC 8.2.1 to 8.2.2, which could cause an issue.

And I have no problem with symlinks, it perhaps makes it clearer what is going on, to anyone inspecting the bin directory.

We just need to make sure they are actually supported on all platforms using hie.

@alanz alanz added this to the 2019-01 milestone Dec 24, 2018
@ingun37
Copy link

ingun37 commented Dec 26, 2018

I think building them concurrently would help alot. Idk if Makefile supports them but Gradle does for sure.(Gradle is also much better in general than Makefile .)

@2mol
Copy link
Author

2mol commented Dec 29, 2018

Ok, tested out the first simple idea to make the single build faster. You can see the change in #1013. More progress to follow.

@alanz symlinks should be totally fine on mac and linux, and I can test on both

@ingun37 I think compiling already uses up all my cores, so I'm not sure if we can gain much by building concurrently. But I'd be happy to be proven wrong.

@lukel97 lukel97 added the build Continuous integration and building label Jan 1, 2019
@alanz alanz modified the milestones: 2019-01, 2019-02 Feb 2, 2019
@alanz alanz modified the milestones: 2019-02, 2019-03 Mar 2, 2019
@alanz alanz modified the milestones: 2019-03, 2019-04 Apr 6, 2019
@fendor
Copy link
Collaborator

fendor commented Apr 21, 2019

I suppose, since the Makefile does not exist anymore and #1168 has been merged, this can be closed now.

@fendor fendor closed this as completed Apr 21, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
build Continuous integration and building type: discussion type: enhancement type: refactor Refactor and tidy up internals.
Projects
None yet
Development

No branches or pull requests

6 participants