-
Notifications
You must be signed in to change notification settings - Fork 16
GHC-internal modules in base
#146
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
|
There are two axis of distinction between external and internal proposed:
I stricly disagree with the second point. If change is breaking, than it requires major version bump. I don't care whether the module is marked as "implementation specific internal". If you expose it, than it's part of public API. And as far as I understand, it's the first point which causes most friction. GHC devs need to change something in GHC specific stuff, but have to go through CLC. And that I agree on, the clarification will make everyone's life eaiser. But I repeat. Do not bundle these two unrelated considerations, they are not related. As an (occasional) user of low-level Finally, if it's really internal modules (i.e. possible future
then whole thing is not worth doing. You probably want to word that differently. |
As a matter of principle, I have no objection to continuing to bump the version of However, until this happens (which could happen in time for GHC 9.8, if we move quickly) we will need to be pragmatic (as we have been in the past). Concretely, this means accepting the (small) possibility that breaking changes in internal modules may occur in minor version bumps. Of course, we will do due diligence to minimize the damage, but if there is a change needed in an internal declaration for a soundness fix (or something of similar severity) then we will make the change with a minor bump. Doing otherwise would merely cause undue harm to users for no perceivable benefit. |
Can you provide a short summary of such breaking changes deep in EDIT: I know that |
Such cases are (and should be) rare. Producing a comprehensive summary of these changes would be require a fair amount of effort not because there are numerous cases but because they tend to be small and subtle. On a cursory look through the last few releases I was unable to find a single one. However, the principle stands: if we need to change, e.g., the type of To reiterate, ideally the likes of |
Does that module needs to be public? If it's used in desugaring, can't GHC access symbols in non-public modules? At least TH can AFAIK. EDIT: I understand that you try to be prepared for very unexpected things. But if you find it hard to find a convincing example, maybe it's a sign that asking CLC whether doing slight PVP sin isn't really not that bad? EDIT2: Or if it's in non-CLC module, than just doing it and warning users in |
Yes, this is precisely the pragmatic approach that I was suggesting in the original comment. |
Quite right. I've made this change. |
But that is different from what the proposal says. Especially as proposal says that The bar to make a breaking change (in |
I think one thing that might be good to draw out is how we will communicate the status of base modules going forward. How would a first time contributor to Some possibilities:
Another question is what the procedure will be for modules moving from one group to another, or for the introduction of new modules. For instance would adding a new internal module need CLC review? What about moving a module from internal to external? |
To be clear, this proposal explicitly /does not/ proposal to split out the internal modules of As long as internal modules remain in
Yes, clear communication will be essential. I have added a small note suggesting that we introduce a new Haddock field to mark internal modules. Adding a mention to the MR template also seems like a reasonable idea.
This proposal seeks to establish the /concept/ of an internal module and a roadmap for what modules we would l like to move to internal status in the future. While it would be great if the CLC would summarily accept a swath of these changes, we are willing to propose concrete changes in future, smaller CLC proposals if necessary. |
@bgamari are you suggesting this as a stop gap or a permanent solution? |
Do note that I have dropped the section entitled "where to place internal declarations" as it contained language from the editing process which was not intended to be in this proposal (namely the renaming of existing modules). |
I like the idea of code being marked as stable/unstable, however: So this can only work if the "this module is stable" documentation is kept accurate somehow. The reason I'd be undecided without such a mechanism, is that we already have a stability field in haddock and it is widely ignored anyway. (The other question is for some sort of linting about whether unstable modules are used in a project, but that's not in scope of this proposal afaict.) |
I see. In that case I'm leaning towards -1 on this proposal, since I believe violating PVP in the standard library sets a bad example (and as indicated in the PVP ticket, I'm also against formalizing it into PVP). The only pragmatic way forward is the base split to me. I'm starting to find it hard to follow all the inter-connected tickets about this, though. |
@bgamari could you please elaborate on this? There is no necessity to expose functions used by the list fusion framework, and indeed no other function of it (e. g., |
Yes, this is a fair point; Looking beyond |
(Assuming for a moment that More generally, my question is this. I understand that many fragile entities have been exposed from |
This comment was marked as off-topic.
This comment was marked as off-topic.
These cases are indeed largely due to historical accident. In many cases modules have been exposed via |
Is it difficult to keep modules which have been exposed due to historical accident stable (or at least avoid breaking changes)? And contain new work to |
@Bodigrim it very much depends upon the case; this is essentially what I try to capture in the "stability risk" column of the spreadsheet. Anything assessed to be 0 or 1 will likely be quite easy to keep stable (and consequently I generally propose that these be "stabilized" except in cases where there is no evidence of external usage). Modules assessed to be a higher stability risk would be harder. In some cases we propose that these be stabilized despite this (in particular, in cases where we find high degrees of dependence in the ecosystem). However, there are certainly a number of modules which we would prefer to hide. Regardless, yes, we will need to be more careful to contain new work to |
What does Stability risk grade 3 stand for? |
These assessments are a qualitative, fairly subjective grading. Grading 3 essentially corresponds to things that not only are likely to change but that we would also active discourage users from relying on. |
I like the proposal overall. Here's my constructive feedback.
I don't think we need a new field. I suggest reusing I suggest opening a PR to Haddock with the description of these four fields as the immediate next step.
Currently, the Base stability spreadsheet wants to either hide or internalize a total of 38 modules. Creating 38 CLC proposals (one per each module) sounds like huge amount of work. I suggest to at least split this process into batches. In fact, I believe we can agree on most of the modules in this CLC proposa directly.
My general view:
I would like to provide my comments to all other individual modules:
|
Currently my approach is to judge every proposal on a case-by-case basis taking into account the precedent of previous decisions, so I don't expect anything to change from my point of view, vis-a-vis the decision-making process. That is to say, if this proposal is passed it sets the precedent that CLC members ought to consider those modules as less stable, and I'm happy to take that precedent as a strong indicator of how I ought to make decisions. |
I'm okay with lowering my standards if it helps Haskell and GHC users to be more productive. In fact, I already didn't have high requirements for changes in the GHC-internal (as I see) modules. Formalizing this convention and having a list of such modules explicit and uniform for everyone sounds like a step forward. |
Dear CLC members, let's vote on the proposal to add the following disclaimer
to these modules:
(I'd usually ask proposers to put up an MR, but in this case it seems better to hold a vote first. If there is a delay, I'll prepare an MR myself) @tomjaguarpaw @chshersh @hasufell @mixphix @angerman @parsonsmatt +1 from me. |
+1 |
4 similar comments
+1 |
+1 |
+1 |
+1 |
Thanks all, 6 votes in favour are enough to approve. @bgamari please go ahead with an MR (and leave a link here for the reference). |
@bgamari this is a gentle reminder to prepare an MR implementing the proposal. |
@Bodigrim thanks for the ping. I have not forgotten; to the contrary, we have been hard at work getting the groundwork in place to implement this and the rest of our proposal. |
To warn users that these modules are internal and their interfaces may change with little warning. As proposed in Core Libraries Committee #146 [CLC146]. [CLC146]: haskell/core-libraries-committee#146
Could someone perhaps link to (or copy outright) the content of #146 (comment) into the issue description, since it is what actually ended up being voted on for this issue? :) This thread goes for a wild ride that perhaps most future readers don't need to go through. In the end the proposed change was literally just documentation, if I'm not mistaken, while many other things of far greater import were argued for and against, as everyone gradually came to understand each other over time. |
I think we are waiting for @bgamari to formulate a concrete proposal (indeed along the lines of the comment you cite). Ben, it may well be better to start a new issue, to avoid the "wild ride". |
I'm somewhat confused. There was a very specific proposal voted on in this thread; it's an MR which is missing, but no reason to start a new issue for it. |
Yes, I am also rather confused. My understanding is that the proposal as-written in the summary is accepted. The operative changes are described in #146 (comment) and #146 (comment). However, I now realize that I did forget to notify the CLC of the implementation of the proposal, which was carried out in !10422; apologies for that. This MR introduced |
Ben, now I'm confused even more. The approved changes we voted on are in the second comment you linked only, not the first one. I thought I asked both you and Simon, and repeated again the precise scope in the call for voting. If you want to introduce (I've been very slow recently and accumulated a huge backlog of work here, and I'm sorry for this. I hope to clear up things during the upcoming months) |
Apologies, this was my miscommunication; I floated the idea in a comment but failed to express it in the proposal. I have raised a separate proposal for
Quite alright. You have a lot on your plate and yet manage it admirably. |
Uh oh!
There was an error while loading. Please reload this page.
(The proposal eventually approved in this thread is #146 (comment) — Bodigrim, Sep 2023)
1. Background
Currently the
base
package exposes many internal implementation details of the implementationbase
functionality. By "internal implementation details" we mean functions and data types that are part of GHC's realisation of some exposed function, but which were never intended to be directly used by clients of thebase
library. For instance, theGHC.Base.mapFB
function is a necessary exposed part of the fusion framework formap
but which GHC's authors never intended users to call.This lack of clarity is bad for several reasons:
Users have no way to know which functions are part of the "intended, stable API" and which are part of the "internal, implementation details". Consequently, they may accidentally rely on the latter; they simply have no way to tell.
GHC's developers are hampered in modifying the implementation because too much is exposed. This imposes a high backward-compatibility burden, one that is an accident of history.
This status quo leaves much to be desired: users tend to rely on any interface available to them and therefore GHC developers are susceptible to breaking users when changing implementation details within
base
. On the other hand, there is a clear need to be able to iterate on the implementation of GHC and itsbase
library: fixing compiler bugs may require the introduction of new internal yet exposed primitives (c.f. the changes made in the implementation ofunsafeCoerce
in GHC 9.0) and improving runtime performance may require changes in the types of exposed internal implementation (c.f. GHC #22946).These difficulties are discussed in CLC #105.
2. Proposal
We propose to classify the modules of
base
into three groups:Hidden: these are simply the existing non-exposed modules (
other-module
in Cabal terms). No change here.External: these modules comprise the public API of
base
.exposed-modules
Cabal sectionInternal: these modules are part of the internal implementation of
base
functions.exposed-modules
Cabal sectionAs of today, all modules are either Hidden or External; the CLC policy is that the API of all exposed modules is subject to CLC review.
The main payload of this proposal is
2.1 Codifying the Internal vs External split
Our proposal is simply to declare whether a module is Internal or External, using some out-of-band mecanism like a publicly visible list.
However, future reorganizations (notably HF tech propoosal #47) might split
base
into two packages:ghc-base
, all of whose exposed modules are Internal.base
, all of whose exposed modules are External.That would codify the distinction between Internal and External, which would be a Good Thing. But the burden of this proposal is simply to make that distinction in the first place, and start a dialogue about which modules belong in each category.
Incidentally, the
Stability
Haddock field of a module is not the same as Internal vs External distinction. A module could be External (i.e. designed for external callers), and yet experimental and not yet stable. That seems to be the intended purpose of theStability
field, although it is not well describe anywhere (please tell us there is a good specification).We propose to document internal modules via a yet-to-be-named Haddock field.
2.2 Module by module summary
To make the discussion concrete, we have characterized each of the exposed modules in the
GHC.*
namespace along three axes:These findings, along with the stability indicated by the modules'
Stability
Haddock field, are summarized in this spreadsheet. We then used these assessments to define an action plan (seen in the "Action" column) which will bring us closer to the goal of clearly delineating the stable interface ofbase
. We do not intend to pursue this plan as one atomic change; rather, we intend for this plan to be an aspiration which we will iteratively approach over the course of the coming years, largely driven by the needs of the GHC developers.The proposed actions fall into a few broad buckets:
In the sections below we will discuss some of the reasoning behind these proposed actions and draw attention to some open questions.
3. The question of
GHC.Exts
Historically
GHC.Exts
has been the primary entry-point for users wanting access to all of the primitives that GHC exposes (e.g. primitive types, operations, and other magic). This widely-used module poses a conundrum since, while many of these details are quite stable (e.g.Int#
), a few others truly are exposing implementation details which cannot be safely used in a GHC-version-agnostic way (e.g.mkApUpd0#
,unpackClosure#
,threadStatus#
). There are at least two ways by which this might be addressed:Int#
,Weak#
,newArray#
, etc.) inGHC.Exts
, leaving the rest to only be exposed viaGHC.Prim
(which should not be used by end-users), orGHC.Exts
to be unstable and export the stable subset from another namespace (e.g.Word#
and its operations could be exposed byGHC.Unboxed.Word
)4. Non-normative interfaces
Several interfaces exposed by
base
intentionally reflect internal details of GHC's implementation and, by their nature, should change to reflect changes in the underlying implementation. Here we call such interfaces "non-normative" as they are defined not by a specification of desired Haskell interfaces but rather by the system that they reflect.One such module is
GHC.Stats
, which allows the user to reflect on various statistics about the operation of the runtime system. If the runtime system were to change (e.g. by adding a new phase of garbage collection), users would expect the module to change as well. For this reason, we mark such non-normative interfaces as "internal".The text was updated successfully, but these errors were encountered: