-
Notifications
You must be signed in to change notification settings - Fork 18k
runtime: Additional Allocator Metrics in runtime.MemStats
#11890
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
runtime.MemStats
runtime.MemStats
/cc @RLH @aclements |
Go seem to have exposed the internal implementation concept of a span in memstats but not in any meaningful way such as indicating the size of a span. This is good since it allows future implementations freedom to change the size of a span. Go should preserve this freedom. This proposal seems to imply that one will need to know the size of a span for it to be useful. Is that true? |
Yes, that is true. The core reason why is the size class is a useful hint in indicating the size of the individual objects involved in the measurement. For instance, this is to help differentiate between low volume of large object churn versus high volume of small object churn. As for hiding the implementation details of the runtime, let me offer a few remarks:
I think there is a middleground; just not sure what it is. |
Details aside, there's been some discussion of the philosophy of MemStats over on #10323. I agree that we have to expose implementation details in order for it to be truly useful, but we've already made the mistake of exposing such details in MemStats that are no longer really relevant, but are covered under Go 1 compatibility. |
Yep, that bug had been the impetus for filing this one (just get the need Is it possible to, for instance, keep the legacy fields in the struct and |
Current HeapSys/HeapAlloc/HeapInuse/HeapIdle/HeapReleased allow to answer 1, 2 and 3 questions (minus by-size-class part, which is more of an implementation detail, and when you think about RSS size classes are more-of-less irrelevant). |
Kindly ping @matttproud 😉 |
The original motivation was spelled out in the top-level filing, but in the
interim canary analysis (CAS) has been explicated in the public domain:
https://research.google/pubs/pub46908/
The motivation is to determine whether a release regresses in terms of
resource efficiency at runtime. Knowing information about size class
activity and liveliness has been useful in determining whether design
assumptions about memory lifetime are correct, which is useful for
scaleable, low-latency server design. I materially needed this when
building Prometheus, which had multi-modal memory lifetimes. Particularly
is my server using memory in a way that will promote heap fragmentation in
a containerized environment where an out-of-memory (OOM) killer will
terminate it unceremoniously?
… |
A lot of these are now doable with |
Hi, I am wondering if it would be amenable to include several additional core metrics in
runtime.MemStats
, namely the following measures:No. of spans pending being released by size class.
This helps server authors understand the discrepancy between reported
heap reservation/allocation versus process RSS.
No. of live spans (with active allocations contained therein) by size class.
Essential corollary for no. 1.
Measure of sum of span occupancy by size class.
This helps server and runtime authors understand level of span reuse
and divine potential problems with heap fragmentation viz-a-viz no. 2.
Sum of span allocations by size class (not of inner object allocations).
This helps server authors understand the aggregate throughput of
memory flow in realtime, a measure of efficiency that is useful to
capture when comparing releases and automated canary tests.
Sum of span releases by size class (not of inner object allocations).
Useful corollary for no. 4.
Summary statistics about age for spans by size classes: min, median, average, max.
Throughput in no. 4 and no. 5 is useful, but this takes the level of detail to a
deeper level.
Cumulative sums of individual allocations made for each span size class.
Useful throughput measure for individual allocations.
These would be inordinately beneficial in gross measurement of fleet memory allocator performance as well as offer server authors deeper telemetric insights into the lifecycles of their allocations. pprof is great for one-off diagnosis but not real-time analytics of the fleet.
I would be happy coming to compromise on these, especially to enhance the language and requirements as well as to possibly volunteer time in the implementation of these representations should we come to agreement.
The text was updated successfully, but these errors were encountered: