diff --git a/documentation/memory/README.md b/documentation/memory/README.md index d6de2c6..fa4a7d1 100644 --- a/documentation/memory/README.md +++ b/documentation/memory/README.md @@ -1,71 +1,3 @@ # Memory -In this document you can learn about how to debug memory related issues. - -- [Memory](#memory) - - [My process runs out of memory](#my-process-runs-out-of-memory) - - [Symptoms](#symptoms) - - [Side Effects](#side-effects) - - [Debugging](#debugging) - - [My process utilizes memory inefficiently](#my-process-utilizes-memory-inefficiently) - - [Symptoms](#symptoms-1) - - [Side Effects](#side-effects-1) - - [Debugging](#debugging-1) - -## My process runs out of memory - -Node.js ​ _(JavaScript)_ ​ is a garbage collected language, so having memory -leaks is possible through retainers. As Node.js applications are usually -multi-tenant, business critical, and long-running, providing an accessible and -efficient way of finding a memory leak is essential. - -### Symptoms - -The user observes continuously increasing memory usage ​ _(can be fast or slow, -over days or even weeks)_ ​then sees the process crashing and restarting by the -process manager. The process is maybe running slower than before and the -restarts make certain requests to fail ​ _(load balancer responds with 502)_ ​. - -### Side Effects - -- Process restarts due to the memory exhaustion and request are dropped on the - floor -- Increased GC activity leads to higher CPU usage and slower response time -- Increased memory swapping slows down the process -- May not have enough available memory to get a Heap Snapshot - -### Debugging - -To debug a memory issue we need to be able to see how much space our specific -type of objects take, and what variables retain them to get garbage collected. -For the effective debugging we also need to know the allocation pattern of our -variables over time. - -- [Using Heap Profiler](./step1/using_heap_profiler.md) -- [Using Heap Snapshot](./step2/using_heap_snapshot.md) -- [GC Traces](./step3/using_gc_traces.md) -- [Native Tools](./step4/using_native_tools.md) - -## My process utilizes memory inefficiently - -### Symptoms - -The application uses an unexpected amount of memory and/or we observe elevated -garbage collector activity. - -### Side Effects - -- An elevated number of page faults -- Higher GC activity and CPU usage - -### Debugging - -To debug a memory issue we need to be able to see how much space our specific -type of objects take, and what variables retain them to get garbage collected. -For the effective debugging we also need to know the allocation pattern of our -variables over time. - -- [Using Heap Profiler](./step1/using_heap_profiler.md) -- [Using Heap Snapshot](./step2/using_heap_snapshot.md) -- [GC Traces](./step3/using_gc_traces.md) -- [Native Tools](./step4/using_native_tools.md) \ No newline at end of file +See: https://github.com/nodejs/nodejs.org/tree/main/locale/en/docs/guides/diagnostics/memory diff --git a/documentation/memory/case_study.md b/documentation/memory/case_study.md deleted file mode 100644 index 8ad1dfb..0000000 --- a/documentation/memory/case_study.md +++ /dev/null @@ -1 +0,0 @@ -//TODO diff --git a/documentation/memory/investigation_flow.md b/documentation/memory/investigation_flow.md deleted file mode 100644 index 8ad1dfb..0000000 --- a/documentation/memory/investigation_flow.md +++ /dev/null @@ -1 +0,0 @@ -//TODO diff --git a/documentation/memory/setup.md b/documentation/memory/setup.md deleted file mode 100644 index 8ad1dfb..0000000 --- a/documentation/memory/setup.md +++ /dev/null @@ -1 +0,0 @@ -//TODO diff --git a/documentation/memory/step1/using_heap_profiler.md b/documentation/memory/step1/using_heap_profiler.md deleted file mode 100644 index 7aa6ebf..0000000 --- a/documentation/memory/step1/using_heap_profiler.md +++ /dev/null @@ -1,88 +0,0 @@ -# Using Heap Profiler - -To debug a memory issue we need to be able to see how much space our specific type of objects take, and what variables retain them to get garbage collected. For the effective debugging we also need to know the allocation pattern of our variables over time. - -The heap profiler acts on top of V8 towards to bring snapshots of memory over time. In this document, we will cover the memory profiling using: - -1. Allocation Timeline -2. Sampling Heap Profiler - -Unlike heap dump that was cover in the [step 2](../step2/using_heap_snapshot.md), the idea of using real-time profiling is to understand allocations in a given time frame. - -## Heap Profiler - Allocation Timeline - -Heap Profiler is similar to the Sampling Heap Profiler, except it will track every allocation. It has -higher overhead than the Sampling Heap Profiler so it’s not recommended to use in production. - -> You can use [@mmarchini/observe](https://www.npmjs.com/package/@mmarchini/observe) to do it programmatically. - -### How To - -Start the application: - -```console -node --inspect index.js -``` - -> `--inspect-brk` is an better choice for scripts. - -Connect to the dev-tools instance and then: - -- Select `memory` tab -- Select `Allocation instrumentation timeline` -- Start profiling - -![image](https://user-images.githubusercontent.com/26234614/136712329-ac9fc581-af2b-4a94-8849-b959ebea0a59.png) - -After it, the heap profiling is running, it is strongly recommended to run samples in order to identify memory issues, for this example, we will use `Apache Benchmark` to produce load in the application. - -> In this example, we are assuming the heap profiling under web application. - -```console -ab -n 1000 -c 5 http://localhost:3000 -``` - -Hence, press stop button when the load expected is complete - -![image](https://user-images.githubusercontent.com/26234614/136714198-867632e0-2417-4336-9e6c-828fcf5be6b7.png) - -Then look at the snapshot data towards to memory allocation. - -![image](https://user-images.githubusercontent.com/26234614/136846720-65bf7073-eddc-4afd-9753-e21ef75e0243.png) - -Check the [usefull links](#usefull-links) section for futher information about memory terminology. - -## Sampling Heap Profiler - -Sampling Heap Profiler tracks memory allocation pattern and reserved space over time. As it’s -sampling based it has a low enough overhead to use it in production systems. - -> You can use the module [`heap-profiler`](https://www.npmjs.com/package/heap-profile) to do it programmatically. - -### How To - -Start the application: - -```console -node --inspect index.js -``` - -> `--inspect-brk` is an better choice for scripts. - -Connect to the dev-tools instance and then: - -- Select `memory` tab -- Select `Allocation sampling` -- Start profiling - -![image](https://user-images.githubusercontent.com/26234614/136847038-1cb6dfd4-26d4-4e2a-8dd1-8d74c2151360.png) - -Produce some load and stop the profiler. It will generate a summary with allocation based in the stacktrace, you can lookup to the functions with more heap allocations in a timespan, see the example below: - -![image](https://user-images.githubusercontent.com/26234614/136849337-1dd4c46e-b479-48a8-a995-422bb3f17f56.png) - -## Useful Links - -- https://developer.chrome.com/docs/devtools/memory-problems/memory-101/ -- https://github.com/v8/sampling-heap-profiler -- https://developer.chrome.com/docs/devtools/memory-problems/allocation-profiler/ diff --git a/documentation/memory/step2/_cursor.png b/documentation/memory/step2/_cursor.png deleted file mode 100644 index b2515e5..0000000 Binary files a/documentation/memory/step2/_cursor.png and /dev/null differ diff --git a/documentation/memory/step2/compare.png b/documentation/memory/step2/compare.png deleted file mode 100644 index d8ca934..0000000 Binary files a/documentation/memory/step2/compare.png and /dev/null differ diff --git a/documentation/memory/step2/load-snapshot.png b/documentation/memory/step2/load-snapshot.png deleted file mode 100644 index 73c8605..0000000 Binary files a/documentation/memory/step2/load-snapshot.png and /dev/null differ diff --git a/documentation/memory/step2/snapshot.png b/documentation/memory/step2/snapshot.png deleted file mode 100644 index 43ac686..0000000 Binary files a/documentation/memory/step2/snapshot.png and /dev/null differ diff --git a/documentation/memory/step2/tools.png b/documentation/memory/step2/tools.png deleted file mode 100644 index 0238020..0000000 Binary files a/documentation/memory/step2/tools.png and /dev/null differ diff --git a/documentation/memory/step2/using_heap_snapshot.md b/documentation/memory/step2/using_heap_snapshot.md deleted file mode 100644 index 8803d2d..0000000 --- a/documentation/memory/step2/using_heap_snapshot.md +++ /dev/null @@ -1,128 +0,0 @@ -# Using Heap Snapshot - -You can take a Heap Snapshot from your running application and load it into -Chrome Developer Tools to inspect certain variables or check retainer size. You -can also compare multiple snapshots to see differences over time. - -## Warning - -To create a snapshot, all other work in your main thread is stopped. Depending on the heap contents it could even take more than a minute. -The snapshot is built in memory, so it can double the heap size, resulting in filling up entire memory and then crashing the app. - -If you're going to take a heap snapshot in production, make sure the process you're taking it from can crash without impacting your application's availability. - -## How To - -### Get the Heap Snapshot - -1. via inspector -2. via external signal and commandline flag -3. via writeHeapSnapshot call withing the process -4. via inspector protocol - -#### 1. Use memory profiling in inspector - -> Works in all actively maintained versions of Node.js - -Run node with `--inspect` flag. Open inspector. -![open inspector](./tools.png) - -The simplest way to get a Heap Snapshot is to connect a inspector to your process running locally and go to Memory tab, choose to take a heap snapshot. - -![take a heap snapshot](./snapshot.png) - -#### 2. Use `--heapsnapshot-signal` flag - -> Works in v12.0.0 or later - -You can start node with a commandline flag enabling reacting to a signal to create a heap snapshot. - -``` -$ node --heapsnapshot-signal=SIGUSR2 index.js -``` - -``` -$ ps aux -USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND -node 1 5.5 6.1 787252 247004 ? Ssl 16:43 0:02 node --heapsnapshot-signal=SIGUSR2 index.js -$ kill -USR2 1 -$ ls -Heap.20190718.133405.15554.0.001.heapsnapshot -``` - -For details, see the latest documentation of [heapsnapshot-signal flag](https://nodejs.org/api/cli.html#--heapsnapshot-signalsignal) - - -#### 3. Use `writeHeapSnapshot` function - -> Works in v11.13.0 or later -> Can work in older versions with [heapdump package](https://www.npmjs.com/package/heapdump) - -If you need a snapshot from a working process, like an application running on a server, you can implement getting it using: - -```js -require('v8').writeHeapSnapshot() -``` - -Check [writeHeapSnapshot docs](https://nodejs.org/api/v8.html#v8_v8_writeheapsnapshot_filename) for file name options - -You need to have a way to invoke it without stopping the process, so calling it in a http handler or as a reaction to a signal from the operating system is advised. -Be careful not to expose the http endpoint triggering a snapshot. It should not be possible for anybody else to access it. - -For versions of Node.js before v11.13.0 you can use the [heapdump package](https://www.npmjs.com/package/heapdump) - -#### 4. Trigger Heap Snapshot using inspector protocol - -Inspector protocol can be used to trigger Heap Snapshot from outside of the process. - -It's not necessary to run the actual inspector from Chromium to use the API. - -Here's an example snapshot trigger in bash, using `websocat` and `jq` - -```bash -#!/bin/bash -set -e - -kill -USR1 "$1" -rm -f fifo out -mkfifo ./fifo -websocat -B 10000000000 "$(curl -s http://localhost:9229/json | jq -r '.[0].webSocketDebuggerUrl')" < ./fifo > ./out & -exec 3>./fifo -echo '{"method": "HeapProfiler.enable", "id": 1}' > ./fifo -echo '{"method": "HeapProfiler.takeHeapSnapshot", "id": 2}' > ./fifo -while jq -e "[.id != 2, .result != {}] | all" < <(tail -n 1 ./out); do - sleep 1s - echo "Capturing Heap Snapshot..." -done - -echo -n "" > ./out.heapsnapshot -while read -r line; do - f="$(echo "$line" | jq -r '.params.chunk')" - echo -n "$f" >> out.heapsnapshot - i=$((i+1)) -done < <(cat out | tail -n +2 | head -n -1) - -exec 3>&- -``` - -Not exhaustive list of memory profiling tools usable with inspector protocol: - -- [OpenProfiling for Node.js](https://github.com/vmarchaud/openprofiling-node) - - -## How to find a memory leak with Heap Snapshots - -To find a memory leak one compares two snapshots. It's important to make sure the snapshots diff doesn't contain unnecessary information. Following steps should produce a clean diff between snapshots. - -1. Let the process load all sources and finish bootstrapping. It should take a few seconds at most. -1. Start using the functionality you suspect of leaking memory. It's likely it makes some initial allocations that are not the leaking ones. -1. Take one heap snapshot. -1. Continue using the functionality for a while, preferably without running anything else in between. -1. Take another heap snapshot. The difference between the two should mostly contain what was leaking. -1. Open Chromium/Chrome dev tools and go to *Memory* tab -1. Load the older snapshot file first, newer one second ![Load button in tools](./load-snapshot.png) -1. Select the newer snapshot and switch mode in a dropdown at the top from *Summary* to *Comparison*. ![Comparison dropdown](./compare.png) -1. Look for large positive deltas and explore the references that caused them in the bottom panel. - - -Practice capturing heap snapshots and finding memory leaks with [a heap snapshot exercise](https://github.com/naugtur/node-example-heapdump) diff --git a/documentation/memory/step3/using_gc_traces.md b/documentation/memory/step3/using_gc_traces.md deleted file mode 100644 index 80ffbde..0000000 --- a/documentation/memory/step3/using_gc_traces.md +++ /dev/null @@ -1,189 +0,0 @@ -# Tracing garbage collection - -There's a lot to learn about how the garbage collector works, but if you learn one thing it's that when GC is running, your code is not. - -You may want to know how often and how long the garbage collection is running. - -## Runnig with garbage collection traces -You can see traces for garbage collection in console output of your process using the `--trace_gc` flag. - -``` -node --trace_gc app.js -``` - -You might want to avoid getting traces from the entire lifetime of your process running on a server. In that case, set the flag from within the process. - -Here's how to print GC events to stdout for one minute. -```js -const v8 = require('v8'); -v8.setFlagsFromString('--trace_gc'); -setTimeout(() => { v8.setFlagsFromString('--notrace_gc'); }, 60e3); -``` - -### Examining a trace with `--trace_gc` - -Obtained traces of garbage collection looks like the following lines. - -``` -[19278:0x5408db0] 44 ms: Scavenge 2.3 (3.0) -> 1.9 (4.0) MB, 1.2 / 0.0 ms (average mu = 1.000, current mu = 1.000) allocation failure - -[23521:0x10268b000] 120 ms: Mark-sweep 100.7 (122.7) -> 100.6 (122.7) MB, 0.15 / 0.0 ms (average mu = 0.132, current mu = 0.137) deserialize GC in old space requested -``` - -This is how to interpret the trace data (for the second line): - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
Token valueInterpretation
23521PID of the running process
0x10268db0Isolate (JS heap instance)
120Time since the process start in ms
Mark-sweepType / Phase of GC
100.7Heap used before GC in MB
122.7Total heap before GC in MB
100.6Heap used after GC in MB
122.7Total heap after GC in MB
0.15 / 0.0
- (average mu = 0.132, current mu = 0.137)
Time spent in GC in ms
deserialize GC in old space requestedReason for GC
- -## Using performance hooks to trace garbage collection - -For Node.js v8.5.0 or later, you can use [performance hooks](https://nodejs.org/api/perf_hooks.html) to trace garbage collection. - -```js -const { PerformanceObserver } = require('perf_hooks'); - -// Create a performance observer -const obs = new PerformanceObserver((list) => { - const entry = list.getEntries()[0] - /* - The entry would be an instance of PerformanceEntry containing - metrics of garbage collection. - For example: - PerformanceEntry { - name: 'gc', - entryType: 'gc', - startTime: 2820.567669, - duration: 1.315709, - kind: 1 - } - */ -}); - -// Subscribe notifications of GCs -obs.observe({ entryTypes: ['gc'] }); - -// Stop subscription -obs.disconnect(); -``` - -### Examining a trace with performance hooks - -You can get GC statistics as [PerformanceEntry](https://nodejs.org/api/perf_hooks.html#perf_hooks_class_performanceentry) from the callback in [PerformanceObserver](https://nodejs.org/api/perf_hooks.html#perf_hooks_class_performanceobserver). - -For example: - -``` -PerformanceEntry { - name: 'gc', - entryType: 'gc', - startTime: 2820.567669, - duration: 1.315709, - kind: 1 -} -``` - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
PropertyInterpretation
nameThe name of the performance entry.
entryTypeThe type of the performance entry.
startTimeThe high resolution millisecond timestamp
marking the starting time of the Performance Entry.
durationThe total number of milliseconds elapsed for this entry.
kindThe type of garbage collection operation that occurred.
flagsAdditional information about garbage collection operation.
- -For more information, you can refer to [the documentation about performance hooks](https://nodejs.org/api/perf_hooks.html). - -## Examples of diagnosing memory issues with trace option: - -A. How to get context of bad allocations - 1. Suppose we observe that the old space is continously increasing. - 2. But due to heavy gc, the heap roof is not hit, but the process is slow. - 3. Review the trace data and figure out how much is the total heap before and after the gc. - 4. Reduce `--max-old-space-size` such that the total heap is closer to the limit. - 5. Allow the program to run, hit the out of memory. - 6. The produced log shows the failing context. - -B. How to assert whether there is a memory leak when heap growth is observed - 1. Suppose we observe that the old space is continously increasing. - 2. Due to heavy gc, the heap roof is not hit, but the process is slow. - 3. Review the trace data and figure out how much is the total heap before and after the gc. - 4. Reduce `--max-old-space-size` such that the total heap is closer to the limit. - 5. Allow the program to run, see if it hits the out of memory. - 6. If it hits OOM, increment the heap size by ~10% or so and repeat few times. If the same pattern is observed, it is indicative of a memory leak. - 7. If there is no OOM, then freeze the heap size to that value - A packed heap reduces memory footprint and compation latency. - -C. How to assert whether too many gcs are happening or too many gcs are causing an overhead - 1. Review the trace data, specifically around time between consecutive gcs. - 2. Review the trace data, specifically around time spent in gc. - 3. If the time between two gc is less than the time spent in gc, the application is severely starving. - 4. If the time between two gcs and the time spent in gc are very high, probably the application can use a smaller heap. - 5. If the time between two gcs are much greater than the time spent in gc, application is relatively healthy. - diff --git a/documentation/memory/step4/using_native_tools.md b/documentation/memory/step4/using_native_tools.md deleted file mode 100644 index a58bb5c..0000000 --- a/documentation/memory/step4/using_native_tools.md +++ /dev/null @@ -1,14 +0,0 @@ -# Investigation Native memory leaks - -While JavaScript is a garbage collected language, memory is also allocated natively -in C/C++ code. These allocations may occur in the Node.js source code, addons or -in the Node.js dependencies (for example V8). - -[Valgrind](https://valgrind.org/) is a tool that can be used to instrument an -application such that you get reports on memory usage, including potential memory -leaks once the application terminates. Valgrind supports Linux as well as some Unix-like -variants including macOS. - -We recommend using Valgrind to investigate native memory leaks and more -detailed information on how to use it with Node.js is available in -[Investigating Memory Leaks with valgrind](https://github.com/nodejs/node/blob/master/doc/contributing/investigating_native_memory_leak.md)