You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are many third party tools available for profiling Node.js applications but, in many cases, the easiest option is to use the Node.js built in profiler. The built in profiler uses the [profiler inside V8](https://developers.google.com/v8/profiler_example) which samples the stack at regular intervals during program execution. It records the results of these samples, along with important optimization events such as jit compiles, as a series of ticks:
In the past you need the V8 source code to be able to interpret the ticks. Luckily, tools have recently been introduced into Node.js 4.1.1 that facilitate the consumption of this information without separately building V8 from source. Let's see how the built-in profiler can help provide insight into application performance.
14
-
19
+
15
20
To illustrate the use of the tick profiler, we will work with a simple Express application. Our application will have two handlers, one for adding new users to our system:
16
21
17
22
```javascript
18
23
app.get('/newUser', function (req, res) {
19
24
var username =req.query.username||'';
20
25
var password =req.query.password||'';
21
-
26
+
22
27
username =username.replace(/[!@#$%^&*]/g, '');
23
-
28
+
24
29
if (!username ||!password ||users.username) {
25
30
returnres.sendStatus(400);
26
31
}
27
-
32
+
28
33
var salt =crypto.randomBytes(128).toString('base64');
29
34
var hash =crypto.pbkdf2Sync(password, salt, 10000, 512);
30
-
35
+
31
36
users[username] = {
32
37
salt: salt,
33
38
hash: hash
34
39
};
35
-
40
+
36
41
res.sendStatus(200);
37
42
});
38
43
```
39
-
44
+
40
45
and another for validating user authentication attempts:
41
-
46
+
42
47
```javascript
43
48
app.get('/auth', function (req, res) {
44
49
var username =req.query.username||'';
45
50
var password =req.query.password||'';
46
-
51
+
47
52
username =username.replace(/[!@#$%^&*]/g, '');
48
-
53
+
49
54
if (!username ||!password ||!users[username]) {
50
55
returnres.sendStatus(400);
51
56
}
52
-
57
+
53
58
var hash =crypto.pbkdf2Sync(password, users[username].salt, 10000, 512);
54
-
59
+
55
60
if (users[username].hash.toString() ===hash.toString()) {
56
61
res.sendStatus(200);
57
62
} else {
58
63
res.sendStatus(401);
59
64
}
60
65
});
61
66
```
62
-
67
+
63
68
*Please note that these are NOT recommended handlers for authenticating users in your Node.js applications and are used purely for illustration purposes. You should not be trying to design your own cryptographic authentication mechanisms in general. It is much better to use existing, proven authentication solutions.*
64
-
69
+
65
70
Now assume that we've deployed our application and users are complaining about high latency on requests. We can easily run the app with the built in profiler:
66
71
67
72
```
68
73
NODE_ENV=production node --prof app.js
69
74
```
70
-
75
+
71
76
and put some load on the server using ab:
72
77
73
-
```
78
+
```
74
79
curl -X GET "http://localhost:8080/newUser?username=matt&password=password"
75
80
ab -k -c 20 -n 250 "http://localhost:8080/auth?username=matt&password=password"
76
81
```
77
-
82
+
78
83
and get an ab output of:
79
-
84
+
80
85
```
81
86
Concurrency Level: 20
82
87
Time taken for tests: 46.932 seconds
@@ -91,7 +96,7 @@ Time per request: 187.728 [ms] (mean, across all concurrent requests)
91
96
Transfer rate: 1.05 [Kbytes/sec] received
92
97
93
98
...
94
-
99
+
95
100
Percentage of the requests served within a certain time (ms)
96
101
50% 3755
97
102
66% 3804
@@ -103,15 +108,15 @@ Percentage of the requests served within a certain time (ms)
103
108
99% 3875
104
109
100% 4225 (longest request)
105
110
```
106
-
111
+
107
112
From this output, we see that we're only managing to serve about 5 requests per second and that the average request takes just under 4 seconds round trip. In a real world example, we could be doing lots of work in many functions on behalf of a user request but even in our simple example, time could be lost compiling regular expressions, generating random salts, generating unique hashes from user passwords, or inside the Express framework itself.
108
-
113
+
109
114
Since we ran our application using the --prof option, a tick file was generated in the same directory as your local run of the application. It should have the form isolate-0x124353456789-v8.log. In order to make sense of this file, we need to use the tick processor included in the Node.js source at <nodejs_dir>/tools/v8-prof/tick-processor.js. It is important that the version of the tick-processor that you run comes from the same version of node source as version of node used to generate the isolate file. This will no longer be a concern when the tick processor is [installed by default](https://github.com/nodejs/node/pull/3032). The raw tick output can be processed using this tool by running:
Opening processed.txt in your favorite text editor will give you a few different types of information. The file is broken up into sections which are again broken up by language. First, we look at the summary section and see:
116
121
117
122
```
@@ -123,51 +128,51 @@ Opening processed.txt in your favorite text editor will give you a few different
123
128
767 2.0% Shared libraries
124
129
215 0.6% Unaccounted
125
130
```
126
-
131
+
127
132
This tells us that 97% of all samples gathered occurred in C++ code and that when viewing other sections of the processed output we should pay most attention to work being done in C++ (as opposed to Javascript). With this in mind, we next find the [C++] section which contains information about which C++ functions are taking the most CPU time and see:
We see that the top 3 entries account for 72.1% of CPU time taken by the program. From this output, we immediately see that at least 51.8% of CPU time is taken up by a function called PBKDF2 which corresponds to our hash generation from a user's password. However, it may not be immediately obvious how the lower two entries factor into our application (or if it is we will pretend otherwise for the sake of example). To better understand the relationship between these functions, we will next look at the [Bottom up (heavy) profile] section which provides information about the primary callers of each function. Examining this section, we find:
Parsing this section takes a little more work than the raw tick counts above. Within each of the "call stacks" above, the percentage in the parent column tells you the percentage of samples for which the function in the row above was called by the function in the current row. For example, in the middle "call stack" above for _sha1_block_data_order, we see that _sha1_block_data_order occurred in 11.9% of samples, which we knew from the raw counts above. However, here, we can also tell that it was always called by the pbkdf2 function inside the Node.js crypto module. We see that similarly, _malloc_zone_malloc was called almost exclusively by the same pbkdf2 function. Thus, using the information in this view, we can tell that our hash computation from the user's password accounts not only for the 51.8% from above but also for all CPU time in the top 3 most sampled functions since the calls to _sha1_block_data_order and _malloc_zone_malloc were made on behalf of the pbkdf2 function.
155
-
160
+
156
161
At this point, it is very clear that the password based hash generation should be the target of our optimization. Thankfully, you've fully internalized the benefits of asynchronous programming (https://nodesource.com/blog/why-asynchronous) and you realize that the work to generate a hash from the user's password is being done in a synchronous way and thus tying down the event loop. This prevents us from working on other incoming requests while computing a hash.
157
-
162
+
158
163
To remedy this issue, you make a small modification to the above handlers to use the asynchronous version of the pbkdf2 function:
if (users[username].hash.toString() ===hash.toString()) {
173
178
res.sendStatus(200);
@@ -177,9 +182,9 @@ app.get('/auth', function (req, res) {
177
182
});
178
183
});
179
184
```
180
-
185
+
181
186
A new run of the ab benchmark above with the asynchronous version of your app yields:
182
-
187
+
183
188
```
184
189
Concurrency Level: 20
185
190
Time taken for tests: 12.846 seconds
@@ -192,9 +197,9 @@ Requests per second: 19.46 [#/sec] (mean)
192
197
Time per request: 1027.689 [ms] (mean)
193
198
Time per request: 51.384 [ms] (mean, across all concurrent requests)
194
199
Transfer rate: 3.82 [Kbytes/sec] received
195
-
200
+
196
201
...
197
-
202
+
198
203
Percentage of the requests served within a certain time (ms)
199
204
50% 1018
200
205
66% 1035
@@ -206,7 +211,7 @@ Percentage of the requests served within a certain time (ms)
206
211
99% 1071
207
212
100% 1079 (longest request)
208
213
```
209
-
214
+
210
215
Yay! Your app is now serving about 20 requests per second, roughly 4 times more than it was with the synchronous hash generation. Additionally, the average latency is down from the 4 seconds before to just over 1 second.
211
-
216
+
212
217
Hopefully, through the performance investigation of this (admittedly contrived) example, you've seen how the V8 tick processor can help you gain a better understanding of the performance of your Node.js applications.
0 commit comments