You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+24-4Lines changed: 24 additions & 4 deletions
Original file line number
Diff line number
Diff line change
@@ -59,6 +59,12 @@ List of supported problem daemons:
59
59
|[KernelMonitor](https://github.com/kubernetes/node-problem-detector/blob/master/config/kernel-monitor.json)| KernelDeadlock | A system log monitor monitors kernel log and reports problem according to predefined rules. |
60
60
|[AbrtAdaptor](https://github.com/kubernetes/node-problem-detector/blob/master/config/abrt-adaptor.json)| None | Monitor ABRT log messages and report them further. ABRT (Automatic Bug Report Tool) is health monitoring daemon able to catch kernel problems as well as application crashes of various kinds occurred on the host. For more information visit the [link](https://github.com/abrt). |
61
61
|[CustomPluginMonitor](https://github.com/kubernetes/node-problem-detector/blob/master/config/custom-plugin-monitor.json)| On-demand(According to users configuration) | A custom plugin monitor for node-problem-detector to invoke and check various node problems with user defined check scripts. See proposal [here](https://docs.google.com/document/d/1jK_5YloSYtboj-DtfjmYKxfNnUxCAvohLnsH5aGCAYQ/edit#). |
62
+
|[SystemStatsMonitor](https://github.com/kubernetes/node-problem-detector/blob/master/config/system-stats-monitor.json)| None(Could be added in the future) | A system stats monitor for node-problem-detector to collect various health-related system stats as metrics. See proposal [here](https://docs.google.com/document/d/1SeaUz6kBavI283Dq8GBpoEUDrHA2a795xtw0OvjM568/edit). |
63
+
64
+
# Exporter
65
+
66
+
An exporter is a component of node-problem-detector. It reports node problems and/or metrics to
67
+
certain back end (e.g. Kubernetes API server, or Prometheus scrape endpoint).
62
68
63
69
# Usage
64
70
@@ -67,16 +73,21 @@ List of supported problem daemons:
67
73
*`--version`: Print current version of node-problem-detector.
68
74
*`--address`: The address to bind the node problem detector server.
69
75
*`--port`: The port to bind the node problem detector server. Use 0 to disable.
70
-
*`--system-log-monitors`: List of paths to system log monitor configuration files, comma separated, e.g.
76
+
*`--config.system-log-monitor`: List of paths to system log monitor configuration files, comma separated, e.g.
flag of [Heapster](https://github.com/kubernetes/heapster).
82
93
For example, to run without auth, use the following config:
@@ -85,6 +96,14 @@ For example, to run without auth, use the following config:
85
96
```
86
97
Refer [heapster docs](https://github.com/kubernetes/heapster/blob/master/docs/source-configuration.md#kubernetes) for a complete list of available options.
87
98
*`--hostname-override`: A customized node name used for node-problem-detector to update conditions and emit events. node-problem-detector gets node name first from `hostname-override`, then `NODE_NAME` environment variable and finally fall back to `os.Hostname`.
99
+
*`--prometheus-address`: The address to bind the Prometheus scrape endpoint, default to `127.0.0.1`.
100
+
*`--prometheus-port`: The port to bind the Prometheus scrape endpoint, default to 20257. Use 0 to disable.
101
+
102
+
### Deprecated Flags
103
+
104
+
*`--system-log-monitors`: List of paths to system log monitor config files, comma separated. This option is deprecated, replaced by `--config.system-log-monitor`, and will be removed. NPD will panic if both `--system-log-monitors` and `--config.system-log-monitor` are set.
105
+
106
+
*`--custom-plugin-monitors`: List of paths to custom plugin monitor config files, comma separated. This option is deprecated, replaced by `--config.custom-plugin-monitor`, and will be removed. NPD will panic if both `--custom-plugin-monitors` and `--config.custom-plugin-monitor` are set.
88
107
89
108
## Build Image
90
109
@@ -149,12 +168,13 @@ For example, to test [KernelMonitor](https://github.com/kubernetes/node-problem-
2.```kubectl proxy --port=8080``` (make a running cluster's API server available locally)
151
170
3. Update [KernelMonitor](https://github.com/kubernetes/node-problem-detector/blob/master/config/kernel-monitor.json)'s ```logPath``` to your local kernel log directory. For example, on some Linux systems, it is ```/run/log/journal``` instead of ```/var/log/journal```.
152
-
3.```./bin/node-problem-detector --logtostderr --apiserver-override=http://127.0.0.1:8080?inClusterConfig=false --system-log-monitors=config/kernel-monitor.json --port=20256``` (or point to any API server address:port)
171
+
3.```./bin/node-problem-detector --logtostderr --apiserver-override=http://127.0.0.1:8080?inClusterConfig=false --config.system-log-monitor=config/kernel-monitor.json --config.system-stats-monitor=config/system-stats-monitor.json --port=20256 --prometheus-port=20257``` (or point to any API server address:port and Prometheus port)
153
172
4.```sudo sh -c "echo 'kernel: BUG: unable to handle kernel NULL pointer dereference at TESTING' >> /dev/kmsg"```
154
173
5. You can see ```KernelOops``` event in the node-problem-detector log.
155
174
6.```sudo sh -c "echo 'kernel: INFO: task docker:20744 blocked for more than 120 seconds.' >> /dev/kmsg"```
156
175
7. You can see ```DockerHung``` event and condition in the node-problem-detector log.
157
176
8. You can see ```DockerHung``` condition at [http://127.0.0.1:20256/conditions](http://127.0.0.1:20256/conditions).
177
+
9. You can see disk related system metrics in Prometheus format at [http://127.0.0.1:20257/metrics](http://127.0.0.1:20257/metrics).
158
178
159
179
**Note**:
160
180
- You can see more rule examples under [test/kernel_log_generator/problems](https://github.com/kubernetes/node-problem-detector/tree/master/test/kernel_log_generator/problems).
[]string{}, "List of paths to system log monitor config files, comma separated.")
90
+
fs.MarkDeprecated("system-log-monitors", "replaced by --config.system-log-monitor. NPD will panic if both --system-log-monitors and --config.system-log-monitor are set.")
[]string{}, "List of paths to custom plugin monitor config files, comma separated.")
93
+
fs.MarkDeprecated("custom-plugin-monitors", "replaced by --config.custom-plugin-monitor. NPD will panic if both --custom-plugin-monitors and --config.custom-plugin-monitor are set.")
94
+
fs.BoolVar(&npdo.EnableK8sExporter, "enable-k8s-exporter", true, "Enables reporting to Kubernetes API server.")
0 commit comments