Skip to content

Commit 1a90ffd

Browse files
committed
feat(parse): on_skip documentation
1 parent e47d395 commit 1a90ffd

File tree

3 files changed

+123
-66
lines changed

3 files changed

+123
-66
lines changed

csv

Submodule csv updated 601 files

src/md/parse/options.md

Lines changed: 69 additions & 65 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,8 @@
11
---
22
title: Options
33
description: Options relative to the csv-parse package
4-
keywords: ['csv', 'parse', 'options', 'delimiter', 'columns', 'comment', 'escape']
4+
keywords:
5+
['csv', 'parse', 'options', 'delimiter', 'columns', 'comment', 'escape']
56
sort: 5
67
---
78

@@ -13,101 +14,104 @@ All options are optional. The options from the [Node.js Stream Writable](https:/
1314

1415
## Available options
1516

16-
* [`bom`](/parse/options/bom/) (boolean)
17-
_Since version 4.4.0_
17+
- [`bom`](/parse/options/bom/) (boolean)
18+
_Since version 4.4.0_
1819
If true, detect and exclude the byte order mark (BOM) from the CSV input if present.
19-
* [`cast`](/parse/options/cast/) (boolean|function)
20-
_Since version 2.2.0_
20+
- [`cast`](/parse/options/cast/) (boolean|function)
21+
_Since version 2.2.0_
2122
Alter the value of a field. If true, the parser will attempt to convert input strings to native types. A function must return the new value and the received arguments are the value to cast and the context object. This option was named `auto_parse` until version 2.
22-
* [`cast_date`](/parse/options/cast_date/) (boolean|function)
23-
_Since version 1.0.5_
23+
- [`cast_date`](/parse/options/cast_date/) (boolean|function)
24+
_Since version 1.0.5_
2425
Convert the CSV field to a date. It requires the `cast` option to be active. This option was named `auto_parse_date` until version 2.
25-
* [`columns`](/parse/options/columns/) (array|boolean|function)
26-
_Since early days_
26+
- [`columns`](/parse/options/columns/) (array|boolean|function)
27+
_Since early days_
2728
Generate records as object literals instead of arrays. List of fields as an array, a user defined callback accepting the first line and returning the column names, or `true` if auto-discovered in the first CSV line. Defaults to `null`. Affects the result data set in the sense that records will be objects instead of arrays. A value "false" "null", or "undefined" inside the column array skips the column from the output.
28-
* [`group_columns_by_name`](/parse/options/group_columns_by_name/) (boolean)
29+
- [`group_columns_by_name`](/parse/options/group_columns_by_name/) (boolean)
2930
_Since version 4.10.0_
3031
Convert values into an array of values when columns are activated and when multiple columns of the same name are found.
31-
* [`comment`](/parse/options/comment/) (string|buffer)
32-
_Since early days_
32+
- [`comment`](/parse/options/comment/) (string|buffer)
33+
_Since early days_
3334
Treat all the characters after this one as a comment; one or multiple characters; disabled by default by defining an empty string `""`.
34-
* [`comment_no_infix`](/parse/options/comment_no_infix/) (boolean)
35-
_Since 5.5.0_
35+
- [`comment_no_infix`](/parse/options/comment_no_infix/) (boolean)
36+
_Since 5.5.0_
3637
Restricts the definition of comments to full lines.
37-
* [`delimiter`](/parse/options/delimiter/) (string|Buffer|[string|Buffer])
38-
_Since version 0.0.1_
38+
- [`delimiter`](/parse/options/delimiter/) (string|Buffer|[string|Buffer])
39+
_Since version 0.0.1_
3940
Set one or several field delimiters containing one or several characters. It defaults to `,` (comma).
40-
* [`encoding`](/parse/options/encoding/) (string|Buffer)
41-
_Since version 4.13.0_
41+
- [`encoding`](/parse/options/encoding/) (string|Buffer)
42+
_Since version 4.13.0_
4243
Set the input and output encodings. Using `null` or `false` output the raw buffer instead of a string and it defaults to `utf8`.
43-
* [`escape`](/parse/options/escape/) (string|Buffer)
44-
_Since version 0.0.1_
44+
- [`escape`](/parse/options/escape/) (string|Buffer)
45+
_Since version 0.0.1_
4546
Set the escape character as one character/byte only. It only applies to quote and escape characters inside quoted fields and it defaults to `"` (double quote).
46-
* [`from`](/parse/options/from/) (number)
47-
_Since version 1.2.0_
47+
- [`from`](/parse/options/from/) (number)
48+
_Since version 1.2.0_
4849
Start handling records from a requested number of records. Count is 1-based, for example, provides `1` (and not `0`) to emit first record.
49-
* [`from_line`](/parse/options/from_line/) (number)
50-
_Since version 4.0.0_
50+
- [`from_line`](/parse/options/from_line/) (number)
51+
_Since version 4.0.0_
5152
Start handling records from a requested line number.
52-
* [`ignore_last_delimiters`](/parse/options/ignore_last_delimiters/) (boolean)
53-
_Since version 4.15.0_
53+
- [`ignore_last_delimiters`](/parse/options/ignore_last_delimiters/) (boolean)
54+
_Since version 4.15.0_
5455
Disregard any delimiters present in the last field of the record, require the [`column`](/parse/options/columns/) option when `true`.
55-
* [`info`](/parse/options/info/) (boolean)
56-
_Since version 4.0.0_
56+
- [`info`](/parse/options/info/) (boolean)
57+
_Since version 4.0.0_
5758
Generate two properties `info` and `record` where `info` is a snapshot of the info object at the time the record was created and `record` is the parsed array or object; note, it can be used conjointly with the `raw` option.
58-
* [`ltrim`](/parse/options/ltrim/) (boolean)
59-
_Since early days_
59+
- [`ltrim`](/parse/options/ltrim/) (boolean)
60+
_Since early days_
6061
If `true`, ignore whitespace immediately following the delimiter (i.e. left-trim all fields). Defaults to `false`. Does not remove whitespace in a quoted field.
61-
* [`max_record_size`](/parse/options/max_record_size/) (integer)
62-
_Since version 4.0.0_
62+
- [`max_record_size`](/parse/options/max_record_size/) (integer)
63+
_Since version 4.0.0_
6364
Maximum number of characters to be contained in the field and line buffers before an exception is raised. It was previously named "max_limit_on_data_read".
64-
* [`objname`](/parse/options/objname/) (string|Buffer)
65-
_Since early days_
65+
- [`objname`](/parse/options/objname/) (string|Buffer)
66+
_Since early days_
6667
Name of header-record title to name objects by; the string or Buffer value must not be empty and it must match a header value.
67-
* [`on_record`](/parse/options/on_record/) (function)
68-
_Since 4.7.0_
68+
- [`on_record`](/parse/options/on_record/) (function)
69+
_Since 4.7.0_
6970
Alter and filter records by executing a user defined function.
70-
* [`quote`](/parse/options/quote/) (char|Buffer|boolean)
71-
_Since version 0.0.1_
71+
- [`on_skip`](/parse/options/on_skip/) (function)
72+
_Since 5.1.0_
73+
Track invalid records without interrupting the parsing process.
74+
- [`quote`](/parse/options/quote/) (char|Buffer|boolean)
75+
_Since version 0.0.1_
7276
Optional character surrounding a field as one character only; disabled if null, false or empty; defaults to double quote.
73-
* [`raw`](/parse/options/raw/) (boolean)
74-
_Since version 1.1.6_
77+
- [`raw`](/parse/options/raw/) (boolean)
78+
_Since version 1.1.6_
7579
Generate two properties `raw` and `record` where `raw` is the original CSV content and `record` is the parsed array or object; note, it can be used conjointly with the `info` option.
76-
* [`record_delimiter`](/parse/options/record_delimiter/) (chars|array)
77-
_Since version 4.0.0_
80+
- [`record_delimiter`](/parse/options/record_delimiter/) (chars|array)
81+
_Since version 4.0.0_
7882
One or multiple characters used to delimit records; defaults to auto discovery if not provided. Supported auto discovery methods are Linux ("\n"), Apple ("\r") and Windows ("\r\n") row delimiters. It was previously named `rowDelimiter`.
79-
* [`relax_column_count`](/parse/options/relax_column_count/) (boolean)
80-
_Since version 1.0.6_
83+
- [`relax_column_count`](/parse/options/relax_column_count/) (boolean)
84+
_Since version 1.0.6_
8185
Discard inconsistent columns count; disabled if null, false or empty; default to `false`.
82-
* [`relax_column_count_less`](/parse/options/relax_column_count/) (boolean)
83-
_Since version 4.8.0_
86+
- [`relax_column_count_less`](/parse/options/relax_column_count/) (boolean)
87+
_Since version 4.8.0_
8488
Similar to `relax_column_count` but only apply when the record contains less fields than expected.
85-
* [`relax_column_count_more`](/parse/options/relax_column_count/) (boolean)
86-
_Since version 4.8.0_
89+
- [`relax_column_count_more`](/parse/options/relax_column_count/) (boolean)
90+
_Since version 4.8.0_
8791
Similar to `relax_column_count` but only apply when the record contains more fields than expected.
88-
* [`relax_quotes`](/parse/options/relax_quotes/) (boolean)
89-
_Since version 0.0.1_
92+
- [`relax_quotes`](/parse/options/relax_quotes/) (boolean)
93+
_Since version 0.0.1_
9094
Preserve quotes inside unquoted field (be warned, it doesn't make coffee).
91-
* [`rtrim`](/parse/options/rtrim/) (boolean)
92-
_Since early days_
93-
If `true`, ignore whitespace immediately preceding the delimiter (i.e. right-trim all fields). Defaults to `false`. Does not remove whitespace in a quoted field.
94-
* [`skip_empty_lines`](/parse/options/skip_empty_lines/) (boolean)
95-
_Since version 0.0.5_
95+
- [`rtrim`](/parse/options/rtrim/) (boolean)
96+
_Since early days_
97+
If `true`, ignore whitespace immediately preceding the delimiter (i.e. right-trim all fields). Defaults to `false`. Does not remove whitespace in a quoted field.
98+
- [`skip_empty_lines`](/parse/options/skip_empty_lines/) (boolean)
99+
_Since version 0.0.5_
96100
Don't generate records for empty lines (line matching `/^$/`), defaults to `false`.
97-
* [`skip_records_with_empty_values`](/parse/options/skip_records_with_empty_values/) (boolean)
98-
_Since version 1.1.8_
101+
- [`skip_records_with_empty_values`](/parse/options/skip_records_with_empty_values/) (boolean)
102+
_Since version 1.1.8_
99103
Don't generate records for lines containing empty values (column matching `/\s*/`), empty Buffer or equals to `null` and `undefined` if their value was casted, defaults to `false`.
100-
* [`skip_records_with_error`](/parse/options/skip_records_with_error/) (boolean)
101-
_Since version 2.1.0_
104+
- [`skip_records_with_error`](/parse/options/skip_records_with_error/) (boolean)
105+
_Since version 2.1.0_
102106
Skip a line with error found inside and directly go process the next line.
103-
* [`to`](/parse/options/to/) (number)
104-
_Since version 1.2.0_
107+
- [`to`](/parse/options/to/) (number)
108+
_Since version 1.2.0_
105109
Stop handling records after the requested number of records.
106-
* [`to_line`](/parse/options/to_line/) (number)
107-
_Since version 4.0.0_
110+
- [`to_line`](/parse/options/to_line/) (number)
111+
_Since version 4.0.0_
108112
Stop handling records after the requested line number.
109-
* [`trim`](/parse/options/trim/) (boolean)
110-
_Since early days_
113+
- [`trim`](/parse/options/trim/) (boolean)
114+
_Since early days_
111115
If `true`, ignore whitespaces immediately around the delimiter. Defaults to `false`. Does not remove whitespace in a quoted field.
112116

113117
## Choose your style

src/md/parse/options/on_skip.md

Lines changed: 53 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,53 @@
1+
---
2+
title: Option on_skip
3+
navtitle: on_skip
4+
description: Option "on_skip" provides a callback to handle skipped records that contain invalid data.
5+
keywords: ['csv', 'parse', 'options', 'skip', 'error', 'invalid', 'record']
6+
---
7+
8+
# Option `on_skip`
9+
10+
The `on_skip` option provides a way to track invalid records without interrupting the parsing process. It defines a function called when records are skipped due to parsing errors.
11+
12+
- Type: `function`
13+
- Optional
14+
- Default: `undefined`
15+
- Since: 5.1.0
16+
- Related: [`on_record`](/parse/options/on_record/), [`skip_records_with_error`](/parse/options/raw/), [`raw`](/parse/options/skip_records_with_error/) — see [Available Options](/parse/options/#available-options)
17+
18+
The `on_skip` option works at the record level and requires the `skip_records_with_error` option to be enabled.
19+
20+
## Use cases
21+
22+
Use this option to:
23+
24+
- Log skipped records for later analysis
25+
- Track parsing errors while maintaining the parsing process
26+
- Monitor data quality issues in your CSV files
27+
28+
## Usage
29+
30+
The user function receives the error object as an argument. If the `raw` option is enabled, a second argument contains the CSV string being currently processed.
31+
32+
- `error`: Error encountered during parsing
33+
- `message`: A descriptive error message
34+
- `code`: The error code (e.g., "CSV_RECORD_INCONSISTENT_FIELDS_LENGTH")
35+
- `record`: The raw record that caused the error
36+
- `buffer`: Current processing buffer encoded as an UTF-8 string
37+
38+
## Example with inconsistent field lengths
39+
40+
The following example demonstrates how to handle records with inconsistent field counts:
41+
42+
`embed:packages/csv-parse/samples/option.on_skip.js`
43+
44+
## Error handling
45+
46+
The `on_skip` function is called after the parser has determined that a record should be skipped. It works in conjunction with the `skip_records_with_error` option:
47+
48+
1. When `skip_records_with_error` is `true`, invalid records are skipped and trigger the `on_skip` callback
49+
2. When `skip_records_with_error` is `false` (default), parsing errors will cause the parser to emit an error and stop
50+
51+
## Error behaviour
52+
53+
Errors thrown inside the `on_skip` function are caught by the parser and handled as if `skip_records_with_error` was not enabled. It's recommended to implement proper error handling within your callback function to prevent the parsing process from being interrupted.

0 commit comments

Comments
 (0)