Skip to content

Commit 60c76c7

Browse files
committed
Documentation updates
1 parent 0398665 commit 60c76c7

File tree

3 files changed

+57
-22
lines changed

3 files changed

+57
-22
lines changed

README.md

Lines changed: 36 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@
44
[![npm](https://img.shields.io/npm/v/regex-utilities)](https://www.npmjs.com/package/regex-utilities)
55
[![bundle size](https://deno.bundlejs.com/badge?q=regex-utilities&treeshake=[*])](https://bundlejs.com/?q=regex-utilities&treeshake=[*])
66

7-
Tiny utilities shared by the [`regex`](https://github.com/slevithan/regex) library and its extensions. Useful for parsing and processing regular expressions, when you don't need a full regex AST builder.
7+
Tiny utilities shared by the [regex](https://github.com/slevithan/regex) library and its plugins. Useful for parsing and processing regular expression syntax in a lightweight way, when you don't need a full regex AST.
88

99
## Constants
1010

@@ -17,24 +17,53 @@ Frozen object with the following properties for tracking regex syntax context:
1717

1818
## Functions
1919

20-
See documentation in the source code for more details.
20+
For all of the following functions, argument `expression` is the target string, and `needle` is the pattern to search for.
21+
22+
- Argument `expression` is assumed to be a flag-`v`-mode regex pattern string (in other words, nested character classes are allowed when determining the context for a match).
23+
- Argument `needle` is a regex pattern as a string, and is applied with flags `su`.
24+
- If argument `context` is not provided, matches are allowed in all contexts (in other words, inside and outside of character classes).
2125

2226
### `execUnescaped`
2327

24-
Returns a match object for the first unescaped instance of a pattern that is in the given context. Else, returns `null`.
28+
Arguments: `expression, needle, [pos = 0], [context]`
29+
30+
Returns a match object for the first unescaped instance of a regex pattern in the given context, or `null`.
2531

2632
### `hasUnescaped`
2733

28-
Checks whether an unescaped instance of a pattern appears in the given context.
34+
Arguments: `expression, needle, [context]`
35+
36+
Checks whether an unescaped instance of a regex pattern appears in the given context.
2937

3038
### `forEachUnescaped`
3139

32-
Runs a callback for each unescaped instance of a pattern that is in the given context.
40+
Arguments: `expression, needle, callback, [context]`
41+
42+
Runs a callback for each unescaped instance of a regex pattern in the given context.
3343

3444
### `replaceUnescaped`
3545

36-
Replaces all unescaped instances of a pattern that are in the given context.
46+
Arguments: `expression, needle, replacement, [context]`
47+
48+
Replaces all unescaped instances of a regex pattern in the given context, using a replacement string or callback.
49+
50+
<details>
51+
<summary>Examples</summary>
52+
53+
```js
54+
replaceUnescaped('.\\.\\\\.[[\\.].].', '\\.', '~');
55+
// → '~\\.\\\\~[[\\.]~]~'
56+
57+
replaceUnescaped('.\\.\\\\.[[\\.].].', '\\.', '~', Context.DEFAULT);
58+
// → '~\\.\\\\~[[\\.].]~'
59+
60+
replaceUnescaped('.\\.\\\\.[[\\.].].', '\\.', '~', Context.CHAR_CLASS);
61+
// → '.\\.\\\\.[[\\.]~].'
62+
```
63+
</details>
3764

3865
### `getGroupContents`
3966

40-
Returns the contents of the group within the given pattern, with the group being identified by the position where its contents start (i.e., just *after* the group's opening delimiter). Accounts for escaped characters, nested groups, and character classes. Returns the rest of the string if the group is unclosed.
67+
Arguments: `expression, contentsStartPos`
68+
69+
Extracts the full contents of a group (subpattern) from the given expression, accounting for escaped characters, nested groups, and character classes. The group is identified by the position where its contents start (the string index just after the group's opening delimiter). Returns the rest of the string if the group is unclosed.

spec/utilities-spec.js

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,11 @@ describe('replaceUnescaped', () => {
3737
it('should replace all using a replacement function and named backrefs', () => {
3838
expect(replaceUnescaped('%1 %22', '%(?<num>\\d+)', ({groups: {num}}) => `\\${num}`)).toBe('\\1 \\22');
3939
});
40+
41+
// Just documenting current behavior
42+
it('should replace with a literal string (no backreferences) if given a replacement string', () => {
43+
expect(replaceUnescaped('ab', '(.)(?<a>.)', '~$1$<a>~')).toBe('~$1$<a>~');
44+
});
4045
});
4146

4247
describe('forEachUnescaped', () => {

src/index.js

Lines changed: 16 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,8 @@ export const Context = Object.freeze({
55
});
66

77
/**
8-
Replaces all unescaped instances of a pattern that are in the given context.
8+
Replaces all unescaped instances of a regex pattern in the given context, using a replacement
9+
string or callback.
910
1011
Doesn't skip over complete multicharacter tokens (only `\` plus its folowing char) so must be used
1112
with knowledge of what's safe to do given regex syntax. Assumes UnicodeSets-mode syntax.
@@ -15,12 +16,12 @@ with knowledge of what's safe to do given regex syntax. Assumes UnicodeSets-mode
1516
@param {'DEFAULT' | 'CHAR_CLASS'} [context] All contexts if not specified
1617
@returns {string} Updated expression
1718
@example
18-
replaceUnescaped(String.raw`.\.\\.[[\.].].`, '\\.', '~');
19-
// → String.raw`~\.\\~[[\.]~]~`
20-
replaceUnescaped(String.raw`.\.\\.[[\.].].`, '\\.', '~', Context.DEFAULT);
21-
// → String.raw`~\.\\~[[\.].]~`
22-
replaceUnescaped(String.raw`.\.\\.[[\.].].`, '\\.', '~', Context.CHAR_CLASS);
23-
// → String.raw`.\.\\.[[\.]~].`
19+
replaceUnescaped('.\\.\\\\.[[\\.].].', '\\.', '~');
20+
// → '~\\.\\\\~[[\\.]~]~'
21+
replaceUnescaped('.\\.\\\\.[[\\.].].', '\\.', '~', Context.DEFAULT);
22+
// → '~\\.\\\\~[[\\.].]~'
23+
replaceUnescaped('.\\.\\\\.[[\\.].].', '\\.', '~', Context.CHAR_CLASS);
24+
// → '.\\.\\\\.[[\\.]~].'
2425
*/
2526
export function replaceUnescaped(expression, needle, replacement, context) {
2627
const re = new RegExp(`${needle}|(?<skip>\\\\?.)`, 'gsu');
@@ -47,7 +48,7 @@ export function replaceUnescaped(expression, needle, replacement, context) {
4748
}
4849

4950
/**
50-
Runs a callback for each unescaped instance of a pattern that is in the given context.
51+
Runs a callback for each unescaped instance of a regex pattern in the given context.
5152
5253
Doesn't skip over complete multicharacter tokens (only `\` plus its folowing char) so must be used
5354
with knowledge of what's safe to do given regex syntax. Assumes UnicodeSets-mode syntax.
@@ -62,8 +63,8 @@ export function forEachUnescaped(expression, needle, callback, context) {
6263
}
6364

6465
/**
65-
Returns a match object for the first unescaped instance of a pattern that is in the given context.
66-
Else, returns `null`.
66+
Returns a match object for the first unescaped instance of a regex pattern in the given context, or
67+
`null`.
6768
6869
Doesn't skip over complete multicharacter tokens (only `\` plus its folowing char) so must be used
6970
with knowledge of what's safe to do given regex syntax. Assumes UnicodeSets-mode syntax.
@@ -101,7 +102,7 @@ export function execUnescaped(expression, needle, pos = 0, context) {
101102
}
102103

103104
/**
104-
Checks whether an unescaped instance of a pattern appears in the given context.
105+
Checks whether an unescaped instance of a regex pattern appears in the given context.
105106
106107
Doesn't skip over complete multicharacter tokens (only `\` plus its folowing char) so must be used
107108
with knowledge of what's safe to do given regex syntax. Assumes UnicodeSets-mode syntax.
@@ -116,10 +117,10 @@ export function hasUnescaped(expression, needle, context) {
116117
}
117118

118119
/**
119-
Returns the contents of the group within the given pattern, with the group being identified by the
120-
position where its contents start (i.e., just *after* the group's opening delimiter). Accounts for
121-
escaped characters, nested groups, and character classes. Returns the rest of the string if the
122-
group is unclosed.
120+
Extracts the full contents of a group (subpattern) from the given expression, accounting for
121+
escaped characters, nested groups, and character classes. The group is identified by the position
122+
where its contents start (the string index just after the group's opening delimiter). Returns the
123+
rest of the string if the group is unclosed.
123124
124125
Assumes UnicodeSets-mode syntax.
125126
@param {string} expression Search target

0 commit comments

Comments
 (0)