Skip to content

Commit 9c5471a

Browse files
committed
tr/abc/de/: Properly handle longer lhs in in-place calc
A tr/// can be done in-place if the target string doesn't contain a character whose transliterated representation is longer than the original. Otherwise, writing the new value would destroy the next character we need to read. In general, we can't know if a particular string contains such a character without keeping a list of the problematic characters, and scanning it ahead of time for occurrences of those. Instead, we determine at compilation time if, for a given transliteration, if there exists any possible target string that could have an overwriting problem. If none exist, we edit in place. Otherwise, we first make a copy. Prior to this commit, the code failed to account for the case where the rhs is shorter than the left, so that any unmatched lhs characters map to the final rhs one. The reason the code didn't consider this is that I didn't think of this possibility when writing it. This fixes #17654 and #17643
1 parent d1fb523 commit 9c5471a

File tree

2 files changed

+13
-2
lines changed

2 files changed

+13
-2
lines changed

op.c

+6-1
Original file line numberDiff line numberDiff line change
@@ -7475,7 +7475,12 @@ S_pmtrans(pTHX_ OP *o, OP *expr, OP *repl)
74757475
t_cp_end = MIN(IV_MAX, t_cp + span - 1);
74767476

74777477
if (r_cp == TR_SPECIAL_HANDLING) {
7478-
r_cp_end = TR_SPECIAL_HANDLING;
7478+
7479+
/* If unmatched lhs code points map to the final map, use that
7480+
* value. This being set to TR_SPECIAL_HANDLING indicates that
7481+
* we don't have a final map: unmatched lhs code points are
7482+
* simply deleted */
7483+
r_cp_end = (del) ? TR_SPECIAL_HANDLING : final_map;
74797484
}
74807485
else {
74817486
r_cp_end = MIN(IV_MAX, r_cp + span - 1);

t/op/tr.t

+7-1
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ BEGIN {
1313

1414
use utf8;
1515

16-
plan tests => 314;
16+
plan tests => 315;
1717

1818
# Test this first before we extend the stack with other operations.
1919
# This caused an asan failure due to a bad write past the end of the stack.
@@ -1187,4 +1187,10 @@ for ("", nullrocow) {
11871187
is($d, "\x{105}", '104 -> 105');
11881188
}
11891189

1190+
{
1191+
my $c = "cb";
1192+
eval '$c =~ tr{aabc}{d\x{d0000}}';
1193+
is($c, "\x{d0000}\x{d0000}", "Shouldn't generate valgrind errors");
1194+
}
1195+
11901196
1;

0 commit comments

Comments
 (0)