Skip to content

regex remembers failure, refuses to match later #6936

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
p5pRT opened this issue Nov 18, 2003 · 4 comments
Closed

regex remembers failure, refuses to match later #6936

p5pRT opened this issue Nov 18, 2003 · 4 comments

Comments

@p5pRT
Copy link

p5pRT commented Nov 18, 2003

Migrated from rt.perl.org#24517 (status was 'resolved')

Searchable as RT24517$

@p5pRT
Copy link
Author

p5pRT commented Nov 18, 2003

From [email protected]

Created by [email protected]

I've been working with someone on CLPM, trying to develop a program
that creates regexes for matching DNA-like strings. I have found a
problem in the regex engine that prevents a string from matching even
though it is a valid match​:

  "GGAAACCAAA" =~ m{
  ^
  ( [ACGU]{1,3} ) (?{ print "1=[$1] $'\n" })
  ( [ACGU]{1,3} ) (?{ print " 2=[$2] $'\n" })
  ( [ACGU]{1,3} ) (?{ print " 3=[$3] ($3) $'\n" })
  ( \3 ) (?{ print " 4=[$4] ($2) $'\n" })
  ( \2 ) (?{ print " 5=[$5] $'\n" })
  $
  }x and print "OK!";

It SHOULD match like so​:

  $1 should be 'GG'
  $2 should be 'AAA'
  $3 should be 'C'
  $4 should be 'C'
  $5 should be 'AAA'

However, the output from the program shows that it doesn't try to
match 'C' when it should​:

1=[GGA] AACCAAA
  2=[AAC] CAAA
  3=[CAA] (CAA) A
  3=[CA] (CA) AA
  3=[C] (C) AAA
  2=[AA]CCAAA
  3=[C] (C) CAAA
  4=[C] (AA) AAA
  5=[AA] A
  2=[A] ACCAAA
  3=[AC] (AC) CAAA
  3=[A] (A) CCAAA
1=[GG] AAACCAAA
  2=[AAA] CCAAA
  3=[CCA] (CCA) AA
  3=[CC] (CC) AAA
<<< HERE >>>
  2=[AA] ACCAAA
  3=[AC] (AC) CAAA
  3=[A] (A) CCAAA
<<< snipped >>>

Where I've marked with "<<< HERE >>>", it should be trying

  3=[C] (C) CAAA
  4=[C] (AAA) AAA
  5=[AAA]

whereupon it should print "OK!". It never tries this; I think this
is because earlier in the regex, when it was here, it failed, so it
doesn't even bother. But now it has different values for $2 and $3,
so it *should* try. I'm not sure I have the ability to fix this one.

Perl Info

Flags:
    category=core
    severity=medium

Site configuration information for perl v5.8.0:

Configured by Debian Project at Fri Jun  6 00:10:15 EST 2003.

Summary of my perl5 (revision 5.0 version 8 subversion 0) configuration:
  Platform:
    osname=linux, osvers=2.4.20-xfs+ti1211, archname=i386-linux-thread-multi
    uname='linux kosh 2.4.20-xfs+ti1211 #1 sat nov 30 19:19:08 est 2002 i686 gnulinux '
    config_args='-Dusethreads -Duselargefiles -Dccflags=-DDEBIAN -Dcccdlflags=-fPIC -Darchname=i386-linux -Dprefix=/usr -Dprivlib=/usr/share/perl/5.8.0 -Darchlib=/usr/lib/perl/5.8.0 -Dvendorprefix=/usr -Dvendorlib=/usr/share/perl5 -Dvendorarch=/usr/lib/perl5 -Dsiteprefix=/usr/local -Dsitelib=/usr/local/share/perl/5.8.0 -Dsitearch=/usr/local/lib/perl/5.8.0 -Dman1dir=/usr/share/man/man1 -Dman3dir=/usr/share/man/man3 -Dman1ext=1 -Dman3ext=3perl -Dpager=/usr/bin/sensible-pager -Uafs -Ud_csh -Uusesfio -Uusenm -Duseshrplib -Dlibperl=libperl.so.5.8.0 -Dd_dosuid -des'
    hint=recommended, useposix=true, d_sigaction=define
    usethreads=define use5005threads=undef useithreads=define usemultiplicity=define
    useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
    use64bitint=undef use64bitall=undef uselongdouble=undef
    usemymalloc=n, bincompat5005=undef
  Compiler:
    cc='cc', ccflags ='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
    optimize='-O3',
    cppflags='-D_REENTRANT -D_GNU_SOURCE -DDEBIAN -fno-strict-aliasing'
    ccversion='', gccversion='3.3 (Debian)', gccosandvers=''
    intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
    d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
    ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
    alignbytes=4, prototype=define
  Linker and Libraries:
    ld='cc', ldflags =' -L/usr/local/lib'
    libpth=/usr/local/lib /lib /usr/lib
    libs=-lgdbm -ldb -ldl -lm -lpthread -lc -lcrypt
    perllibs=-ldl -lm -lpthread -lc -lcrypt
    libc=/lib/libc-2.3.1.so, so=so, useshrplib=true, libperl=libperl.so.5.8.0
    gnulibc_version='2.3.1'
  Dynamic Linking:
    dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags='-rdynamic'
    cccdlflags='-fPIC', lddlflags='-shared -L/usr/local/lib'

Locally applied patches:
    


@INC for perl v5.8.0:
    /etc/perl
    /usr/local/lib/perl/5.8.0
    /usr/local/share/perl/5.8.0
    /usr/lib/perl5
    /usr/share/perl5
    /usr/lib/perl/5.8.0
    /usr/share/perl/5.8.0
    /usr/local/lib/site_perl
    .


Environment for perl v5.8.0:
    HOME=/home/japhy
    LANG=C
    LANGUAGE (unset)
    LD_LIBRARY_PATH (unset)
    LOGDIR (unset)
    PATH=/usr/local/bin:/usr/bin:/usr/ucb:/usr/sbin:/usr/openwin/bin:/bin:/usr/local/netscape:/usr/ccs/bin:/home/japhy/bin:.
    PERL_BADLANG (unset)
    SHELL=/bin/tcsh

@p5pRT
Copy link
Author

p5pRT commented Nov 19, 2003

From @Abigail

On Tue, Nov 18, 2003 at 08​:47​:40PM -0000, japhy@​perlmonk.org (via RT) wrote​:

# New Ticket Created by japhy@​perlmonk.org
# Please include the string​: [perl #24517]
# in the subject line of all future correspondence about this issue.
# <URL​: http​://rt.perl.org/rt2/Ticket/Display.html?id=24517 >

This is a bug report for perl from japhy@​perlmonk.org,
generated with the help of perlbug 1.34 running under perl v5.8.0.

-----------------------------------------------------------------
[Please enter your report here]

I've been working with someone on CLPM, trying to develop a program
that creates regexes for matching DNA-like strings. I have found a
problem in the regex engine that prevents a string from matching even
though it is a valid match​:

"GGAAACCAAA" =~ m{
^
( [ACGU]{1,3} ) (?{ print "1=[$1] $'\n" })
( [ACGU]{1,3} ) (?{ print " 2=[$2] $'\n" })
( [ACGU]{1,3} ) (?{ print " 3=[$3] ($3) $'\n" })
( \3 ) (?{ print " 4=[$4] ($2) $'\n" })
( \2 ) (?{ print " 5=[$5] $'\n" })
$
}x and print "OK!";

It SHOULD match like so​:

$1 should be 'GG'
$2 should be 'AAA'
$3 should be 'C'
$4 should be 'C'
$5 should be 'AAA'

However, the output from the program shows that it doesn't try to
match 'C' when it should​:

[ SNIP ]

whereupon it should print "OK!". It never tries this; I think this
is because earlier in the regex, when it was here, it failed, so it
doesn't even bother. But now it has different values for $2 and $3,
so it *should* try. I'm not sure I have the ability to fix this one.

In 5.8.1 and 5.8.2 it prints​:

1=[GGA] AACCAAA
  2=[AAC] CAAA
  3=[CAA] (CAA) A
  3=[CA] (CA) AA
  3=[C] (C) AAA
  2=[AA] CCAAA
  3=[CCA] (CCA) AA
  3=[CC] (CC) AAA
  3=[C] (C) CAAA
  4=[C] (AA) AAA
  5=[AA] A
  2=[A] ACCAAA
  3=[ACC] (ACC) AAA
  3=[AC] (AC) CAAA
  3=[A] (A) CCAAA
1=[GG] AAACCAAA
  2=[AAA] CCAAA
  3=[CCA] (CCA) AA
  3=[CC] (CC) AAA
  3=[C] (C) CAAA
  4=[C] (AAA) AAA
  5=[AAA]
OK!

So, I'd say the bug has already been fixed.

Abigail

@p5pRT
Copy link
Author

p5pRT commented Nov 19, 2003

From [email protected]

On Nov 19, via RT and UNEXPECTED_DATA_AFTERabigail@​abigail.nlsaid​:

On Tue, Nov 18, 2003 at 08​:47​:40PM -0000, japhy@​perlmonk.org (via RT) wrote​:

# New Ticket Created by japhy@​perlmonk.org
# Please include the string​: [perl #24517]
# in the subject line of all future correspondence about this issue.
# <URL​: http​://rt.perl.org/rt2/Ticket/Display.html?id=24517 >

whereupon it should print "OK!". It never tries this; I think this
is because earlier in the regex, when it was here, it failed, so it
doesn't even bother. But now it has different values for $2 and $3,
so it *should* try. I'm not sure I have the ability to fix this one.

In 5.8.1 and 5.8.2 it prints​:
[snip]
So, I'd say the bug has already been fixed.

Excellent.

--
Jeff "japhy" Pinyan japhy@​pobox.com http​://www.pobox.com/~japhy/
RPI Acacia brother #734 http​://www.perlmonks.org/ http​://www.cpan.org/
<stu> what does y/// stand for? <tenderpuss> why, yansliterate of course.
[ I'm looking for programming work. If you like my work, let me know. ]

@p5pRT
Copy link
Author

p5pRT commented Nov 19, 2003

@rgs - Status changed from 'new' to 'resolved'

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant