Open
Description
Description
With cygwin/MSYS2 perl when you do regular expression matching on lines read from an in-memory scalar and the regex matches something it takes orders of magnitude longer than matching the same lines read from a disk file.
Steps to Reproduce
Below is code to demonstrate the behavior. An example file is attached to run with the code. In this file about 16% of the lines match.
#!/usr/bin/env perl
use warnings;
use strict;
use Time::HiRes qw( time );
my $file = shift @ARGV;
my ($fh, $time, $n);
open $fh, "<", $file;
$time = time;
$n = 0;
while(<$fh>) {
/^ ?Query/;
$n++;
}
printf "%f read lines from disk and do RE; n=$n.\n", time - $time;
seek $fh, 0, 0;
my $s = "";
while(<$fh>) {
$s .= $_;
}
# my $s = do {
# local $/;
# <$fh>;
# };
open $fh, "<", \$s;
$time = time;
$n = 0;
while(<$fh>) {
/^ ?Query/;
$n++;
}
printf "%f read lines from in-memory file and do RE; n=$n.\n", time - $time;
On my cygwin system this prints:
0.122725 read lines from disk and do RE; n=570694.
27.238712 read lines from in-memory file and do RE; n=570694.
So the in-memory file is about 300 times slower.
Expected behavior
I would expect the times to be roughly in the same ball park.
Perl configuration
Summary of my perl5 (revision 5 version 36 subversion 3) configuration:
Platform:
osname=cygwin
osvers=3.4.10-1.x86_64
archname=x86_64-cygwin-threads-multi
uname='cygwin_nt-10.0-22631 walter 3.4.10-1.x86_64 2023-11-29 12:12 utc x86_64 cygwin '
config_args='-des -Dprefix=/usr -Dmksymlinks -Darchname=x86_64-cygwin-threads -Dlibperl=cygperl5_36.dll -Dcc=gcc -Dld=g++ -Accflags=-ggdb -O2 -pipe -Wall -Werror=format-security -D_FORTIFY_SOURCE=2 -fstack-protector-strong --param=ssp-buffer-size=4 -fdebug-prefix-map=/mnt/share/cygpkgs/perl/perl.x86_64/build=/usr/src/debug/perl-5.36.3-1 -fdebug-prefix-map=/mnt/share/cygpkgs/perl/perl.x86_64/src/perl-5.36.3=/usr/src/debug/perl-5.36.3-1 -fwrapv'
hint=recommended
useposix=true
d_sigaction=define
useithreads=define
usemultiplicity=define
use64bitint=define
use64bitall=define
uselongdouble=undef
usemymalloc=n
default_inc_excludes_dot=define
Compiler:
cc='gcc'
ccflags ='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -D_GNU_SOURCE -ggdb -O2 -pipe -Wall -Werror=format-security -D_FORTIFY_SOURCE=2 -fstack-protector-strong --param=ssp-buffer-size=4 -fdebug-prefix-map=/mnt/share/cygpkgs/perl/perl.x86_64/build=/usr/src/debug/perl-5.36.3-1 -fdebug-prefix-map=/mnt/share/cygpkgs/perl/perl.x86_64/src/perl-5.36.3=/usr/src/debug/perl-5.36.3-1 -fwrapv -fno-strict-aliasing'
optimize='-O3'
cppflags='-DPERL_USE_SAFE_PUTENV -U__STRICT_ANSI__ -D_GNU_SOURCE -ggdb -O2 -pipe -Wall -Werror=format-security -D_FORTIFY_SOURCE=2 -fstack-protector-strong --param=ssp-buffer-size=4 -fdebug-prefix-map=/mnt/share/cygpkgs/perl/perl.x86_64/build=/usr/src/debug/perl-5.36.3-1 -fdebug-prefix-map=/mnt/share/cygpkgs/perl/perl.x86_64/src/perl-5.36.3=/usr/src/debug/perl-5.36.3-1 -fwrapv -fno-strict-aliasing'
ccversion=''
gccversion='11.4.0'
gccosandvers=''
intsize=4
longsize=8
ptrsize=8
doublesize=8
byteorder=12345678
doublekind=3
d_longlong=define
longlongsize=8
d_longdbl=define
longdblsize=16
longdblkind=3
ivtype='long'
ivsize=8
nvtype='double'
nvsize=8
Off_t='off_t'
lseeksize=8
alignbytes=8
prototype=define
Linker and Libraries:
ld='g++'
ldflags =' -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--enable-auto-image-base -fstack-protector-strong'
libpth=/usr/lib
libs=-lpthread -lnsl -lgdbm -ldb -ldl -lcrypt -lgdbm_compat
perllibs=-lpthread -lnsl -ldl -lcrypt
libc=/usr/lib/libcygwin.a
so=dll
useshrplib=true
libperl=cygperl5_36.dll
gnulibc_version=''
Dynamic Linking:
dlsrc=dl_dlopen.xs
dlext=dll
d_dlsymun=undef
ccdlflags=' '
cccdlflags=' '
lddlflags=' --shared -Wl,--enable-auto-import -Wl,--export-all-symbols -Wl,--enable-auto-image-base -fstack-protector-strong'
Characteristics of this binary (from libperl):
Compile-time options:
HAS_TIMES
MULTIPLICITY
PERLIO_LAYERS
PERL_COPY_ON_WRITE
PERL_DONT_CREATE_GVSV
PERL_OP_PARENT
PERL_PRESERVE_IVUV
PERL_USE_SAFE_PUTENV
USE_64_BIT_ALL
USE_64_BIT_INT
USE_ITHREADS
USE_LARGE_FILES
USE_LOCALE
USE_LOCALE_COLLATE
USE_LOCALE_CTYPE
USE_LOCALE_NUMERIC
USE_LOCALE_TIME
USE_PERLIO
USE_PERL_ATOF
USE_REENTRANT_API
USE_THREAD_SAFE_LOCALE
Locally applied patches:
Cygwin: README
Cygwin: use auto-image-base instead of fixed DLL base address
Cygwin: modify hints
Cygwin: Configure correct libsearch
Cygwin: Configure correct libpth
Cygwin: Win32 correct UTF8 handling
Built under cygwin
Compiled at Nov 30 2023 21:40:29
%ENV:
PERL5LIB="/home/dwrice/perl"
CYGWIN="winsymlinks:nativestrict"
@INC:
/home/dwrice/perl
/usr/local/lib/perl5/site_perl/5.36/x86_64-cygwin-threads
/usr/local/share/perl5/site_perl/5.36
/usr/lib/perl5/vendor_perl/5.36/x86_64-cygwin-threads
/usr/share/perl5/vendor_perl/5.36
/usr/lib/perl5/5.36/x86_64-cygwin-threads
/usr/share/perl5/5.36
Metadata
Metadata
Assignees
Labels
No labels