Skip to content

String plus zero inconsistent across platforms #4046

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
p5pRT opened this issue May 29, 2001 · 49 comments
Closed

String plus zero inconsistent across platforms #4046

p5pRT opened this issue May 29, 2001 · 49 comments

Comments

@p5pRT
Copy link

p5pRT commented May 29, 2001

Migrated from rt.perl.org#7059 (status was 'resolved')

Searchable as RT7059$

@p5pRT
Copy link
Author

p5pRT commented May 29, 2001

From [email protected]

This is a bug report for perl from kpt3@​twicom.com,
generated with the help of perlbug 1.33 running under perl v5.6.1.


The following line is inconsistent across different platforms​:

print "0x1" + 0 . "\n";

Perl on Linux (v5.6.0) gives 1
Perl in cygwin(v5.6.1) gives 0
ActiveState (v5.6.0) gives 0

The Linux Perl came with RedHat 7.1

This seems to me to be a rather fundemental problem.

Kevin Twidle



Flags​:
  category=core
  severity=high


Site configuration information for perl v5.6.1​:

Configured by Fifer at Thu Dec 21 21​:37​:10 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration​:
  Platform​:
  osname=cygwin, osvers=1.1.6(0.3032), archname=cygwin
  uname='cygwin_nt-5.0 fifer 1.1.6(0.3032) 2000-11-21 21​:00 i686 unknown '
  config_args='-de'
  hint=recommended, useposix=true, d_sigaction=define
  usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
  useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
  use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler​:
  cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing',
  optimize='-O2',
  cppflags='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing'
  ccversion='', gccversion='2.95.2-5 19991024 (cygwin experimental)', gccosandvers=''
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=4
  alignbytes=8, usemymalloc=y, prototype=define
  Linker and Libraries​:
  ld='ld2', ldflags =' -s -L/usr/local/lib'
  libpth=/usr/local/lib /usr/lib /lib
  libs=-lgdbm -lcrypt
  perllibs=-lcrypt
  libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=libperl5_6_1.a
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' -s'
  cccdlflags=' ', lddlflags=' -s -L/usr/local/lib'

Locally applied patches​:
  v5.6.1-TRIAL1


@​INC for perl v5.6.1​:
  /usr/lib/perl5/5.6.1/cygwin
  /usr/lib/perl5/5.6.1
  /usr/lib/perl5/site_perl/5.6.1/cygwin
  /usr/lib/perl5/site_perl/5.6.1
  /usr/lib/perl5/site_perl
  .


Environment for perl v5.6.1​:
  HOME=/home/kevin
  LANG (unset)
  LANGUAGE (unset)
  LD_LIBRARY_PATH (unset)
  LOGDIR (unset)
  PATH=/usr/local/bin​:/usr/bin​:/bin​:/cygdrive/c/WINNT/system32​:/cygdrive/c/WIN
NT​:/cygdrive/c/WINNT/System32/Wbem
  PERL_BADLANG (unset)
  SHELL=/bin/sh

@p5pRT
Copy link
Author

p5pRT commented May 29, 2001

From @rspier

This appears to be a system-specific problem with whatever is used to
implement Atof (i.e. Perl_my_atof). It's strtold on some systems,
(Or atof or a bunch of other things). (And it's the _implementation_
thats the problem, not the choice of function, afaict.)

Some of the systems _do_ parse 0x1 as 1, others parse it as 0.

The bad conversion (in this example) happens at Line 1822 in sv.c
(5.6.1) (line 2331 in perl-current) in Perl_sv_2nv.

I seem to remember some discussion for the inclusion of a standard
strtold (or similar) function into the core having to do with
64bitness stuff. That would also serve this bug.

-R

p.s. Jarrko- What do you think about including the mod_perl .gdbinit
file in the core? It's less than 5k and it makes debugging much
easier.

Linux
5.00503
5.6.0
5.6.1
perl-current
  all print 1.

FreeBSD
  1

5.6.0 Windows (AS 623)
  prints 0

Solaris​:
4.036
5.00503
5.6.0
all print 0

Irix​:
4.036
5.001
all print 0.

Kevin Twidle writes​:

This is a bug report for perl from kpt3@​twicom.com,
generated with the help of perlbug 1.33 running under perl v5.6.1.

-----------------------------------------------------------------
The following line is inconsistent across different platforms​:

print "0x1" + 0 . "\n";

Perl on Linux (v5.6.0) gives 1
Perl in cygwin(v5.6.1) gives 0
ActiveState (v5.6.0) gives 0

The Linux Perl came with RedHat 7.1

This seems to me to be a rather fundemental problem.

Kevin Twidle
-----------------------------------------------------------------
---
Flags​:
category=core
severity=high
---
Site configuration information for perl v5.6.1​:

Configured by Fifer at Thu Dec 21 21​:37​:10 2000.

Summary of my perl5 (revision 5.0 version 6 subversion 1) configuration​:
Platform​:
osname=cygwin, osvers=1.1.6(0.3032), archname=cygwin
uname='cygwin_nt-5.0 fifer 1.1.6(0.3032) 2000-11-21 21​:00 i686 unknown '
config_args='-de'
hint=recommended, useposix=true, d_sigaction=define
usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
useperlio=undef d_sfio=undef uselargefiles=define usesocks=undef
use64bitint=undef use64bitall=undef uselongdouble=undef
Compiler​:
cc='gcc', ccflags ='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing',
optimize='-O2',
cppflags='-DPERL_USE_SAFE_PUTENV -DHAS_SBRK_PROTO -fno-strict-aliasing'
ccversion='', gccversion='2.95.2-5 19991024 (cygwin experimental)', gccosandvers=''
intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=1234
d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=12
ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=4
alignbytes=8, usemymalloc=y, prototype=define
Linker and Libraries​:
ld='ld2', ldflags =' -s -L/usr/local/lib'
libpth=/usr/local/lib /usr/lib /lib
libs=-lgdbm -lcrypt
perllibs=-lcrypt
libc=/usr/lib/libc.a, so=dll, useshrplib=true, libperl=libperl5_6_1.a
Dynamic Linking​:
dlsrc=dl_dlopen.xs, dlext=dll, d_dlsymun=undef, ccdlflags=' -s'
cccdlflags=' ', lddlflags=' -s -L/usr/local/lib'

Locally applied patches​:
v5.6.1-TRIAL1

---
@​INC for perl v5.6.1​:
/usr/lib/perl5/5.6.1/cygwin
/usr/lib/perl5/5.6.1
/usr/lib/perl5/site_perl/5.6.1/cygwin
/usr/lib/perl5/site_perl/5.6.1
/usr/lib/perl5/site_perl
.

---
Environment for perl v5.6.1​:
HOME=/home/kevin
LANG (unset)
LANGUAGE (unset)
LD_LIBRARY_PATH (unset)
LOGDIR (unset)
PATH=/usr/local/bin​:/usr/bin​:/bin​:/cygdrive/c/WINNT/system32​:/cygdrive/c/WIN
NT​:/cygdrive/c/WINNT/System32/Wbem
PERL_BADLANG (unset)
SHELL=/bin/sh

@p5pRT
Copy link
Author

p5pRT commented May 29, 2001

From @jhi

Hmmm... I really, really vaguely remember seeing this portability bug
manifesting itself earlier.

The bad conversion (in this example) happens at Line 1822 in sv.c
(5.6.1) (line 2331 in perl-current) in Perl_sv_2nv.

I seem to remember some discussion for the inclusion of a standard
strtold (or similar) function into the core having to do with
64bitness stuff. That would also serve this bug.

Yup.

-R

p.s. Jarrko- What do you think about including the mod_perl .gdbinit

s/rrk/rkk/

file in the core? It's less than 5k and it makes debugging much
easier.

Sure. How uptodate it is?

@p5pRT
Copy link
Author

p5pRT commented May 29, 2001

From @jhi

Ho-hum. It's even more fun than that. Linux/x86 perl-current​:

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x0"'
Argument "0x0" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
1
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x1"'
Argument "0x1" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2

So far, so good... now it gets interesting.

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x2"'
Argument "0x2" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x3"'
Argument "0x3" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.5
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x4"'
Argument "0x4" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x5"'
Argument "0x5" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.25
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x6"'
Argument "0x6" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.5
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x7"'
Argument "0x7" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.75
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x8"'
Argument "0x8" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2
papaija​:/tmp/jhi/perl ;

@p5pRT
Copy link
Author

p5pRT commented May 29, 2001

From [Unknown Contact. See original ticket]

GNU Libc's stdlib.h defines strtold (and strtod) thus​:

  extern __inline long double
  strtold (__const char * __nptr, char ** __endptr)
  {
  return __strtold_internal (__nptr, __endptr, 0);
  }

For some reason, strtold does not take a "base" parameter, unlike, say,
strtoul. It behaves as if a base of 0 was specified, and converts "0x1"
to 1 (see strtol(3) for details).

(The 0 argument to __strtold_internal above indicates whether or not to
parse locale-dependent number groupings. It has nothing to do with the
base.)

So far, so good... now it gets interesting. [...]

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x5"'
Argument "0x5" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.25

Similar interesting behaviour is displayed by any hexadecimal number in
the string, apparently due to a bug in __strtold_internal. A cursory
glance at the glibc 2.1.2 source suggests that the bug is probably not
trivial to fix.

Is anyone working on an strtold replacement for the core?

- ams

@p5pRT
Copy link
Author

p5pRT commented May 30, 2001

From @rspier

Looks decently up to date. It could bear some expansion, but it's a
great start.

I removed some mod_perl specific stuff, but that wasn't much.

Here's a file-creation patch.

Inline Patch
--- /dev/null	Fri Mar 23 23:37:44 2001
+++ Porting/gdbmacros	Wed May 30 11:26:27 2001
@@ -0,0 +1,190 @@
+#
+# GDB debugging macros file
+# (thanks to the mod_perl team for developing this.)
+#
+# either copy this file to .gdbinit in the directory you are running
+# the gdb from, or at the gdb prompt:
+# (gdb) source path-to-perl-source/Porting/gdbmacros
+#
+#some handy debugging macros, hopefully you'll never need them
+#some don't quite work, like dump_hv and hv_fetch, 
+#where's the bloody manpage for .gdbinit syntax?
+
+define STpvx
+   print ((XPV*) (PL_stack_base [ax + ($arg0)] )->sv_any )->xpv_pv
+end
+
+define TOPs
+    print ((XPV*) (**sp)->sv_any )->xpv_pv
+end
+
+define curstash
+   print ((XPVHV*) (PL_curstash)->sv_any)->xhv_name
+end
+
+define defstash
+   print ((XPVHV*) (PL_defstash)->sv_any)->xhv_name
+end
+
+define curinfo
+   printf "%d:%s\n", PL_curcop->cop_line, \
+   ((XPV*)(*(XPVGV*)PL_curcop->cop_filegv->sv_any)\
+   ->xgv_gp->gp_sv->sv_any)->xpv_pv 
+end
+
+define SvPVX
+print ((XPV*) ($arg0)->sv_any )->xpv_pv
+end
+
+define SvCUR
+   print ((XPV*)  ($arg0)->sv_any )->xpv_cur 
+end
+
+define SvLEN
+   print ((XPV*)  ($arg0)->sv_any )->xpv_len 
+end
+
+define SvEND
+   print (((XPV*)  ($arg0)->sv_any )->xpv_pv + ((XPV*)($arg0)->sv_any )->xpv_cur) - 1
+end
+
+define SvSTASH
+   print ((XPVHV*)((XPVMG*)($arg0)->sv_any )->xmg_stash)->sv_any->xhv_name
+end
+
+define SvTAINTED
+   print ((($arg0)->sv_flags  & (0x00002000 |0x00004000 |0x00008000 ))  && Perl_sv_tainted ($arg0)) 
+end
+
+define SvTRUE
+   print (	!$arg0	? 0	:    (($arg0)->sv_flags  & 0x00040000 ) 	?   ((PL_Xpv  = (XPV*)($arg0)->sv_any ) &&	(*PL_Xpv ->xpv_pv > '0' ||	PL_Xpv ->xpv_cur > 1 ||	(PL_Xpv ->xpv_cur && *PL_Xpv ->xpv_pv != '0'))	? 1	: 0)	:	(($arg0)->sv_flags  & 0x00010000 ) 	? ((XPVIV*)  ($arg0)->sv_any )->xiv_iv  != 0	:   (($arg0)->sv_flags  & 0x00020000 ) 	? ((XPVNV*)($arg0)->sv_any )->xnv_nv  != 0.0	: Perl_sv_2bool ($arg0) ) 
+end
+
+define GvHV
+   set $hv = (((((XPVGV*)($arg0)->sv_any ) ->xgv_gp) )->gp_hv) 
+end
+
+define GvSV
+ print ((XPV*) ((((XPVGV*)($arg0)->sv_any ) ->xgv_gp) ->gp_sv )->sv_any )->xpv_pv
+end
+
+define GvNAME
+   print (((XPVGV*)($arg0)->sv_any ) ->xgv_name)
+end
+
+define GvFILEGV
+   print ((XPV*) ((((XPVGV*)$arg0->filegv)->xgv_gp)->gp_sv)->sv_any)->xpv_pv
+end
+
+define CvNAME
+   print ((XPVGV*)(((XPVCV*)($arg0)->sv_any)->xcv_gv)->sv_any)->xgv_name
+end
+
+define CvSTASH
+   print ((XPVHV*)(((XPVGV*)(((XPVCV*)($arg0)->sv_any)->xcv_gv)->sv_any)->xgv_stash)->sv_any)->xhv_name
+end
+
+define CvDEPTH
+   print ((XPVCV*)($arg0)->sv_any )->xcv_depth 
+end
+
+define CvFILEGV
+   print ((XPV*) ((((XPVGV*)((XPVCV*)($arg0)->sv_any )->xcv_filegv)->xgv_gp)->gp_sv)->sv_any)->xpv_pv
+end
+
+define SVOPpvx
+   print ((XPV*) ( ((SVOP*)$arg0)->op_sv)->sv_any )->xpv_pv
+end
+
+define HvNAME
+   print ((XPVHV*)$arg0->sv_any)->xhv_name
+end
+
+define HvKEYS
+   print ((XPVHV*)  ($arg0)->sv_any)->xhv_keys
+end
+
+define AvFILL
+   print ((XPVAV*)  ($arg0)->sv_any)->xav_fill
+end
+
+define dumpav
+    set $n = ((XPVAV*)  ($arg0)->sv_any)->xav_fill
+    set $i = 0
+    while $i <= $n
+        set $sv = *Perl_av_fetch($arg0, $i, 0)
+        printf "[%u] -> `%s'\n", $i, ((XPV*) ($sv)->sv_any )->xpv_pv
+        set $i = $i + 1
+    end
+end
+
+define dumphv
+    set $n = ((XPVHV*)  ($arg0)->sv_any)->xhv_keys
+    set $i = 0
+    set $key = 0
+    set $klen = 0
+    Perl_hv_iterinit($arg0)
+    while $i <= $n
+        set $sv = Perl_hv_iternextsv($arg0, &$key, &$klen)
+        printf "%s = `%s'\n", $key, ((XPV*) ($sv)->sv_any )->xpv_pv
+        set $i = $i + 1
+    end
+end
+
+define hvfetch
+   set $klen = strlen($arg1)
+   set $sv = *Perl_hv_fetch($arg0, $arg1, $klen, 0)
+   printf "%s = `%s'\n", $arg1, ((XPV*) ($sv)->sv_any )->xpv_pv
+end
+
+define hvINCval
+   set $hv = (((((XPVGV*)(PL_incgv)->sv_any)->xgv_gp))->gp_hv)
+   set $klen = strlen($arg0)
+   set $sv = *Perl_hv_fetch($hv, $arg0, $klen, 0)
+   printf "%s = `%s'\n", $arg0, ((XPV*) ($sv)->sv_any )->xpv_pv
+end
+
+define dumpany
+   set $sv = Perl_newSVpv("use Data::Dumper; Dumper \\",0)
+   set $void = Perl_sv_catpv($sv, $arg0)
+   set $dump = perl_eval_pv(((XPV*) ($sv)->sv_any )->xpv_pv, 1)
+   printf "%s = `%s'\n", $arg0, ((XPV*) ($dump)->sv_any )->xpv_pv
+end
+
+define dumpanyrv
+   set $rv = Perl_newRV((SV*)$arg0)
+   set $rvpv = perl_get_sv("main::DumpAnyRv", 1)
+   set $void = Perl_sv_setsv($rvpv, $rv)
+   set $sv = perl_eval_pv("use Data::Dumper; Dumper $::DumpAnyRv",1)
+   printf "`%s'\n", ((XPV*) ($sv)->sv_any )->xpv_pv
+end
+
+define svpeek
+   set $pv = Perl_sv_peek((SV*)$arg0)
+   printf "%s\n", $pv
+end
+
+define caller
+   set $sv = perl_eval_pv("scalar caller", 1)
+   printf "caller = %s\n", ((XPV*) ($sv)->sv_any )->xpv_pv
+end
+
+define cluck
+   set $sv = perl_eval_pv("Carp::cluck(); `tail '$Apache::ErrLog'`", 1)
+   printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv
+end
+
+define longmess
+   set $sv = perl_eval_pv("Carp::longmess()", 1)
+   printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv
+end
+
+define shortmess
+   set $sv = perl_eval_pv("Carp::shortmess()", 1)
+   printf "%s\n", ((XPV*) ($sv)->sv_any )->xpv_pv
+end
+
+define perl_get_sv
+    set $sv = perl_get_sv($arg0, 0)
+    printf "%s\n", $sv ? ((XPV*) ((SV*)$sv)->sv_any)->xpv_pv : "undef"
+end

@p5pRT
Copy link
Author

p5pRT commented May 30, 2001

From [Unknown Contact. See original ticket]

file in the core? It's less than 5k and it makes debugging much
easier.

Sure. How uptodate it is?

Looks decently up to date. It could bear some expansion, but it's a
great start.

as i mentioned in my reply to simon's original message, many of the macros
do not work with -Dusethreads. i would really like to see something
crafted with Devel​::DebugInit to generate as many macros as possible. and
for the handcrafted ones, replaces PL_foo with my_perl->Ifoo.

From dougm@​covalent.net Wed Apr 18 09​:32​:22 2001 -0700
Status​:
X-Status​:
X-Keywords​:
Date​: Wed, 18 Apr 2001 09​:32​:21 -0700 (PDT)
From​: Doug MacEachern <dougm@​covalent.net>
To​: Simon Cozens <simon@​netthink.co.uk>
cc​: perl5-porters@​perl.org
Subject​: Re​: The mod_perl .gdbinit
In-Reply-To​: <20010418093850.A8131@​netthink.co.uk>
Message-ID​: <Pine.LNX.4.21.0104180927220.17271-100000@​mako.covalent.net>
MIME-Version​: 1.0
Content-Type​: TEXT/PLAIN; charset=US-ASCII

On Wed, 18 Apr 2001, Simon Cozens wrote​:

I've just had a look at these, and they're *SOOOO* useful. Is there any
chance
we can steal some of them for the core?

most of them were originally generated by Devel​::DebugInit, some crafted
by hand (e.g. curinfo, my favorite). i think it would be worth revisiting
Devel​::DebugInit to get all of them generated, including the handcrafted
ones to deal with PL_* so they are expanded properly for ithread Perls and
other flavors.

@p5pRT
Copy link
Author

p5pRT commented May 30, 2001

From [Unknown Contact. See original ticket]

String Binary Binary fraction Decimal fraction
0x1 1 1.0 1.0
0x2 10 1.0 1.0
0x3 11 1.1 1.5
0x4 100 1.00 1.0
0x5 101 1.01 1.25
0x6 110 1.10 1.5
0x7 111 1.11 1.75
0x8 100 1.00 1.0
0x9 1001 1.001 1.125
0xA 1010 1.010 1.25
0xB 1011 1.011 1.375

Perl is taking the second number as a binary fraction with a fixed decimal
point.
I assume that it is reading the first number in a similar way.

I cannot test it because trying
perl -wle 'print "0x4" + "0x5"'

Linux (5.6.0) gives
Argument "0x4" isn't numeric in addition (+) at -e line 1.
Argument "0x5" isn't numeric in addition (+) at -e line 1.
9

Cygwin(5.6.1) gives
Argument "0x4" isn't numeric in addition (+) at -e line 1.
Argument "0x5" isn't numeric in addition (+) at -e line 1.
0

Kevin

----- Original Message -----
From​: "Jarkko Hietaniemi" <jhi@​iki.fi>
To​: "Robert Spier" <rspier@​pobox.com>
Cc​: "Kevin Twidle" <kevin@​twicom.com>; <perl5-porters@​perl.org>;
<perlbug@​rfi.net>
Sent​: Tuesday, May 29, 2001 8​:45 PM
Subject​: Re​: [ID 20010529.006] String plus zero inconsistent across
platforms

Linux
5.00503
5.6.0
5.6.1
perl-current
all print 1.

Ho-hum. It's even more fun than that. Linux/x86 perl-current​:

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x0"'
Argument "0x0" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
1
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x1"'
Argument "0x1" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2

So far, so good... now it gets interesting.

papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x2"'
Argument "0x2" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x3"'
Argument "0x3" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.5
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x4"'
Argument "0x4" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x5"'
Argument "0x5" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.25
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x6"'
Argument "0x6" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.5
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x7"'
Argument "0x7" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2.75
papaija​:/tmp/jhi/perl ; ./perl -wle 'print "0x1" + "0x8"'
Argument "0x8" isn't numeric in addition (+) at -e line 1.
Argument "0x1" isn't numeric in addition (+) at -e line 1.
2
papaija​:/tmp/jhi/perl ;

--
$jhi++; # http​://www.iki.fi/jhi/
# There is this special biologist word we use for 'stable'.
# It is 'dead'. -- Jack Cohen

@p5pRT
Copy link
Author

p5pRT commented May 30, 2001

From @jhi

==== //depot/perl/sv.c#402 - /u/vieraat/vieraat/jhi/pp4/perl/sv.c ====
Index​: perl/sv.c

Inline Patch
--- perl/sv.c.~1~	Thu May 31 05:04:49 2001
+++ perl/sv.c	Thu May 31 05:04:49 2001
@@ -1871,6 +1871,8 @@
 #endif
 	    {
 		NV d;
+		char *s;
+		bool firstzero;
 #ifdef HAS_STRTOL
 		/* Hopefully trace flow will optimise this away where possible
 		 */
@@ -1878,7 +1880,15 @@
 #endif
 		/* It wasn't an integer, or it overflowed, or we don't have
 		   strtol. Do things the slow way - check if it's a UV etc. */
-		d = Atof(SvPVX(sv));
+		s = SvPVX(sv);
+		firstzero = s[0] == '0';
+		d =
+		  (firstzero && toLOWER(s[1]) == 'x') ?
+		    (NV)Strtoul(s + 2, 0, 16) :
+		    (firstzero && toLOWER(s[1]) == 'b' ?
+		      (NV)Strtoul(s + 2, 0, 2) :
+		      (firstzero && s[1] >= '0' && s[1] <= '7' ?
+		        (NV)Strtoul(s + 1, 0, 8) : Atof(s)));
 
 		if (SvTYPE(sv) < SVt_PVNV)
 		    sv_upgrade(sv, SVt_PVNV);
@@ -2326,7 +2336,9 @@
 #endif
     }
     else if (SvPOKp(sv) && SvLEN(sv)) {
-	if (ckWARN(WARN_NUMERIC) && !SvIOKp(sv) && !looks_like_number(sv))
+        int numtype = looks_like_number(sv);
+      
+	if (ckWARN(WARN_NUMERIC) && !SvIOKp(sv) && !numtype)
 	    not_a_number(sv);
 	SvNVX(sv) = Atof(SvPVX(sv));
 #ifdef NV_PRESERVES_UV
@@ -2348,7 +2360,6 @@
 	} else {
 	    /* Is it something we can run through strtol etc (ie no
 	       trailing exponent part)? */
-	    int numtype = looks_like_number(sv);
 	    /* XXX probably should cache this if called above */
 
 	    if (!(numtype &
End of Patch.

@p5pRT
Copy link
Author

p5pRT commented May 30, 2001

From @vanstyn

I don't think we do​: for a long time we have taught people that "37xa"
numifies to 37, and I think it is unreasonable and unhelpful now to
change "0xa" to numify to something other than 0. I think your example
should numify to 0 + 10 + 0 = 10.

Hugo

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @jhi

Well, in that case we must go in the other direction and stop some
Atof()s from numifying "0xa" to 10.

Hugo

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @vanstyn

Attached is a start, which passes all but pragma/locale.t here. I think
it needs the following changes​:
- overflow/underflow checking
- locale support
- ? unicode support
- inf/nan support
- whitespace tolerance (not sure what is permitted)
- (optional) rewrite 'apply the exponent' to be faster
- add tests​: it is surprising that only locale of the above missing features
  caused any test failures

I'd suggest also that Perl_atof2 be redefined - it is confusing having
a macro that converts a variable reference to its address without it
being named so as obviously to be a macro. Since the only references
to it are in the Perl_my_atof function, I think this should be safe.

This patch requires a 'make regen_headers'.

Hugo

Inline Patch
--- embed.pl.old	Sun May 27 00:32:32 2001
+++ embed.pl	Thu May 31 15:28:58 2001
@@ -2592,3 +2592,4 @@
 Apd	|STRLEN	|sv_utf8_upgrade_flags|SV *sv|I32 flags
 Apd	|char*	|sv_pvn_force_flags|SV* sv|STRLEN* lp|I32 flags
 Ap	|char*	|sv_2pv_flags	|SV* sv|STRLEN* lp|I32 flags
+Ap	|char*	|my_atof2	|const char *s|NV* value
--- perl.h.old	Thu May 31 02:17:32 2001
+++ perl.h	Thu May 31 15:34:23 2001
@@ -1311,24 +1311,8 @@
 #   endif
 #endif
 
-#if !defined(Perl_atof) && defined(USE_LONG_DOUBLE) && defined(HAS_LONG_DOUBLE)
-#   if !defined(Perl_atof) && defined(HAS_STRTOLD)
-#       define Perl_atof(s) (NV)strtold(s, (char**)NULL)
-#   endif
-#   if !defined(Perl_atof) && defined(HAS_ATOLF)
-#       define Perl_atof (NV)atolf
-#   endif
-#   if !defined(Perl_atof) && defined(PERL_SCNfldbl)
-#       define Perl_atof PERL_SCNfldbl
-#       define Perl_atof2(s,f) sscanf((s), "%"PERL_SCNfldbl, &(f))
-#   endif
-#endif
-#if !defined(Perl_atof)
-#   define Perl_atof atof /* we assume atof being available anywhere */
-#endif
-#if !defined(Perl_atof2)
-#   define Perl_atof2(s,f) ((f) = (NV)Perl_atof(s))
-#endif
+#define Perl_atof(s) Perl_my_atof(s)
+#define Perl_atof2(s, np) Perl_my_atof2(s, &(np))
 
 /* Previously these definitions used hardcoded figures.
  * It is hoped these formula are more portable, although
--- util.c.old	Thu May 31 13:43:03 2001
+++ util.c	Thu May 31 15:50:57 2001
@@ -4033,6 +4033,63 @@
     return x;
 }
 
+char*
+Perl_my_atof2(pTHX_ const char* orig, NV* value)
+{
+    NV result = 0.0;
+    bool negative = 0;
+    char* s = (char*)orig;
+
+    switch (*s) {
+	case '-':
+	    negative = 1;
+	    /* fall through */
+	case '+':
+	    ++s;
+    }
+    while (isDIGIT(*s))
+	result = result * 10 + (*s++ - '0');
+    if (*s == '.') {
+	NV decimal = 1.0;
+	++s;
+	while (isDIGIT(*s)) {
+	    decimal /= 10;
+	    result += (*s++ - '0') * decimal;
+	}
+    }
+    if (*s == 'e' || *s == 'E') {
+	I32 exponent = 0;
+	I32 expnegative = 0;
+	I32 bit;
+	NV power;
+
+	++s;
+	switch (*s) {
+	    case '-':
+		expnegative = 1;
+		/* fall through */
+	    case '+':
+		++s;
+	}
+	while (isDIGIT(*s))
+	    exponent = exponent * 10 + (*s++ - '0');
+
+	/* now apply the exponent */
+	power = (expnegative) ? 0.1 : 10.0;
+	for (bit = 1; exponent; bit <<= 1) {
+	    if (exponent & bit) {
+		exponent ^= bit;
+		result *= power;
+	    }
+	    power *= power;
+	}
+    }
+    if (negative)
+	result = -result;
+    *value = result;
+    return s;
+}
+
 void
 Perl_report_evil_fh(pTHX_ GV *gv, IO *io, I32 op)
 {

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @simoncozens

Do we actually want this? It'd slow numeric conversion down horribly, and
numeric conversion's pretty hot.

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @vanstyn

I assume we need a mechanism capable of parsing a number out of a unicode
string, but I don't think we need to expand the normal concept of what a
digit is, so it shouldn't be too bad.

Hugo

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @rspier

I agree with Hugo, but the portability problem remains.

Might something that does something similar to​:

  sub Atof {
  my $_ = shift;
  s/^\s*(\d+(?​:\.\d+)).*$/$1/;
  # $_ is now _only_ digits.
  return strtod($_);
  }

be a happy solution?

-R

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @simoncozens

I'm reasonably sure the code in your patch was Unicode-happy. I'll have to
check, though.

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @nwc10

I like that idea better.

Attached is a start, which passes all but pragma/locale.t here. I think
it needs the following changes​:
- overflow/underflow checking
- locale support
- ? unicode support
- inf/nan support
- whitespace tolerance (not sure what is permitted)
- (optional) rewrite 'apply the exponent' to be faster
- add tests​: it is surprising that only locale of the above missing features
caused any test failures

There wasn't a test for 1 + 1 == 2 until recently.
The first test of converting a string containing a negative floating point
value to a number isn't until lib/storable-something.

+char*
+Perl_my_atof2(pTHX_ const char* orig, NV* value)
+{
+ NV result = 0.0;
+ bool negative = 0;
+ char* s = (char*)orig;
+
+ switch (*s) {
+ case '-'​:
+ negative = 1;
+ /* fall through */
+ case '+'​:
+ ++s;
+ }
+ while (isDIGIT(*s))
+ result = result * 10 + (*s++ - '0');
+ if (*s == '.') {
+ NV decimal = 1.0;
+ ++s;
+ while (isDIGIT(*s)) {
+ decimal /= 10;
+ result += (*s++ - '0') * decimal;
+ }
+ }
+ if (*s == 'e' || *s == 'E') {
+ I32 exponent = 0;
+ I32 expnegative = 0;
+ I32 bit;
+ NV power;
+
+ ++s;
+ switch (*s) {
+ case '-'​:
+ expnegative = 1;
+ /* fall through */
+ case '+'​:
+ ++s;
+ }
+ while (isDIGIT(*s))
+ exponent = exponent * 10 + (*s++ - '0');
+
+ /* now apply the exponent */
+ power = (expnegative) ? 0.1 : 10.0;
+ for (bit = 1; exponent; bit <<= 1) {
+ if (exponent & bit) {
+ exponent ^= bit;
+ result *= power;
+ }
+ power *= power;
+ }
+ }
+ if (negative)
+ result = -result;
+ *value = result;
+ return s;
+}

Do we really want to do this?
IIRC Nick I-S recently posted the descriptions of 2 papers here, about how
to do floating point conversion (to and from) accurately. I suspect that
there are several subtle pitfalls. (Yes. This is arm waving. I've not read
them. Please shut me up if I'm wrong)

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @vanstyn

Attached is a continuation, which passes all current tests here.

:I think it needs the following changes​:
:- overflow/underflow checking

Not done; I don't know what is needed here.

:- locale support

Done, I think - we have support for locale 'decimal_point', but not
for 'thousands_sep'. I suspect we should also be allowing some
underscores in there, but we don't at the moment.

:- ? unicode support

I think this is only a problem if a) the locale-specific decimal point
is not utf8 invariant and b) the string isn't using the same encoding
as the decimal point string. I only know of "." and "," dp strings;
what others are there?

:- inf/nan support
:- whitespace tolerance (not sure what is permitted)

Not done, I don't know what is needed here.

:- (optional) rewrite 'apply the exponent' to be faster

The current algorithm is acceptable but not fast​: we do one fp multiply
per bit set in the absolute value of the exponent, plus one per bit
(set or clear) up to the most significant set bit; for 6.02E23 we'll
do 5 fp and two integer multiplies while parsing the digits, and 9 more
fp multiplies while applying the exponent.[0] We could perhaps gain some
speed by having a table of 10**(2**n) for all relevant n; I also don't
know how much accuracy we may be losing through the repeated mutiplies
in both the decimal places and the exponent.

:- add tests​: it is surprising that only locale of the above missing features
: caused any test failures

I think we need tests for bad numbers like "+23E" and "+E23" for example.

:I'd suggest also that Perl_atof2 be redefined - it is confusing having
:a macro that converts a variable reference to its address without it
:being named so as obviously to be a macro. Since the only references
:to it are in the Perl_my_atof function, I think this should be safe.

Done this.

:This patch requires a 'make regen_headers'.

This patch replaces the last one, and still requires a 'make regen_headers'.

Hugo
[0] My Maths dictionary doesn't have Avogadro's number; my English
dictionary has an entry for it that _doesn't say what its value is_. :(

Inline Patch
--- embed.pl.old	Sun May 27 00:32:32 2001
+++ embed.pl	Thu May 31 15:28:58 2001
@@ -2592,3 +2592,4 @@
 Apd	|STRLEN	|sv_utf8_upgrade_flags|SV *sv|I32 flags
 Apd	|char*	|sv_pvn_force_flags|SV* sv|STRLEN* lp|I32 flags
 Ap	|char*	|sv_2pv_flags	|SV* sv|STRLEN* lp|I32 flags
+Ap	|char*	|my_atof2	|const char *s|NV* value
--- perl.h.old	Thu May 31 02:17:32 2001
+++ perl.h	Thu May 31 20:23:23 2001
@@ -1311,24 +1311,8 @@
 #   endif
 #endif
 
-#if !defined(Perl_atof) && defined(USE_LONG_DOUBLE) && defined(HAS_LONG_DOUBLE)
-#   if !defined(Perl_atof) && defined(HAS_STRTOLD)
-#       define Perl_atof(s) (NV)strtold(s, (char**)NULL)
-#   endif
-#   if !defined(Perl_atof) && defined(HAS_ATOLF)
-#       define Perl_atof (NV)atolf
-#   endif
-#   if !defined(Perl_atof) && defined(PERL_SCNfldbl)
-#       define Perl_atof PERL_SCNfldbl
-#       define Perl_atof2(s,f) sscanf((s), "%"PERL_SCNfldbl, &(f))
-#   endif
-#endif
-#if !defined(Perl_atof)
-#   define Perl_atof atof /* we assume atof being available anywhere */
-#endif
-#if !defined(Perl_atof2)
-#   define Perl_atof2(s,f) ((f) = (NV)Perl_atof(s))
-#endif
+#define Perl_atof(s) Perl_my_atof(s)
+#define Perl_atof2(s, np) Perl_my_atof2(s, np)
 
 /* Previously these definitions used hardcoded figures.
  * It is hoped these formula are more portable, although
--- util.c.old	Thu May 31 13:43:03 2001
+++ util.c	Thu May 31 20:43:29 2001
@@ -4018,19 +4018,86 @@
     if ((PL_hints & HINT_LOCALE) && PL_numeric_local) {
 	NV y;
 
-	Perl_atof2(s, x);
+	Perl_atof2(s, &x);
 	SET_NUMERIC_STANDARD();
-	Perl_atof2(s, y);
+	Perl_atof2(s, &y);
 	SET_NUMERIC_LOCAL();
 	if ((y < 0.0 && y < x) || (y > 0.0 && y > x))
 	    return y;
     }
     else
-	Perl_atof2(s, x);
+	Perl_atof2(s, &x);
 #else
-    Perl_atof2(s, x);
+    Perl_atof2(s, &x);
 #endif
     return x;
+}
+
+char*
+Perl_my_atof2(pTHX_ const char* orig, NV* value)
+{
+    NV result = 0.0;
+    bool negative = 0;
+    char* s = (char*)orig;
+    char* point = ".";	/* locale-dependent decimal point equivalent */
+    STRLEN pointlen = 1;
+    bool seendigit = 0;
+
+    if (PL_numeric_radix_sv)
+	point = SvPV(PL_numeric_radix_sv, pointlen);
+
+    switch (*s) {
+	case '-':
+	    negative = 1;
+	    /* fall through */
+	case '+':
+	    ++s;
+    }
+    while (isDIGIT(*s)) {
+	result = result * 10 + (*s++ - '0');
+	seendigit = 1;
+    }
+    if (memEQ(s, point, pointlen)) {
+	NV decimal = 0.1;
+
+	s += pointlen;
+	while (isDIGIT(*s)) {
+	    result += (*s++ - '0') * decimal;
+	    decimal *= 0.1;
+	    seendigit = 1;
+	}
+    }
+    if (seendigit && *s == 'e' || *s == 'E') {
+	I32 exponent = 0;
+	I32 expnegative = 0;
+	I32 bit;
+	NV power;
+
+	++s;
+	switch (*s) {
+	    case '-':
+		expnegative = 1;
+		/* fall through */
+	    case '+':
+		++s;
+	}
+	while (isDIGIT(*s))
+	    exponent = exponent * 10 + (*s++ - '0');
+
+	/* now apply the exponent */
+	power = (expnegative) ? 0.1 : 10.0;
+	for (bit = 1; exponent; bit <<= 1) {
+	    if (exponent & bit) {
+		exponent ^= bit;
+		result *= power;
+	    }
+	    power *= power;
+	}
+    }
+    if (negative)
+	result = -result;
+    *value = result;
+    return s;
 }
 
 void

@p5pRT
Copy link
Author

p5pRT commented May 31, 2001

From @vanstyn

I think so, but it's always possible that we don't want to do it in
exactly this way. An alternative approach is to apply only the 'valid
number' logic, copying into a new string along the way, and then
calling atof() on the new string. However I think we have a chance
of making our own version right, and I know we have no chance of that
with every operating system's own versions.

:IIRC Nick I-S recently posted the descriptions of 2 papers here, about how
:to do floating point conversion (to and from) accurately. I suspect that
:there are several subtle pitfalls. (Yes. This is arm waving. I've not read
:them. Please shut me up if I'm wrong)

There are several obvious pitfalls, which I'm hoping someone else will
sort out - checking for over and underflow, and loss of accuracy due
to repeated multiplications and additions.

To be honest, I think we should ignore underflow - if the user gets zero,
that's accurate enough by my lights; overflow we can check for easily
enough, at the cost of a bit of cruft. Accuracy will be the bugbear,
and I think we can get most of the way there with a couple of
precalculated tables. I'm not sure how we ensure​:
  print fp_really_equal(1e-300, "." . ("0" x 299) . "1") ? "ok" : "not ok";
.. unless we have 300 entries in the precalculated table.

I guess there's also some potential for accumulating error because we
add a separate component to the total for each decimal place, but I'd
expect the error there to be within tolerance; if not, we could look
to combine the digit string into a single number, then combine the
decimals-offset with the exponent to multiply/divide in one go. Having
access to big enough integers for the mantissa (ignoring the decimal
point) would help a lot with that. I think fp-to-string is much nastier,
or I'd have removed the remaining C-library sprintf call a long time
ago.

I hope that the patch will be applied, and that we'll then get some
people who really need this stuff coming up with test cases for us
to aim for.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 1, 2001

From @jhi

Yup. It's one Unicode character but two bytes.

Unicode book). There might be others, however.

See my other message about other possible digits, however (who knows
whether tb_CN (or whatever) locale might not define Tibetan digits as
being isdigit()?).

Cheers,
Philip
--
Philip Newton <Philip.Newton@​gmx.net>

@p5pRT
Copy link
Author

p5pRT commented Jun 1, 2001

From @jhi

Do you mean

  if (seendigit && (*s == 'e' || *s == 'E')) {

If so, the -Wall is good for something :-)

@p5pRT
Copy link
Author

p5pRT commented Jun 1, 2001

From @jhi

Assuming that the separator is only one byte is wrong.
See util.c​:Perl_set_numeric_radix().

@p5pRT
Copy link
Author

p5pRT commented Jun 1, 2001

From @jhi

The patch applied now, however.

@p5pRT
Copy link
Author

p5pRT commented Jun 1, 2001

From @vanstyn

Yes.

:If so, the -Wall is good for something :-)

Yes.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 1, 2001

From @vanstyn

I don't, do I?

:See util.c​:Perl_set_numeric_radix().

I did​:

  point = SvPV(PL_numeric_radix_sv, pointlen);
[...]
  if (memEQ(s, point, pointlen)) {

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 1, 2001

From @jhi

That's good, then.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @jhi

Hmmm. This patch has a strange effect​: the posix.t #14 (strtod)
starts failing in tru64/alpha/cc, tru64/alpha/gcc, and solaris/sparc/gcc--
but not in linux/x86/gcc (this effectively means, at least in the platforms
I routinely use, that the ones that had Atof() that returned 3 for "0x3"
started failing, and the one platform that was wrong (Linux, is working
correctly for the strtod test...) The failure is bizarre in that if I use
Devel​::Peek​::Dump() on the $n in the test, it correctly is NV of 3.14159
(and $x is 6). So is it the == that got broken...?

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @vanstyn

I think you need to dump not $n but the literal; try something like​:
  $pi = 3.14159;
  unless ($n == $pi && $x == 6) {
  Dump($n); Dump($pi);
  print "not ";
  }

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @jhi

Inline Patch
--- t/lib/posix.t       Sat Jun  2 20:11:44 2001
+++ p.t Sat Jun  2 22:29:40 2001
@@ -79,6 +79,9 @@
 if ($Config{d_strtod}) {
     $lc = &POSIX::setlocale(&POSIX::LC_NUMERIC, 'C') if $Config{d_setlocale};
     ($n, $x) = &POSIX::strtod('3.14159_OR_SO');
+    use Devel::Peek;
+    $pi = 3.14159;
+    Dump($pi); Dump($n); Dump($x);
 # Using long double NVs may introduce greater accuracy than wanted.
     $n =~ s/^3.14158999\d*$/3.14159/
         if $Config{uselongdouble} eq 'define';

ok 13 SV = NV\(0x127fb8\) at 0x1c0d54   REFCNT = 1   FLAGS = \(NOK\,pNOK\)   NV = 3\.14159 SV = NV\(0x127fc8\) at 0x1c0d00   REFCNT = 1   FLAGS = \(NOK\,pNOK\)   NV = 3\.14159 SV = IV\(0x18c71c\) at 0x1c0d18   REFCNT = 1   FLAGS = \(IOK\,pIOK\)   IV = 6 not ok 14

This on the solaris/sparc/gcc.

akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);print "ok" if $n == 3.14159;print $n - 3.14159'
SV = NV(0x124728) at 0x10b018
  REFCNT = 1
  FLAGS = (NOK,pNOK)
  NV = 3.14159
-4.44089209850063e-16
akvavitix​:/tmp/jhi/perl ;

With 5.6.0 (don't have 5.6.1 compiled)​:

akvavitix​:/tmp/jhi/perl ; /p/bin/perl -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);print "ok" if $n == 3.14159;print $n - 3.14159'
SV = NV(0x10bc58) at 0x150fc8
  REFCNT = 1
  FLAGS = (NOK,pNOK)
  NV = 3.14159
ok
0
akvavitix​:/tmp/jhi/perl ;

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @jhi

Also an interesting question is why POSIX​::strtod()?

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @vanstyn

Presumably, because that is what's doing the same thing more
accurately.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @vanstyn

Could you show, say, a %.32f of $n and of 3.14159 please?

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @vanstyn

For example, this is what I get here​:
crypt% ./perl -wle 'printf "%.32f\n", 3.14159+$_ for 0, 2e-16, 3e-16;use POSIX;printf "%.32f\n", POSIX​::strtod("3.14159")+$_ for 0, 2e-16, 3e-16'
3.14158999999999988261834005243145
3.14158999999999988261834005243145
3.14159000000000032670754990249407
3.14158999999999988261834005243145
3.14158999999999988261834005243145
3.14159000000000032670754990249407
crypt%

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @vanstyn

Ok, I've now built with -Duselongdouble, and I have this​:
crypt% ./perl -Ilib -wle 'use POSIX; printf "%.32f\n",$_ for 3.14159, scalar POSIX​::strtod("3.14159")'
3.14159000000000001033735597122387
3.14158999999999988261834005243145
crypt%

.. which suggests to me that the tests are failing because the new routine
is way _more_ accurate than the C library strtod. That's not unreasonable,
because strtod is not expected to work to long double accuracy, but it
implies that (some at least of) the tests are wrong. I'll start trying to
fix them, and see what else is left. I think I will need to improve the
decimal handling, because being only 10 times more accurate than strtod
probably isn't good enough.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @jhi

akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);printf "%.32f\n", $n;printf "%.32f\n", 3.14159'
SV = NV(0x124720) at 0x10b018
  REFCNT = 1
  FLAGS = (NOK,pNOK)
  NV = 3.14159
3.14158999999999988261834005243145
3.14159000000000032670754990249407
akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -wle 'printf "%.32f\n", 3.14159+$_ for 0, 2e-16, 3e-16;use POSIX;printf "%.32f\n", POSIX​::strtod("3.14159")+$_ for 0, 2e-16, 3e-16'
3.14159000000000032670754990249407
3.14159000000000032670754990249407
3.14159000000000077079675975255668
3.14158999999999988261834005243145
3.14158999999999988261834005243145
3.14159000000000032670754990249407
akvavitix​:/tmp/jhi/perl ; ./myconfig
Summary of my perl5 (revision 5.0 version 7 subversion 1 patchlevel 10380) configuration​:
  Platform​:
  osname=solaris, osvers=2.7, archname=sun4-solaris
  uname='sunos akvavitix.hut.fi 5.7 generic_106541-05 sun4u sparc '
  config_args='-des -Dusedevel -Dcc=gcc'
  hint=recommended, useposix=true, d_sigaction=define
  usethreads=undef use5005threads=undef useithreads=undef usemultiplicity=undef
  useperlio=define d_sfio=undef uselargefiles=define usesocks=undef
  use64bitint=undef use64bitall=undef uselongdouble=undef
  Compiler​:
  cc='gcc', ccflags ='-fno-strict-aliasing -D_LARGEFILE_SOURCE -D_FILE_OFFSET_BITS=64',
  optimize='-O',
  cppflags='-Wall -fno-strict-aliasing'
  ccversion='', gccversion='2.95.2 19991024 (release)', gccosandvers='solaris2.7'
  intsize=4, longsize=4, ptrsize=4, doublesize=8, byteorder=4321
  d_longlong=define, longlongsize=8, d_longdbl=define, longdblsize=16
  ivtype='long', ivsize=4, nvtype='double', nvsize=8, Off_t='off_t', lseeksize=8
  alignbytes=8, usemymalloc=n, prototype=define
  Linker and Libraries​:
  ld='gcc', ldflags =' '
  libpth=/usr/lib /usr/ccs/lib
  libs=-lsocket -lnsl -ldl -lm -lc
  perllibs=-lsocket -lnsl -ldl -lm -lc
  libc=/lib/libc.so, so=so, useshrplib=false, libperl=libperl.a
  Dynamic Linking​:
  dlsrc=dl_dlopen.xs, dlext=so, d_dlsymun=undef, ccdlflags=' '
  cccdlflags='-fPIC', lddlflags='-G'

akvavitix​:/tmp/jhi/perl ;

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 2, 2001

From @vanstyn

Not sure which config that is, but I'm a little surprised that's so far
out. I'll work on trying to improve atof2(), but I suspect this is going
to prove as troublesome a test candidate as any of the _ld platforms.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @vanstyn

Third attempt; this passes all tests here with i64 + ld, and I'm
working on testing the other combinations.

I believe gcc gives me a 64-bit mantisssa for long doubles; I'd be
particularly interested in test results from anyone that has greater
precision available to them.

Hugo

Inline Patch
--- util.c.old	Thu May 31 20:43:29 2001
+++ util.c.new	Sun Jun  3 14:49:25 2001
@@ -4033,6 +4033,35 @@
     return x;
 }
 
+NV
+S_mulexp10(NV value, I32 exponent)
+{
+    NV result = value;
+    NV power = 10.0;
+    I32 bit;
+
+    if (exponent > 0) {
+	for (bit = 1; exponent; bit <<= 1) {
+	    if (exponent & bit) {
+		exponent ^= bit;
+		result *= power;
+	    }
+	    power *= power;
+	}
+    }
+    else if (exponent < 0) {
+	exponent = -exponent;
+	for (bit = 1; exponent; bit <<= 1) {
+	    if (exponent & bit) {
+		exponent ^= bit;
+		result /= power;
+	    }
+	    power *= power;
+	}
+    }
+    return result;
+}
+
 char*
 Perl_my_atof2(pTHX_ const char* orig, NV* value)
 {
@@ -4042,10 +4071,26 @@
     char* point = ".";	/* locale-dependent decimal point equivalent */
     STRLEN pointlen = 1;
     bool seendigit = 0;
+    I32 expextra = 0;
+    I32 exponent = 0;
+    I32 i;
+/* this is arbitrary */
+#define PARTLIM 6
+/* we want the largest integers we can usefully use */
+#if defined(HAS_QUAD) && defined(USE_64_BIT_INT)
+#   define PARTSIZE ((int)TYPE_DIGITS(U64)-1)
+    U64 part[PARTLIM];
+#else
+#   define PARTSIZE ((int)TYPE_DIGITS(U32)-1)
+    U32 part[PARTLIM];
+#endif
+    I32 ipart = 0;	/* index into part[] */
+    I32 offcount;	/* number of digits in least significant part */
 
     if (PL_numeric_radix_sv)
 	point = SvPV(PL_numeric_radix_sv, pointlen);
 
+    /* sign */
     switch (*s) {
 	case '-':
 	    negative = 1;
@@ -4053,23 +4098,77 @@
 	case '+':
 	    ++s;
     }
+
+    part[0] = offcount = 0;
+    if (isDIGIT(*s)) {
+	seendigit = 1;	/* get this over with */
+
+	/* skip leading zeros */
+	while (*s == '0')
+	    ++s;
+    }
+
+    /* integer digits */
     while (isDIGIT(*s)) {
-	result = result * 10 + (*s++ - '0');
-	seendigit = 1;
+	if (++offcount > PARTSIZE) {
+	    if (++ipart < PARTLIM) {
+		part[ipart] = 0;
+		offcount = 1;	/* ++0 */
+	    }
+	    else {
+		/* limits of precision reached */
+		--ipart;
+		--offcount;
+		if (*s >= '5')
+		    ++part[ipart];
+		while (isDIGIT(*s)) {
+		    ++expextra;
+		    ++s;
+		}
+		/* warn of loss of precision? */
+		break;
+	    }
+	}
+	part[ipart] = part[ipart] * 10 + (*s++ - '0');
     }
-    if (memEQ(s, point, pointlen)) {
-	NV decimal = 0.1;
 
+    /* decimal point */
+    if (memEQ(s, point, pointlen)) {
 	s += pointlen;
+	if (isDIGIT(*s))
+	    seendigit = 1;	/* get this over with */
+
+	/* decimal digits */
 	while (isDIGIT(*s)) {
-	    result += (*s++ - '0') * decimal;
-	    decimal *= 0.1;
-	    seendigit = 1;
+	    if (++offcount > PARTSIZE) {
+		if (++ipart < PARTLIM) {
+		    part[ipart] = 0;
+		    offcount = 1;	/* ++0 */
+		}
+		else {
+		    /* limits of precision reached */
+		    --ipart;
+		    --offcount;
+		    if (*s >= '5')
+			++part[ipart];
+		    while (isDIGIT(*s))
+			++s;
+		    /* warn of loss of precision? */
+		    break;
+		}
+	    }
+	    --expextra;
+	    part[ipart] = part[ipart] * 10 + (*s++ - '0');
 	}
     }
+
+    /* combine components of mantissa */
+    for (i = 0; i <= ipart; ++i)
+	result += S_mulexp10((NV)part[ipart - i],
+		i ? offcount + (i - 1) * PARTSIZE : 0);
+
     if (seendigit && *s == 'e' || *s == 'E') {
-	I32 exponent = 0;
-	I32 expnegative = 0;
+	bool expnegative = 0;
 	I32 bit;
 	NV power;
 
@@ -4083,17 +4182,15 @@
 	}
 	while (isDIGIT(*s))
 	    exponent = exponent * 10 + (*s++ - '0');
-
-	/* now apply the exponent */
-	power = (expnegative) ? 0.1 : 10.0;
-	for (bit = 1; exponent; bit <<= 1) {
-	    if (exponent & bit) {
-		exponent ^= bit;
-		result *= power;
-	    }
-	    power *= power;
-	}
+	if (expnegative)
+	    exponent = -exponent;
     }
+
+    /* now apply the exponent */
+    exponent += expextra;
+    result = S_mulexp10(result, exponent);
+
+    /* now apply the sign */
     if (negative)
 	result = -result;
     *value = result;

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @vanstyn

i32 + ld (which I've never tried before) gives me one test failure,
which I don't think is my fault​:
crypt% ( cd t ; op/int.t )
1..14
ok 1
ok 2
ok 3
ok 4
ok 5
ok 6
ok 7
not ok 8 # int(4294967303.15) is 4294967302, not 4294967303
not ok 9 # int(4294967303.15) is -4294967302, not -4294967303
ok 10
ok 11
ok 12
ok 13
ok 14
crypt%

Nick, could this be related to your numeric conversion work?

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @nwc10

Not directly. It looks like ID 20010118.017, which Abigail found was failing
on released 5.7.0. [Jarkko added a patch of mine to beef up the tests in
op/int.t sometime after 5.7.0 which appears to have turned up a bug in
some x86 linux systems.]

Quote Abigail from <20010119203420.828.qmail@​foad.org>​:

Same errors with the released 5.7.0.

$ ./perl -I../lib ~abigail/Src/perl/t/op/int.t
1..14
ok 1
ok 2
ok 3
ok 4
ok 5
ok 6
ok 7
not ok 8 # int(4294967303.15) is 4294967302, not 4294967303
not ok 9 # int(4294967303.15) is -4294967302, not -4294967303
ok 10
ok 11
ok 12
ok 13
ok 14

So it seems that you are able to re-create bug 20010118.017. I wasn't.
Not sure where to proceed from here - I think it's a gcc or libc issue,
but as I couldn't make that test fail on long doubles I couldn't track it
further.

Nicholas Clark

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @jhi

At the moment I'm not using -ld at all.

out. I'll work on trying to improve atof2(), but I suspect this is going
to prove as troublesome a test candidate as any of the _ld platforms.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @vanstyn

Ok, I'll try to come back to this one later.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @vanstyn

Whoops​:
: if (seendigit && *s == 'e' || *s == 'E') {

I forgot to add the parens here.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @jhi

Even with the above, the "3rd try", and the handy.h patch I still
get in solaris/sparc/gcc​:

akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -MPOSIX=strtod -MDevel​::Peek -wle '$n=strtod("3.14159");Dump($n);printf "%.32f\n", $n;printf "%.32f\n", 3.14159'
SV = NV(0x124dd0) at 0x10b2b0
  REFCNT = 1
  FLAGS = (NOK,pNOK)
  NV = 3.14159
3.14158999999999988261834005243145
3.14159000000000032670754990249407
akvavitix​:/tmp/jhi/perl ; ./perl -Ilib -wle 'printf "%.32f\n", 3.14159+$_ for 0, 2e-16, 3e-16;use POSIX;printf "%.32f\n", POSIX​::strtod("3.14159")+$_ for 0, 2e-16, 3e-16'
3.14159000000000032670754990249407
3.14159000000000032670754990249407
3.14159000000000077079675975255668
3.14158999999999988261834005243145
3.14158999999999988261834005243145
3.14159000000000032670754990249407

and the t/lib/posix.t #14 failing. I'm not using long doubles.

Hugo

@p5pRT
Copy link
Author

p5pRT commented Jun 3, 2001

From @jhi

Change 10408 by jhi@​alpha on 2001/06/03 16​:52​:41

  Tweak the test to be happy if the accuracy is 'good enough'.

Affected files ...

... //depot/perl/t/lib/posix.t#20 edit

Differences ...

==== //depot/perl/t/lib/posix.t#20 (xtext) ====
Index​: perl/t/lib/posix.t

Inline Patch
--- perl/t/lib/posix.t.~1~	Sun Jun  3 20:52:46 2001
+++ perl/t/lib/posix.t	Sun Jun  3 20:52:46 2001
@@ -82,7 +82,8 @@
 # Using long double NVs may introduce greater accuracy than wanted.
     $n =~ s/^3.1415(8999|9000)\d*$/3.14159/
         if $Config{uselongdouble} eq 'define';
-    print (($n == 3.14159) && ($x == 6) ? "ok 14\n" : "not ok 14\n");
+    print ((abs($n - 3.14159) < 0.00001) && ($x == 6) ?
+          "ok 14\n" : "not ok 14\n");
     &POSIX::setlocale(&POSIX::LC_NUMERIC, $lc) if $Config{d_setlocale};
 } else { print "# strtod not present\n", "ok 14\n"; }
 
End of Patch.

@p5pRT
Copy link
Author

p5pRT commented Jun 4, 2001

From [Unknown Contact. See original ticket]

  ...

Here's a file-creation patch.
  ...
+++ Porting/gdbmacros Wed May 30 11​:26​:27 2001

Is there progress being made on this issue?

I've noticed that the patch hasn't been applied yet.
(are there concerns with the patch?)

@p5pRT
Copy link
Author

p5pRT commented Jun 5, 2001

From @vanstyn

Ok, I think this is happening because we calculate 314159/10/10000.
Attached patch improves this to calculate instead 314159/(10*10000),
which should give the more accurate result.

Hugo

Inline Patch
--- util.c.old	Mon Jun  4 00:46:14 2001
+++ util.c	Tue Jun  5 12:51:11 2001
@@ -4036,30 +4036,25 @@
 NV
 S_mulexp10(NV value, I32 exponent)
 {
-    NV result = value;
+    NV result = 1.0;
     NV power = 10.0;
+    bool negative = 0;
     I32 bit;
 
-    if (exponent > 0) {
-	for (bit = 1; exponent; bit <<= 1) {
-	    if (exponent & bit) {
-		exponent ^= bit;
-		result *= power;
-	    }
-	    power *= power;
-	}
-    }
+    if (exponent == 0)
+	return value;
     else if (exponent < 0) {
+	negative = 1;
 	exponent = -exponent;
-	for (bit = 1; exponent; bit <<= 1) {
-	    if (exponent & bit) {
-		exponent ^= bit;
-		result /= power;
-	    }
-	    power *= power;
+    }
+    for (bit = 1; exponent; bit <<= 1) {
+	if (exponent & bit) {
+	    exponent ^= bit;
+	    result *= power;
 	}
+	power *= power;
     }
-    return result;
+    return negative ? value / result : value * result;
 }
 
 char*

@p5pRT
Copy link
Author

p5pRT commented Jun 5, 2001

From @jhi

Doug MacEachern had issues with the proposed macros.

@p5pRT
Copy link
Author

p5pRT commented Sep 20, 2001

From @rspier

Inline Patch
--- pod/perltodo.pod.orig	Thu Sep 20 22:29:06 2001
+++ pod/perltodo.pod	Thu Sep 20 22:42:50 2001
@@ -824,4 +824,16 @@
     Collation?     http://www.unicode.org/unicode/reports/tr10/
     Normalization? http://www.unicode.org/unicode/reports/tr15/
 
+=head2 Create debugging macros
+
+Debugging macros (like printsv, dump) can make debugging perl inside a
+C debugger much easier.  A good set for gdb comes with mod_perl.
+Something similar should be distributed with perl.
+
+The proper way to do this is to use and extend Devel::DebugInit.
+Devel::DebugInit also needs to be extended to support threads.
+
+See p5p archives for late May/early June 2001 for a recent discussion
+on this topic.
+
 =cut

@p5pRT
Copy link
Author

p5pRT commented Sep 21, 2001

From @jhi

Thanks, applied.

--- pod/perltodo.pod.orig Thu Sep 20 22​:29​:06 2001
+++ pod/perltodo.pod Thu Sep 20 22​:42​:50 2001
@​@​ -824,4 +824,16 @​@​
Collation? http​://www.unicode.org/unicode/reports/tr10/
Normalization? http​://www.unicode.org/unicode/reports/tr15/

+=head2 Create debugging macros
+
+Debugging macros (like printsv, dump) can make debugging perl inside a
+C debugger much easier. A good set for gdb comes with mod_perl.
+Something similar should be distributed with perl.
+
+The proper way to do this is to use and extend Devel​::DebugInit.
+Devel​::DebugInit also needs to be extended to support threads.
+
+See p5p archives for late May/early June 2001 for a recent discussion
+on this topic.
+
=cut

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant