The default KIND of some size/position intrinsic functions should not be default integer #72

klausler · 2019-11-05T18:07:25Z

Fortran mandates that the default kind of INTEGER occupy one numeric storage unit; so does the default kind of REAL. In 2019, default REAL is expected to be IEEE-754 single precision. Consequently, the default kind of INTEGER ends up having to be 32 bits wide.

This is a problem for real applications using arrays that are (or can be) very large, as the default KIND= parameter values for intrinsic functions like SIZE, SHAPE, LBOUND, UBOUND, FINDLOC, MAXLOC, and MINLOC (*) are all defined to be the default kind of INTEGER. One must determine the proper kind of "long" integer (or assume it) and remember to specify that KIND= on every use of these intrinsic functions in order to ensure that they work with large arrays.

I propose that the default result kinds of these intrinsic functions be redefined to be processor-dependent, so that a processor supporting large memories can do the obvious right thing.

(*) this list is probably incomplete; maybe it should include LEN but that's less of an issue and it would be the hardest to change

The text was updated successfully, but these errors were encountered:

certik · 2019-11-05T18:20:54Z

I wasn't aware that size(A) will return an incorrect number if A is larger than 4GB. In fact, it already fails for larger than 2GB, here is an example with gfortran that fails:

program test_size
real, allocatable :: A(:)
allocate(A(3000000000_16))
A = 1
print *, size(A)
print *, size(A, kind=16)
end

When compiled and executed, it prints:

 -1294967296
 3000000000

The array gets correctly allocated and assigned to, but the default size(A) fails to return the correct result (it returns a wrapped around 32 bit integer value). The size(A, kind=16) works correctly.

This needs to be fixed.

gronki · 2019-11-05T18:44:12Z

This is indeed critical as these kind of mistakes easily go unnoticed. wt., 5 lis 2019, 19:20 użytkownik Ondřej Čertík <[email protected]> napisał:

…

I wasn't aware that size(A) will return an incorrect number (or fails?) if A is larger than 4GB. In fact, it already fails for larger than 2GB, here is an example with gfortran that fails: program test_sizereal, allocatable :: A(:)allocate(A(3000000000_16)) A = 1print *, size(A)print *, size(A, kind=16) end When compiled and executed, it prints: -1294967296 3000000000 The array gets correctly allocated and assigned to, but the default size(A) fails to return the correct result (it returns a wrapped around 32 bit integer value). The size(A, kind=16) works correctly. This needs to be fixed. — You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub <#72?email_source=notifications&email_token=AC4NA3I5S2EQLQ5UVWUPJ6DQSG2QPA5CNFSM4JJGCR32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDDZW3Y#issuecomment-549952367>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC4NA3K5G7LTZUCAEE5XSUTQSG2QPANCNFSM4JJGCR3Q> .

FortranFan · 2019-11-11T16:15:20Z

Please see #78 with a proposal by UK national body from year 2013 that I think had addressed the concern in this thread quite well.

klausler · 2019-11-11T17:44:44Z

Please see #78 with a proposal by UK national body from year 2013 that I think had addressed the concern in this thread quite well.

There's some overlap, but it's not the same essential problem. Whether or not the program is able to define the default kinds of intrinsic types, the default kinds of the results of the particular intrinsic functions related to size should not be the default kind of INTEGER.

sblionel · 2019-11-11T19:39:52Z

This is why these intrinsics now have an optional KIND= argument. The problem with changing the default behavior is that it would break some existing programs, which is almost always a killer.

One might think that implementations would start shifting default integer from 32 to 64 bits much the way that it changed from 16 to 32 bits in the late 70s. (I am ignoring old platforms with 36, 48 and 60-bit word sizes.) But then you run into the issue @klausler originally noted that this would also change the size of default REAL, and I don't think people are ready for that.

Offhand, I am not in favor of any proposal that adds a new implicit behavior. I understand that dealing with constants of non-default kind can be messy, and things such as SIZE can be problematic, but the programmer already needs to be aware when an array might exceed a default integer extent, and use larger kind integers throughout the code.

I know this general topic was discussed when the KIND arguments were added, but I can't find details in the 2014 papers.

certik · 2019-11-11T19:48:44Z

I know this general topic was discussed when the KIND arguments were added, but I can't find details in the 2014 papers.

To fix this particular problem, that the discussion around a particular new feature gets lost, I plan to capture any such future (technical) discussion that happens in person at the committee and document it here in the relevant issues, so that the wider community as well as committee members can in the future reference the arguments that were made, and thus build upon the previous work that was done.

klausler · 2019-11-11T19:57:40Z

I understand full well that those intrinsics have KIND= arguments. Unfortunately, a program using large arrays must specify an adequately-sized kind value for every call to these intrinsics, and ensure that the libraries into which it calls are also free of any missing or inadequate KIND= argument.

Allowing implementations to ability to determine the default value of these KIND= arguments may lead to warnings and errors when codes that can't handle large arrays are recompiled. That seems preferable to mysterious and hard-to-debug crashes, and an implementation needs to pick either the "emit a message" or "crash mysteriously at runtime" option when compiling for large-memory targets; the codes that will fail have not been "broken" by the compiler either way, and the former seems more conformable with a desire to promote portability.

gronki · 2019-11-11T20:01:24Z

I fully agree with Peter here. Backwards compatibility does not seem a good argument in this case. Dangerous silent errors do more harm than possibly breaking one old code. pon., 11 lis 2019, 20:57 użytkownik Peter Klausler <[email protected]> napisał:

…

I understand full well that those intrinsics have KIND= arguments. Unfortunately, a program using large arrays must specify an adequately-sized kind value for every call to these intrinsics, and ensure that the libraries into which it calls are also free of any missing or inadequate KIND= argument. Allowing implementations to ability to determine the default value of these KIND= arguments may lead to warnings and errors when codes that can't handle large arrays are recompiled. That seems preferable to mysterious and hard-to-debug crashes, and an implementation needs to pick either the "emit a message" or "crash mysteriously" option when compiling for large-memory targets; the codes that will fail have not been "broken" by the compiler either way, and the former seems more conformable with a desire to promote portability. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#72?email_source=notifications&email_token=AC4NA3KWBRG6LO5SZZP55XLQTG2LLA5CNFSM4JJGCR32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEDX54UQ#issuecomment-552590930>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC4NA3JUBZCZ77LEZKY4PQLQTG2LLANCNFSM4JJGCR3Q> .

FortranFan · 2019-11-11T23:06:40Z

@klausler wrote:

.. the default kinds of the results of the particular intrinsic functions related to size should not be the default kind of INTEGER.

The way things are, it might be too late now, user definable kinds like in the UK proposal appear the only option for future programmers if they seek certain brevity and cleanliness along with safety in their codes.

WG5 is unlikely to ever agree to a change to intrinsics such as SIZE as suggested in the original post, but it's fathomable WG5 may proceed with the UK proposal with user definable default KINDs at some stage.

klausler · 2019-11-12T00:04:13Z

@klausler wrote:

.. the default kinds of the results of the particular intrinsic functions related to size should not be the default kind of INTEGER.

The way things are, it might be too late now, user definable kinds like in the UK proposal appear the only option for future programmers if they seek certain brevity and cleanliness along with safety in their codes.

WG5 is unlikely to ever agree to a change to intrinsics such as SIZE as suggested in the original post, but it's fathomable WG5 may proceed with the UK proposal with user definable default KINDs at some stage.

I repeat, these features are solving largely distinct problems. If the UK proposal were adopted, there would still be a problem.

I would hope that WG5 would approve a change to the specification of the default kinds of the results of these intrinsic functions; that would allow the f18 compiler to remove an item from its documented list of "intentional violations of the standard". Standardized or not, it seems like the right thing to do in an implementation for modern targets.

FortranFan · 2019-11-12T15:56:08Z

@klausler wrote:

..
I repeat, these features are solving largely distinct problems. If the UK proposal were adopted, there would still be a problem.

I would hope that WG5 would approve a change to the specification of the default kinds of the results of these intrinsic functions; that would allow the f18 compiler to remove an item from its documented list of "intentional violations of the standard". Standardized or not, it seems like the right thing to do in an implementation for modern targets.

Well, some might argue what f18 is trying to address is also "solving largely distinct problems" particularly with SIZE intrinsic since it now includes the optional KIND option. And if f18 is adopting "intentional violations of the standard", then there must be users out there concerned about this, those who rather prefer a new processor such as f18 to adopt strict consistency with the standard by default. Now of course when an implementation feels strongly about certain stipulations in the standard, it can separately offer its users an alternate path forward,say an option to pursue another dialect e.g., how GCC/gfortran does with -std='gnu'.

certik · 2019-11-12T16:07:43Z

One workaround can be that in Debug mode the compiler can emit code to check the size of the array at runtime and produce a warning (or error if instructed) if a size with incorrect kind option is being used, so that at least users have good means to ensure their code is not broken.

But as @sblionel suggested, even if you loop over the array with a default integer, the code will still break, so the user must be aware of this anyway. But a compiler (in Debug mode) can check this and tell users to fix their code (by adding the appropriate kind).

klausler · 2019-11-12T16:17:50Z

One workaround can be that in Debug mode the compiler can emit code to check the size of the array at runtime and produce a warning (or error if instructed) if a size with incorrect kind option is being used, so that at least users have good means to ensure their code is not broken.

But as @sblionel suggested, even if you loop over the array with a default integer, the code will still break, so the user must be aware of this anyway. But a compiler (in Debug mode) can check this and tell users to fix their code (by adding the appropriate kind).

It's useful to the user to detect potential problems in their code before executtion time, when possible.

certik · 2019-11-12T16:23:32Z

It's useful to the user to detect potential problems in their code before executtion time, when possible.

The code can read the size from an input file, in which case it will not be known until runtime. The only way that I can think of to give a warning at compile time is to keep track of how the array is allocated, and if at any point it is allocated using, say, integer(int64) (even if the value is read from an input file), then it will give warnings to all usages of size that do not have kind=int64 with it, as well as all integers used for iteration over the array. The only possible issue is if the array gets allocated in code that the compiler does not have access to. Otherwise it might actually be possible to check this at compile time.

gronki · 2019-11-12T16:34:19Z

In C size_t is 8 bytes on x86_64.

#include <stdio.h>
#include <stddef.h>

int main() { size_t a; printf("sizeof(a) = %d\n", sizeof(a)); }

output: sizeof(a) = 8

The fact is, backwards compatibility is a poor reasoning here. No decently written code will break just because size returns 8 byte instead of 4 byte integer. And understanding backwards compatibility as "all codes that previously worked must still work" is inherently wrong with Fortran language giving so much forgiveness for poor coding practices for years (which was recently pointed out in one of the publications).

certik · 2019-11-12T16:39:17Z

And understanding backwards compatibility as "all codes that previously worked must still work" is inherently wrong with Fortran language giving so much forgiveness for poor coding practices for years (which was recently pointed out in one of the publications).

Can you please point me to the publication you are referring to?

That's one of the great strengths of Fortran that old code continues running and does not require a massive rewrite like Python 3 forced on all Python 2 code. So we want to keep that feature. But there might be a way to get what we want without breaking old code.

gronki · 2019-11-12T16:59:12Z

It was mentioned on this github but it was a while ago. I need a moment to find it. (Maybe somebody will remember what it was faster.) I see it mostly as a weakness. I still see common blocks, data statements and gotos in recently developed codes which I think is terrible. But we will clearly disagree on this so let's not discuss this here. ;) wt., 12 lis 2019 o 17:39 Ondřej Čertík <[email protected]> napisał(a):

…

And understanding backwards compatibility as "all codes that previously worked must still work" is inherently wrong with Fortran language giving so much forgiveness for poor coding practices for years (which was recently pointed out in one of the publications). Can you please point me to the publication you are referring to? That's one of the great strengths of Fortran that old code continues running and does not require a massive rewrite like Python 3 forced on all Python 2 code. So we want to keep that feature. But there might be a way to get what we want without breaking old code. — You are receiving this because you commented. Reply to this email directly, view it on GitHub <#72?email_source=notifications&email_token=AC4NA3JYP6UVYJQQKDD4VYDQTLL3NA5CNFSM4JJGCR32YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOED236OI#issuecomment-552976185>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AC4NA3MLRC25YCLT7HF7IS3QTLL3NANCNFSM4JJGCR3Q> .

certik · 2019-11-12T17:29:27Z

I still see common blocks, data statements and gotos in recently developed codes which I think is terrible.

I agree with you those should not be used in new codes. But I think compilers should still support it, so that old codes continue to work.

klausler · 2019-11-12T18:23:28Z

It's useful to the user to detect potential problems in their code before executtion time, when possible.

The code can read the size from an input file, in which case it will not be known until runtime. The only way that I can think of to give a warning at compile time is to keep track of how the array is allocated, and if at any point it is allocated using, say, integer(int64) (even if the value is read from an input file), then it will give warnings to all usages of size that do not have kind=int64 with it, as well as all integers used for iteration over the array. The only possible issue is if the array gets allocated in code that the compiler does not have access to. Otherwise it might actually be possible to check this at compile time.

What we can warn about at compilation time are things like DO J=1,SIZE(A) when the 64-bit size value must be truncated to a 32-bit default-kind integer J.

jme52 · 2020-01-17T12:34:24Z

I have some questions:

Aren't the dimensions of arrays already part of current array descriptors? If that's the case, would it be that difficult / a bad idea that the standard demands in the definition of intrinsic procedures that if the kind of the result of a call to size will not represent correctly the value that size would return, the runtime aborts?
Yes, this would break some existing programs, but only those that are poorly designed:
```
program p
use, intrinsic :: iso_fortran_env
logical :: big
real, allocatable :: A(:)
read(*,*) big
if (big) then
   allocate(A(2_int64**40_int64))
else
   allocate(A(1))
end if
write(*,*) size(A)
end program
```
If the call to size above is legal Fortran 2018, it is at least a bad idea (if (big) the result is processor dependent?).
Rather than changing the default kind of the result of these functions to be processor dependent as you propose (or to the integer kind with the largest decimal exponent range, which the standard requires to be at least 18), would it be possible to remove in the near future the requirement that default integer, default real, and default logical have the same length? My understanding was that this is mainly a requirement to support common blocks, and they are already obsolescent.

klausler mentioned this issue Nov 5, 2019

The default KIND of some size/position intrinsic functions should not be default integer #73

Closed

FortranFan mentioned this issue Nov 12, 2019

On backward compatibility #79

Open

certik added the Clause 16 Standard Clause 16: Intrinsic procedures and modules label Apr 23, 2022

The default KIND of some size/position intrinsic functions should not be default integer #72

The default KIND of some size/position intrinsic functions should not be default integer #72

Comments

klausler commented Nov 5, 2019

certik commented Nov 5, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gronki commented Nov 5, 2019 via email

Uh oh!

FortranFan commented Nov 11, 2019

Uh oh!

klausler commented Nov 11, 2019

Uh oh!

sblionel commented Nov 11, 2019

Uh oh!

certik commented Nov 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klausler commented Nov 11, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gronki commented Nov 11, 2019 via email

Uh oh!

FortranFan commented Nov 11, 2019

Uh oh!

klausler commented Nov 12, 2019

Uh oh!

FortranFan commented Nov 12, 2019

Uh oh!

certik commented Nov 12, 2019 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

klausler commented Nov 12, 2019

Uh oh!

certik commented Nov 12, 2019

Uh oh!

gronki commented Nov 12, 2019

Uh oh!

certik commented Nov 12, 2019

Uh oh!

gronki commented Nov 12, 2019 via email

Uh oh!

certik commented Nov 12, 2019

Uh oh!

klausler commented Nov 12, 2019

Uh oh!

jme52 commented Jan 17, 2020

Uh oh!

certik commented Nov 5, 2019 •

edited

Loading

certik commented Nov 11, 2019 •

edited

Loading

klausler commented Nov 11, 2019 •

edited

Loading

certik commented Nov 12, 2019 •

edited

Loading