Skip to content

[flang] Accomodate historic preprocessing usage #78868

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jan 26, 2024
Merged

Conversation

klausler
Copy link
Contributor

Some Fortran codes use line continuation as a form of token pasting; see #78797. This works in compilers that run a C-like preprocessor and then apply line continuation to its output; f18 implements line continuation during tokenization and preprocessing, but can still handle this case.

In the rare case when an identifier is split across two or more continuation lines, this patch allows its parts to be distinct preprocessing tokens for the purpose of macro replacemnt. They (or their replacement texts) can be effectively rejoined later as a single identifier when the cooked character stream is tokenized in parsing.

Fixes #78797.

Some Fortran codes use line continuation as a form of token pasting;
see llvm#78797.  This works
in compilers that run a C-like preprocessor and then apply line
continuation to its output; f18 implements line continuation during
tokenization and preprocessing, but can still handle this case.

In the rare case when an identifier is split across two or more
continuation lines, this patch allows its parts to be distinct
preprocessing tokens for the purpose of macro replacemnt.  They
(or their replacement texts) can be effectively rejoined later as a
single identifier when the cooked character stream is tokenized in
parsing.

Fixes llvm#78797.
@klausler klausler requested a review from vdonaldson January 21, 2024 01:10
@llvmbot llvmbot added flang Flang issues not falling into any other category flang:parser labels Jan 21, 2024
@llvmbot
Copy link
Member

llvmbot commented Jan 21, 2024

@llvm/pr-subscribers-flang-parser

Author: Peter Klausler (klausler)

Changes

Some Fortran codes use line continuation as a form of token pasting; see #78797. This works in compilers that run a C-like preprocessor and then apply line continuation to its output; f18 implements line continuation during tokenization and preprocessing, but can still handle this case.

In the rare case when an identifier is split across two or more continuation lines, this patch allows its parts to be distinct preprocessing tokens for the purpose of macro replacemnt. They (or their replacement texts) can be effectively rejoined later as a single identifier when the cooked character stream is tokenized in parsing.

Fixes #78797.


Full diff: https://github.com/llvm/llvm-project/pull/78868.diff

3 Files Affected:

  • (modified) flang/lib/Parser/prescan.cpp (+21-2)
  • (modified) flang/lib/Parser/prescan.h (+2-1)
  • (added) flang/test/Preprocessing/pp133.F90 (+9)
diff --git a/flang/lib/Parser/prescan.cpp b/flang/lib/Parser/prescan.cpp
index 68d7d9f0c53c47..6bccfc3f9baea9 100644
--- a/flang/lib/Parser/prescan.cpp
+++ b/flang/lib/Parser/prescan.cpp
@@ -438,7 +438,8 @@ void Prescanner::NextChar() {
 // character is reached; handles C-style comments in preprocessing
 // directives, Fortran ! comments, stuff after the right margin in
 // fixed form, and all forms of line continuation.
-void Prescanner::SkipToNextSignificantCharacter() {
+bool Prescanner::SkipToNextSignificantCharacter() {
+  auto anyContinuationLine{false};
   if (inPreprocessorDirective_) {
     SkipCComments();
   } else {
@@ -449,6 +450,7 @@ void Prescanner::SkipToNextSignificantCharacter() {
       mightNeedSpace = *at_ == '\n';
     }
     for (; Continuation(mightNeedSpace); mightNeedSpace = false) {
+      anyContinuationLine = true;
       ++continuationLines_;
       if (MustSkipToEndOfLine()) {
         SkipToEndOfLine();
@@ -458,6 +460,7 @@ void Prescanner::SkipToNextSignificantCharacter() {
       tabInCurrentLine_ = true;
     }
   }
+  return anyContinuationLine;
 }
 
 void Prescanner::SkipCComments() {
@@ -625,7 +628,23 @@ bool Prescanner::NextToken(TokenSequence &tokens) {
     }
     preventHollerith_ = false;
   } else if (IsLegalInIdentifier(*at_)) {
-    while (IsLegalInIdentifier(EmitCharAndAdvance(tokens, *at_))) {
+    int parts{1};
+    do {
+      EmitChar(tokens, *at_);
+      ++at_, ++column_;
+      if (SkipToNextSignificantCharacter() && IsLegalIdentifierStart(*at_)) {
+        tokens.CloseToken();
+        ++parts;
+      }
+    } while (IsLegalInIdentifier(*at_));
+    if (parts >= 3) {
+      // Subtlety: When an identifier is split across three or more continuation
+      // lines, its parts are kept as distinct pp-tokens so that macro
+      // operates on them independently.  This trick accommodates the historic
+      // practice of using line continuation for token pasting after
+      // replacement.
+    } else if (parts == 2) {
+      tokens.ReopenLastToken();
     }
     if (InFixedFormSource()) {
       SkipSpaces();
diff --git a/flang/lib/Parser/prescan.h b/flang/lib/Parser/prescan.h
index 84e046c1b102f0..7442b5d2263354 100644
--- a/flang/lib/Parser/prescan.h
+++ b/flang/lib/Parser/prescan.h
@@ -159,7 +159,8 @@ class Prescanner {
   void SkipToEndOfLine();
   bool MustSkipToEndOfLine() const;
   void NextChar();
-  void SkipToNextSignificantCharacter();
+  // True when input flowed to a continuation line
+  bool SkipToNextSignificantCharacter();
   void SkipCComments();
   void SkipSpaces();
   static const char *SkipWhiteSpace(const char *);
diff --git a/flang/test/Preprocessing/pp133.F90 b/flang/test/Preprocessing/pp133.F90
new file mode 100644
index 00000000000000..01e7b010d426ec
--- /dev/null
+++ b/flang/test/Preprocessing/pp133.F90
@@ -0,0 +1,9 @@
+! RUN: %flang -E %s 2>&1 | FileCheck %s
+! CHECK: print *, ADC
+#define B D
+implicit none
+real ADC
+print *, A&
+  &B&
+  &C
+end

@klausler klausler merged commit 4299d9b into llvm:main Jan 26, 2024
@klausler klausler deleted the bug78797 branch January 26, 2024 00:01
@vvereschaka
Copy link
Contributor

Hi @klausler ,

just letting you know that there is two tests with the same name after your commit - pp133.F90 and pp133.f90.
I'm not sure that it was you tried to do.
Also we can get some weird results for them on some systems, I suppose.

https://github.com/llvm/llvm-project/tree/main/flang/test/Preprocessing

@klausler
Copy link
Contributor Author

I'll rename the new one to avoid a case conflict. Thanks for noticing.

@klausler
Copy link
Contributor Author

... done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flang:parser flang Flang issues not falling into any other category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[flang] Failed substitution with preprocessor
4 participants