Skip to content

[test_runner] test.py should presumably support errors in a non-entry-point library #44990

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
eernstg opened this issue Feb 12, 2021 · 28 comments
Assignees
Labels
area-test Cross-cutting test issues (use area- labels for specific failures; not used for package:test).

Comments

@eernstg
Copy link
Member

eernstg commented Feb 12, 2021

I'd expect the test runner to recognize an error expectation in a library which is not the entry point, and consider an actual occurrence of such an error as a successful test run. For example:

// Store this as library 'tests/language/scratch_test.dart'.
import 'scratch_lib.dart';
main() => C();

// Store this as library 'tests/language/scratch_lib.dart'.
class C extends C {}
//    ^
// [analyzer] unspecified
// [cfe] unspecified

Running a fresh dartanalyzer from commit 65fab23, we get the following error message:

ERROR|COMPILE_TIME_ERROR|RECURSIVE_INTERFACE_INHERITANCE_EXTENDS|/usr/local/google/home/eernst/devel/dart/work/sdk/tests/language/scratch_lib.dart|1|7|1|'C' can't extend itself.

This error message should be matched by the expectation comment in scratch_lib.dart, but we still get the following outcome:

> cd $SDK_HOME
> python tools/test.py -n analyzer-asserts-strong-linux --verbose language/scratch_test
Test configuration:
    analyzer-asserts-strong-linux(architecture: x64, compiler: dart2analyzer, mode: release, runtime: none, system: linux, nnbd: strong, enable-asserts, use-sdk)
Suites tested: language
Running "dart2analyzer" command: DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart-sdk/bin/dartanalyzer --ignore-unrecognized-flags --packages=/usr/local/google/home/eernst/devel/dart/work/sdk/.packages --format=machine --no-hints /usr/local/google/home/eernst/devel/dart/work/sdk/tests/language/scratch_test.dart

FAILED: dart2analyzer-none release_x64 language/scratch_test
Expected: Pass
Actual: CompileTimeError

--- Command "dart2analyzer" (took 02.000283s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart-sdk/bin/dartanalyzer --ignore-unrecognized-flags --packages=/usr/local/google/home/eernst/devel/dart/work/sdk/.packages --format=machine --no-hints /usr/local/google/home/eernst/devel/dart/work/sdk/tests/language/scratch_test.dart

unexpected analysis errors:
In tests/language/scratch_lib.dart:
- Line 1, column 7: COMPILE_TIME_ERROR.RECURSIVE_INTERFACE_INHERITANCE_EXTENDS
  'C' can't extend itself.

--- Re-run this test:
python tools/test.py -n analyzer-asserts-strong-linux language/scratch_test
Done dart2analyzer-none release_x64 language/scratch_test: fail

=== 0 tests passed, 1 failed ===

In other words, the test runner does not seem to expect that an error expectation comment can occur in any other library of a test than the entry point.

Motivation This can be a problem in practice: Some tests are intended to handle mixed-version programs (some libraries opt in null safety, others opt it out), and they generally need to have an entry point which is opted out (in order to avoid diagnostic messages about reverse imports, optedIn --> outedOut). Next, some errors occur in opted-in code, so they can't be in the entry point.

It is possible that the error expectation comments are supposed to work also in non-entry-point libraries of a test, in which case this issue should be labeled as type-bug. However, it could also be considered a new and possibly useful behavior, in which case it should be labeled as type-enhancement. So I haven't added any labels about this dimension.

@eernstg eernstg added the area-test Cross-cutting test issues (use area- labels for specific failures; not used for package:test). label Feb 12, 2021
@eernstg
Copy link
Member Author

eernstg commented Feb 12, 2021

@munificent, what do you think?

@munificent
Copy link
Member

Yeah, this is reasonable. The static error test stuff doesn't currently support this because we didn't have a need for it in the tests I'd seen but it makes sense to add support for this.

dart-bot pushed a commit that referenced this issue Mar 17, 2021
This test failed because it had a test outcome expectation comment in
a library which is not the entry point (cf. #44990). This CL changes
the test such that said comment is located in the entry point. The
trade-off is that this test now has a "reverse" import: A library
with null safety enabled imports a legacy library (but, apparently,
this does not cause the test to fail).

PS: This means that we don't have a test that will go green when #44990 is resolved.

Change-Id: Ie94bff22ce75bd662752c5814917e141fafc72ed
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/191365
Reviewed-by: Leaf Petersen <[email protected]>
Commit-Queue: Erik Ernst <[email protected]>
@sgrekhov
Copy link
Contributor

I hit this as well.

// lib.dart
library [Libraries_and_Scripts_A03_t10_lib];
//      ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
// [analyzer] unspecified
// [cfe] unspecified

// test.dart
import "lib.dart";

main() {
}

How to expect static error here? I want to check here that there is a compile time error if library name is wrong. But there are no erroneous lines in a test.dart and expectations in a lib.dart don't work

@eernstg
Copy link
Member Author

eernstg commented Jun 8, 2021

@munificent, I think it's very likely that the tests that need to have expected errors outside the entry point could be written in such a way that they only have those errors in one library. Would it be easier to adjust the test runner such that it could handle this situation, e.g., based on a directive in the entry point?:

// ----- 'my_test.dart'
// ExpectedErrorsIn='my_test_part.dart'

part 'my_test_part.dart';
void main() {}

// ----- my_test_part.dart'

part of 'some_other_library.dart';
//      ^^^^^^^^^^^^^^^^^^^^^^^^^
// [analyzer] Some error message about the wrong part-of URI.

This might also help avoiding a performance problem: If the test runner always needs to scan through every reachable library to see if it contains any error expectations then this might slow down test runs quite a lot.

@scheglov
Copy link
Contributor

FWIW, this issue accounts for 90 out of 269 tests that fail on analyzer-asserts-strong-linux.

@munificent
Copy link
Member

Would it be easier to adjust the test runner such that it could handle this situation, e.g., based on a directive in the entry point?:

// ----- 'my_test.dart'
// ExpectedErrorsIn='my_test_part.dart'

part 'my_test_part.dart';
void main() {}

// ----- my_test_part.dart'

part of 'some_other_library.dart';
//      ^^^^^^^^^^^^^^^^^^^^^^^^^
// [analyzer] Some error message about the wrong part-of URI.

Yeah, this isn't a bad idea. I'm a little worried about the failure mode. If you forget the // ExpectedErrorsIn= comment, then the test runner will silently ignore the errors in some_other_library.dart. If the implementation incorrectly fails to emit those errors, the test runner won't catch the bug. You'll think the test is passing, but it's actually doing nothing.

Here's an idea: We could say that imported/part files that contain static error expectations should end in _error.dart. If the test runner sees any references to a path with that suffix in a test, it will implicitly read that file for its static error expectations.

But, also, we can add support to the test runner to report an error if it sees a file that does not contain that suffix (or _test.dart) but does contain any static error expectations. That way, the test runner won't silently ignore unused error expectations. What do you think?

@eernstg
Copy link
Member Author

eernstg commented Aug 18, 2021

I'm a little worried about the failure mode.

Good point. The requirement that we'd want to enforce here is a property of the test source code alone, so it should be sufficient to check this kind of well-formedness as a presubmit check (no need to do it for every execution of the test runner).

We would then detect that a few test entry point libraries have // ExpectedErrorsIn=, and we'd put the transitive closure of imports from there into a positive list. Then we'd check that the error expectation comment syntax (// ^^^ and so on) only occurs in a library which is (1) outside the SDK test directories (they can do whatever they want), or (2) a test entry point, or (3) on the positive list. This ought to be sufficient to detect the failures you mention.

@munificent
Copy link
Member

We would then detect that a few test entry point libraries have // ExpectedErrorsIn=, and we'd put the transitive closure of imports from there into a positive list. Then we'd check that the error expectation comment syntax (// ^^^ and so on) only occurs in a library which is (1) outside the SDK test directories (they can do whatever they want), or (2) a test entry point, or (3) on the positive list. This ought to be sufficient to detect the failures you mention.

SGTM.

@joshualitt
Copy link
Contributor

joshualitt commented Oct 14, 2022

I just ran into this issue too. I think this is a bigger problem than it appears to be. It is very hard to manage these tests, because they show up on the results page as failures. Whenever they get 'tickled' either by new features, or a new config, or a new backend, they become effectively impossible to triage without domain knowledge.

@joshualitt
Copy link
Contributor

Actually, this issue is even bigger than I had suspected.

Crucially, when failure tests are written with failures outside of the entry library, we do not validate any errors, we only verify that they continue to fail. This results in significantly less test coverage, and causes these tests to bitrot.

I've created an experimental CL to crawl local imports / part files and add any static errors uncovered. This CL does not handle the case where the errors are in another package, but it does manage to address over half of these tests. As might be expected, there are a number of issues with some of these failure tests, for example some have the incorrect error strings, some have missing errors, etc.

I'm not sure we actually want to try to land the above CL, but I think it is important we fix this before these tests rot more.

@mkustermann
Copy link
Member

It seems like this issue in the test runner causes us to have failures where both the test and the runtime do the right thing (tests compile time error, runtime results in compile-time error) but the test runner reports it as failure, because it thinks the test should pass. (e.g. co19/Language/Libraries_and_Scripts/definition_syntax_t08)

This is very problematic as backends want to have no approved failures in the results database, but that cannot be achieved if the test runner has this problem.

@munificent as you're assigned as the owner, are you working on a fix for this?

@leafpetersen
Copy link
Member

cc @kallentu

@munificent
Copy link
Member

@munificent as you're assigned as the owner, are you working on a fix for this?

Alas, no, I don't have any bandwidth for the test runner these days.

@munificent munificent removed their assignment Dec 7, 2023
@leafpetersen
Copy link
Member

This is probably going to be pretty important for macros. @davidmorgan any interest in learning more about how the test runner works... ? :)

@davidmorgan
Copy link
Contributor

I think so yes, I have a TODO from yesterday to start looking at language tests for macros and this does indeed look relevant.

@munificent
Copy link
Member

I'd be happy to spend time helping you ramp up on the test runner. Feel free to throw something on my calendar and I can give you a braindump.

@davidmorgan
Copy link
Contributor

SG; knowledge transfer just before vacation does not seem a winning proposition, so sent an invitation for 2024 ;)

@davidmorgan davidmorgan self-assigned this Jan 10, 2024
@davidmorgan
Copy link
Contributor

I resurrected Joshua's PR from #44990 (comment) and the results seem promising, quite a few CompileTimeError->Pass from existing expected CompileTimeError that are already correct:

https://dart-ci.firebaseapp.com/cl/345541/2

I started fixing some not-quite-correct co19 tests then noticed they're not in the SDK depot.

@munificent guessing here, is the path to land this with co19 fixes: 1) get the test runner change ready, 2) merge the co19 fixes, 3) merge test runner fix a long with DEPS bump so the co19 changes roll at the same time?

@munificent
Copy link
Member

I'm not sure how co19 changes get rolled in. @eernstg would know better how we should manage this transition.

@eernstg
Copy link
Member Author

eernstg commented Jan 15, 2024

For the co19 tests I think a good approach would be to create issues on the co19 repo, https://github.com/dart-lang/co19/issues, rather than changing the co19 tests directly. This will allow the usual pipeline to be used (that is: updating the co19 repo and rolling a specific commit into the SDK, thus updating the co19 tests as seen from the SDK).

@sgrekhov, do you agree with this recommendation?

@davidmorgan, would it be a problem for you or anyone you're aware of that this approach introduces a certain delay (from the time where the test is adjusted and until the time where the corresponding commit is rolled into the SDK)?

@sgrekhov
Copy link
Contributor

@eernstg yes, I'm agree. @davidmorgan please, file co19 issue with the list of not-quite-correct tests, explain what should be changed and I'll do the work. Alternative is a direct PR to co19 repo

@davidmorgan
Copy link
Contributor

Sent dart-lang/co19#2494

@davidmorgan
Copy link
Contributor

davidmorgan commented Feb 14, 2024

Done in https://dart-review.googlesource.com/c/sdk/+/345541 ... the co19 failure fixes will follow later, so the failures are approved for now. dart-lang/co19#2497

Please reopen if anything seems not as expected :)

@eernstg
Copy link
Member Author

eernstg commented Mar 7, 2024

I think augmentation libraries are not included when the local source entities are scanned for test expectations.

@eernstg eernstg reopened this Mar 7, 2024
@davidmorgan
Copy link
Contributor

That sounds likely :)

@sgrekhov do you happen to have an example test with augmentations libraries I can look at please, so I can see what should work here?

@sgrekhov
Copy link
Contributor

sgrekhov commented Mar 7, 2024

There is no committed test yet. But please see PR dart-lang/co19#2561

$ python3 tools/test.py -n analyzer-asserts-linux co19/LanguageFeatures/Augmentation-libraries/defining_augmentation_A02_t04
No build targets found.
Test configuration:
    analyzer-asserts-linux(architecture: x64, compiler: dart2analyzer, mode: release, runtime: none, system: linux, enable-asserts, use-sdk)
Suites tested: co19

FAILED: dart2analyzer-none release_x64 co19/LanguageFeatures/Augmentation-libraries/defining_augmentation_A02_t04
Expected: Pass
Actual: CompileTimeError

--- Command "dart2analyzer" (took 03.000851s):
DART_CONFIGURATION=ReleaseX64 out/ReleaseX64/dart-sdk/bin/dart out/ReleaseX64/gen/dartanalyzer.dart.snapshot -Dtest_runner.configuration=analyzer-asserts-linux --enable-experiment=macros --ignore-unrecognized-flags --packages=/home/sgrekhov/Google/dart-sdk/sdk/.dart_tool/package_config.json --format=json /home/sgrekhov/Google/dart-sdk/sdk/tests/co19/src/LanguageFeatures/Augmentation-libraries/defining_augmentation_A02_t04.dart

unexpected analysis errors in defining_augmentation_A02_t04_lib2.dart:
- Line 18, column 9: COMPILE_TIME_ERROR.UNDEFINED_IDENTIFIER
  Undefined name 'AL'.

--- Re-run this test:
python3 tools/test.py -n analyzer-asserts-linux co19/LanguageFeatures/Augmentation-libraries/defining_augmentation_A02_t04
[00:04 | 100% | +    0 | -    1]

=== 0 tests passed, 1 failed ===

But defining_augmentation_A02_t04_lib2.dart does expect an error on line 18

@davidmorgan
Copy link
Contributor

Thanks! https://dart-review.googlesource.com/c/sdk/+/356401 should do it.

copybara-service bot pushed a commit that referenced this issue Mar 8, 2024
…libraries.

[email protected]

Change-Id: I52fd157be6ba561f571170ce393d80820b2744dc
Bug: #44990
Reviewed-on: https://dart-review.googlesource.com/c/sdk/+/356401
Reviewed-by: Erik Ernst <[email protected]>
Auto-Submit: Morgan :) <[email protected]>
Commit-Queue: Morgan :) <[email protected]>
@sgrekhov
Copy link
Contributor

It works! Great! Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-test Cross-cutting test issues (use area- labels for specific failures; not used for package:test).
Projects
None yet
Development

No branches or pull requests

8 participants