c2rust-analyze: initial implementation of error recovery #876

spernsteiner · 2023-03-29T18:10:05Z

This branch adds a basic implementation of error recovery: if the static analysis panics on some function, it now catches the panic, marks that function as invalid, and continues with analysis of the remaining functions. At the end, after completing as much analysis and rewriting as it could, the tool prints a report of all the panics it encountered, including a stack trace for each one.

Currently, when marking a function as failed/invalid, we immediately mark all of its (transitive) callers invalid as well. In the future we can probably relax this, falling back on something like Trivial or on dynamic analysis results to let the callers reason about the unanalyzed function.

The current approach produces a lot of invalid Rust code. Suppose f calls g, f fails analysis, but g succeeds and gets rewritten. Currently we don't rewrite f at all due to the previous failure, and if g's signature gets rewritten, f's call to g will no longer be valid. In the future we'll need a better scheme for calling back and forth between analyzed and unanalyzed functions.

kkysen

We also need to change the existing feature-specific tests to check that there were no caught panics, rather than just not panicking. That could be in a follow-up PR, but I think it should merge right after this one so we're not left in an intermediate state where tests can silently fail.

kkysen · 2023-03-29T18:41:20Z

In the future we can probably relax this, falling back on something like Trivial

What do you mean by this? I was planning on entirely removing Triviality (#874) as the current version is unsound and we were having trouble determining a sound version, and I was just going to hardcode the libc functions.

spernsteiner · 2023-03-29T20:37:06Z

We also need to change the existing feature-specific tests to check that there were no caught panics

Which tests are these? I'd prefer to fix it in this PR so master isn't put into an inconsistent state.

What do you mean by this? I was planning on entirely removing Trivial

I haven't fully thought this through, but I think we may want a way to approximate the behavior of a function g on its caller f even if g can't be fully analyzed. Otherwise, a failure in g would mean we can't analyze f at all. I'm not sure what soundness guarantees, if any, we can provide in the case where analysis has panicked on some functions; if we can't provide guarantees regardless, then this approximation wouldn't need to be sound.

spernsteiner · 2023-03-29T22:12:47Z

Panic reports now include a source span within the code being processed, if one is known. The output format looks like this:

panic at c2rust-analyze/src/context.rs:546:9: unexpected pointer type in *mut *mut _IO_FILE
related location: <rustc_middle::mir::syntax::Operand as c2rust_analyze::context::TypeOf>::type_of::h618b9ae9e6b3fd09 @ /.../c2rust-analyze/src/context.rs:537:41
source location: /.../lighttpd_rust_amalgamated//src/main.rs:91380:13: 91380:19 (#0)
   0: c2rust_analyze::panic_detail::panic_hook
             at src/panic_detail.rs:72:14
   1: core::ops::function::Fn::call
             at /rustc/d394408fb38c4de61f765a3ed5189d2731a1da91/library/core/src/ops/function.rs:77:5
   2: std::panicking::rust_panic_with_hook
             at /rustc/d394408fb38c4de61f765a3ed5189d2731a1da91/library/std/src/panicking.rs:702:17
   3: ...

This makes it easier to understand the context in which the panic occurs.

c2rust-analyze/src/main.rs

aneksteind · 2023-03-30T18:58:00Z

c2rust-analyze/tests/filecheck/catch_panic.rs

+
+// Analysis of `bad2` should also fail because it has a callee on which analysis failed.
+// CHECK-NOT: final labeling for "bad2"
+unsafe fn bad2(p: NonNull<u8>) {


Thoughts on adding a test asserting the functionality we know to be broken, e.g. that good2 defined before bad3 where bad3 calls good2, will not produce any of the error messages below? That way we can develop against this test case later on

I added a new bad_call_good function, which fails analysis and also calls good, and a check that this does not result in the failure propagating from bad_call_good to its callee good. But I'm not sure if that's what you were suggesting.

c2rust-analyze/src/panic_detail.rs

kkysen · 2023-04-02T07:04:20Z

We also need to change the existing feature-specific tests to check that there were no caught panics

Which tests are these? I'd prefer to fix it in this PR so master isn't put into an inconsistent state.

Basically all of them. They all currently check that c2rust-analyze doesn't panic, but now the meaning of panicking changes, so those tests should be changed to check that c2rust-analyze has 0 caught panics.

c2rust-analyze/src/panic_detail.rs

kkysen · 2023-04-02T07:13:43Z

c2rust-analyze/src/context.rs

+        self.fns_failed.insert(did, detail);
+
+        // This is the first time marking `did` as failed, so also mark all of its callers.
+        let callers = self.fn_callers.get(&did).cloned().unwrap_or(Vec::new());


Suggested change

let callers = self.fn_callers.get(&did).cloned().unwrap_or(Vec::new());

let callers = self.fn_callers.get(&did).cloned().unwrap_or_default();

This only removes a type hint that might be useful for the reader.

This is a clippy::or_fun_call warning. I don't see a good reason not to follow it. The fact that it's Vec isn't important here.

This is a false positive in Clippy. "This is only bad if it allocates or does some non-trivial amount of work," but "The lint also cannot figure out whether the function you call is actually expensive to call or not." In this case the function being called is const, so it doesn't need to do any run-time work at all. The lint in question has also been downgraded to allow-by-default due to the high rate of false positives.

This one is a slightly different warning. When it suggests using a closure that calls a const fn, that is more of a false warning as you were saying, but in this case, .unwrap_or_default() is functionally different (I'm surprised it falls under the same clippy lint, as it's quite different). Using Default::default() for this is more semantically clear (we don't care that it's actually Vec::new(), but that it's the default, empty container). We could conceivably change it to ThinVec, Box<[T]>, IndexSet, etc., in which case the .unwrap_or_default() wouldn't need to be changed since it's the same semantic logic.

That said, I don't really want to spend time debating the merits of clippy lints. I think we should just do what clippy says here. There's no major reason not to, and using Default here is more semantically specific.

I think we should just do what clippy says here. There's no major reason not to

There's no good reason to make this change. The fact that Clippy suggested it isn't a good reason on its own, because Clippy often gives bad advice. (Any clippy lints that do regularly give good advice tend to get uplifted into rustc.)

The fact that it's Vec isn't important here.

I disagree. Knowing the type of xs is useful for knowing what you'll get if you do for x in xs.

As far as I know, the official policy for this repo does not require the code to be clippy-clean. Unless that changes (which I would oppose), I don't see any value in making the code (very slightly) worse just to appease the tool.

c2rust-analyze/src/panic_detail.rs

c2rust-analyze/src/main.rs

kkysen

Could 059628a and 86dfd30 be split out into a separate PR and merged immediately? Those 2 commits look good to me and are independent of the panicking stuff in the rest of this PR.

c2rust-analyze/src/context.rs

kkysen · 2023-04-02T08:17:27Z

c2rust-analyze/src/panic_detail.rs

+/// Detailed information about a panic.
+#[derive(Clone, Debug)]
+pub struct PanicDetail {
+    msg: String,
+    loc: Option<String>,
+    relevant_loc: Option<String>,
+    backtrace: Option<Backtrace>,
+    sp: Span,
+}


Could we separate PanicDetail (an actual panic occurred) and FnFailure (either a panic occurred or a called function failed)?

Also, we could separate the data itself with the formatting/displaying. Something more like this:

struct PanicDetail { msg: String, loc: String, backtrace: Backtrace, span: Span, } impl PanicDetail { pub fn from_info(info: &PanicInfo) -> Self { Self { msg: panic_to_string(info.payload()), loc: info.location().unwrap().to_string(), backtrace: Backtrace::new_unresolved(), span: CURRENT_SPAN.with(|cell| cell.get()), } } }

and then impl Display on it, or wrappers. Right now the data and formatting are quite intertwined.

Could we separate PanicDetail (an actual panic occurred) and FnFailure (either a panic occurred or a called function failed)?

I think it's okay to leave this for now, since the "called function failed" will (hopefully) be going away some time soon.

we could separate the data itself with the formatting/displaying

I'm not entirely clear on what you're suggesting, but this also seems fine to me as-is. Such implementation details of the panic_detail module don't affect much outside the module itself, which is small and seems unlikely to need major changes in the future.

Most of these fields are optional, all except for msg. They are all set when an actual panic occurs, but when it's a function failing due to one its callees failing, only msg is used. I was thinking it could like this:

struct PanicDetail { msg: String, loc: String, backtrace: Backtrace, span: Span, } impl PanicDetail { pub fn from_info(info: &PanicInfo) -> Self { Self { msg: panic_to_string(info.payload()), loc: info.location().unwrap().to_string(), backtrace: Backtrace::new_unresolved(), span: CURRENT_SPAN.with(|cell| cell.get()), } } } enum FnFailure { Panic(PanicDetail), CalleeFailed(String), }

Also, as I commented in #876 (comment), doing info.location().unwrap() simplifies things here more than just that one line, as then uses of PanicDetail::loc don't need to handle the Option.

I'm not sure what you mean by the "called function failed" will be going away soon, though.

I'm not sure what you mean by the "called function failed" will be going away soon, though.

I'm currently working on better error recovery, so that we'll no longer need to propagate analysis failures up to callers.

kkysen · 2023-04-02T08:28:20Z

c2rust-analyze/src/panic_detail.rs

+    let bt = Backtrace::new();
+    let detail = PanicDetail {
+        msg: panic_to_string(info.payload()),
+        loc: info.location().map(|l| l.to_string()),


I think it's fine to do info.location().unwrap(). Currently, it always returns Some, and std .unwrap()s, too. That would simplify a bunch of the logic.

This change only simplifies this single line (loc still needs to be Option since PanicDetail::new doesn't provide a location), and it would make us reliant on an implementation detail of the standard library, so I don't think it's worth it.

kkysen

Should we also set up a snapshot test for this? For the caught panics report. I think that could be pushed to a secondary PR.

c2rust-analyze/src/util.rs

c2rust-analyze/tests/common/mod.rs

c2rust-analyze/src/main.rs

spernsteiner · 2023-05-18T23:50:54Z

Rebased and addressed some final comments about panic-catching defaults in tests

kkysen

There were a few things I thought should be done a bit differently, but they're ultimately minor. The main parts of this PR—catching and correctly handling panics, and making sure our tests still test the same thing—all look good to me now.

Co-authored-by: Khyber Sen <[email protected]>

spernsteiner requested review from kkysen and aneksteind and removed request for kkysen March 29, 2023 18:10

kkysen reviewed Mar 29, 2023

View reviewed changes

aneksteind reviewed Mar 30, 2023

View reviewed changes

c2rust-analyze/src/main.rs Outdated Show resolved Hide resolved

aneksteind reviewed Mar 30, 2023

View reviewed changes

c2rust-analyze/src/main.rs Show resolved Hide resolved

aneksteind reviewed Mar 30, 2023

View reviewed changes

c2rust-analyze/src/panic_detail.rs Show resolved Hide resolved