Skip to content

A way to force the compiler to accept falsely named "Undefined behavior" #63359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Aug 7, 2019 · 9 comments
Closed
Labels
A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.

Comments

@ghost
Copy link

ghost commented Aug 7, 2019

I'm trying to execute some data in Rust, however I'm getting this error:

error[E0080]: it is undefined behavior to use this value
  --> src/main.rs:9:1
   |
9  | / const executable_data: extern "C" fn() = unsafe {
10 | |     std::mem::transmute(address)
11 | | };
   | |__^ type validation failed: encountered a pointer, but expected a function pointer
   |
   = note: The rules on what exactly is undefined behavior aren't clear, so this check might be overzealous. Please open an issue on the rust compiler repository if you believe it should not be considered undefined behavior

Here's the part of my code that matters:

const executable_data: extern "C" fn() = unsafe {
    std::mem::transmute(address)
};

In my opinion, there should be a way to force the compiler to accept something like this, because I'm very certain that this is not undefined behavior. Even if it is, I would like to know for certain by learning it the hard way, but sadly the compiler doesn't even let me compile it, so I can't.

@ghost ghost changed the title Force the compiler to accept falsely named "Undefined behavior" A way to force the compiler to accept falsely named "Undefined behavior" Aug 7, 2019
@jonas-schievink jonas-schievink added A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue. labels Aug 7, 2019
@Mark-Simulacrum
Copy link
Member

It would be good to know the type of address (or data_address, your two snippets don't match).

However, I will caution you that "learning it the hard way" with UB isn't really possible; if your code has UB the compiler is free to do anything. It might even do what you expect it to do (today); but you cannot rely on this. For now, you can get this to work most likely be using a let variable instead of a const.

cc @RalfJung

@Lonami
Copy link
Contributor

Lonami commented Aug 7, 2019

#63197 has more related discussion to UB from using arbitrary addresses.

@ghost
Copy link
Author

ghost commented Aug 7, 2019

@Mark-Simulacrum I need to use const in my use case. The type of address is *const u8. Also, maybe it wouldn't be such a good idea to make this change universal, but instead to make it an option in the compiler to simply disable this type of error.

@Mark-Simulacrum
Copy link
Member

What is your use case? Why can't you use a let variable, or cast at use sites, maybe via a macro?

@ghost
Copy link
Author

ghost commented Aug 7, 2019

@Mark-Simulacrum I'm trying to directly execute hex opcodes in rust. For that, I need the data to be in the text section of the .o file output, which is why I need const. Anything else would put it in the data section, if I understood some stuff I read lately. There is an array called bytes which contains the hex opcodes, each of which are of the type u8, and I need to call it's address in memory to execute it. For clarification, I do have the hex opcode for the ret instruction on x86-64 in the bytes array, so therefore there is no need to worry about the CPU running through memory.

@Mark-Simulacrum
Copy link
Member

The data being in the text section is independent of where the address to said data is stored.

How are you putting the bytes array in your executable? A linker script of some sort? Either way, it sounds like it'd be better to ask on users.rust-lang.org as this error isn't really related to your main problem/task (i.e., it's not the problem). There's also a more diverse audience there who can most likely provide more aid than I.

@ghost
Copy link
Author

ghost commented Aug 7, 2019

@Mark-Simulacrum Data in the data section cannot be executed, however the data in the text section can be (on Linux, which is my environment). I'll take a look at users.rust-lang.org.

@RalfJung
Copy link
Member

RalfJung commented Aug 7, 2019

I'm very certain that this is not undefined behavior. Even if it is, I would like to know for certain by learning it the hard way, but sadly the compiler doesn't even let me compile it, so I can't.

UB doesn't imply the compiler will "miscompile" your code. It means your compiler may "miscompile" your code. ("miscompile" in quotes because there is actually no miscompilation going on here -- if the code is UB, the compiler can output anything, including invalid machine code, and it still constitutes are correct compilation.) It also means how your code compiles may change any time. It is fundamentally impossible to test for that by only considering the generated binary.

You can learn UB by reading the documentation and playing with tools like Miri, but it is impossible to learn UB from what the compiler does.

Also see my recent blog post on the topic:

“What the hardware does” [i.e., what the program does when run on real hardware] is most of the time irrelevant when discussing what a Rust/C/C++ program does, unless you already established that there is no undefined behavior. [...]
Only UB-free programs can be made sense of by looking at their assembly [or running them on real hardware], but whether a program has UB is impossible to tell on that level. For that, you need to think in terms of the abstract machine.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 10, 2019

Note that only the bytecode of your function needs to be in text. There's no need to have the function pointer in text

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-const-eval Area: Constant evaluation, covers all const contexts (static, const fn, ...) C-feature-request Category: A feature request, i.e: not implemented / a PR. T-lang Relevant to the language team, which will review and decide on the PR/issue.
Projects
None yet
Development

No branches or pull requests

5 participants