Skip to content

Comptime closures must die #5718

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ghost opened this issue Jun 26, 2020 · 13 comments
Closed

Comptime closures must die #5718

ghost opened this issue Jun 26, 2020 · 13 comments
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Milestone

Comments

@ghost
Copy link

ghost commented Jun 26, 2020

(FOLDED INTO #5895)

Comptime is executed lazily and all calls are memoised, which in the ideal case means that the programmer shouldn't have to think about the order in which operations occur. However, there is one artifact of execution order visible from userspace, and that's closure:

// Code lifted from #5578, comments my own, I hope that's ok

const testing = @import("std").testing;

fn GiveTwice(comptime valt: type) type {
    // (Currently) the only way to have comptime state bundled with runtime data
    // -- so, from the user's perspective, a perfectly sensible usage pattern
    comptime var times: usize = 2;
    return struct {
        val: valt,

        const Self = @This();

        pub fn init(_val: valt) Self {
            return Self{ .val = _val };
        }

        pub fn give(self: *Self) valt {
            comptime if (times == 0)
                @compileError("You can only 'give' 2 times");

            times -= 1;

            return self.val;
        }
    };
}

test "Test 1" {
    var a = GiveTwice(usize).init(1);
    _ = a.give();
    _ = a.give();
    _ = a.give(); // <-- fails to compile, as expected

    // GiveTwice is memoised, so this does not work as expected
    var b = GiveTwice(usize).init(1);
    _ = b.give(); // <-- this one also fails
    _ = b.give(); // <-- as does this one
    _ = b.give(); // <-- when really only this one should
}

https://youtu.be/riAs5zXq-Vw?t=60

Not only is it counterintuitive, it's incongruous with runtime semantics, just for the sake of it. I've proposed #5675 and #5873 to provide better defined solutions for the most common applications of this trick, but even if we have no alternative solutions, we shouldn't allow it.

Furthermore, I propose no comptime global variables. This way, comptime functions are strictly transformations on data. If you want comptime state, you need to express it in a linear execution context and pass it between functions as pointers, like the semantics of comptime var.

With this, we have no more guesswork or surprises -- you can see exactly how your code will compile without knowing all the details of the compiler.

@Vexu Vexu added the proposal This issue suggests modifications. If it also has the "accepted" label then it is planned. label Jun 26, 2020
@Vexu Vexu added this to the 0.7.0 milestone Jun 26, 2020
@SpexGuy
Copy link
Contributor

SpexGuy commented Jun 26, 2020

Tracking modifiability of pointers like that could end up being pretty expensive, and lead to strange compile errors in complex comptime code that are difficult to figure out. Would it be better/easier to to directly disallow comptime closures over mutable state? You would also have to disallow closures over const data containing pointers to mutable state, but I think that would be a reasonable compromise.

@pixelherodev
Copy link
Contributor

That would be the case with a blacklist, but as I understand it, this would be a whitelist. If it's explicitly provided, it's modifiable. That should be cheap enough to check, because the whitelist ought to be small.

@SpexGuy
Copy link
Contributor

SpexGuy commented Jun 26, 2020

The problem is that the white list is really easy to circumvent unless the compiler does a lot of internal state tracking throughout execution.

fn addOne(comptime foo: *comptime_int) void {
    // modification allowed, pointer is passed in
    foo.* += 1;
}

const func = comptime blk: {
    var x = 0;
    const xPtr = &x;
    break :blk struct {
        fn func() void {
            @compileLog(xPtr.*); // does not modify state
            comptime addOne(xPtr); // does not modify state at this level
            @compileLog(xPtr.*); // does not modify state
        }
    }.func;
};

@ghost
Copy link
Author

ghost commented Jun 27, 2020

@SpexGuy: That would also work. I'm not experienced enough in the domain to know whether it would be overly restrictive, though.

@ghost
Copy link
Author

ghost commented Jun 27, 2020

Lifted from Soren on Discord:

pub const ColorId = comptime blk: {
    if(number_of_colors > 0) {
        var result = 0;
        var work = 1;
        while(work < number_of_colors) {
            result += 1;
            work *= 2;
        }
        break :blk @Type(.{
            .Int = .{
                .is_signed = false,
                .bits = result
            }
        });
    } else @compileError("number_of_colors must be 1 or greater");
};

We'd want to allow this. I propose: entering a new function scope, or a nonlinear execution context, reifies all external variables to constants, thus Martin's func example would fail to compile as addOne would receive a *const comptime_int. This is not quite the same as disallowing closure over mutable state -- referencing external mutable state is still allowed, but the type checker will catch any mutations.

@SpexGuy
Copy link
Contributor

SpexGuy commented Jun 27, 2020

We're in agreement that the Soren example should compile. It doesn't have any functions or closures though, so I don't see how it would be affected by this proposal at all.

Converting everything to const recursively is an interesting idea, but you would need a way to recursively "constify" a struct and all types it references. Here's a modified example:

const Wrapper = struct { ptr: *comptime_int };
fn addOne(comptime wrap: Wrapper) void {
    // modification allowed, pointer is passed in
    wrap.ptr.* += 1;
}

const func = comptime blk: {
    var x = 0;
    const wrap = Wrapper{ .ptr = &x };
    break :blk struct {
        fn func() void {
            // what is @TypeOf(wrap) here?
            @compileLog(wrap.ptr.*); // does not modify state
            comptime addOne(wrap); // does not modify state at this level
            @compileLog(wrap.ptr.*); // does not modify state
        }
    }.func;
};

You could say that the constness is tracked in the pointer type but isn't formally part of the type system, but that leads to lots of state tracking to handle cases like this:

const BatchIncr = struct {
    numbers: []*usize;
    void incr(self: @This()) {
        for (self.numbers) |ptr| {
            ptr.* += 1;
        }
    }
};

const f = comptime outer: {
    var batch = BatchIncr{ .numbers = [0]*usize{} };
    const func = blk: {
        var x: usize = 0; // var x is in this execution context
        break :blk struct {
            pub fn func(comptime b: *BatchIncr, comptime uniq: var) void {
                var y: usize = 1; // y is in the execution context of the caller of this function
                b.numbers = b.numbers ++ &[_]*usize { &y, &x }; // no modification here, only escaping
            }
        }.func;
    }
    func(&batch, .{}); // creates a var y in this execution context
    batch.incr(); // both vars come from this execution context, so this is fine
    break :outer func;
};

comptime {
    var batch = BatchIncr{ .numbers = [0]*usize{} };
    f(&batch, .{}); // creates y in this context and references x from the other context.  x is pseudo-const in this context
    batch.incr(); // compile error when incrementing numbers[1]
}

And even if all of the tracking is correct and doesn't slow down comptime execution significantly, these errors are difficult to debug. The resulting compile error from the above example would be something like:

foo.zig:5:16: error: cannot modify closed over comptime var
            ptr.* += 1;
                ^
foo.zig:29:14: note: called from here
    batch.incr();
              ^

which is pretty inscrutable. Compared to that, only allowing closures over pointers that are already const gives errors that are always easy to understand, and should be relatively low-effort to fix in most cases (copy to a const pointer, then close over that). This would prevent closing over structs with mutable comptime state, but I think that's ok, since those are already structured and can be passed as parameters instead.

@ghost
Copy link
Author

ghost commented Jun 27, 2020

The problem there is returning a pointer to a local, which also should not be allowed. I didn't remember seeing an issue about that, so I created #5725 (EDIT: Vexu has since informed me that there is already work being done on this). Any examples of encapsulation breakage that don't rely on that?

@SpexGuy
Copy link
Contributor

SpexGuy commented Jun 27, 2020

This error message is indeed misleading, because &x is the cause, not &y. Comptime variables are not stored on a stack, so returning a pointer to a local var at compile time is fine. This has been proposed in the past as a method to construct a comptime allocator. But in both uses above, y is part of the current execution context. It is allowed to be modified. The cross-reference from the second incr() to &x from the first execution context is the error that needs to be addressed. x was a var created in a separate execution context, so it must be pseudo-const in the second execution context.

@ghost
Copy link
Author

ghost commented Jun 27, 2020

Maybe it doesn't cause problems, but it does cause confusion, and in fact it will cause the same problem if func is called again because it will be memoised. Every use case for this kind of thing can be accomplished by just returning a copy of the data, there's no need to allow pointers for it, especially if we apply the same restriction at runtime. (Well, there's closure-over-comptime-var, but I consider that an antipattern anyway.)

@SpexGuy
Copy link
Contributor

SpexGuy commented Jun 27, 2020

The uniq argument exists to prevent memoization, so it won't be cached in this example. But I agree that in general the caching causes strange behavior and it would be nice to have a better solution for that.

Every use case for this kind of thing can be accomplished by just returning a copy of the data, there's no need to allow pointers for it

I agree, good code would not do this. But I'm not talking about good code. I'm trying to present examples of pathological code that the compiler has to be able to deal with. There's no need to use pointers in this example, but that doesn't mean that this pattern can't be typed, and the language rules have to behave the same way when analyzing this example as they do everywhere else. At the end of the day, comptime has pointers and the language can't just ignore that. Here's a modified version that doesn't leak locals in a way that is quite as obvious:

const Wrap = struct { ptr: *usize };
const BatchIncr = struct {
    numbers: []Wrap;
    void incr(self: @This()) {
        for (self.numbers) |wrap| {
            wrap.ptr.* += 1;
        }
    }
};

const f = comptime outer: {
    var batch = BatchIncr{ .numbers = [0]*usize{} };
    var x: usize = 0; // var x is in this execution context
    const wrap = Wrap{ .ptr = &x };

    // This is where we seem to disagree.
    // Wrap contains a pointer to non-const data, so my
    // rule would prohibit closing over it.  If we don't
    // do that, then we need to track what execution context
    // created it.  The execution context of x is that of
    // `outer`.  In the absence of my rule, wrap is const,
    // so we can close over it.
    const func = struct {
        pub fn func(comptime b: *BatchIncr, comptime y: Wrap, comptime uniq: var) void {
            // @TypeOf(&wrap) == *const Wrap, but @TypeOf(wrap.ptr) == *usize.
            // no modification here, only value copies of structs
            b.numbers = b.numbers ++ &[_]Wrap{ y, wrap };
        }
    }.func;

    var y: usize = 0; // creates a var y in this execution context
    func(&batch, &y, .{}); // adds y and the closed-over x to the list
    batch.incr(); // both vars come from this execution context, so this is fine

    // This next line technically leaks a local var.  `func` references the global
    // constant `wrap`, which contains a pointer to var `x` declared in local scope
    // In theory, if we wanted to outlaw leaking references to comptime vars outside
    // of their declared scope, we could make a compile error here.  But that would
    // require traversing the entire reference graph of `func` in order to determine
    // its most constricting scope, which would be expensive.  We would have to do
    // this at all call sites and for all pointers at every scope boundary in order
    // to properly enforce it.  There also may be good reasons to allow these leaks,
    // as long as the leaked var is not modified.  The alternative, which is much
    // less expensive, is to store the owning execution context as compiler metadata
    // alongside each comptime var.  At every modification through a pointer, the
    // compiler must check that the execution context performing the modification is
    // compatible with the one that owns the comptime var.  When there is a pointer
    // to mutable state that is owned by an incompatible context, that state is
    // considered "pseudo-const".  It cannot be actually const because the type
    // system cannot represent execution contexts.  If a pseudo-const var is modified,
    // it raises a compile error.  This is what causes the `batch.incr()` call at the
    // end of this example to fail.
    break :outer func;
};

comptime {
    var batch = BatchIncr{ .numbers = [0]*usize{} };
    var y: usize = 0; // creates var y in this context
    f(&batch, &y, .{}); // references x from the other context.  x is pseudo-const in this context.
    batch.incr(); // compile error when incrementing numbers[1], that var is not owned by this context.
}

@ghost
Copy link
Author

ghost commented Jun 28, 2020

Ok, yes, I agree, disallowing mutable closure is a much simpler solution. Updated the issue with that.

@ghost ghost changed the title Proposal: Prohibit comptime functions from manipulating non-explicit external state Proposal: Comptime closures must die Jul 14, 2020
@ghost
Copy link
Author

ghost commented Jul 14, 2020

New title communicates the point much more clearly.

@ghost ghost changed the title Proposal: Comptime closures must die Comptime closures must die Jul 14, 2020
@ghost
Copy link
Author

ghost commented Jul 17, 2020

#5895 encompasses this. Closing.

@ghost ghost closed this as completed Jul 17, 2020
This issue was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
proposal This issue suggests modifications. If it also has the "accepted" label then it is planned.
Projects
None yet
Development

No branches or pull requests

3 participants