Skip to content

[WIP] Add mutex traits #132

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

Rahix
Copy link

@Rahix Rahix commented May 1, 2019

I've finally gotten around to writing a proposal for an embedded_hal mutex trait as discussed in #119.

Idea

I have found the most generic way for a mutex implementation is to use a closure passed to the lock function. A design like the std mutex where the lock function returns a guard is currently not possible because of the way most hardware-crates implement critical sections. This means, with the current proposal, mutex usage would look like this:

m.lock(|v| {
  // Access is available in here ...
}).unwrap();

Mutability

There are two mutex traits available: RoMutex and RwMutex. RoMutex is not like RwLock where multiple read accesses are allowed simultaneously. RoMutex only allows a single context to gain exclusive read access. RwMutex is in line with the std mutex type.

The reason for having this separation is to allow efficient implementations based on both std and bare-metal mutex types. For compatibility, there are two blanket implementations:

Mutex Creation

I think, a lot of driver crates which would benefit from this trait will need to create the mutex somewhere inside their initialization routine. For this to be possible, the trait should have a generic create or new method. I went with create to prevent name conflicts.

Because this is shared between RoMutex and RwMutex, I separated it into its own trait: Mutex. I also made it return a Result, in case an implementation might fail.

Example Implementations

I've written two example implementations for these traits which I included in the documentation. A rendered version can be found here.

Open Questions

Return value of create()

In the current design, I made create() return a result. However, none of the mutex implementations I looked at would actually be able to fail, which makes me unsure if a Result is really necessary here ...

Usefulness of RoMutex

I am unsure if RoMutex is actually useful. I added it for easy compatibility with the bare-metal mutex but I don't know of any real uses. Maybe I misunderstood the reasoning for the design in bare-metal?

Docs

I added cortex-m as a dev-dependency for a doc-test. Is this acceptable or should I rather use a dummy implementation?

cc @eldruin, @therealprof

@rust-highfive
Copy link

Thanks for the pull request, and welcome! The Rust team is excited to review your changes, and you should hear from @therealprof (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

Please see the contribution instructions for more information.

@rust-highfive rust-highfive added S-waiting-on-review Review is incomplete T-hal labels May 1, 2019
Copy link
Contributor

@ryankurte ryankurte left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good to me! i guess we could provide an implementation for std::sync::Mutex in a different crate too?

@Rahix
Copy link
Author

Rahix commented May 9, 2019

Yes! I think that would belong into linux-embedded-hal ... In fact, I already have an example implementation in the docs 😉

@ryankurte
Copy link
Contributor

brilliant, i saw the docs example but hoped i wouldn't need to use it ^_^

still lgtm, @eldruin and @therealprof do either of you have opinions here? if not i'll kick of the merge tomorrow.

@therealprof
Copy link
Contributor

@ryankurte I haven't been able to check it out yet. I was also under the impression this is not meant to go in yet due to the [WIP] in the title...

@therealprof
Copy link
Contributor

@Rahix You mentioned one of the main usecases would be to allow drivers access to a suitable Mutex implementation for a particular environment. This would probably require either aliasing a suitable Mutex implementation in the application or passing one to the crate requiring the functionality, right? It would be great to have one or two examples demonstrating how a driver crate could leverage this functionality and how the user would use a driver in a bare-metal or std application.

@eldruin
Copy link
Member

eldruin commented May 10, 2019

@ryankurte I am looking forward to using this on the pcf857x driver. I will do so soon and post here how it goes. I have been too absorbed into finishing the kxcj9 driver and the ad983x driver I just started.

Interestingly, I recently read this paper which talks about a similar approach about using closures in a cell-like type and was thinking if that would ease some of the code in the pcf857x driver but did not think about applying this to a mutex.

@Rahix
Copy link
Author

Rahix commented May 10, 2019

I was also under the impression this is not meant to go in yet due to the [WIP] in the title...

Ah, well I added WIP because I first wanted some discussion before finalizing the design. Mostly whether it makes sense to have both RwMutex and RoMutex (see Open Questions in the OP). I don't see much use of RoMutex myself, to be honest, I just added it because the
bare-metal Mutex is also read-only ...

Another question is compatibility with frameworks like RTFM. Unfortunately I don't have much experience with RTFM, so I am unsure if a straightforward implementation is possible ...

This would probably require either aliasing a suitable Mutex implementation in the application or passing one to the crate requiring the functionality, right?

Yes, most likely similar to this:

let driver = driver_crate::Driver::<CortexMMutex<_>>::new();

The same was already happening in shared-bus, under the hood:

https://github.com/Rahix/shared-bus/blob/e24defd5c802db247a2e87fd874410eb34ee5548/src/lib.rs#L53-L63

What I expect most people will do is aliasing a mutex in the crate root and then refer to it later, so they can easily swap out implementations:

use cortex_m::interrupt::Mutex;
// or
use some_other_device::OtherDeviceMutex as Mutex;

// later ...
let driver = driver_crate::Driver::<crate::Mutex<_>>::new();

It would be great to have one or two examples demonstrating how a driver crate could leverage this functionality and how the user would use a driver in a bare-metal or std application.

Sure! I'll cook something up and post it here ...

@Rahix
Copy link
Author

Rahix commented Jun 19, 2019

I am unfortunately quite busy with other things right now, so I might still take some time until I can continue working on this. In the meantime, could anyone who is familiar with RTFM give some input whether this API choice is compatible with an RTFM implementation of a mutex type that properly handles priorities?

@korken89
Copy link

korken89 commented Aug 27, 2019

Hi, this has slipped under my radar - but we have been discussing the same thing within RTFM and maybe try to get it in somewhere.
Either in cortex-m or similar crates or here.

I can give feedback from the RTFM PoV, and I'll link in @perlindgren here as well. We have been discussing this.

@Rahix
Copy link
Author

Rahix commented Aug 28, 2019

@korken89, that would be great! I have very limited knowledge of the inner workings of RTFM but from what I have seen so far I am quite worried about whether this is possible at all. Can you give more insight? Did you RTFM guys come up with an alternative design which is compatible if this one fails to be?

@korken89
Copy link

Hi, we are meeting to discuss this tomorrow and I'll give an update here after that! :)

@korken89
Copy link

korken89 commented Sep 3, 2019

Hi @Rahix, thanks for starting this discussion! We had quite a discussion on this and I will recite it here, but after this I recommend that we talk about this at today's Embedded WG meeting.

API

On the API side, we discussed for quite a while and came to the conclusion that the following things should be changed:

  1. The lock API should take a &mut self, instead of &self - this allows it to be compatible with the current Mutex<RefCell< ... dance and RTFM's internal Mutex trait.

  2. The API should not return a Result, and there is a good reason for this - but first lets discuss the use-case.

The Mutex trait is meant to be available so that HAL (and other crates/apps) can have a notion about a resource and unique access to it. For example, this means if I make an UART driver which has shared access I can protect this driver via a .lock(|| ... ) method and we can reason about safety and soundness.

And here comes the problem with a Result based lock - what does it mean to have a lock which can fail? A lock which can fail would be difficult to used to create HALs, as the reason for the lock failing is outside the scope of the HAL - only an infallible lock makes sense here. Another view is what if a lock fails - what does error handling mean at that low level rather than just allowing it to boil up through the API? Or if you have half setup a peripheral and a lock fails, how do you abort?

The next thing we though about is, why should a lock be fallible? We found one implementation of a spin-lock mutex which used a Result return to indicate a dead-lock, which is fine, but what other than a panic is correct if one would get a dead-lock in a driver?

All these questions point to an infallible Mutex trait. The example of a dead-lock in a driver is to simply implement this trait and .unwrap() in the implementation.

It boils down to this philosophy: panic is for program error - Result is for user error

Placement of the trait (which crate)

On the point that this PR is to the embedded-hal crate it is our strong recommendation to not place it here.
If there are breaking changes in the Embedded HAL it can be so that competing implementations of the Mutex trait starts to exist which could bifurcate the ecosystem, plus that the Mutex trait will be tied to HAL releases.

The issue here is that this Mutex trait will most likely become 1.0 very fast as we have a good idea of how to do this part - hence we should make a new crate in the Embedded WG eco-system, lets call it mutex-trait, which all base crates of each architecture will reexport.

Recommended actions

  1. Create a mutex-trait crate under the Embedded WG
  2. Include the traits as described, including the following changes:
    • Change to &mut self
    • Removed the Result return type
  3. Include and reexport it in the foundational crates for each architecture

Thanks again to starting this discussion!

cc @perlindgren @japaric

@Rahix
Copy link
Author

Rahix commented Sep 3, 2019

Hi @korken89,

thanks a lot for the feedback! I'll join the WG meeting so we can talk more though I'd like to make a few points in advance as well:

API

  1. The lock API should take a &mut self, instead of &self - this allows it to be compatible with the current Mutex<RefCell< ... >> dance and RTFM's internal Mutex trait.

As in

fn lock_mut<R>(&self, f: impl FnOnce(&mut T) -> R) -> Result<R, Self::Error>;
// becomes
fn lock_mut<R>(&mut self, f: impl FnOnce(&mut T) -> R) -> Result<R, Self::Error>;

? I see how this would be necessary for implementing the rtfm::Mutex trait, but from my point of view, it also kind of defeats the point of the mutex overall.

Making the mutex take &mut self means that you cannot share it because you'd need multiple mutable references to do so. This is why all existing mutex implementations take &self. The current Mutex<RefCell<...>> approach (ref Example Implementation) and the std::sync::Mutex are two examples.

I don't know much about RTFM but I took some time to read through the docs. From what I gathered, the rtfm::Mutex trait with &mut self works because resources are never 'shared' in a sense. There is only ever one task's context holding a reference to a resource and thus it can be mutable.

On the other hand, my initial use-case for the Mutex trait - shared-bus - requires sharing of the mutex. shared-bus allows multiple drivers to 'own' a proxy to access the bus and for this to work, they all need references to the mutex. This means the references must be immutable.

I feel like we might be talking about two very different concepts of a mutex. And I fear these two aren't easily compatible. Your proposed &mut self-mutex is unusable for resource sharing outside RTFM and I have a feeling that RTFM can't properly deal with shared mutexes like the one I'd need for shared-bus.

Maybe we need to step back here and outline what this proposed Mutex trait is supposed to be used for. From my side, this includes shared-bus and @eldruin's pcf857x driver. Both need multiple references to a mutex and thus immutable locking. What uses did you have in mind for the Mutex trait, especially in scope of RTFM?


  1. The API should not return a Result, and there is a good reason for this - but first lets discuss the use-case.

Ok, makes sense. I'd like to mention that this will likely make panic-never no longer work for most projects, however. For me personally, this is not a big issue but I think there might be people who disagree.


I also have a point which you have not touched yet:

Usefulness of RoMutex

Quoting the OP:

I am unsure if RoMutex is actually useful. I added it for easy compatibility with the bare-metal mutex but I don't know of any real uses.

I have though about this a bit over the past months and I am leaning to the side of not having two different mutex types. I haven't come up with a single usecase for RoMutex and having the split design hurts the API quite a bit. So I'd favor just having a single Mutex trait which behaves like the current RwMutex.


Placement of the trait (which crate)

On the point that this PR is to the embedded-hal crate it is our strong recommendation to not place it here.

Yes, I prefer a separate crate as well! I think this is a thing the embedded WG should look at more generally; For some of the other traits in embedded-hal it might make sense to have them split off, too.

@Rahix
Copy link
Author

Rahix commented Sep 3, 2019

@ilya-epifanov
Copy link
Collaborator

@Rahix I think the original proposal with non-mut fn lock(&self, ...) is much closer to what a real Mutex should be.

@korken89
Copy link

korken89 commented Sep 4, 2019

@ilya-epifanov This does however open up for unsound mutex implementations.
I will talk about this in my RFC, stay tuned. :)

@thejpster
Copy link
Contributor

With respect to the infallible Mutex, what is it supposed to do if someone already holds the Mutex? Panic? That seems undesirable, given that there may be a better approach to resolve the problem. If I call a function on my, say, SD Card block driver, but it can't lock the SPI bus because there's an outstanding transaction in progress on another chip select, I expect the block driver function call to return with a EBUSY error code, telling me I just need to try again later. If the Mutex panics, how does my SD card driver know it's safe to attempt to lock the SPI bus to initiate its next SPI transaction?

@korken89
Copy link

korken89 commented Sep 4, 2019

@thejpster I think you have misunderstood the usage, let me clarify.

If we take the example implementation linked above what you describe can never happen.
If the SPI mutex is locked the program is within a critical section and no other can try to lock it per design. The same holds for the RTFM Mutex.

If you want a Mutex which you can lock somewhere, and later unlock (not bounded by the scope of a closure) - this cannot be implemented with this kind of mutex, and it is not something we want to be able to do either.
A design based on this pattern is bound to fail unfortunately, as you can never check if all locks/unlocks are correct - with the closure bounded mutex not having this issue.
The way around this is to have event/message queues, and a single point where reads/writes happen (usually the peripheral ISR). We are currently doing this within our products with great success.

I will also add this as one example in the RFC to clarify.

@ilya-epifanov
Copy link
Collaborator

@thejpster @korken89 So this Mutex is the kind of a quick lock, which is usually implemented on single-core systems as let old_interrupt_mask = disable_interrupts(); f(); restore_interrupts(old_interrupt_mask), right?

If you need a lock (be it infallible but blocking or non-blocking), you could use this Mutex to implement it on top of your OS/scheduler.

@perlindgren
Copy link

Well the implementation can be whatever, panicing, spin-waiting, yielding, ..., the important thing is that locking should not be falliable.

In the case of RTFM, the behavior is a guaranteed success, and immediate progression without the need of the disable/enable interrupt dance. Instead RTFM filters interrupts using the BASEPRI (on M3 and above), according to the Stack Resource Policy.

While the MutexTrait could potentially provide a try_lock, I do believe that this is the wrong way to approach concurrency in driver design and embedded systems programming in general. In the general case it is very hard to know what action should be done by the driver in case a resource cannot be immediately claimed. Bare in mind that drivers typically operate under implicit real-time constraints stipulated by the underlying hardware (besides potential timing constraints stemming from the application/system level). I don't see how such constraints can be met if allowing for unpredictable behavior (failing locks).

So as a consequence, MutexTrait lock should be infalliable, and the coordination of resources managed at higher level in the driver/application design to ensure progression (and successfully meeting implicit and explicit timing constraints).

@thejpster
Copy link
Contributor

@korken89, gotcha. Thanks!

@jonas-schievink
Copy link
Contributor

The RFC has now been accepted, and the trait is available in https://github.com/rust-embedded/mutex-trait

Should this PR be closed?

@Rahix Rahix closed this Mar 19, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-waiting-on-review Review is incomplete T-hal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants