Skip to content

a sound {bare_metal,cortex_m,etc}::Mutex #388

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 3 commits into from

Conversation

japaric
Copy link
Member

@japaric japaric commented Oct 23, 2019

bare_metal::Mutex, re-exported in the cortex-m and other crates, is a mutex
based on critical sections that temporarily disable all interrupts. As provided
today this abstraction is unsound in multi-core context because Mutex can be
stored in static variables, which are visible to all cores, but interrupt
masking is not sufficient to synchronize (potentially) parallel access (i.e.
access from different cores) to memory.

This document proposes that we deprecate the existing Mutex abstraction in
favor of a mutex that properly expresses the idea that mutexes based on
interrupt-masking is only "Sync" in single-core context.

Rendered

@japaric japaric requested review from dylanmckay, jcsoo and a team as code owners October 23, 2019 23:10
@jamesmunns
Copy link
Member

Hey @japaric, I need to give this a deeper read, but it makes sense based on previous statements and explorations you've made re: the current assumptions of single core in the current rust-embedded ecosystem.

Would you mind expanding how this proposal does or does not work with #377, particularly as it has recently entered FCP?

If there are any soundness or convenience holes in #377, I'd prefer if we address those now, before merging the RFC.

@japaric
Copy link
Member Author

japaric commented Oct 23, 2019

@jamesmunns This proposal is orthogonal to RFC #377.

RFC #377 specifies a (unifies the) lock API (the signature of the method) but says nothing about whether a Mutex implementor needs to be Sync, SingleCoreSync or none of them. This is intentional and gives more flexibility to driver authors: they could for example (a) build a driver tied to an interrupt handler it that case they'll want a bound T: Mutex + SingleCoreSync to prevent unsynchronized things from being part of the state of the driver; or (b) provide a more general TemperatureSensor<I> where I: Mutex<impl I2C> driver that says nothing about its Sync-ness; an instance of this driver will inherit the Sync-ness of the concrete Mutex implementor it was initialized with, e.g. I could be RefCell<_> (then the driver instance will NOT be Sync), spin::Mutex (Sync driver) or SingleCoreMutex (driver is just SingleCoreSync) .

This proposal is about fixing a particular mutex implementation, the bare_metal::Mutex type (re-exported in cortex_m). This type wrongly implements the Sync trait and that makes it unsound by definition (though the data races can only be observed in multi-core context).

thejpster
thejpster previously approved these changes Oct 24, 2019
Copy link
Contributor

@thejpster thejpster left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me. Thanks for the detailed write-up!

korken89
korken89 previously approved these changes Oct 29, 2019
Copy link
Contributor

@korken89 korken89 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have waited for this day, thanks for starting the push towards single vs. multicore before it comes to bite us.

On the note of fixing or removing, I think this is a difficult question as it would incur extra cost on single core systems while being sound for multicore systems.
I will ponder the implications for this a bit.

Copy link
Contributor

@jonas-schievink jonas-schievink left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely in favor of going down this road, thanks for writing this up!

It would be great to figure out what it would take to stabilize auto trait. The tracking issue rust-lang/rust#13231 is pretty inactive.

andre-richter
andre-richter previously approved these changes Oct 30, 2019
Copy link
Member

@andre-richter andre-richter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for writing this up. Good stuff.

Since you asked for name-bikeshedding:
We could discuss if we want to aim for a more generic term than SingleCore. For example, there are SoCs with multiple SingleCore CPUs on them but still working with shared memory.

In such a case, SingleCore* has a tendency to become ambiguous.
SingleProcessingSync might be the academically correct term (as in being the opposite of MultiProcessingSync).

@therealprof
Copy link
Contributor

We could discuss if we want to aim for a more generic term than SingleCore. For example, there are SoCs with multiple SingleCore CPUs on them but still working with shared memory.

Sorry I don't follow. If you have a SoC with multiple CPUs containing a single core the SoC is by definition multi core. Where exactly is the ambiguity here?

@therealprof
Copy link
Contributor

I'm not too hot on the RacyMutex workaround but I do love the SHALL_WE_DANCE theme. 👍

therealprof
therealprof previously approved these changes Nov 1, 2019
Co-Authored-By: Jonas Schievink <[email protected]>
Co-Authored-By: Jonas Schievink <[email protected]>
@japaric
Copy link
Member Author

japaric commented Nov 4, 2019

@therealprof perhaps @andre-richter is thinking of something along the lines of: a Cortex-M4 microprocessor has one core but one SoC / IC can contain two M4 microprocessors both connected to a memory block, external to the die of each microprocessor but within the die of the SoC / IC. This is not exactly the same an Intel microprocessor with 8 cores all connected to the same internal memory (e.g. L3 cache) -- in this case the cores and the memory are on the same microprocessor die.

I don't think the concept of die / SoC is too relevant. In principle, one could have one M4 SoC / IC and one M3 SoC / IC both connected to external SRAM (another IC) and write an application as a single Rust crate for the whole system. Each SoC is single-core but the overall system is multi-core.


SingleProcessingSync

I think this name may be misleading. For example, one may have to deal with both traits, SingleProcessingSync and "MultiProcessingSync" (i.e. core::marker::Sync), in the same multi-core application. For instance, in a multi-core RTFM application the core::marker::Send bound is required to send a message from a task running on a core to another task running on a different core; but to send a message between tasks running on the same core the payload only needs to satisfy the SingleCoreSend / SingleProcessingSend trait.

It is also possible to write a program that uses a single core for a multi-processing OS like Linux (using thread affinity). That program only needs to satisfy the SingleProcessing* bounds but it, or many instances of it, can run on a multiprocessing system.


I personally think that CrossCoreSend is the most descriptive name for core::Send (ditto for Sync). We can't rename the Send - Sync traits because they are defined in core but we could re-export the traits in a crate under the CrossCore* names and encourage people to use those renamed traits in embedded crates. That still leaves the question of how to name the other two traits. Perhaps WithinACoreSync? "only synchronous within a core". So we could end with something like this:

// crate-name: send_sync
/// trait definitions

pub use core::marker::Send as CrossCoreSend;
pub use core::marker::Sync as CrossCoreSync;

pub unsafe auto trait WithinACoreSend {}
pub unsafe auto trait WithinACoreSync {}
// crate-name: heapless
/// trait user

mod pool {
    use send_sync::CrossCoreSend;

    // owned pointer managed by a pool allocator
    pub struct Box<T> { /* .. */ }

    unsafe impl<T> CrossCoreSend for Box<T> {} // instead of Send
}

@therealprof
Copy link
Contributor

I don't think the concept of die / SoC is too relevant. In principle, one could have one M4 SoC / IC and one M3 SoC / IC both connected to external SRAM (another IC) and write an application as a single Rust crate for the whole system. Each SoC is single-core but the overall system is multi-core.

Indeed, I fully agree. For a long time people have been calling multiple cores in a single package multi-core indendent of their connection/configuration. In fact the first dual-core Intel chip (Presler) consisted of two dies placed in the same package. But it really does not matter.

@andre-richter
Copy link
Member

one SoC / IC can contain two M4 microprocessors

Thats what I meant. But I am okay only counting cores (sounds like a band name: "The Counting Cores" 🤣).

SameCoreSync would read little easier, but might semantically not be as correct as "within a". Just thinking out loud.

I really like CrossCore*.

@eldruin
Copy link
Member

eldruin commented Nov 5, 2019

What about a future situation where multithreading within a single core cortex-m microprocessor becomes possible?
Maybe "core" becomes ambiguous again, depending on your definition.

@korken89
Copy link
Contributor

korken89 commented Nov 5, 2019

While I think many like that a replacement for the re-exported bare_metal mutex comes with this, do we want this or do we want to go towards the Mutex trait together with starting the profilation of mutex crates and have the racy mutex in its own crate? And keep only the traits?

@jamesmunns
Copy link
Member

We decided to decline this RFC in favor of #419, which has similar goals.

Thank you @japaric for your work on this!

@jamesmunns jamesmunns closed this Feb 12, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants