-
Notifications
You must be signed in to change notification settings - Fork 198
Improve gimlet qspi error reporting #2071
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
drv/gimlet-hf-server/src/main.rs
Outdated
self.poll_for_write_complete(None); | ||
self.qspi | ||
.page_program(addr, data) | ||
.map_err(|x| qspi_to_hf(x))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: you can write this without the lambda, e.g. map_err(qspi_to_hf)
(same note applies in a bunch of other places in the PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another thought (take it or leave it) is that there are a number of places where we call the same method on self.qspi
(in particular, read_status
) and then have to add the map_err
; if we wanted to reduce the repetition of that, we could consider adding wrappers like:
impl ServerImpl {
fn read_status(&self) -> Result<u8, HfError> {
self.qspi.read_status().map_err(qspi_to_hf)
}
// ... perhaps other wrappers as well?
}
to avoid having to write .map_err(qspi_to_hf)
quite as much?
i'm not sure if this is actually worth the effort or not, but i figured it was worth suggesting.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I took the suggestion for read_status
. Maybe it should apply elsehwere?
drv/gimlet-hf-server/src/main.rs
Outdated
// If we can't read the id there's a good chance nothing else is going to | ||
// work. `panic` would probably just be a crash loop. looping forever | ||
// is slightly more polite. | ||
loop { | ||
// We are dead now. | ||
hl::sleep_for(1000); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
elsewhere, we have tasks that die permanently by calling sys_recv_notification
with an empty notification mask, like this:
hubris/drv/gimlet-seq-server/src/main.rs
Lines 174 to 184 in 7a24b88
// All these moments will be lost in time, like tears in rain... | |
// Time to die. | |
loop { | |
// Sleeping with all bits in the notification mask clear means | |
// we should never be notified --- and if one never wakes up, | |
// the difference between sleeping and dying seems kind of | |
// irrelevant. But, `rustc` doesn't realize that this should | |
// never return, we'll stick it in a `loop` anyway so the main | |
// function can return `!` | |
sys_recv_notification(0); | |
} |
i think this is a bit nicer as we don't get periodically woken up by the timer, remember that we are just looping on it, and go back to sleep. not that this case is particularly worth optimizing for, though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also, is there, perhaps, something we ought to do to indicate that we are in a permanently-bused state?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I do think we would benefit from a LIKELY_VERY_VERY_DEAD
state
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIRC Humility shows tasks in sys_recv_notification(0)
as (DEAD)
, so there's precedent for treating it as a canonically dead state.
@@ -61,6 +61,14 @@ struct Config { | |||
pub clock: u8, | |||
} | |||
|
|||
// There isn't a great crate to do `From` implementation so do this manually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, i suppose the alternative is giving drv-hf-api
a dependency on drv_stm32h7_qspi
? which, yeah, seems sad...
drv/gimlet-hf-server/src/main.rs
Outdated
self.poll_for_write_complete(None); | ||
self.qspi | ||
.page_program(addr, data) | ||
.map_err(|x| qspi_to_hf(x))?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
another thought (take it or leave it) is that there are a number of places where we call the same method on self.qspi
(in particular, read_status
) and then have to add the map_err
; if we wanted to reduce the repetition of that, we could consider adding wrappers like:
impl ServerImpl {
fn read_status(&self) -> Result<u8, HfError> {
self.qspi.read_status().map_err(qspi_to_hf)
}
// ... perhaps other wrappers as well?
}
to avoid having to write .map_err(qspi_to_hf)
quite as much?
i'm not sure if this is actually worth the effort or not, but i figured it was worth suggesting.
drv/stm32h7-qspi/src/lib.rs
Outdated
@@ -258,6 +282,18 @@ impl Qspi { | |||
// perform transfers. | |||
let mut out = out; | |||
while !out.is_empty() { | |||
if self.reg.sr.read().tof().bit_is_set() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about reading self.reg.sr
just once, then checking tof
, tef
, and flevel
(below) on that value?
@@ -61,6 +61,14 @@ struct Config { | |||
pub clock: u8, | |||
} | |||
|
|||
// There isn't a great crate to do `From` implementation so do this manually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have a slight bias towards making drv-hf-api
depend on drv-stm32h7-qspi
, so that it can define the From
implementation, but I don't feel strongly about it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suppose the downside of that is that it probably makes drv-hf-api
not compile for target MCUs other than STM32H7. But, we don't currently have any reason to implement this API on other MCUs, so 🤷♀️
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We also use drv-hf-api
on cosmo which is still an STM32H7 but does not use drv-stm32h7-qspi
for host flash so it felt a little weird to add that there. I don't love any option here tbh.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Industrial Strength version would be to add a stm32
feature to drv-hf-api
which then brings in the QSPI crate and adds the From
implementation, but that's Even More Shenanigans 🤷🏻
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
alternatively QspiError
could go in some kinda qspi-types
crate or something that both the drv-stm32h7-qspi
and drv-hf-api
crates depend on. but...that's a Lot...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
drv/stm32h7-qspi/src/lib.rs
Outdated
if sr.tof().bit_is_set() { | ||
self.reg.fcr.modify(|_, w| w.ctof().set_bit()); | ||
self.disable_all_interrupts(); | ||
return Err(QspiError::Timeout); | ||
} | ||
if sr.tef().bit_is_set() { | ||
self.reg.fcr.modify(|_, w| w.ctef().set_bit()); | ||
self.disable_all_interrupts(); | ||
return Err(QspiError::TransferError); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It occurs to me that, rather than having all the error paths have to remember to explicitly disable IRQs, we could maybe wrap this in an inner function that we call and then match the Result
from, so that there's one place where we clear the IRQs regardless of how we exited from the function. Not sure if this is worth the effort unless there's a possibility of adding additional error paths here in the future.
drv/stm32h7-qspi/src/lib.rs
Outdated
self.disable_all_interrupts(); | ||
return Err(QspiError::Timeout); | ||
} | ||
if sr.tef().bit_is_set() { | ||
self.reg.fcr.modify(|_, w| w.ctef().set_bit()); | ||
self.disable_all_interrupts(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similarly, is it worth refactoring this so that the error path disables interrupts in one place instead of multiple?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll give this a shot (and I deeply appreciate you having an eye for things like this!)
We currently ignore any errors when working with the QSPI driver. The biggest place this is noticable is if we happen to give a bad address for reading we will loop forever. Fix this up to actually handle errors.
2eecf2c
to
91e2f6e
Compare
We currently ignore any errors when working with the QSPI driver. The biggest place this is noticable is if we happen to give a bad address for reading we will loop forever. Fix this up to actually handle errors.