Skip to content

Handle may be dropped by firmware unexpectedly, leading to unsoundness #359

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
timrobertsdev opened this issue Feb 14, 2022 · 5 comments
Closed

Comments

@timrobertsdev
Copy link
Contributor

Affected: Firmware targeting EFI < 1.10 and UEFI drivers that do not use BootServices.open_protocol and BootServices.close_protocol.

Prior to EFI 1.10, the EFI_BOOT_SERVICES.UninstallProtocolInterface function lacks the safety checks that were added in 1.10. It simply removes a protocol from a handle, and if the last protocol is removed from a handle, that handle is freed by firmware and no longer valid. It is entirely up to the caller to ensure that there are no remaining references to that protocol and potentially that handle. Post-1.10, the firmware is able to see any consumers that have used open_protocol or close_protocol and attempt to disconnect those consumers before removing the protocol, failing if it is unable to do so.

As far as I can tell, the firmware does not have a direct mechanism to let you know that a handle has been removed and is invalid. I've got a couple ideas about how to go about this:

  1. Ensure the handle is valid at every callsite, potentially using locate_handle, though that requires a protocol to search for. There isn't another UEFI function that would be suitable, as the rest of the protocol-related functions need to be passed a valid handle.
  2. Remove Copy and Clone from Handle, add an unsafe_clone, and note the unsafety, trusting the caller.

This feels like a can of worms; some input before I start working on an implementation would be appreciated. :)

@nicholasbishop
Copy link
Member

Thanks for looking into this.

First question I have is: how common is firmware that is still on EFI <= 1.10? As far as I understand that's the last version of plain old EFI from before UEFI, and deprecated since 2005. So in theory nobody should have produced such firmware in 17 years. I think we could reasonably decide not to support >15 year old hardware, but of course companies sometimes do crazy things, so maybe there's more recent devices that would be impacted by this bug. Do you have any specific hardware/firmware in mind here?

@timrobertsdev
Copy link
Contributor Author

timrobertsdev commented Feb 14, 2022

I haven't found anything in the wild that still uses EFI < 1.10, but theoretically this would also affect any UEFI drivers or applications written without the use of OpenProtocol and CloseProtocol (possibly using HandleProtocol instead), which is unlikely if the authors are following best practices (and/or our deprecation warning), but still possible.

EDIT: If we decide to not support EFI <= 1.10, I suppose the deprecation warning on BootServices::handle_protocol would be sufficient, possibly with extra documentation detailing the safety issues.

@nicholasbishop
Copy link
Member

Some followup thoughts and questions:

Taking another look at EFI_BOOT_SERVICES.UninstallProtocolInterface in the spec, I noticed this piece:

The caller is responsible for ensuring that there are no references to a protocol interface that has been removed. In some cases, outstanding reference information is not available in the protocol, so the protocol, once added, cannot be removed. Examples include Console I/O, Block I/O, Disk I/O, and (in general) handles to device protocols.

It then goes on to describe how this has been fixed by adding OpenProtocol/CloseProtocol.

To me that section makes it sound like "solved" it in pre-1.10 by just not allowing public handles to be dropped, in which case we don't need to worry about it in older firmware. Instead the issue seems to be worse with modern firmware, since now OpenProtocol/CloseProtocol do exist, so the firmware may think it's safe to drop a handle if the internal database doesn't show any users. If code is still using HandleProtocol it could cause unsafety. So I think you are right that we should better document that potential for unsafety, and maybe consider dropping those methods entirely or at least mark them unsafe.


I took another look at the other suggested mitigations and I don't think they would work:

Ensure the handle is valid at every callsite, potentially using locate_handle

If I understand correctly, UEFI switches between tasks with interrupts. So even if you tried to validate a handle every time with locate_handle, you might get interrupted between validation and actual use of the handle, and if that interrupt executes code that makes the handle invalid, you're right back where you started.

Remove Copy and Clone from Handle

I think the same issues exist even if you ban safe copy/clone of a handle, since the original handle being copied from could also become invalid, so having a copy or not seems irrelevant to me.


Interested to hear your thoughts on all the above. This stuff is tricky and I'm not at all sure I've correctly understood everything :)

nicholasbishop added a commit to nicholasbishop/uefi-rs that referenced this issue Jul 5, 2022
This method has the same problems as `handle_protocol`; it does not mark
the handle and protocol as in use. Calls to `locate_protocol` can be
replaced by calling `get_handle_for_protocol` and `open_protocol`.

rust-osdev#359
nicholasbishop added a commit to nicholasbishop/uefi-rs that referenced this issue Jul 5, 2022
This method has the same problems as `handle_protocol`; it does not mark
the handle and protocol as in use. Calls to `locate_protocol` can be
replaced by calling `get_handle_for_protocol` and `open_protocol`.

rust-osdev#359
@timrobertsdev
Copy link
Contributor Author

This stuff is indeed tricky! With UEFI switching between tasks with interrupts, I'm not sure how any kind of meaningful validation could be done. I'm very on-board with your recent PR that marks locate_handle as unsafe and deprecates it. With how the spec is written, I don't think we have any other realistic solution.

GabrielMajeri pushed a commit to nicholasbishop/uefi-rs that referenced this issue Jul 15, 2022
This method has the same problems as `handle_protocol`; it does not mark
the handle and protocol as in use. Calls to `locate_protocol` can be
replaced by calling `get_handle_for_protocol` and `open_protocol`.

rust-osdev#359
GabrielMajeri pushed a commit that referenced this issue Jul 15, 2022
* Add BootServices::get_handle_for_protocol

This is a convenience method to get any arbitrary handle that supports a
particular `Protocol`.

* Use open_protocol in shim-lock test

* Use open_protocol in multiprocessor test

* Use open_protocol in device path test

* Use open_protocol in pointer test

* Use open_protocol in graphics test

* Use open_protocol in file system test

* Use open_protocol in the serial device test

* Simplify event callback with context test

Rather than opening the Output protocol, which isn't important for this
test, just write through the context pointer and assert that the
expected data was written.

* Mark handle_protocol as unsafe

This function is already marked deprecated, mark it unsafe as well and
update the documentation to describe why.

* Deprecate BootServices::locate_protocol and mark it unsafe

This method has the same problems as `handle_protocol`; it does not mark
the handle and protocol as in use. Calls to `locate_protocol` can be
replaced by calling `get_handle_for_protocol` and `open_protocol`.

#359
e820 pushed a commit to e820/uefi-rs that referenced this issue Jul 30, 2022
* Add BootServices::get_handle_for_protocol

This is a convenience method to get any arbitrary handle that supports a
particular `Protocol`.

* Use open_protocol in shim-lock test

* Use open_protocol in multiprocessor test

* Use open_protocol in device path test

* Use open_protocol in pointer test

* Use open_protocol in graphics test

* Use open_protocol in file system test

* Use open_protocol in the serial device test

* Simplify event callback with context test

Rather than opening the Output protocol, which isn't important for this
test, just write through the context pointer and assert that the
expected data was written.

* Mark handle_protocol as unsafe

This function is already marked deprecated, mark it unsafe as well and
update the documentation to describe why.

* Deprecate BootServices::locate_protocol and mark it unsafe

This method has the same problems as `handle_protocol`; it does not mark
the handle and protocol as in use. Calls to `locate_protocol` can be
replaced by calling `get_handle_for_protocol` and `open_protocol`.

rust-osdev#359
@nicholasbishop
Copy link
Member

I think we can call this one done now. We've made some good improvements in the protocol interface that should help keep things sound, although to a certain extent we'll always be at the mercy of UEFI implementations.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants