-
Notifications
You must be signed in to change notification settings - Fork 83
(Zero cost) stack overflow protection #34
Comments
You can achieve a similar result if you set a fixed size for a stack. You just create a section with that size at the beginning of RAM, and set the stack pointer to the end of a section. Downside is that you waste some memory at the end. |
@pftbest Yeah, I thought about that option but decided not to include it here because it seems to be very fram far from ideal. Reasons: (a) it requires user input (there's no sensible default, I think), (b) it's error-prone (e.g. if the selected stack space is too large you'll get an error about not being enough space to fit the static variables) and (c) it's not efficient (if you pick a stack size that's too small then you leave space unused, what you mentioned). |
Stack probes only (currently) exist in Rust to avoid "jumping over" the guard page of the thread stack and into the heap. The old mechanism, which relied on segmented stacks and TLS slots, is dead and it would be really annoying to resurrect for this issue. |
@japaric Anyway, I know how to do this. Place the static sections twice. The first time you place them, they go into |
so that the stack can never collide into them closes #34
This is one possible solution to the stack overflow problem described in #34. This approach uses a linker wrapper, called [swap-ld], to generate the desired memory layout. See #34 for a description of the desired memory layout and #41 for a description of how `swap-ld` works. The observable effects of this change in cortex-m programs are: - the `_sbss` symbol is now override-able. - there is now a `.stack` linker section that denotes the span of the call stack. `.stack` won't be loaded into the program; it just exists for informative purposes (`swap-ld` uses this information). Given the following program: ``` rust fn main() { static mut X: u32 = 0; static mut Y: u32 = 1; loop { unsafe { ptr::write_volatile(&mut X, X + 1); ptr::write_volatile(&mut Y, Y + 1); } } } ``` If you link this program using the `arm-none-eabi-ld` linker, which is the cortex-m-quickstart default, you'll get the following memory layout: ``` console $ console section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x5000 0x20000000 .bss 0x4 0x20000000 .data 0x4 0x20000004 ``` Note how the space reserved for the stack (depicted by the `.stack` linker section) overlaps with the space where .bss and .data reside. If you, instead, link this program using `swap-ld` you'll get the following memory layout: ``` console $ arm-none-eabi-size -Ax app section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x4ff8 0x20000000 .bss 0x4 0x20004ff8 .data 0x4 0x20004ffc ``` Note that no overlap exists in this case and that the call stack size has reduced to accommodate the .bss and .data sections. Unlike #41 the addresses of static variables is now correct: ``` console $ arm-none-eabi-objdump -CD app Disassembly of section .vector_table: 08000000 <_svector_table>: 8000000: 20004ff8 strdcs r4, [r0], -r8 ; initial Stack Pointer 08000004 <cortex_m_rt::RESET_VECTOR>: 8000004: 08000131 stmdaeq r0, {r0, r4, r5, r8} 08000008 <EXCEPTIONS>: 8000008: 080001bd stmdaeq r0, {r0, r2, r3, r4, r5, r7, r8} (..) Disassembly of section .stack: 20000000 <.stack>: ... Disassembly of section .bss: 20004ff8 <cortex_m_quickstart::main::X>: 20004ff8: 00000000 andeq r0, r0, r0 Disassembly of section .data: 20004ffc <_sdata>: 20004ffc: 00000001 andeq r0, r0, r1 ``` closes #34 [swap-ld]: https://github.com/japaric/swap-ld
This is one possible solution to the stack overflow problem described in #34. This approach uses a linker wrapper, called [swap-ld], to generate the desired memory layout. See #34 for a description of the desired memory layout and #41 for a description of how `swap-ld` works. The observable effects of this change in cortex-m programs are: - the `_sbss` symbol is now override-able. - there is now a `.stack` linker section that denotes the span of the call stack. `.stack` won't be loaded into the program; it just exists for informative purposes (`swap-ld` uses this information). Given the following program: ``` rust fn main() { static mut X: u32 = 0; static mut Y: u32 = 1; loop { unsafe { ptr::write_volatile(&mut X, X + 1); ptr::write_volatile(&mut Y, Y + 1); } } } ``` If you link this program using the `arm-none-eabi-ld` linker, which is the cortex-m-quickstart default, you'll get the following memory layout: ``` console $ console section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x5000 0x20000000 .bss 0x4 0x20000000 .data 0x4 0x20000004 ``` Note how the space reserved for the stack (depicted by the `.stack` linker section) overlaps with the space where .bss and .data reside. If you, instead, link this program using `swap-ld` you'll get the following memory layout: ``` console $ arm-none-eabi-size -Ax app section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x4ff8 0x20000000 .bss 0x4 0x20004ff8 .data 0x4 0x20004ffc ``` Note that no overlap exists in this case and that the call stack size has reduced to accommodate the .bss and .data sections. Unlike #41 the addresses of static variables is now correct: ``` console $ arm-none-eabi-objdump -CD app Disassembly of section .vector_table: 08000000 <_svector_table>: 8000000: 20004ff8 strdcs r4, [r0], -r8 ; initial Stack Pointer 08000004 <cortex_m_rt::RESET_VECTOR>: 8000004: 08000131 stmdaeq r0, {r0, r4, r5, r8} 08000008 <EXCEPTIONS>: 8000008: 080001bd stmdaeq r0, {r0, r2, r3, r4, r5, r7, r8} (..) Disassembly of section .stack: 20000000 <.stack>: ... Disassembly of section .bss: 20004ff8 <cortex_m_quickstart::main::X>: 20004ff8: 00000000 andeq r0, r0, r0 Disassembly of section .data: 20004ffc <_sdata>: 20004ffc: 00000001 andeq r0, r0, r1 ``` closes #34 [swap-ld]: https://github.com/japaric/swap-ld
This is one possible solution to the stack overflow problem described in #34. This approach uses a linker wrapper, called [swap-ld], to generate the desired memory layout. See #34 for a description of the desired memory layout and #41 for a description of how `swap-ld` works. The observable effects of this change in cortex-m programs are: - the `_sbss` symbol is now override-able. - there is now a `.stack` linker section that denotes the span of the call stack. `.stack` won't be loaded into the program; it just exists for informative purposes (`swap-ld` uses this information). Given the following program: ``` rust fn main() { static mut X: u32 = 0; static mut Y: u32 = 1; loop { unsafe { ptr::write_volatile(&mut X, X + 1); ptr::write_volatile(&mut Y, Y + 1); } } } ``` If you link this program using the `arm-none-eabi-ld` linker, which is the cortex-m-quickstart default, you'll get the following memory layout: ``` console $ console section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x5000 0x20000000 .bss 0x4 0x20000000 .data 0x4 0x20000004 ``` Note how the space reserved for the stack (depicted by the `.stack` linker section) overlaps with the space where .bss and .data reside. If you, instead, link this program using `swap-ld` you'll get the following memory layout: ``` console $ arm-none-eabi-size -Ax app section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x4ff8 0x20000000 .bss 0x4 0x20004ff8 .data 0x4 0x20004ffc ``` Note that no overlap exists in this case and that the call stack size has reduced to accommodate the .bss and .data sections. Unlike #41 the addresses of static variables is now correct: ``` console $ arm-none-eabi-objdump -CD app Disassembly of section .vector_table: 08000000 <_svector_table>: 8000000: 20004ff8 strdcs r4, [r0], -r8 ; initial Stack Pointer 08000004 <cortex_m_rt::RESET_VECTOR>: 8000004: 08000131 stmdaeq r0, {r0, r4, r5, r8} 08000008 <EXCEPTIONS>: 8000008: 080001bd stmdaeq r0, {r0, r2, r3, r4, r5, r7, r8} (..) Disassembly of section .stack: 20000000 <.stack>: ... Disassembly of section .bss: 20004ff8 <cortex_m_quickstart::main::X>: 20004ff8: 00000000 andeq r0, r0, r0 Disassembly of section .data: 20004ffc <_sdata>: 20004ffc: 00000001 andeq r0, r0, r1 ``` closes #34 [swap-ld]: https://github.com/japaric/swap-ld
Hi, It looks like the fix in #43 was reverted as a side-effect of #64 as far as I understand. It looks like no support from
Then initializing the heap with: extern "C" {
static mut __sheap: u32;
static mut __eheap: u32;
}
let sheap = unsafe { &mut __sheap } as *mut u32 as usize;
let eheap = unsafe { &mut __eheap } as *mut u32 as usize;
assert!(sheap < eheap);
// Unsafe: Called only once before any allocation.
unsafe { ALLOCATOR.init(sheap, eheap - sheap) } However, I wonder if Thanks! |
The concept from #43 ended up in https://github.com/knurling-rs/flip-link which is probably the better way to do it (for now at least!). The problem with doing it in the linker script is you end up having to know in advance a specific size for the stack, so either you overestimate and things don't fit, or you end up with lots of unused RAM that could have been stack. With a tool like flip-link, the stack is automatically made as big as it can be. Ideally, the linker itself could figure this out for us, but I don't believe that's currently possible. |
Hi @adamgreig , I'm aware of flip-link but I've hit knurling-rs/flip-link#43 for one of the board I want to support (nRF52840-dongle using DFU where I need to reserve space at the beginning of the flash for the bootloader) so I stopped using it.
This is not a problem if you use a heap, because both the stack and the heap are variable-sized. As long as you use only one (you always use a stack, so as long as you don't use a heap), you don't need a linker script. But as soon as you use a heap, you need to decide how to assign the non-data RAM between the stack and the heap, essentially choosing a size for both. So the main issue I see with doing it in the linker script is that it's not convenient. The best scenario would be something like:
|
Today, the layout of RAM looks like this (assuming no heap and a single RAM region):
With this layout when a stack overflow occurs
static
variables end up being overwritten /corrupted silently.
This scenario can be avoided by simply changing the memory layout to look like this:
In this new scenario a stack overflow will hit the lower RAM boundary. Trying to write beyond the
RAM boundaries raises a HardFault exception. Thus, in theory, the HardFault exception handler
could be used as a stack overflow handler.
In systems where heap memory, which by default starts where the
static
region ends and growsupwards, exists a similar reordering of the regions can be applied.
Implementation
To my knowledge this can't be implement using only linker scripts. With a linker script you can
instruct the linker where to start a memory region but you can't specify the end address of the
region.
The related bits of the linker script we use are shown below:
A C implementation of this region reordering uses a two step linking process. See
this StackOverflow answer for details.
Other options for stack overflow protection
Assuming that we can't change the memory layout. We could:
Use the MPU (Memory Protection Unit) to mark the upper boundary of the
static
region asread-only. In this scenario when a stack overflow occurs a MemManage exception is raised. This has
an initialization cost but no runtime cost. The downside is that not all microcontrollers have a
MPU.
Implement stack probes for the Cortex-M targets. I believe enabling stack probes carries a runtime
cost (per function call?) but I'm not sure.
The text was updated successfully, but these errors were encountered: