(Zero cost) stack overflow protection #34

japaric · 2017-09-27T22:38:54Z

Today, the layout of RAM looks like this (assuming no heap and a single RAM region):

With this layout when a stack overflow occurs static variables end up being overwritten /
corrupted silently.

This scenario can be avoided by simply changing the memory layout to look like this:

In this new scenario a stack overflow will hit the lower RAM boundary. Trying to write beyond the
RAM boundaries raises a HardFault exception. Thus, in theory, the HardFault exception handler
could be used as a stack overflow handler.

In systems where heap memory, which by default starts where the static region ends and grows
upwards, exists a similar reordering of the regions can be applied.

Implementation

To my knowledge this can't be implement using only linker scripts. With a linker script you can
instruct the linker where to start a memory region but you can't specify the end address of the
region.

The related bits of the linker script we use are shown below:

PROVIDE(_stack_start = ORIGIN(RAM) + LENGTH(RAM));

SECTIONS
{
  /* .. */
  .bss : ALIGN(4)
  {
    _sbss = .;
    *(.bss .bss.*);
    . = ALIGN(4);
    _ebss = .;
  } > RAM

  .data : ALIGN(4)
  {
    _sidata = LOADADDR(.data);
    _sdata = .;
    *(.data .data.*);
    . = ALIGN(4);
    _edata = .;
  } > RAM AT > FLASH

  /* The heap starts right after the .bss + .data section ends */
  _sheap = _edata;

  /* .. */
}

A C implementation of this region reordering uses a two step linking process. See
this StackOverflow answer for details.

Other options for stack overflow protection

Assuming that we can't change the memory layout. We could:

Use the MPU (Memory Protection Unit) to mark the upper boundary of the static region as
read-only. In this scenario when a stack overflow occurs a MemManage exception is raised. This has
an initialization cost but no runtime cost. The downside is that not all microcontrollers have a
MPU.
Implement stack probes for the Cortex-M targets. I believe enabling stack probes carries a runtime
cost (per function call?) but I'm not sure.

The text was updated successfully, but these errors were encountered:

pftbest · 2017-09-27T23:10:53Z

You can achieve a similar result if you set a fixed size for a stack. You just create a section with that size at the beginning of RAM, and set the stack pointer to the end of a section.

Downside is that you waste some memory at the end.

japaric · 2017-10-02T10:10:44Z

@pftbest Yeah, I thought about that option but decided not to include it here because it seems to be very fram far from ideal. Reasons: (a) it requires user input (there's no sensible default, I think), (b) it's error-prone (e.g. if the selected stack space is too large you'll get an error about not being enough space to fit the static variables) and (c) it's not efficient (if you pick a stack size that's too small then you leave space unused, what you mentioned).

whitequark · 2017-10-15T11:18:04Z

Implement stack probes for the Cortex-M targets. I believe enabling stack probes carries a runtime
cost (per function call?) but I'm not sure.

Stack probes only (currently) exist in Rust to avoid "jumping over" the guard page of the thread stack and into the heap. The old mechanism, which relied on segmented stacks and TLS slots, is dead and it would be really annoying to resurrect for this issue.

whitequark · 2017-10-15T11:19:49Z

@japaric Anyway, I know how to do this. Place the static sections twice. The first time you place them, they go into /dev/null (via /DISCARD/ or something) but you can get the size with the SIZEOF operator. The second time, you use that result. So it's two-pass linking in one ld invocation.

so that the stack can never collide into them closes #34

This is one possible solution to the stack overflow problem described in #34. This approach uses a linker wrapper, called [swap-ld], to generate the desired memory layout. See #34 for a description of the desired memory layout and #41 for a description of how `swap-ld` works. The observable effects of this change in cortex-m programs are: - the `_sbss` symbol is now override-able. - there is now a `.stack` linker section that denotes the span of the call stack. `.stack` won't be loaded into the program; it just exists for informative purposes (`swap-ld` uses this information). Given the following program: ``` rust fn main() { static mut X: u32 = 0; static mut Y: u32 = 1; loop { unsafe { ptr::write_volatile(&mut X, X + 1); ptr::write_volatile(&mut Y, Y + 1); } } } ``` If you link this program using the `arm-none-eabi-ld` linker, which is the cortex-m-quickstart default, you'll get the following memory layout: ``` console $ console section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x5000 0x20000000 .bss 0x4 0x20000000 .data 0x4 0x20000004 ``` Note how the space reserved for the stack (depicted by the `.stack` linker section) overlaps with the space where .bss and .data reside. If you, instead, link this program using `swap-ld` you'll get the following memory layout: ``` console $ arm-none-eabi-size -Ax app section size addr .vector_table 0x130 0x8000000 .text 0x94 0x8000130 .rodata 0x0 0x80001c4 .stack 0x4ff8 0x20000000 .bss 0x4 0x20004ff8 .data 0x4 0x20004ffc ``` Note that no overlap exists in this case and that the call stack size has reduced to accommodate the .bss and .data sections. Unlike #41 the addresses of static variables is now correct: ``` console $ arm-none-eabi-objdump -CD app Disassembly of section .vector_table: 08000000 <_svector_table>: 8000000: 20004ff8 strdcs r4, [r0], -r8 ; initial Stack Pointer 08000004 <cortex_m_rt::RESET_VECTOR>: 8000004: 08000131 stmdaeq r0, {r0, r4, r5, r8} 08000008 <EXCEPTIONS>: 8000008: 080001bd stmdaeq r0, {r0, r2, r3, r4, r5, r7, r8} (..) Disassembly of section .stack: 20000000 <.stack>: ... Disassembly of section .bss: 20004ff8 <cortex_m_quickstart::main::X>: 20004ff8: 00000000 andeq r0, r0, r0 Disassembly of section .data: 20004ffc <_sdata>: 20004ffc: 00000001 andeq r0, r0, r1 ``` closes #34 [swap-ld]: https://github.com/japaric/swap-ld

ia0 · 2022-01-01T20:38:59Z

Hi,

It looks like the fix in #43 was reverted as a side-effect of #64 as far as I understand. It looks like no support from cortex-m-rt is actually need to get stack overflow protection as well as maximum heap usage (related to #5) by using a memory.x like the following:

__stack_size = 0x10000;

MEMORY
{
  FLASH : ORIGIN = 0x00000000, LENGTH = 0x00100000
  RAM   : ORIGIN = 0x20000000 + __stack_size, LENGTH = 0x00040000 - __stack_size
}

_stack_start = ORIGIN(RAM);
__eheap = ORIGIN(RAM) + LENGTH(RAM);

Then initializing the heap with:

extern "C" {
    static mut __sheap: u32;
    static mut __eheap: u32;
}
let sheap = unsafe { &mut __sheap } as *mut u32 as usize;
let eheap = unsafe { &mut __eheap } as *mut u32 as usize;
assert!(sheap < eheap);
// Unsafe: Called only once before any allocation.
unsafe { ALLOCATOR.init(sheap, eheap - sheap) }

However, I wonder if cortex-m-rt should help the user to achieve this. I'm not sure how, so I'm asking in this issue. I can open a new issue if needed.

Thanks!

adamgreig · 2022-01-02T01:49:11Z

The concept from #43 ended up in https://github.com/knurling-rs/flip-link which is probably the better way to do it (for now at least!). The problem with doing it in the linker script is you end up having to know in advance a specific size for the stack, so either you overestimate and things don't fit, or you end up with lots of unused RAM that could have been stack. With a tool like flip-link, the stack is automatically made as big as it can be. Ideally, the linker itself could figure this out for us, but I don't believe that's currently possible.

ia0 · 2022-01-02T13:10:17Z

Hi @adamgreig ,

I'm aware of flip-link but I've hit knurling-rs/flip-link#43 for one of the board I want to support (nRF52840-dongle using DFU where I need to reserve space at the beginning of the flash for the bootloader) so I stopped using it.

The problem with doing it in the linker script is you end up having to know in advance a specific size for the stack

This is not a problem if you use a heap, because both the stack and the heap are variable-sized. As long as you use only one (you always use a stack, so as long as you don't use a heap), you don't need a linker script. But as soon as you use a heap, you need to decide how to assign the non-data RAM between the stack and the heap, essentially choosing a size for both.

So the main issue I see with doing it in the linker script is that it's not convenient. The best scenario would be something like:

Specify the heap with some magic, e.g. #[cortex_m_rt::heap] static mut HEAP: [u8; HEAP_SIZE] = [0; HEAP_SIZE]; (the macro will just use the size of this symbol and ignore it from the binary). If the magic is missing, no heap is used.
Use flip-link (once the memory.x override bug is fixed) to get the stack-overflow protection.

japaric added enhancement help wanted labels Sep 27, 2017

japaric added a commit that referenced this issue Oct 20, 2017

place static vars at the end of RAM

cd692ae

so that the stack can never collide into them closes #34

japaric mentioned this issue Oct 20, 2017

[WIP] place static vars at the end of RAM #41

Closed

japaric mentioned this issue Nov 9, 2017

Zero cost stack overflow protection, take 2 #43

Merged

3 tasks

japaric closed this as completed in #43 Feb 17, 2018

jjyr mentioned this issue Jul 8, 2020

Detect stack overflow nervosnetwork/capsule#10

Closed

jonas-schievink mentioned this issue Jan 23, 2022

Enable stack overflow protection by default rust-embedded/cortex-m#408

Open

ArthurHeymans mentioned this issue Oct 25, 2024

Add stack overflow detection to SW emulator chipsalliance/caliptra-sw#1735

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

(Zero cost) stack overflow protection #34

(Zero cost) stack overflow protection #34

japaric commented Sep 27, 2017

pftbest commented Sep 27, 2017

Uh oh!

japaric commented Oct 2, 2017

Uh oh!

whitequark commented Oct 15, 2017

Uh oh!

whitequark commented Oct 15, 2017

Uh oh!

ia0 commented Jan 1, 2022

Uh oh!

adamgreig commented Jan 2, 2022

Uh oh!

ia0 commented Jan 2, 2022

Uh oh!

(Zero cost) stack overflow protection #34

(Zero cost) stack overflow protection #34

Comments

japaric commented Sep 27, 2017

Implementation

Other options for stack overflow protection

pftbest commented Sep 27, 2017

Uh oh!

japaric commented Oct 2, 2017

Uh oh!

whitequark commented Oct 15, 2017

Uh oh!

whitequark commented Oct 15, 2017

Uh oh!

ia0 commented Jan 1, 2022

Uh oh!

adamgreig commented Jan 2, 2022

Uh oh!

ia0 commented Jan 2, 2022

Uh oh!