Skip to content

[RISC-V] Simplifying the loop generated in genZeroInitFrameUsingBlockInit and jump encoding #114003

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 30 commits into from

Conversation

sirntar
Copy link
Member

@sirntar sirntar commented Mar 28, 2025

The main point of this pr is to optimize loop generated in genZeroInitFrameUsingBlockInit. Partial unrolling of the loop for cases where there is 12 or more reg slots on the stack reduced the loop iterations, resulting in an average performance gain of about 1.41% when running coreCLR tests (10 samples before and after the change) and about 0.5% on average for coreFX tests.

loop example for 19 slots
; loopBytes = (uRegSlots & ~3) * REGSIZE_BYTES
  addi           t1, t0, 128  ; rEndAddr = rAddr + loopBytes
loop_4:
  sd             zero, 0(t0)
  sd             zero, 8(t0)
  sd             zero, 16(t0)
  sd             zero, 24(t0)
  addi           t0, t0,  32 ; rAddr += 4 * REGSIZE_BYTES
  blt            t0, t1, loop_4
done_4:
  add            t0, t1, zero ; rAddr = rEndAddr
; uLclBytes -= loopBytes (-128)
; loopBytes = (uRegSlots & ~1) * REGSIZE_BYTES - loopBytes (144 - 128 = 16)
set_2:
  sd             zero, 0(t0)
  sd             zero, 8(t0)
done_2:
  addi           t0, t0,  16 ; rAddr += loopBytes 
; uLclBytes -= loopBytes (-16)
set_odd:
  sd             zero, 0(t0)

From the very beginning we had INS_bnez and INS_beqz pseudo-instructions, but it wasn't possible to emit them using emitIns_R_I. Now it is possible (and recommended).
As for "C" extension, new instructions will be defined with prefix c., so for example bnez in "C" extension will be c.bnez

In the case of changes in the encoding of jumps, I think it is now more readable, but the reduction of one shift and a few constants (which the compiler will optimize) does not increase performance in any significant way. So I'm waiting for your feedback.


part of #84834, cc @dotnet/samsung

@ghost ghost added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Mar 28, 2025
@dotnet-policy-service dotnet-policy-service bot added the community-contribution Indicates that the PR has been added by a community member label Mar 28, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@risc-vv
Copy link

risc-vv commented Mar 28, 2025

RISC-V Release-CLR-VF2: 9404 / 9547 (98.50%)
=======================
      passed: 9404
      failed: 125
     skipped: 106
      killed: 18
------------------------
  TOTAL libs: 9653
 TOTAL tests: 9653
   REAL time: 2h 14min 2s 596ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: a4e62285a27b736c04c0c14b83094e66440f7013
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

@risc-vv
Copy link

risc-vv commented Mar 28, 2025

4893367 is being scheduled for building and testing

GIT: 4893367f0091651ce729a4bb3080357a95b48807
REPO: sirntar/runtime
BRANCH: simplify-jumps

@clamp03 clamp03 added the arch-riscv Related to the RISC-V architecture label Mar 28, 2025
@tomeksowi
Copy link
Contributor

tomeksowi commented Mar 28, 2025

 sd             zero, 0(t0)
 sd             zero, 8(t0)
 sd             zero, 16(t0)
 sd             zero, 24(t0)
 addi           t1, t1, -1
 addi           t0, t0, 24
 bne            t1, zero, pc-24 (-6 instructions)

You could remove the counter t1 decrement by calculating the end pointer into t1 and do blt t0, t1, loop_start.

EDIT: also, if you're zeroing 4 doublewords at a time, the addi should increment the pointer by 32 not 24 (PR description error?).

@sirntar
Copy link
Member Author

sirntar commented Mar 28, 2025

@tomeksowi Good catch! Will change bnez to blt and test it.

EDIT: also, if you're zeroing 4 doublewords at a time, the addi should increment the pointer by 32 not 24 (PR description error?).

It's a mistake in PR description - in code it is correct: 4 * REGSIZE_BYTES + 3 * padding

@risc-vv
Copy link

risc-vv commented Mar 28, 2025

e849ffc is being scheduled for building and testing

GIT: e849ffc590b64ba0553683e266611f5f50a782c7
REPO: sirntar/runtime
BRANCH: simplify-jumps

@risc-vv
Copy link

risc-vv commented Mar 28, 2025

39212af is being scheduled for building and testing

GIT: 39212af07844ed00efb48bbc0cbcc2d7dafd0576
REPO: sirntar/runtime
BRANCH: simplify-jumps

@risc-vv
Copy link

risc-vv commented Mar 28, 2025

RISC-V Release-FX-VF2: 0 / 258 (0.00%)
=======================
      passed: 0
      failed: 0
     skipped: 0
      killed: 258
------------------------
  TOTAL libs: 258
 TOTAL tests: 258
   REAL time: 13min 2s 817ms
=======================

Release-FX-VF2.md, Release-FX-VF2.xml, testfx_output.tar.gz

Build information and commands

GIT: 9dfc852c1c0e70c9904663baca0bdac9f0adc458
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-VF2: 9480 / 9547 (99.30%)
=======================
      passed: 9480
      failed: 50
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9653
 TOTAL tests: 9653
   REAL time: 2h 11min 4s 22ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: 9dfc852c1c0e70c9904663baca0bdac9f0adc458
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9482 / 9547 (99.32%)
=======================
      passed: 9482
      failed: 48
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9653
 TOTAL tests: 9653
   REAL time: 2h 47min 40s 340ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: 9dfc852c1c0e70c9904663baca0bdac9f0adc458
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 0 / 258 (0.00%)
=======================
      passed: 0
      failed: 0
     skipped: 0
      killed: 258
------------------------
  TOTAL libs: 258
 TOTAL tests: 258
   REAL time: 10min 21s 84ms
=======================

Release-FX-QEMU.md, Release-FX-QEMU.xml, testfx_output.tar.gz

Build information and commands

GIT: 9dfc852c1c0e70c9904663baca0bdac9f0adc458
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

@risc-vv
Copy link

risc-vv commented Apr 3, 2025

RISC-V Release-CLR-VF2: 9524 / 9548 (99.75%)
=======================
      passed: 9524
      failed: 7
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9654
 TOTAL tests: 9654
   REAL time: 2h 14min 23s 87ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: db2cad2547b9592cf4e601cf49eba672867adf90
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-VF2: 431638 / 493981 (87.38%)
=======================
      passed: 431638
      failed: 495
     skipped: 1353
      killed: 61848
------------------------
  TOTAL libs: 259
 TOTAL tests: 495334
   REAL time: 1h 52min 51s 913ms
=======================

Release-FX-VF2.md, Release-FX-VF2.xml, testfx_output.tar.gz

Build information and commands

GIT: db2cad2547b9592cf4e601cf49eba672867adf90
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9524 / 9548 (99.75%)
=======================
      passed: 9524
      failed: 7
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9654
 TOTAL tests: 9654
   REAL time: 2h 47min 22s 231ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: db2cad2547b9592cf4e601cf49eba672867adf90
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 392211 / 466910 (84.00%)
=======================
      passed: 392211
      failed: 694
     skipped: 1279
      killed: 74005
------------------------
  TOTAL libs: 259
 TOTAL tests: 468189
   REAL time: 2h 5min 52s 245ms
=======================

Release-FX-QEMU.md, Release-FX-QEMU.xml, testfx_output.tar.gz

Build information and commands

GIT: db2cad2547b9592cf4e601cf49eba672867adf90
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: sirntar/runtime
BRANCH: simplify-jumps
CONFIG: Release
LIB_CONFIG: Release

@risc-vv
Copy link

risc-vv commented Apr 4, 2025

RISC-V Release-CLR-VF2: 9478 / 9551 (99.24%)
=======================
      passed: 9478
      failed: 56
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9657
 TOTAL tests: 9657
   REAL time: 2h 11min 10s 633ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: 848c521ea963fff364c19f15f100f5184814832a
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9478 / 9551 (99.24%)
=======================
      passed: 9478
      failed: 56
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9657
 TOTAL tests: 9657
   REAL time: 2h 47min 22s 428ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: 848c521ea963fff364c19f15f100f5184814832a
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

@risc-vv
Copy link

risc-vv commented Apr 7, 2025

RISC-V Release-CLR-VF2: 9478 / 9551 (99.24%)
=======================
      passed: 9478
      failed: 56
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9657
 TOTAL tests: 9657
   REAL time: 2h 13min 32s 482ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: ddf99012153a9a6e328c942f4796a272a769795c
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

@risc-vv
Copy link

risc-vv commented Apr 7, 2025

RISC-V Release-CLR-VF2: 9479 / 9551 (99.25%)
=======================
      passed: 9479
      failed: 55
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9657
 TOTAL tests: 9657
   REAL time: 2h 13min 44s 609ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: fe1558731eb72f4e7346b697931d2f9406ce7957
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9480 / 9551 (99.26%)
=======================
      passed: 9480
      failed: 54
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9657
 TOTAL tests: 9657
   REAL time: 2h 47min 22s 383ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: fe1558731eb72f4e7346b697931d2f9406ce7957
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

@risc-vv
Copy link

risc-vv commented Apr 7, 2025

RISC-V Release-CLR-VF2: 9476 / 9551 (99.21%)
=======================
      passed: 9476
      failed: 58
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9657
 TOTAL tests: 9657
   REAL time: 2h 13min 28s 412ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: a7ac979da53e159b370d50cf4c5354df8ae75a8e
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9477 / 9551 (99.23%)
=======================
      passed: 9477
      failed: 57
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9657
 TOTAL tests: 9657
   REAL time: 2h 47min 3s 285ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: a7ac979da53e159b370d50cf4c5354df8ae75a8e
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 434206 / 498796 (87.05%)
=======================
      passed: 434206
      failed: 1105
     skipped: 1212
      killed: 63485
------------------------
  TOTAL libs: 259
 TOTAL tests: 500008
   REAL time: 1h 44min 42s 733ms
=======================

Release-FX-QEMU.md, Release-FX-QEMU.xml, testfx_output.tar.gz

Build information and commands

GIT: a7ac979da53e159b370d50cf4c5354df8ae75a8e
CI: a8426a46d8575dfcb3b5fec0d7d0b7a7c118d690
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

@risc-vv
Copy link

risc-vv commented Apr 9, 2025

RISC-V Release-CLR-VF2: 9477 / 9552 (99.21%)
=======================
      passed: 9477
      failed: 58
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9658
 TOTAL tests: 9658
   REAL time: 2h 11min 1s 756ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: cf50bc6d738f7dafeba3bb8b3bb3ec94dbbd5ac1
CI: 09909bfe3d23ad26455327811013bcbb48915255
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9478 / 9552 (99.23%)
=======================
      passed: 9478
      failed: 57
     skipped: 106
      killed: 17
------------------------
  TOTAL libs: 9658
 TOTAL tests: 9658
   REAL time: 2h 48min 14s 953ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: cf50bc6d738f7dafeba3bb8b3bb3ec94dbbd5ac1
CI: 09909bfe3d23ad26455327811013bcbb48915255
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 363721 / 434281 (83.75%)
=======================
      passed: 363721
      failed: 572
     skipped: 1254
      killed: 69988
------------------------
  TOTAL libs: 259
 TOTAL tests: 435535
   REAL time: 2h 0min 49s 39ms
=======================

Release-FX-QEMU.md, Release-FX-QEMU.xml, testfx_output.tar.gz

Build information and commands

GIT: cf50bc6d738f7dafeba3bb8b3bb3ec94dbbd5ac1
CI: 09909bfe3d23ad26455327811013bcbb48915255
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

@am11
Copy link
Member

am11 commented Apr 10, 2025

RISC-V Release-CLR-VF2: 9477 / 9552 (99.21%)

In other PRs, it's 99.79% Is it a regression in this change?

@risc-vv
Copy link

risc-vv commented Apr 10, 2025

RISC-V Release-CLR-VF2: 9518 / 9552 (99.64%)
=======================
      passed: 9518
      failed: 17
     skipped: 107
      killed: 17
------------------------
  TOTAL libs: 9659
 TOTAL tests: 9659
   REAL time: 2h 11min 36s 369ms
=======================

Release-CLR-VF2.md, Release-CLR-VF2.xml, testclr_output.tar.gz

Build information and commands

GIT: 3bbf6556dbad10b9acc69a53799b03a7e1bfa90a
CI: 09909bfe3d23ad26455327811013bcbb48915255
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-CLR-QEMU: 9521 / 9552 (99.68%)
=======================
      passed: 9521
      failed: 14
     skipped: 107
      killed: 17
------------------------
  TOTAL libs: 9659
 TOTAL tests: 9659
   REAL time: 2h 47min 56s 808ms
=======================

Release-CLR-QEMU.md, Release-CLR-QEMU.xml, testclr_output.tar.gz

Build information and commands

GIT: 3bbf6556dbad10b9acc69a53799b03a7e1bfa90a
CI: 09909bfe3d23ad26455327811013bcbb48915255
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

RISC-V Release-FX-QEMU: 134013 / 205549 (65.20%)
=======================
      passed: 134013
      failed: 631
     skipped: 1016
      killed: 70905
------------------------
  TOTAL libs: 259
 TOTAL tests: 206565
   REAL time: 1h 58min 53s 891ms
=======================

Release-FX-QEMU.md, Release-FX-QEMU.xml, testfx_output.tar.gz

Build information and commands

GIT: 3bbf6556dbad10b9acc69a53799b03a7e1bfa90a
CI: 09909bfe3d23ad26455327811013bcbb48915255
REPO: dotnet/runtime
BRANCH: main
CONFIG: Release
LIB_CONFIG: Release

Copy link
Contributor

Draft Pull Request was automatically closed for 30 days of inactivity. Please let us know if you'd like to reopen it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-riscv Related to the RISC-V architecture area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants