Skip to content

Use AVX512 intrinsics for numeric conversion in JIT #93896

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

martinothamar
Copy link
Contributor

Jit version of #89330

For Vector128.ConvertToDouble

Before

G_M000_IG02:                ;; offset=0x0007
       call     [OthmarTesting.Testing:Src():long]
       vpbroadcastq  xmm0, rax
       vpblendd xmm1, xmm0, xmmword ptr [reloc @RWD00], 10
       vpsrlq   xmm0, xmm0, 32
       vpxorq   xmm0, xmm0, qword ptr [reloc @RWD16] {1to2}
       vsubpd   xmm0, xmm0, qword ptr [reloc @RWD24] {1to2}
       vaddpd   xmm0, xmm0, xmm1
       vmovups  xmmword ptr [rbx], xmm0
       mov      rax, rbx

After

G_M000_IG02:                ;; offset=0x0007
       call     [OthmarTesting.Testing:Src():long]
       vpbroadcastq  xmm0, rax
       vcvtqq2pd xmm0, xmm0
       vmovups  xmmword ptr [rbx], xmm0
       mov      rax, rbx

@ghost ghost added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member labels Oct 23, 2023
@ghost
Copy link

ghost commented Oct 23, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

Issue Details

Jit version of #89330

For Vector128.ConvertToDouble

Before

G_M000_IG02:                ;; offset=0x0007
       call     [OthmarTesting.Testing:Src():long]
       vpbroadcastq  xmm0, rax
       vpblendd xmm1, xmm0, xmmword ptr [reloc @RWD00], 10
       vpsrlq   xmm0, xmm0, 32
       vpxorq   xmm0, xmm0, qword ptr [reloc @RWD16] {1to2}
       vsubpd   xmm0, xmm0, qword ptr [reloc @RWD24] {1to2}
       vaddpd   xmm0, xmm0, xmm1
       vmovups  xmmword ptr [rbx], xmm0
       mov      rax, rbx

After

G_M000_IG02:                ;; offset=0x0007
       call     [OthmarTesting.Testing:Src():long]
       vpbroadcastq  xmm0, rax
       vcvtqq2pd xmm0, xmm0
       vmovups  xmmword ptr [rbx], xmm0
       mov      rax, rbx
Author: martinothamar
Assignees: -
Labels:

area-CodeGen-coreclr

Milestone: -

@martinothamar
Copy link
Contributor Author

Should I implement these hardware intrinsics for Vector<T> (simdashwintrinsic.cpp) as well? I see that NI_VectorT128_ConvertToDouble for example is only accelerated under ARM64

@adamsitnik
Copy link
Member

@dotnet/avx512-contrib the previous PR (#89330) provided by @martinothamar was left with no feedback for over two months, please don't let this PR become stale too.

@tannergooding
Copy link
Member

We appreciate the PR, but this is not something we'll be able to take at this time.

There are longer discussions covering the reasons why in #84384 and #61885, but in essence we have a desire to ensure the default behavior is deterministic across platforms.

This entails us not simply executing the direct instruction, but also doing the work to ensure that overflow results in saturation, rather than returning some sentinel value and doing the equivalent work to ensure that scalars, constant folding, and other paths all have the same behavior. This work is significantly more involved and requires additional testing, general integration, etc.

As such, it is work that we will be doing ourselves when the time is appropriate (this will likely be in the next few months).

@martinothamar martinothamar deleted the use-avx512-for-number-conversion-intrinsics-jit branch October 26, 2023 07:26
@martinothamar
Copy link
Contributor Author

Ahh I see that makes sense, thanks for the quick reply. I'll go ahead and close the other PR as well

@ghost ghost locked as resolved and limited conversation to collaborators Nov 25, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI community-contribution Indicates that the PR has been added by a community member
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants