Mono problems with double to nuint conversion #64570

uweigand · 2022-01-31T21:46:10Z

After #64234 was merged, I now see failures on s390x (which uses Mono by default):

    System.Runtime.InteropServices.Tests.NFloatTests.NFloatToUIntPtr(value: -4567) [FAIL]
      Assert.Equal() Failure
      Expected: 0
      Actual:   18446744073709547049
      Stack Trace:
        /home/uweigand/runtime/src/libraries/System.Runtime.InteropServices/tests/System.Runtime.InteropServices.UnitTests/System/Runtime/InteropServices/NFloatTests.cs(555,0): at System.Runtime.InteropServices.Tests.NFloatTests.NFloatToUIntPtr(Single value)
        /home/uweigand/runtime/src/mono/System.Private.CoreLib/src/System/Reflection/RuntimeMethodInfo.Mono.cs(386,0): at System.Reflection.RuntimeMethodInfo.InvokeWorker(Object obj, BindingFlags invokeAttr, Span`1 parameters)

The failing test case does the following:

        public void NFloatToUIntPtr(float value)
        {
            nuint result = (nuint)new NFloat(value);
            Assert.Equal((nuint)value, result);
        }

The expected value is computed by directly converting a float to nuint. The actual value is computed by first converting the float to a double, and then converting the double to nuint. The float-to-double conversion is harmless. The problem is that float to nuint and double to nuint use quite different code paths in Mono.

The code in type_from_op in mini/method-to-ir.c does this:

        case MONO_CEE_CONV_U:
[...]
                case STACK_R8:
                        ins->opcode = OP_FCONV_TO_U;
                        break;
                case STACK_R4:
                        if (TARGET_SIZEOF_VOID_P == 8)
                                ins->opcode = OP_RCONV_TO_U8;
                        else
                                ins->opcode = OP_RCONV_TO_U4;
                        break;

So on s390x the float-to-nuint case uses OP_RCONV_TO_U8, which is handled fine in the s390x back end. However, the double-to-nuint case uses OP_FCONV_TO_U, which never even reaches the back end, because for this opcode, Mono unconditionally registers an emulation icall, see register_icalls in mini/mini-runtime.c:

        // FIXME Elsewhere these are registered with no_wrapper = FALSE
#if SIZEOF_REGISTER == 4
        register_opcode_emulation (OP_FCONV_TO_U, __emul_fconv_to_u, mono_icall_sig_uint32_double, mono_fconv_u4, TRUE);
#else
        register_opcode_emulation (OP_FCONV_TO_U, __emul_fconv_to_u, mono_icall_sig_ulong_double, mono_fconv_u8, TRUE);
#endif

All other emulation icalls seem to be guarded by some #ifdef MONO_ARCH_... guard, but not this one. This will be unconditionally emulated even if the back end would handle OP_FCONV_TO_U8 and/or OP_FCONV_TO_U4.

The problem now comes from the fact that the emulation function does not fully model the expected semantics:

guint64
mono_fconv_u8 (double v)
{
#if defined(TARGET_X86) || defined(TARGET_AMD64)
        const double two63 = 2147483648.0 * 4294967296.0;
        if (v < two63) {
                return (gint64)v;
        } else {
                return (gint64)(v - two63) + ((guint64)1 << 63);
        }
#else
        if (mono_isinf (v) || mono_isnan (v))
                return 0;
        return (guint64)v;
#endif
}

Conversion of a negative floating-point number to an unsigned type is expected to return 0. But this C implementation does not guarantee this. The explicit Intel implementation definitely does not do this for negative numbers, and the cross-platform pure C version runs into undefined behavior ("If the value of the integral part cannot be represented by the integer type, the behavior is undefined.").

On s390x, the actual behavior you'll see varies between compilers. By default, the runtime build uses clang, and clang defaults to targeting a very old s390x processor, which didn't yet have float-to-unsigned conversion instructions. So clang will instead emit code that does exactly what that Intel special case open-codes.

This has the effect that a negative floating-point input is converted to a signed integer first, and the result then reinterpreted as unsigned. Therefore we get from (double)4567.0 first (intptr_t)-4567 and then (uintptr_t)18446744073709547049.

The other conversion goes directly from float to nuint, which uses OP_RCONV_TO_U8. This is implemented in the s390x back end by the single instruction that correctly handles all corner cases, so negative inputs convert to 0. This gives the mismatch.

I think on Intel you don't see this problem because the mini-amd64.c back end does not implement OP_RCONV_TO_U8, and therefore this conversion also falls back to an emulated icall, which gives the same (wrong) result, so the test succeeds.

Now I'm wondering what the right fix is. Simply changing type_from_op to this:

        case MONO_CEE_CONV_U:
[...]
                case STACK_R8:
                        if (TARGET_SIZEOF_VOID_P == 8)
                                ins->opcode = OP_FCONV_TO_U8;
                        else
                                ins->opcode = OP_FCONV_TO_U4;
                        break;
                case STACK_R4:
                        if (TARGET_SIZEOF_VOID_P == 8)
                                ins->opcode = OP_RCONV_TO_U8;
                        else
                                ins->opcode = OP_RCONV_TO_U4;
                        break;

fixes the failure for me. However, I'm not sure if this would affect other back ends. Also, this seems to make OP_FCONV_TO_U completely unused so there probably should be a bigger cleanup.

Any thoughts or suggestions?

CC @tannergooding @vargaz @lambdageek

The text was updated successfully, but these errors were encountered:

ghost · 2022-01-31T21:46:13Z

I couldn't figure out the best area label to add to this issue. If you have write-permissions please help me learn by adding exactly one area label.

uweigand · 2022-01-31T21:46:28Z

CC @nealef

tannergooding · 2022-01-31T22:11:47Z

The problem is that float to nuint and double to nuint use quite different code paths in Mono.

This would be a bug in the mono logic. Any float represents the same underlying value as its double counterpart and so float->double->nuint and float->nuint should produce identical results for any input (at least when running on the same platform, as indicated below there are currently xplat differences that we aren't handling today).

On s390x, the actual behavior you'll see varies between compilers. By default, the runtime build uses clang, and clang defaults to targeting a very old s390x processor, which didn't yet have float-to-unsigned conversion instructions. So clang will instead emit code that does exactly what that Intel special case open-codes.

This is actually something I'm looking into resolving more broadly and is being tracked by #61885. I have a prototype that fixes this here: #61761, but Mono is having issues (likely due to similar issues listed above) and so its still a WIP.

This, if we go through with it, will be a decent "breaking change", but overall one that helps remove these types of issues and inconsistencies for most users and so will ultimately reduce bugs.
The current prototype notably is trying to ignore platform specific handling. Each platform would need to add it back where the logic matches the expected behavior. I'm not familiar with s390x and so I don't know if it matches the "saturating" arithmetic that Arm64 uses (and which WASM, Rust, and others are standardizing on).

lambdageek · 2022-02-01T14:19:36Z

/cc @imhameed @vargaz

lambdageek · 2022-02-01T14:25:55Z

Simply changing type_from_op to [do cases on TARGET_SIZE_OF_VOID_P]

That seems like the right fix to me. (leaving aside the longer-term project of giving a consistent semantics to unsigned conversion for .NET).

We run the NFloatToUIntPtr tests on apple platforms right? We should apply the suggested fix and verify that the tests still pass. That seems targeted enough that we could take it to .NET 6 servicing.

uweigand · 2022-02-01T14:57:47Z

Simply changing type_from_op to [do cases on TARGET_SIZE_OF_VOID_P]

That seems like the right fix to me. (leaving aside the longer-term project of giving a consistent semantics to unsigned conversion for .NET).

I've just posted the final version of this fix (including the OP_FCONV_TO_U removal) as #64618. This passes the test suite on s390x for me and fixes the regression.

* Emit OP_FCONV_TO_U8/U4 instead of OP_FCONV_TO_U. * Remove OP_FCONV_TO_U implementations across all back ends. * Remove duplicated mono_fconv_u8/u4 icall wrappers. * Fixes #64570

ghost added the untriaged New issue has not been triaged by the area owner label Jan 31, 2022

lambdageek added the area-Codegen-JIT-mono label Feb 1, 2022

uweigand mentioned this issue Feb 1, 2022

[mono] Remove OP_FCONV_TO_U opcode #64618

Merged

ghost added the in-pr There is an active PR which will close this issue when it is merged label Feb 1, 2022

lambdageek mentioned this issue Feb 1, 2022

[release/6.0] Expand System.Runtime.InteropServices.NFloat to support the APIs required by Xamarin #64556

Merged

akoeplinger closed this as completed in #64618 Feb 7, 2022

ghost removed the in-pr There is an active PR which will close this issue when it is merged label Feb 7, 2022

ghost locked as resolved and limited conversation to collaborators Mar 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Mono problems with double to nuint conversion #64570

Mono problems with double to nuint conversion #64570

uweigand commented Jan 31, 2022

ghost commented Jan 31, 2022

Uh oh!

uweigand commented Jan 31, 2022

Uh oh!

tannergooding commented Jan 31, 2022 •

edited

Loading

Uh oh!

lambdageek commented Feb 1, 2022

Uh oh!

lambdageek commented Feb 1, 2022

Uh oh!

uweigand commented Feb 1, 2022

Uh oh!

Mono problems with double to nuint conversion #64570

Mono problems with double to nuint conversion #64570

Comments

uweigand commented Jan 31, 2022

ghost commented Jan 31, 2022

Uh oh!

uweigand commented Jan 31, 2022

Uh oh!

tannergooding commented Jan 31, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lambdageek commented Feb 1, 2022

Uh oh!

lambdageek commented Feb 1, 2022

Uh oh!

uweigand commented Feb 1, 2022

Uh oh!

tannergooding commented Jan 31, 2022 •

edited

Loading