Skip to content

LLVM error "ran out of registers during register allocation" on 32-bit x86 with code-coverage #42200

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
kimikage opened this issue Sep 10, 2021 · 8 comments

Comments

@kimikage
Copy link
Contributor

This comes from the discussion on Discourse.

I found a strange error "ran out of registers during register allocation" in the CI log of a package.
The error occurs on 32-bit x86, but I couldn't figure out the trigger for a long time. Later, @ChenNingCong found that the code for code coverage had something to do with it.

The following is an MWE. I think the easiest way to reproduce the error is to write it in runtests.jl and run it with pkg> test --coverage.

struct RGB8
    r::UInt8
    g::UInt8
    b::UInt8
end
Base.broadcastable(c::RGB8) = Ref(c)

_mapc(f, x::RGB8, y::RGB8) = RGB8(f(x.r, y.r), f(x.g, y.g), f(x.b, y.b))

function lighten(c1::RGB8, c2::RGB8, opacity::AbstractFloat)
    mixed = _mapc(max, c1, c2)
    w(v1, v2) = round(UInt8, v1 * (1 - opacity) + v2 * opacity)
    _mapc(w, c1, mixed)
end

c = RGB8(255, 0, 0)
cs = (RGB8(255, 0, 0), RGB8(0, 255, 0))
f(c1, c2, opacity) = lighten.(c1, c2, opacity)

using InteractiveUtils
versioninfo()

@code_llvm   f(c, cs, 0.8)
@code_native f(c, cs, 0.8) # error
Output of v1.6.2 on Windows (i686-w64-mingw32)
Julia Version 1.6.2
Commit 1b93d53fc4 (2021-07-14 15:36 UTC)
Platform Info:
  OS: Windows (i686-w64-mingw32)
  CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 32
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, tigerlake)
Environment:
  JULIA_LOAD_PATH = @;C:\Users\username\AppData\Local\Temp\jl_iMZPOq
;  @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:18 within `f'
; Function Attrs: alignstack(16)
define void @julia_f_224([2 x [3 x i8]]* noalias nocapture sret %0, [3 x i8]* nocapture nonnull readonly align 1 deref
erenceable(3) %1, [2 x [3 x i8]]* nocapture nonnull readonly align 1 dereferenceable(6) %2, double %3) #0 {
top:
  %4 = alloca [3 x i8], align 1
  %5 = alloca [3 x i8], align 1
  %6 = alloca [1 x double], align 8
  %7 = alloca [3 x i8], align 1
  %8 = alloca [3 x i8], align 1
  %9 = alloca [1 x double], align 8
  %lcnt = load volatile i64, i64* inttoptr (i32 390202632 to i64*), align 8
  %10 = add i64 %lcnt, 1
  store volatile i64 %10, i64* inttoptr (i32 390202632 to i64*), align 8
; ┌ @ broadcast.jl:1312 within `broadcasted'
; │┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:6 within `broadcastable'
    %lcnt1 = load volatile i64, i64* inttoptr (i32 390202536 to i64*), align 8
    %11 = add i64 %lcnt1, 1
    store volatile i64 %11, i64* inttoptr (i32 390202536 to i64*), align 8
; ││┌ @ refpointer.jl:134 within `Ref'
; │││┌ @ refvalue.jl:10 within `RefValue' @ refvalue.jl:8
      %.elt = getelementptr inbounds [3 x i8], [3 x i8]* %1, i32 0, i32 0
      %.unpack = load i8, i8* %.elt, align 1
      %.elt35 = getelementptr inbounds [3 x i8], [3 x i8]* %1, i32 0, i32 1
      %.unpack36 = load i8, i8* %.elt35, align 1
      %.elt37 = getelementptr inbounds [3 x i8], [3 x i8]* %1, i32 0, i32 2
      %.unpack38 = load i8, i8* %.elt37, align 1
; └└└└
; ┌ @ broadcast.jl:883 within `materialize'
; │┌ @ broadcast.jl:1098 within `copy'
; ││┌ @ ntuple.jl:49 within `ntuple'
; │││┌ @ broadcast.jl:1098 within `#19'
; ││││┌ @ broadcast.jl:620 within `_broadcast_getindex'
; │││││┌ @ broadcast.jl:644 within `_getindex'
; ││││││┌ @ broadcast.jl:595 within `_broadcast_getindex'
; │││││││┌ @ refvalue.jl:56 within `getindex'
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %.fca.0.gep30 = getelementptr inbounds [3 x i8], [3 x i8]* %4, i32 0, i32 0
           store i8 %.unpack, i8* %.fca.0.gep30, align 1
           %.fca.1.gep32 = getelementptr inbounds [3 x i8], [3 x i8]* %4, i32 0, i32 1
           store i8 %.unpack36, i8* %.fca.1.gep32, align 1
           %.fca.2.gep34 = getelementptr inbounds [3 x i8], [3 x i8]* %4, i32 0, i32 2
           store i8 %.unpack38, i8* %.fca.2.gep34, align 1
; ││││└└└└└
; ││││┌ @ broadcast.jl:621 within `_broadcast_getindex'
; │││││┌ @ broadcast.jl:648 within `_broadcast_getindex_evalf'
; ││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:11 within `lighten'
         %lcnt2 = load volatile i64, i64* inttoptr (i32 390202576 to i64*), align 16
         %12 = add i64 %lcnt2, 1
         store volatile i64 %12, i64* inttoptr (i32 390202576 to i64*), align 16
; │││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:8 within `_mapc'
          %lcnt3 = load volatile i64, i64* inttoptr (i32 390202552 to i64*), align 8
          %13 = add i64 %lcnt3, 1
          store volatile i64 %13, i64* inttoptr (i32 390202552 to i64*), align 8
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %14 = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %2, i32 0, i32 0, i32 0
; ││││││││└
; ││││││││┌ @ promotion.jl:421 within `max'
; │││││││││┌ @ int.jl:441 within `<'
            %15 = load i8, i8* %14, align 1
            %.not = icmp ult i8 %15, %.unpack
; │││││││││└
           %16 = select i1 %.not, i8 %.unpack, i8 %15
; ││││││││└
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %17 = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %2, i32 0, i32 0, i32 1
; ││││││││└
; ││││││││┌ @ promotion.jl:421 within `max'
; │││││││││┌ @ int.jl:441 within `<'
            %18 = load i8, i8* %17, align 1
            %.not50 = icmp ult i8 %18, %.unpack36
; │││││││││└
           %19 = select i1 %.not50, i8 %.unpack36, i8 %18
; ││││││││└
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %20 = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %2, i32 0, i32 0, i32 2
; ││││││││└
; ││││││││┌ @ promotion.jl:421 within `max'
; │││││││││┌ @ int.jl:441 within `<'
            %21 = load i8, i8* %20, align 1
            %.not51 = icmp ult i8 %21, %.unpack38
; │││││││││└
           %22 = select i1 %.not51, i8 %.unpack38, i8 %21
; ││││││││└
; ││││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:2 within `RGB8'
           %lcnt4 = load volatile i64, i64* inttoptr (i32 390202504 to i64*), align 8
           %23 = add i64 %lcnt4, 1
           store volatile i64 %23, i64* inttoptr (i32 390202504 to i64*), align 8
           %24 = getelementptr inbounds [3 x i8], [3 x i8]* %5, i32 0, i32 0
           store i8 %16, i8* %24, align 1
           %25 = getelementptr inbounds [3 x i8], [3 x i8]* %5, i32 0, i32 1
           store i8 %19, i8* %25, align 1
           %26 = getelementptr inbounds [3 x i8], [3 x i8]* %5, i32 0, i32 2
           store i8 %22, i8* %26, align 1
; ││││││└└└
; ││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:12 within `lighten'
         %lcnt5 = load volatile i64, i64* inttoptr (i32 390202584 to i64*), align 8
         %27 = add i64 %lcnt5, 1
         store volatile i64 %27, i64* inttoptr (i32 390202584 to i64*), align 8
         %28 = getelementptr inbounds [1 x double], [1 x double]* %6, i32 0, i32 0
         store double %3, double* %28, align 8
; ││││││└
; ││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:13 within `lighten'
         %lcnt6 = load volatile i64, i64* inttoptr (i32 390202592 to i64*), align 32
         %29 = add i64 %lcnt6, 1
         store volatile i64 %29, i64* inttoptr (i32 390202592 to i64*), align 32
         %30 = call [3 x i8] @j__mapc_226([1 x double]* nocapture readonly %6, [3 x i8]* nocapture readonly %4, [3 x i
8]* nocapture readonly %5) #0
         %.fca.0.extract20 = extractvalue [3 x i8] %30, 0
         %.fca.1.extract22 = extractvalue [3 x i8] %30, 1
         %.fca.2.extract24 = extractvalue [3 x i8] %30, 2
; ││││└└└
; ││││┌ @ broadcast.jl:620 within `_broadcast_getindex'
; │││││┌ @ broadcast.jl:644 within `_getindex'
; ││││││┌ @ broadcast.jl:595 within `_broadcast_getindex'
; │││││││┌ @ refvalue.jl:56 within `getindex'
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %.fca.0.gep = getelementptr inbounds [3 x i8], [3 x i8]* %7, i32 0, i32 0
           store i8 %.unpack, i8* %.fca.0.gep, align 1
           %.fca.1.gep = getelementptr inbounds [3 x i8], [3 x i8]* %7, i32 0, i32 1
           store i8 %.unpack36, i8* %.fca.1.gep, align 1
           %.fca.2.gep = getelementptr inbounds [3 x i8], [3 x i8]* %7, i32 0, i32 2
           store i8 %.unpack38, i8* %.fca.2.gep, align 1
; ││││└└└└└
; ││││┌ @ broadcast.jl:621 within `_broadcast_getindex'
; │││││┌ @ broadcast.jl:648 within `_broadcast_getindex_evalf'
; ││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:11 within `lighten'
         %lcnt7 = load volatile i64, i64* inttoptr (i32 390202576 to i64*), align 16
         %31 = add i64 %lcnt7, 1
         store volatile i64 %31, i64* inttoptr (i32 390202576 to i64*), align 16
; │││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:8 within `_mapc'
          %lcnt8 = load volatile i64, i64* inttoptr (i32 390202552 to i64*), align 8
          %32 = add i64 %lcnt8, 1
          store volatile i64 %32, i64* inttoptr (i32 390202552 to i64*), align 8
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %33 = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %2, i32 0, i32 1, i32 0
; ││││││││└
; ││││││││┌ @ promotion.jl:421 within `max'
; │││││││││┌ @ int.jl:441 within `<'
            %34 = load i8, i8* %33, align 1
            %.not58 = icmp ult i8 %34, %.unpack
; │││││││││└
           %35 = select i1 %.not58, i8 %.unpack, i8 %34
; ││││││││└
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %36 = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %2, i32 0, i32 1, i32 1
; ││││││││└
; ││││││││┌ @ promotion.jl:421 within `max'
; │││││││││┌ @ int.jl:441 within `<'
            %37 = load i8, i8* %36, align 1
            %.not59 = icmp ult i8 %37, %.unpack36
; │││││││││└
           %38 = select i1 %.not59, i8 %.unpack36, i8 %37
; ││││││││└
; ││││││││┌ @ Base.jl:33 within `getproperty'
           %39 = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %2, i32 0, i32 1, i32 2
; ││││││││└
; ││││││││┌ @ promotion.jl:421 within `max'
; │││││││││┌ @ int.jl:441 within `<'
            %40 = load i8, i8* %39, align 1
            %.not60 = icmp ult i8 %40, %.unpack38
; │││││││││└
           %41 = select i1 %.not60, i8 %.unpack38, i8 %40
; ││││││││└
; ││││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:2 within `RGB8'
           %lcnt9 = load volatile i64, i64* inttoptr (i32 390202504 to i64*), align 8
           %42 = add i64 %lcnt9, 1
           store volatile i64 %42, i64* inttoptr (i32 390202504 to i64*), align 8
           %43 = getelementptr inbounds [3 x i8], [3 x i8]* %8, i32 0, i32 0
           store i8 %35, i8* %43, align 1
           %44 = getelementptr inbounds [3 x i8], [3 x i8]* %8, i32 0, i32 1
           store i8 %38, i8* %44, align 1
           %45 = getelementptr inbounds [3 x i8], [3 x i8]* %8, i32 0, i32 2
           store i8 %41, i8* %45, align 1
; ││││││└└└
; ││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:12 within `lighten'
         %lcnt10 = load volatile i64, i64* inttoptr (i32 390202584 to i64*), align 8
         %46 = add i64 %lcnt10, 1
         store volatile i64 %46, i64* inttoptr (i32 390202584 to i64*), align 8
         %47 = getelementptr inbounds [1 x double], [1 x double]* %9, i32 0, i32 0
         store double %3, double* %47, align 8
; ││││││└
; ││││││┌ @ C:\Users\username\.julia\dev\RanOutOfRegisters\test\runtests.jl:13 within `lighten'
         %lcnt11 = load volatile i64, i64* inttoptr (i32 390202592 to i64*), align 32
         %48 = add i64 %lcnt11, 1
         store volatile i64 %48, i64* inttoptr (i32 390202592 to i64*), align 32
         %49 = call [3 x i8] @j__mapc_227([1 x double]* nocapture readonly %9, [3 x i8]* nocapture readonly %7, [3 x i
8]* nocapture readonly %8) #0
         %.fca.0.extract = extractvalue [3 x i8] %49, 0
         %.fca.1.extract = extractvalue [3 x i8] %49, 1
         %.fca.2.extract = extractvalue [3 x i8] %49, 2
; └└└└└└└
  %.sroa.012.sroa.0.0..sroa.012.0..sroa_idx.sroa_idx = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %0, i32
0, i32 0, i32 0
  store i8 %.fca.0.extract20, i8* %.sroa.012.sroa.0.0..sroa.012.0..sroa_idx.sroa_idx, align 1
  %.sroa.012.sroa.2.0..sroa.012.0..sroa_idx.sroa_idx = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %0, i32
0, i32 0, i32 1
  store i8 %.fca.1.extract22, i8* %.sroa.012.sroa.2.0..sroa.012.0..sroa_idx.sroa_idx, align 1
  %.sroa.012.sroa.3.0..sroa.012.0..sroa_idx.sroa_idx = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %0, i32
0, i32 0, i32 2
  store i8 %.fca.2.extract24, i8* %.sroa.012.sroa.3.0..sroa.012.0..sroa_idx.sroa_idx, align 1
  %.sroa.215.0..sroa_idx = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %0, i32 0, i32 1, i32 0
  store i8 %.fca.0.extract, i8* %.sroa.215.0..sroa_idx, align 1
  %.sroa.316.0..sroa_idx = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %0, i32 0, i32 1, i32 1
  store i8 %.fca.1.extract, i8* %.sroa.316.0..sroa_idx, align 1
  %.sroa.4.0..sroa_idx = getelementptr inbounds [2 x [3 x i8]], [2 x [3 x i8]]* %0, i32 0, i32 1, i32 2
  store i8 %.fca.2.extract, i8* %.sroa.4.0..sroa_idx, align 1
  ret void
}
error: ran out of registers during register allocation
ERROR: Package RanOutOfRegisters errored during testing

This problem occurs in the following versions (on both Linux and Windows):

  • v1.5.4 (LLVM 9.0)
  • v1.6.2 (LLVM 11.0)
  • v1.7.0-beta4 (LLVM 12.0)
  • v1.8.0-DEV.498 (LLVM 12.0)

On the other hand, it does not seem to occur on x86-64 or 32-/64-bit ARM.

If you have difficulty preparing a 32-bit x86 environment, you can also use GitHub Actions to reproduce the error.

@kimikage
Copy link
Contributor Author

The tail of the IR dump with JULIA_LLVM_ARGS=-print-after-all was as follows:

# *** IR Dump After Live Register Matrix ***:
# *** IR Dump After Live Register Matrix ***:
# Machine code for function julia_f_5: NoPHIs, TracksLiveness, TiedOpsRewritten
Frame Objects:
  fi#-4: size=8, align=1, fixed, at location [SP+16]
  fi#-3: size=4, align=1, fixed, at location [SP+12]
  fi#-2: size=4, align=1, fixed, at location [SP+8]
  fi#-1: size=4, align=1, fixed, at location [SP+4]
  fi#0: size=3, align=1, at location [SP+4]
  fi#1: size=3, align=1, at location [SP+4]
  fi#2: size=8, align=8, at location [SP+4]
  fi#3: size=3, align=1, at location [SP+4]
  fi#4: size=3, align=1, at location [SP+4]
  fi#5: size=8, align=8, at location [SP+4]

0B	bb.0.top:
32B	  %1:fr64x = VMOVSDZrm_alt %fixed-stack.0, 1, $noreg, 0, $noreg :: (load 8 from %fixed-stack.0)
80B	  %5:gr32 = MOV32rm $noreg, 1, $noreg, 381541016, $noreg :: (volatile load 4 from `i64* inttoptr (i32 381541016 to i64*)`, align 8)
112B	  %5:gr32 = ADD32ri8 %5:gr32(tied-def 0), 1, implicit-def $eflags
128B	  ADC32mi8 $noreg, 1, $noreg, 381541020, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags :: (volatile store 4 into `i64* inttoptr (i32 381541016 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381541016 to i64*)` + 4, align 8)
144B	  MOV32mr $noreg, 1, $noreg, 381541016, $noreg, %5:gr32 :: (volatile store 4 into `i64* inttoptr (i32 381541016 to i64*)`, align 8)
160B	  %7:gr32 = MOV32rm $noreg, 1, $noreg, 381540920, $noreg, debug-location !7 :: (volatile load 4 from `i64* inttoptr (i32 381540920 to i64*)`, align 8); runtests.jl:6 @[ broadcast.jl:1312 @[ runtests.jl:18 ] ]
192B	  %7:gr32 = ADD32ri8 %7:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !7; runtests.jl:6 @[ broadcast.jl:1312 @[ runtests.jl:18 ] ]
208B	  ADC32mi8 $noreg, 1, $noreg, 381540924, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !7 :: (volatile store 4 into `i64* inttoptr (i32 381540920 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381540920 to i64*)` + 4, align 8); runtests.jl:6 @[ broadcast.jl:1312 @[ runtests.jl:18 ] ]
216B	  %3:gr32 = MOV32rm %fixed-stack.2, 1, $noreg, 0, $noreg :: (load 4 from %fixed-stack.2)
224B	  MOV32mr $noreg, 1, $noreg, 381540920, $noreg, %7:gr32, debug-location !7 :: (volatile store 4 into `i64* inttoptr (i32 381540920 to i64*)`, align 8); runtests.jl:6 @[ broadcast.jl:1312 @[ runtests.jl:18 ] ]
240B	  %9:gr32_abcd = MOVZX32rm8 %3:gr32, 1, $noreg, 0, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir..elt61, !tbaa !20, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
288B	  %12:gr32_abcd = MOVZX32rm8 %3:gr32, 1, $noreg, 1, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir..elt35, !tbaa !20, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
336B	  %15:gr32_abcd = MOVZX32rm8 %3:gr32, 1, $noreg, 2, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir..elt37, !tbaa !20, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
384B	  MOV8mr %stack.0, 1, $noreg, 0, $noreg, %9.sub_8bit:gr32_abcd, debug-location !23 :: (store 1 into %ir..fca.0.gep3062); Base.jl:33 @[ refvalue.jl:56 @[ broadcast.jl:595 @[ broadcast.jl:644 @[ broadcast.jl:620 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
400B	  MOV8mr %stack.0, 1, $noreg, 1, $noreg, %12.sub_8bit:gr32_abcd, debug-location !23 :: (store 1 into %ir..fca.1.gep32); Base.jl:33 @[ refvalue.jl:56 @[ broadcast.jl:595 @[ broadcast.jl:644 @[ broadcast.jl:620 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
416B	  MOV8mr %stack.0, 1, $noreg, 2, $noreg, %15.sub_8bit:gr32_abcd, debug-location !23 :: (store 1 into %ir..fca.2.gep34); Base.jl:33 @[ refvalue.jl:56 @[ broadcast.jl:595 @[ broadcast.jl:644 @[ broadcast.jl:620 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
432B	  %18:gr32 = MOV32rm $noreg, 1, $noreg, 381540960, $noreg, debug-location !42 :: (volatile load 4 from `i64* inttoptr (i32 381540960 to i64*)`, align 32); runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
464B	  %18:gr32 = ADD32ri8 %18:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !42; runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
480B	  ADC32mi8 $noreg, 1, $noreg, 381540964, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !42 :: (volatile store 4 into `i64* inttoptr (i32 381540960 to i64*)` + 4, align 32), (volatile load 4 from `i64* inttoptr (i32 381540960 to i64*)` + 4, align 32); runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
488B	  %2:gr32 = MOV32rm %fixed-stack.1, 1, $noreg, 0, $noreg :: (load 4 from %fixed-stack.1)
496B	  MOV32mr $noreg, 1, $noreg, 381540960, $noreg, %18:gr32, debug-location !42 :: (volatile store 4 into `i64* inttoptr (i32 381540960 to i64*)`, align 32); runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
512B	  %20:gr32 = MOV32rm $noreg, 1, $noreg, 381540936, $noreg, debug-location !47 :: (volatile load 4 from `i64* inttoptr (i32 381540936 to i64*)`, align 8); runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
544B	  %20:gr32 = ADD32ri8 %20:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !47; runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
560B	  ADC32mi8 $noreg, 1, $noreg, 381540940, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !47 :: (volatile store 4 into `i64* inttoptr (i32 381540936 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381540936 to i64*)` + 4, align 8); runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
576B	  MOV32mr $noreg, 1, $noreg, 381540936, $noreg, %20:gr32, debug-location !47 :: (volatile store 4 into `i64* inttoptr (i32 381540936 to i64*)`, align 8); runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
592B	  %25:gr32_abcd = MOVZX32rm8 %2:gr32, 1, $noreg, 0, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir.14, !tbaa !56, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
640B	  CMP8rr %25.sub_8bit:gr32_abcd, %9.sub_8bit:gr32_abcd, implicit-def $eflags, debug-location !50; int.jl:441 @[ promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ] ]
672B	  %25:gr32_abcd = CMOV32rr %25:gr32_abcd(tied-def 0), %9:gr32_abcd, 2, implicit killed $eflags, debug-location !53; promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
720B	  %32:gr32_abcd = MOVZX32rm8 %2:gr32, 1, $noreg, 1, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir.17, !tbaa !56, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
768B	  CMP8rr %32.sub_8bit:gr32_abcd, %12.sub_8bit:gr32_abcd, implicit-def $eflags, debug-location !50; int.jl:441 @[ promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ] ]
800B	  %32:gr32_abcd = CMOV32rr %32:gr32_abcd(tied-def 0), %12:gr32_abcd, 2, implicit killed $eflags, debug-location !53; promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
848B	  %39:gr32_abcd = MOVZX32rm8 %2:gr32, 1, $noreg, 2, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir.20, !tbaa !56, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
896B	  CMP8rr %39.sub_8bit:gr32_abcd, %15.sub_8bit:gr32_abcd, implicit-def $eflags, debug-location !50; int.jl:441 @[ promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ] ]
928B	  %39:gr32_abcd = CMOV32rr %39:gr32_abcd(tied-def 0), %15:gr32_abcd, 2, implicit killed $eflags, debug-location !53; promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
976B	  %43:gr32 = MOV32rm $noreg, 1, $noreg, 381540888, $noreg, debug-location !58 :: (volatile load 4 from `i64* inttoptr (i32 381540888 to i64*)`, align 8); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1008B	  %43:gr32 = ADD32ri8 %43:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !58; runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1024B	  ADC32mi8 $noreg, 1, $noreg, 381540892, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !58 :: (volatile store 4 into `i64* inttoptr (i32 381540888 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381540888 to i64*)` + 4, align 8); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1040B	  MOV32mr $noreg, 1, $noreg, 381540888, $noreg, %43:gr32, debug-location !58 :: (volatile store 4 into `i64* inttoptr (i32 381540888 to i64*)`, align 8); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1056B	  MOV8mr %stack.1, 1, $noreg, 0, $noreg, %25.sub_8bit:gr32_abcd, debug-location !58 :: (store 1 into %ir.24, !tbaa !60); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1072B	  MOV8mr %stack.1, 1, $noreg, 1, $noreg, %32.sub_8bit:gr32_abcd, debug-location !58 :: (store 1 into %ir.25, !tbaa !60); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1088B	  MOV8mr %stack.1, 1, $noreg, 2, $noreg, %39.sub_8bit:gr32_abcd, debug-location !58 :: (store 1 into %ir.26, !tbaa !60); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1104B	  %45:gr32 = MOV32rm $noreg, 1, $noreg, 381540968, $noreg, debug-location !62 :: (volatile load 4 from `i64* inttoptr (i32 381540968 to i64*)`, align 8); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1136B	  %45:gr32 = ADD32ri8 %45:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !62; runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1152B	  ADC32mi8 $noreg, 1, $noreg, 381540972, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !62 :: (volatile store 4 into `i64* inttoptr (i32 381540968 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381540968 to i64*)` + 4, align 8); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1168B	  MOV32mr $noreg, 1, $noreg, 381540968, $noreg, %45:gr32, debug-location !62 :: (volatile store 4 into `i64* inttoptr (i32 381540968 to i64*)`, align 8); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1184B	  VMOVSDZmr %stack.2, 1, $noreg, 0, $noreg, %1:fr64x, debug-location !62 :: (store 8 into %ir.28, !tbaa !60); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1200B	  %47:gr32 = MOV32rm $noreg, 1, $noreg, 381540976, $noreg, debug-location !63 :: (volatile load 4 from `i64* inttoptr (i32 381540976 to i64*)`, align 16); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1232B	  %47:gr32 = ADD32ri8 %47:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1248B	  ADC32mi8 $noreg, 1, $noreg, 381540980, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !63 :: (volatile store 4 into `i64* inttoptr (i32 381540976 to i64*)` + 4, align 16), (volatile load 4 from `i64* inttoptr (i32 381540976 to i64*)` + 4, align 16); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1264B	  MOV32mr $noreg, 1, $noreg, 381540976, $noreg, %47:gr32, debug-location !63 :: (volatile store 4 into `i64* inttoptr (i32 381540976 to i64*)`, align 16); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1272B	  %0:gr32 = MOV32rm %fixed-stack.3, 1, $noreg, 0, $noreg :: (load 4 from %fixed-stack.3)
1280B	  ADJCALLSTACKDOWN32 12, 0, 12, implicit-def dead $esp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $esp, implicit $ssp, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1296B	  %49:gr32 = LEA32r %stack.1, 1, $noreg, 0, $noreg
1312B	  %50:gr32 = LEA32r %stack.0, 1, $noreg, 0, $noreg
1328B	  %51:gr32 = LEA32r %stack.2, 1, $noreg, 0, $noreg
1344B	  PUSH32r %49:gr32, implicit-def $esp, implicit $esp, debug-location !63 :: (store 4 into stack + 8); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1360B	  PUSH32r %50:gr32, implicit-def $esp, implicit $esp, debug-location !63 :: (store 4 into stack + 4); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1376B	  PUSH32r %51:gr32, implicit-def $esp, implicit $esp, debug-location !63 :: (store 4 into stack); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1392B	  CALLpcrel32 @julia__mapc_9, <regmask $bh $bl $bp $bph $bpl $bx $di $dih $dil $ebp $ebx $edi $esi $hbp $hbx $hdi $hsi $si $sih $sil>, implicit $esp, implicit $ssp, implicit-def $esp, implicit-def $ssp, implicit-def $al, implicit-def $dl, implicit-def $cl, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1408B	  ADJCALLSTACKUP32 12, 0, implicit-def dead $esp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $esp, implicit $ssp, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1472B	  MOV8mr %stack.3, 1, $noreg, 0, $noreg, %9.sub_8bit:gr32_abcd, debug-location !23 :: (store 1 into %ir..fca.0.gep63); Base.jl:33 @[ refvalue.jl:56 @[ broadcast.jl:595 @[ broadcast.jl:644 @[ broadcast.jl:620 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1488B	  MOV8mr %stack.3, 1, $noreg, 1, $noreg, %12.sub_8bit:gr32_abcd, debug-location !23 :: (store 1 into %ir..fca.1.gep); Base.jl:33 @[ refvalue.jl:56 @[ broadcast.jl:595 @[ broadcast.jl:644 @[ broadcast.jl:620 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1504B	  MOV8mr %stack.3, 1, $noreg, 2, $noreg, %15.sub_8bit:gr32_abcd, debug-location !23 :: (store 1 into %ir..fca.2.gep); Base.jl:33 @[ refvalue.jl:56 @[ broadcast.jl:595 @[ broadcast.jl:644 @[ broadcast.jl:620 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1520B	  %56:gr32 = MOV32rm $noreg, 1, $noreg, 381540960, $noreg, debug-location !42 :: (volatile load 4 from `i64* inttoptr (i32 381540960 to i64*)`, align 32); runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1552B	  %56:gr32 = ADD32ri8 %56:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !42; runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1568B	  ADC32mi8 $noreg, 1, $noreg, 381540964, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !42 :: (volatile store 4 into `i64* inttoptr (i32 381540960 to i64*)` + 4, align 32), (volatile load 4 from `i64* inttoptr (i32 381540960 to i64*)` + 4, align 32); runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1584B	  MOV32mr $noreg, 1, $noreg, 381540960, $noreg, %56:gr32, debug-location !42 :: (volatile store 4 into `i64* inttoptr (i32 381540960 to i64*)`, align 32); runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
1600B	  %58:gr32 = MOV32rm $noreg, 1, $noreg, 381540936, $noreg, debug-location !47 :: (volatile load 4 from `i64* inttoptr (i32 381540936 to i64*)`, align 8); runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
1632B	  %58:gr32 = ADD32ri8 %58:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !47; runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
1648B	  ADC32mi8 $noreg, 1, $noreg, 381540940, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !47 :: (volatile store 4 into `i64* inttoptr (i32 381540936 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381540936 to i64*)` + 4, align 8); runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
1664B	  MOV32mr $noreg, 1, $noreg, 381540936, $noreg, %58:gr32, debug-location !47 :: (volatile store 4 into `i64* inttoptr (i32 381540936 to i64*)`, align 8); runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ]
1680B	  %63:gr32_abcd = MOVZX32rm8 %2:gr32, 1, $noreg, 3, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir.36, !tbaa !56, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1728B	  CMP8rr %63.sub_8bit:gr32_abcd, %9.sub_8bit:gr32_abcd, implicit-def $eflags, debug-location !50; int.jl:441 @[ promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ] ]
1760B	  %63:gr32_abcd = CMOV32rr %63:gr32_abcd(tied-def 0), %9:gr32_abcd, 2, implicit killed $eflags, debug-location !53; promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1808B	  %70:gr32_abcd = MOVZX32rm8 %2:gr32, 1, $noreg, 4, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir.39, !tbaa !56, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1856B	  CMP8rr %70.sub_8bit:gr32_abcd, %12.sub_8bit:gr32_abcd, implicit-def $eflags, debug-location !50; int.jl:441 @[ promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ] ]
1888B	  %70:gr32_abcd = CMOV32rr %70:gr32_abcd(tied-def 0), %12:gr32_abcd, 2, implicit killed $eflags, debug-location !53; promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1936B	  %77:gr32_abcd = MOVZX32rm8 %2:gr32, 1, $noreg, 5, $noreg, debug-location !53 :: (dereferenceable load 1 from %ir.42, !tbaa !56, addrspace 11); promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
1984B	  CMP8rr %77.sub_8bit:gr32_abcd, %15.sub_8bit:gr32_abcd, implicit-def $eflags, debug-location !50; int.jl:441 @[ promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ] ]
2016B	  %77:gr32_abcd = CMOV32rr %77:gr32_abcd(tied-def 0), %15:gr32_abcd, 2, implicit killed $eflags, debug-location !53; promotion.jl:421 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2064B	  %81:gr32 = MOV32rm $noreg, 1, $noreg, 381540888, $noreg, debug-location !58 :: (volatile load 4 from `i64* inttoptr (i32 381540888 to i64*)`, align 8); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2096B	  %81:gr32 = ADD32ri8 %81:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !58; runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2112B	  ADC32mi8 $noreg, 1, $noreg, 381540892, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !58 :: (volatile store 4 into `i64* inttoptr (i32 381540888 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381540888 to i64*)` + 4, align 8); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2128B	  MOV32mr $noreg, 1, $noreg, 381540888, $noreg, %81:gr32, debug-location !58 :: (volatile store 4 into `i64* inttoptr (i32 381540888 to i64*)`, align 8); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2144B	  MOV8mr %stack.4, 1, $noreg, 0, $noreg, %63.sub_8bit:gr32_abcd, debug-location !58 :: (store 1 into %ir.46, !tbaa !60); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2160B	  MOV8mr %stack.4, 1, $noreg, 1, $noreg, %70.sub_8bit:gr32_abcd, debug-location !58 :: (store 1 into %ir.47, !tbaa !60); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2176B	  MOV8mr %stack.4, 1, $noreg, 2, $noreg, %77.sub_8bit:gr32_abcd, debug-location !58 :: (store 1 into %ir.48, !tbaa !60); runtests.jl:2 @[ runtests.jl:8 @[ runtests.jl:11 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ] ] ]
2192B	  %83:gr32 = MOV32rm $noreg, 1, $noreg, 381540968, $noreg, debug-location !62 :: (volatile load 4 from `i64* inttoptr (i32 381540968 to i64*)`, align 8); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2224B	  %83:gr32 = ADD32ri8 %83:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !62; runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2240B	  ADC32mi8 $noreg, 1, $noreg, 381540972, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !62 :: (volatile store 4 into `i64* inttoptr (i32 381540968 to i64*)` + 4, align 8), (volatile load 4 from `i64* inttoptr (i32 381540968 to i64*)` + 4, align 8); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2256B	  MOV32mr $noreg, 1, $noreg, 381540968, $noreg, %83:gr32, debug-location !62 :: (volatile store 4 into `i64* inttoptr (i32 381540968 to i64*)`, align 8); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2272B	  VMOVSDZmr %stack.5, 1, $noreg, 0, $noreg, %1:fr64x, debug-location !62 :: (store 8 into %ir.50, !tbaa !60); runtests.jl:12 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2288B	  %85:gr32 = MOV32rm $noreg, 1, $noreg, 381540976, $noreg, debug-location !63 :: (volatile load 4 from `i64* inttoptr (i32 381540976 to i64*)`, align 16); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2320B	  %85:gr32 = ADD32ri8 %85:gr32(tied-def 0), 1, implicit-def $eflags, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2336B	  ADC32mi8 $noreg, 1, $noreg, 381540980, $noreg, 0, implicit-def dead $eflags, implicit killed $eflags, debug-location !63 :: (volatile store 4 into `i64* inttoptr (i32 381540976 to i64*)` + 4, align 16), (volatile load 4 from `i64* inttoptr (i32 381540976 to i64*)` + 4, align 16); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2344B	  %52:gr8 = COPY $al, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2352B	  %53:gr8 = COPY $dl, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2360B	  %54:gr8 = COPY $cl, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2368B	  MOV32mr $noreg, 1, $noreg, 381540976, $noreg, %85:gr32, debug-location !63 :: (volatile store 4 into `i64* inttoptr (i32 381540976 to i64*)`, align 16); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2376B	  ADJCALLSTACKDOWN32 12, 0, 12, implicit-def dead $esp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $esp, implicit $ssp, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2384B	  %87:gr32 = LEA32r %stack.4, 1, $noreg, 0, $noreg
2400B	  %88:gr32 = LEA32r %stack.3, 1, $noreg, 0, $noreg
2416B	  %89:gr32 = LEA32r %stack.5, 1, $noreg, 0, $noreg
2432B	  PUSH32r %87:gr32, implicit-def $esp, implicit $esp, debug-location !63 :: (store 4 into stack + 8); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2448B	  PUSH32r %88:gr32, implicit-def $esp, implicit $esp, debug-location !63 :: (store 4 into stack + 4); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2464B	  PUSH32r %89:gr32, implicit-def $esp, implicit $esp, debug-location !63 :: (store 4 into stack); runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2480B	  CALLpcrel32 @julia__mapc_9, <regmask $bh $bl $bp $bph $bpl $bx $di $dih $dil $ebp $ebx $edi $esi $hbp $hbx $hdi $hsi $si $sih $sil>, implicit $esp, implicit $ssp, implicit-def $esp, implicit-def $ssp, implicit-def $al, implicit-def $dl, implicit-def $cl, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2496B	  ADJCALLSTACKUP32 12, 0, implicit-def dead $esp, implicit-def dead $eflags, implicit-def dead $ssp, implicit $esp, implicit $ssp, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2512B	  %90:gr8 = COPY $al, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2528B	  %91:gr8 = COPY $dl, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2544B	  %92:gr8 = COPY $cl, debug-location !63; runtests.jl:13 @[ broadcast.jl:648 @[ broadcast.jl:621 @[ broadcast.jl:1098 @[ ntuple.jl:49 @[ broadcast.jl:1098 @[ broadcast.jl:883 @[ runtests.jl:18 ] ] ] ] ] ] ]
2560B	  MOV8mr %0:gr32, 1, $noreg, 0, $noreg, %52:gr8, debug-location !12 :: (store 1 into %ir..sroa.012.sroa.0.0..sroa.012.0..sroa_idx.sroa_idx64); runtests.jl:18
2576B	  MOV8mr %0:gr32, 1, $noreg, 1, $noreg, %53:gr8, debug-location !12 :: (store 1 into %ir..sroa.012.sroa.2.0..sroa.012.0..sroa_idx.sroa_idx); runtests.jl:18
2592B	  MOV8mr %0:gr32, 1, $noreg, 2, $noreg, %54:gr8, debug-location !12 :: (store 1 into %ir..sroa.012.sroa.3.0..sroa.012.0..sroa_idx.sroa_idx); runtests.jl:18
2608B	  MOV8mr %0:gr32, 1, $noreg, 3, $noreg, %90:gr8, debug-location !12 :: (store 1 into %ir..sroa.215.0..sroa_idx); runtests.jl:18
2624B	  MOV8mr %0:gr32, 1, $noreg, 4, $noreg, %91:gr8, debug-location !12 :: (store 1 into %ir..sroa.316.0..sroa_idx); runtests.jl:18
2640B	  MOV8mr %0:gr32, 1, $noreg, 5, $noreg, %92:gr8, debug-location !12 :: (store 1 into %ir..sroa.4.0..sroa_idx); runtests.jl:18
2656B	  $eax = COPY %0:gr32, debug-location !12; runtests.jl:18
2672B	  RET 0, $eax, debug-location !12; runtests.jl:18

# End machine code for function julia_f_5.

error: ran out of registers during register allocation

Therefore, the error seems to occur at the next step, "Greedy Register Allocator".

@ChenNingCong
Copy link

ChenNingCong commented Sep 10, 2021

Can you output the result of @code_llvm dump_module=true raw=true f(c, cs, 0.8)? This is more of a LLVM issue and we only need the LLVM bitcode to reproduce the problem.

Edit: I have used Github Action to reproduce the error with full LLVM IR, see here: link

Edit2: I reproduce the error on a nightly built Julia to match the new version LLVM on my computer. Unfortunately, I can't reproduce the error by copying LLVM IR and compiling it with llc. It successfully returns the machine code...

@vtjnash
Copy link
Member

vtjnash commented May 4, 2024

We have upgraded LLVM 7 times since this issue, so hopefully it is just okay now?

@vtjnash vtjnash closed this as completed May 4, 2024
@kimikage
Copy link
Contributor Author

kimikage commented May 4, 2024

@vtjnash
Unfortunately, v1.6.7 does not seem to have fixed the problem.
Given that this issue has not been referenced much, this is an edge case and of little importance.
But why not keep it open until the next LTS migration?

error: ran out of registers during register allocation
ERROR: Package RanOutOfRegisters errored during testing

julia> versioninfo()
Julia Version 1.6.7
Commit 3b76b25b64 (2022-07-19 15:11 UTC)
Platform Info:
  OS: Windows (i686-w64-mingw32)
  CPU: 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 32
  LIBM: libopenlibm
  LLVM: libLLVM-11.0.1 (ORCJIT, tigerlake)

@vtjnash
Copy link
Member

vtjnash commented May 4, 2024

Next LTS should be announced imminently. There is unlikely to be more 1.6 releases, as there haven't been in a while.

@kimikage
Copy link
Contributor Author

kimikage commented May 4, 2024

I was about to press 👍 but checked just to be sure.

error: ran out of registers during register allocation
ERROR: Package RanOutOfRegisters errored during testing

julia> versioninfo()
Julia Version 1.10.3
Commit 0b4590a550 (2024-04-30 10:59 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (i686-w64-mingw32)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 32
  LIBM: libopenlibm
  LLVM: libLLVM-15.0.7 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)
error: ran out of registers during register allocation
ERROR: Package RanOutOfRegisters errored during testing

julia> versioninfo()
Julia Version 1.12.0-DEV.462
Commit da6892ffc9 (2024-05-04 02:54 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Windows (i686-w64-mingw32)
  CPU: 8 × 11th Gen Intel(R) Core(TM) i7-1165G7 @ 2.80GHz
  WORD_SIZE: 32
  LLVM: libLLVM-17.0.6 (ORCJIT, tigerlake)
Threads: 1 default, 0 interactive, 1 GC (on 8 virtual cores)

Perhaps this is not a problem on the LLVM side?

@oscardssmith oscardssmith reopened this May 4, 2024
@kimikage
Copy link
Contributor Author

kimikage commented May 4, 2024

Weirdly, I can not reproduce the problem with GitHub Actions now. Some context (host CPU?) seems to have changed.🤔
https://github.com/kimikage/Issue42200.jl/actions/runs/8953530308/job/24592153489

However, as I showed above, it reproduces in my local environment.

@kimikage
Copy link
Contributor Author

Some context (host CPU?) seems to have changed.

This appears to be a problem that occurs on Intel CPUs.
The GitHub-hosted runner used Intel Xeon around 2021. (As of 2024, it uses AMD EPYC.)
as a reference to versioninfo(): https://discourse.julialang.org/t/os-dependency-of-float16-bigfloat-on-nightly/58414

If a non-Intel architecture is set as the cpu-target on an Intel CPU, no error occurs. E.g.:

% julia +nightly~x86 -C generic

The @code_llvm results are identical.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants