Slow compilation with chained dependent match types #14903

prolativ · 2022-04-11T09:38:04Z

Compiler version

3.1.2

Minimized code

I encountered the problem while trying to migrate a part of akka-http (forked from parboiled2) to Scala 3.

import annotation.unchecked.uncheckedVariance

sealed trait HList
sealed trait HNil extends HList
case object HNil extends HNil
case class ::[+H, +T <: HList](head: H, tail: T) extends HList

type Concat[X <: HList, Y <: HList] <: HList = X match
  case HNil   => Y
  case h :: t => h :: Concat[t, Y]

/**
  * Decompose L into Prefix ++ Suffix if possible
*/
type StripSuffix[L <: HList, Suffix <: HList] <: Option[HList] = L match
  case Suffix => Some[HNil]
  case h :: t => StripSuffix[t, Suffix] match
    case Some[x] => Some[h :: x]
    case _ => None.type
  case _      => None.type

/**
  * type-level implementation of this logic:
  *   Out =
  *     R                      if T has a tail of type L
  *     (L dropRight T) ++ R   if L has a tail of type T
*/
sealed trait TailSwitch[L <: HList, T <: HList, R <: HList]:
  type Out <: HList

object TailSwitch:
  type TS[L <: HList, T <: HList, R <: HList] <: HList =
    StripSuffix[T, L] match
      case Some[_] => R
      case _ => StripSuffix[L, T] match
        case Some[x] => Concat[x, R]

  implicit def tailSwitch[L <: HList, T <: HList, R <: HList]: (TailSwitch[L, T, R] {
    type Out = TS[L, T, R]
  }) = new TailSwitch[L, T, R] { type Out = TS[L, T, R] }

/**
 * Rule popping I from stack and pushing back O
*/
sealed class Rule[-I <: HList, +O <: HList]:
  def ~[I2 <: HList, O2 <: HList](that: Rule[I2, O2])(implicit
      i: TailSwitch[I2, O @uncheckedVariance, I @uncheckedVariance],
      o: TailSwitch[O @uncheckedVariance, I2, O2]
  ): Rule[i.Out, o.Out] = ???

object Test:
  def dot = new Rule[HNil, HNil] {}
  def num = new Rule[HNil, Byte :: HNil] {}
  def pattern = num ~ dot ~ num ~ dot ~ num ~ dot ~ num

Output

This does compile although the compilation is very slow and making patter even slightly more complex makes the compilation time much longer. It turns out that the compilation time strongly depends on how the tailSwitch instance is defined.
If it's just an implicit def or given then the compilation takes about 3.5 minutes. For inline given it's 2 minutes. Luckily there seems to be a workaround for the problem, which is using transparent inline given - then the code compiles in less than a second.

Expectation

The compilation should be as fast as for transparent inline given for all other cases as the types are given very explicitly here and transparent doesn't seem to narrow the type.

The text was updated successfully, but these errors were encountered:

jrudolph · 2022-04-11T14:40:11Z

A simple guess would be that the result of the first step is a Rule[I, O] where both I and O are still the TS type (applying the match type, did prove that it exists, but it wasn't reduced to its final form). Now, we start with Rule[TS[...], TS[...] going into the second step which already has to do double the work, basically redoing the first step + doing the next step. That type result will be something like Rule[TS[TS[...], TS[...], TS[...]], TS[TS[...], TS[...], TS[...]]] explaining the exponential explosion.

The root cause would be how implicits and type matches combine in this case.

Would that make sense?

prolativ · 2022-04-11T14:48:18Z

I have the same intuition. I would say the problem here is that match types are reduced too lazily.

jrudolph · 2022-04-11T15:16:14Z

Here's a simpler example, it's not even about the wrapping type class but it seems to depend just on the number of patterns in the match:

  trait Wrapper[T] {
    type Out
  }

  type Func[T] =
    T match {
      case String => Long
      case Long   => Int
      case Int    => Float
      case Float => Double
      case Double => Unit
      case Unit => String
    }

  implicit def infer[A]: Wrapper[One[A]] { type Out = Func[A] } = ???

  trait One[A] {
    def use(implicit w: Wrapper[One[A]]): One[w.Out]
  }

  val x: One[Long] = null
  x.use.use.use.use.use.use.use

odersky · 2022-04-11T17:39:43Z

@jrudolph I think that makes sense. We do cache match type reduction, but we have to be very conservative in order to be sound, which means that we have to invalidate caches in many situations. It seems that applies here. /cc @OlivierBlanvillain maybe he can shed some more light on the matter.

But it's good that transparent inline solves the problem.

jrudolph · 2022-04-12T08:09:11Z

Thanks, @odersky.

Speaking of caches, WeakHashSet related code turns up with > 7 % in profiling of this case. In particular, the queue.poll call seems to be expensive. I don't know if this is indicative of bad cache usage or just a property of poll. Is it really worth it to even use a ReferenceQueue in this case? Wouldn't it be easier to clean up GC'd entries while accessing the table?

odersky · 2022-04-12T08:44:09Z

The WeakHashSet is the set of all cached types. That's a different cache actually. A high profiling count could point to a problem in the implementation of that cache. Maybe @smarter could take a look at it, he converted our type tables to weak hash sets. It's also an indication that we generate a lot of different types. Would be interesting to find out what they are.

odersky · 2022-04-12T08:52:54Z

~~Indeed, I notice that reduction caches are not effective in this case. I see no re-use at all, and millions of footprint mismatches.~~

In fact it's something else. We generate millions of fresh match types, all of the form:

?1.Out match {
  case String => Long
  case Long => Int
  case Int => Float
  case Float => Double
  case Double => Unit
  case Unit => String
}

Since every match type is fresh, it needs to be reduced from scratch, so the cache is populated but never looked up. The problem is that the skolem type ?1 is not cached and that means every type containing it is not cached either.

odersky · 2022-04-12T09:31:02Z

Luckily there seems to be a workaround for the problem, which is using transparent inline given - then the code compiles in less than a second.

In fact I could not reproduce that. I hangs for me also with transparent inline.

Skolem types were not cached, which means that any type containing a skolem type was not cached either. This meant that the same match type with a skolem type as selector was created many times instead of once, so its reduction was not cached either. We now cache skolem types. It's a bet that in practice few skolem types are created and that therefore the hashtable pollution with skolemtypes is less of a problem than the potential problem of losing identity of types containing skolem types. Fixes scala#14903

odersky · 2022-04-12T09:44:46Z

It seems that caching skolem types solves the problem. But I note that the original code has an error. See the test file in the PR.

smarter · 2022-04-12T09:50:47Z

Is it really worth it to even use a ReferenceQueue in this case?

I don't know, but that design was imported wholesale from Scala 2 where it survived many years of optimization work, and using it in Scala 3 did not lead to slowdowns compared to a regular HashMap, cf #12935

jrudolph · 2022-04-12T09:54:32Z

But I note that the original code has an error

It compiles fine for me on 3.1.1 and 3.1.2 when using

implicit transparent inline def tailSwitch...

odersky · 2022-04-12T10:31:55Z

I get a hanged compiler when I run that code from main. And with the changes in the PR I get:

-- [E007] Type Mismatch Error: i14903.scala:54:16 ------------------------------
54 |  def pattern = num ~ dot ~ num ~ dot ~ num ~ dot ~ num // error
   |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |Found:    Rule[?1.Out, ?2.Out]
   |Required: Rule[HNil, 
   |  TailSwitch[Byte :: Byte :: Byte :: HNil, HNil, Byte :: HNil]{
   |    Out = (Byte :: Byte :: Byte :: Byte :: HNil)
   |  }#Out
   |]
   |
   |where:    ?1 is an unknown value of type TailSwitch[HNil, ?3.Out, ?4.Out]{Out = TailSwitch.TS[HNil, ?3.Out, ?4.Out]}
   |          ?2 is an unknown value of type TailSwitch[?3.Out, HNil, Byte :: HNil]{
   |  Out = TailSwitch.TS[?3.Out, HNil, Byte :: HNil]
   |}
   |
   |
   |Note: a match type could not be fully reduced:
   |
   |  trying to reduce  StripSuffix[?5.Out, HNil]
   |  failed since selector  ?5.Out
   |  does not match  case h :: t => StripSuffix[t, HNil] match {
   |  case Some[x] => Some[h :: x]
   |  case Any => None.type
   |} <: Option[HList]
   |  and cannot be shown to be disjoint from it either.
   |  Therefore, reduction cannot advance to the remaining case
   |
   |    case _ => None.type
   |
   | longer explanation available when compiling with `-explain`

Correction: It did not hang in main (with transparent inline), it just took a minute or so. And I get the same error as above when it terminates.

prolativ · 2022-04-12T10:51:02Z

Indeed. This looks like a regression between main and 3.1.2 then.

odersky · 2022-04-12T11:06:17Z

@prolativ Can we find out what was the commit that regressed? And, is the error legit or wrong?

jrudolph · 2022-04-12T11:48:52Z

3.1.3-RC1-bin-20220404-ad2553d-NIGHTLY is the first nightly that doesn't compile quickly with transparent inline, so it should be in a26a2c3..ad2553d. Maybe #13780?

odersky · 2022-04-12T11:59:13Z

I guess #13780 is a likely candidate. The question still remains whether the error is legit or not.

smarter · 2022-04-12T12:03:29Z

For anyone looking into this, note that the error message doesn't give enough information because it talks about skolems without giving their underlying type, this happened before and in #13491 (comment) I gave a way to workaround this:

Replacing i" by ex" in the errors printed in MatchTypeTrace#explainEntry, we can get more information

(I didn't make a PR for that because it can lead to the same skolem being explained multiple times so something a bit more clever is needed)

smarter · 2022-04-12T15:36:20Z

In d1957e5, the test case for #13491 was changed as follow:

--- tests/pos/13491.scala
+++ tests/pos/13491.scala
@@ -86,7 +86,8 @@ object Rule {
   type RuleN[+L <: HList]   = Rule[HNil, L]

   def rule[I <: HList, O <: HList](r: Rule[I, O]): Rule[I, O] = ???
-  implicit def valueMap[T](m: Map[String, T])(implicit h: HListable[T]): RuleN[h.Out] = ???
+
+  implicit def valueMap[T, Out0 <: HList](m: Map[String, T])(implicit h: HListable[T] { type Out = Out0 }): RuleN[Out0] = ???
 }

 object Test {

If I apply the same kind of change to the test case in this issue, it works too (and it doesn't lead to slow compilation on master anymore since skolems aren't involved):

diff --git tests/neg/i14903.scala tests/neg/i14903.scala
index cd61a12e858..cfd9fd6d546 100644
--- tests/neg/i14903.scala
+++ tests/pos/i14903.scala
@@ -43,10 +43,10 @@ object TailSwitch:
  * Rule popping I from stack and pushing back O
 */
 sealed class Rule[-I <: HList, +O <: HList]:
-  def ~[I2 <: HList, O2 <: HList](that: Rule[I2, O2])(implicit
-      i: TailSwitch[I2, O @uncheckedVariance, I @uncheckedVariance],
-      o: TailSwitch[O @uncheckedVariance, I2, O2]
-  ): Rule[i.Out, o.Out] = ???
+  def ~[I2 <: HList, O2 <: HList, Out0 <: HList, Out1 <: HList](that: Rule[I2, O2])(implicit
+      i: TailSwitch[I2, O @uncheckedVariance, I @uncheckedVariance] { type Out = Out0 },
+      o: TailSwitch[O @uncheckedVariance, I2, O2] { type Out = Out1 }
+  ): Rule[Out0, Out1] = ???

 object Test:
   def dot = new Rule[HNil, HNil] {}

smarter · 2022-04-13T10:32:40Z

@jrudolph does the proposed change to ~ in my previous comment work for parboiled2?

jrudolph · 2022-04-14T07:23:26Z

@jrudolph does the proposed change to ~ in my previous comment work for parboiled2?

I can definitely try. Would it have to stay that way or just to check that it would be a suitable workaround for the issue for the time being?

smarter · 2022-04-14T08:02:57Z

Seems like it would have to be that way, unless there's a chance the compiler could accept this code without breaking soundness /cc @OlivierBlanvillain

jrudolph · 2022-04-14T08:24:17Z

Seems like it would have to be that way, unless there's a chance the compiler could accept this code without breaking soundness /cc @OlivierBlanvillain

Doing it in that one place doesn't seem to be enough. I would imagine that it will break things at other places as well as this kind of pattern is used in many more places. Which kind of usages are fine and which are going to break with 3.1.3?

smarter · 2022-04-20T20:21:34Z

Which kind of usages are fine and which are going to break with 3.1.3?

Unclear so far, I've opened #14987 as an alternative fix which if backported would avoid breaking anything hopefully.

Previously, when reducing `a.T` we checked if the type of `a` was a subtype of `RefinedType(.., T, TypeAlias(...))`, now we extend this check to handle refinements where the `info` is a `TypeBounds` where both bounds are equal. This solves two big issues at once: - We can restore tests/pos/13491.scala to its original form from before scala#13780. The check for abstract types introduced by scala#13780 for soundness reasons is no longer hit because the type selection is reduced before we get to that point. This is important because parboiled2 relies on this and is therefore currently broken on 3.1.3-RC1 and main (sirthias/parboiled2#365). - This fixes scala#14903 (slow compilation issue affecting parboiled2) without caching skolems (as in the alternative fix scala#14909). Again, this is due to the type containing skolem being reducible to a simpler type and therefore cacheable.

Skolem types were not cached, which means that any type containing a skolem type was not cached either. This meant that the same match type with a skolem type as selector was created many times instead of once, so its reduction was not cached either. We now cache skolem types. It's a bet that in practice few skolem types are created and that therefore the hashtable pollution with skolemtypes is less of a problem than the potential problem of losing identity of types containing skolem types. Fixes scala#14903

Previously, when reducing `a.T` we checked if the type of `a` was a subtype of `RefinedType(.., T, TypeAlias(...))`, now we extend this check to handle refinements where the `info` is a `TypeBounds` where both bounds are equal. This solves two big issues at once: - We can restore tests/pos/13491.scala to its original form from before scala#13780. The check for abstract types introduced by scala#13780 for soundness reasons is no longer hit because the type selection is reduced before we get to that point. This is important because parboiled2 relies on this and is therefore currently broken on 3.1.3-RC1 and main (sirthias/parboiled2#365). - This fixes scala#14903 (slow compilation issue affecting parboiled2) without caching skolems (as in the alternative fix scala#14909). Again, this is due to the type containing skolem being reducible to a simpler type and therefore cacheable.

Skolem types were not cached, which means that any type containing a skolem type was not cached either. This meant that the same match type with a skolem type as selector was created many times instead of once, so its reduction was not cached either. We now cache skolem types. It's a bet that in practice few skolem types are created and that therefore the hashtable pollution with skolemtypes is less of a problem than the potential problem of losing identity of types containing skolem types. This was originally motivated by scala#14903 but that ended up being fixed in a different way. It still seems like a good idea to do this since it matches what we do for type variables.

prolativ added itype:performance area:inline area:match-types labels Apr 11, 2022

prolativ assigned mbovel Apr 11, 2022

prolativ mentioned this issue Apr 11, 2022

scala3: compile akka-http-core and pass all its tests akka/akka-http#4097

Merged

jrudolph mentioned this issue Apr 11, 2022

Scala 3: very slow / hanging rule compilation for ~ chaining subrules sirthias/parboiled2#360

Closed

odersky mentioned this issue Apr 12, 2022

Cache skolem types #14909

Merged

jrudolph mentioned this issue Apr 12, 2022

Add parboiled2 to community build #14911

Merged

jrudolph mentioned this issue Apr 14, 2022

Compilation broken against current dotty nightlies and likely for upcoming Scala 3.1.3 sirthias/parboiled2#365

Closed

griggt mentioned this issue Apr 20, 2022

More aggressive reduction of type selection (fixes parboiled2) #14987

Merged

odersky closed this as completed in #14987 Apr 21, 2022

mbovel removed their assignment May 3, 2022

Kordyjan added this to the 3.2.0 milestone Aug 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Slow compilation with chained dependent match types #14903

Slow compilation with chained dependent match types #14903

prolativ commented Apr 11, 2022

jrudolph commented Apr 11, 2022

prolativ commented Apr 11, 2022

jrudolph commented Apr 11, 2022 •

edited

Loading

odersky commented Apr 11, 2022

jrudolph commented Apr 12, 2022

odersky commented Apr 12, 2022

odersky commented Apr 12, 2022 •

edited

Loading

odersky commented Apr 12, 2022

odersky commented Apr 12, 2022

smarter commented Apr 12, 2022

jrudolph commented Apr 12, 2022

odersky commented Apr 12, 2022 •

edited

Loading

prolativ commented Apr 12, 2022

odersky commented Apr 12, 2022

jrudolph commented Apr 12, 2022 •

edited

Loading

odersky commented Apr 12, 2022

smarter commented Apr 12, 2022

smarter commented Apr 12, 2022

smarter commented Apr 13, 2022

jrudolph commented Apr 14, 2022

smarter commented Apr 14, 2022

jrudolph commented Apr 14, 2022

smarter commented Apr 20, 2022

Slow compilation with chained dependent match types #14903

Slow compilation with chained dependent match types #14903

Comments

prolativ commented Apr 11, 2022

Compiler version

Minimized code

Output

Expectation

jrudolph commented Apr 11, 2022

prolativ commented Apr 11, 2022

jrudolph commented Apr 11, 2022 • edited Loading

odersky commented Apr 11, 2022

jrudolph commented Apr 12, 2022

odersky commented Apr 12, 2022

odersky commented Apr 12, 2022 • edited Loading

odersky commented Apr 12, 2022

odersky commented Apr 12, 2022

smarter commented Apr 12, 2022

jrudolph commented Apr 12, 2022

odersky commented Apr 12, 2022 • edited Loading

prolativ commented Apr 12, 2022

odersky commented Apr 12, 2022

jrudolph commented Apr 12, 2022 • edited Loading

odersky commented Apr 12, 2022

smarter commented Apr 12, 2022

smarter commented Apr 12, 2022

smarter commented Apr 13, 2022

jrudolph commented Apr 14, 2022

smarter commented Apr 14, 2022

jrudolph commented Apr 14, 2022

smarter commented Apr 20, 2022

jrudolph commented Apr 11, 2022 •

edited

Loading

odersky commented Apr 12, 2022 •

edited

Loading

odersky commented Apr 12, 2022 •

edited

Loading

jrudolph commented Apr 12, 2022 •

edited

Loading