Skip to content

Slow compilation with chained dependent match types #14903

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
prolativ opened this issue Apr 11, 2022 · 23 comments · Fixed by #14909 or #14987
Closed

Slow compilation with chained dependent match types #14903

prolativ opened this issue Apr 11, 2022 · 23 comments · Fixed by #14909 or #14987

Comments

@prolativ
Copy link
Contributor

Compiler version

3.1.2

Minimized code

I encountered the problem while trying to migrate a part of akka-http (forked from parboiled2) to Scala 3.

import annotation.unchecked.uncheckedVariance

sealed trait HList
sealed trait HNil extends HList
case object HNil extends HNil
case class ::[+H, +T <: HList](head: H, tail: T) extends HList

type Concat[X <: HList, Y <: HList] <: HList = X match
  case HNil   => Y
  case h :: t => h :: Concat[t, Y]

/**
  * Decompose L into Prefix ++ Suffix if possible
*/
type StripSuffix[L <: HList, Suffix <: HList] <: Option[HList] = L match
  case Suffix => Some[HNil]
  case h :: t => StripSuffix[t, Suffix] match
    case Some[x] => Some[h :: x]
    case _ => None.type
  case _      => None.type

/**
  * type-level implementation of this logic:
  *   Out =
  *     R                      if T has a tail of type L
  *     (L dropRight T) ++ R   if L has a tail of type T
*/
sealed trait TailSwitch[L <: HList, T <: HList, R <: HList]:
  type Out <: HList

object TailSwitch:
  type TS[L <: HList, T <: HList, R <: HList] <: HList =
    StripSuffix[T, L] match
      case Some[_] => R
      case _ => StripSuffix[L, T] match
        case Some[x] => Concat[x, R]

  implicit def tailSwitch[L <: HList, T <: HList, R <: HList]: (TailSwitch[L, T, R] {
    type Out = TS[L, T, R]
  }) = new TailSwitch[L, T, R] { type Out = TS[L, T, R] }

/**
 * Rule popping I from stack and pushing back O
*/
sealed class Rule[-I <: HList, +O <: HList]:
  def ~[I2 <: HList, O2 <: HList](that: Rule[I2, O2])(implicit
      i: TailSwitch[I2, O @uncheckedVariance, I @uncheckedVariance],
      o: TailSwitch[O @uncheckedVariance, I2, O2]
  ): Rule[i.Out, o.Out] = ???

object Test:
  def dot = new Rule[HNil, HNil] {}
  def num = new Rule[HNil, Byte :: HNil] {}
  def pattern = num ~ dot ~ num ~ dot ~ num ~ dot ~ num
  

Output

This does compile although the compilation is very slow and making patter even slightly more complex makes the compilation time much longer. It turns out that the compilation time strongly depends on how the tailSwitch instance is defined.
If it's just an implicit def or given then the compilation takes about 3.5 minutes. For inline given it's 2 minutes. Luckily there seems to be a workaround for the problem, which is using transparent inline given - then the code compiles in less than a second.

Expectation

The compilation should be as fast as for transparent inline given for all other cases as the types are given very explicitly here and transparent doesn't seem to narrow the type.

@jrudolph
Copy link
Contributor

A simple guess would be that the result of the first step is a Rule[I, O] where both I and O are still the TS type (applying the match type, did prove that it exists, but it wasn't reduced to its final form). Now, we start with Rule[TS[...], TS[...] going into the second step which already has to do double the work, basically redoing the first step + doing the next step. That type result will be something like Rule[TS[TS[...], TS[...], TS[...]], TS[TS[...], TS[...], TS[...]]] explaining the exponential explosion.

The root cause would be how implicits and type matches combine in this case.

Would that make sense?

@prolativ
Copy link
Contributor Author

I have the same intuition. I would say the problem here is that match types are reduced too lazily.

@jrudolph
Copy link
Contributor

jrudolph commented Apr 11, 2022

Here's a simpler example, it's not even about the wrapping type class but it seems to depend just on the number of patterns in the match:

  trait Wrapper[T] {
    type Out
  }

  type Func[T] =
    T match {
      case String => Long
      case Long   => Int
      case Int    => Float
      case Float => Double
      case Double => Unit
      case Unit => String
    }

  implicit def infer[A]: Wrapper[One[A]] { type Out = Func[A] } = ???

  trait One[A] {
    def use(implicit w: Wrapper[One[A]]): One[w.Out]
  }

  val x: One[Long] = null
  x.use.use.use.use.use.use.use

@odersky
Copy link
Contributor

odersky commented Apr 11, 2022

@jrudolph I think that makes sense. We do cache match type reduction, but we have to be very conservative in order to be sound, which means that we have to invalidate caches in many situations. It seems that applies here. /cc @OlivierBlanvillain maybe he can shed some more light on the matter.

But it's good that transparent inline solves the problem.

@jrudolph
Copy link
Contributor

Thanks, @odersky.

Speaking of caches, WeakHashSet related code turns up with > 7 % in profiling of this case. In particular, the queue.poll call seems to be expensive. I don't know if this is indicative of bad cache usage or just a property of poll. Is it really worth it to even use a ReferenceQueue in this case? Wouldn't it be easier to clean up GC'd entries while accessing the table?

@odersky
Copy link
Contributor

odersky commented Apr 12, 2022

The WeakHashSet is the set of all cached types. That's a different cache actually. A high profiling count could point to a problem in the implementation of that cache. Maybe @smarter could take a look at it, he converted our type tables to weak hash sets. It's also an indication that we generate a lot of different types. Would be interesting to find out what they are.

@odersky
Copy link
Contributor

odersky commented Apr 12, 2022

Indeed, I notice that reduction caches are not effective in this case. I see no re-use at all, and millions of footprint mismatches.

In fact it's something else. We generate millions of fresh match types, all of the form:

?1.Out match {
  case String => Long
  case Long => Int
  case Int => Float
  case Float => Double
  case Double => Unit
  case Unit => String
}

Since every match type is fresh, it needs to be reduced from scratch, so the cache is populated but never looked up. The problem is that the skolem type ?1 is not cached and that means every type containing it is not cached either.

@odersky
Copy link
Contributor

odersky commented Apr 12, 2022

Luckily there seems to be a workaround for the problem, which is using transparent inline given - then the code compiles in less than a second.

In fact I could not reproduce that. I hangs for me also with transparent inline.

odersky added a commit to dotty-staging/dotty that referenced this issue Apr 12, 2022
Skolem types were not cached, which means that any type containing a skolem type
was not cached either. This meant that the same match type with a skolem type as
selector was created many times instead of once, so its reduction was not cached
either. We now cache skolem types. It's a bet that in practice few skolem types are
created and that therefore the hashtable pollution with skolemtypes is less of
a problem than the potential problem of losing identity of types containing
skolem types.

Fixes scala#14903
@odersky
Copy link
Contributor

odersky commented Apr 12, 2022

It seems that caching skolem types solves the problem. But I note that the original code has an error. See the test file in the PR.

@smarter
Copy link
Member

smarter commented Apr 12, 2022

Is it really worth it to even use a ReferenceQueue in this case?

I don't know, but that design was imported wholesale from Scala 2 where it survived many years of optimization work, and using it in Scala 3 did not lead to slowdowns compared to a regular HashMap, cf #12935

@jrudolph
Copy link
Contributor

But I note that the original code has an error

It compiles fine for me on 3.1.1 and 3.1.2 when using

implicit transparent inline def tailSwitch...

@odersky
Copy link
Contributor

odersky commented Apr 12, 2022

I get a hanged compiler when I run that code from main. And with the changes in the PR I get:

-- [E007] Type Mismatch Error: i14903.scala:54:16 ------------------------------
54 |  def pattern = num ~ dot ~ num ~ dot ~ num ~ dot ~ num // error
   |                ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
   |Found:    Rule[?1.Out, ?2.Out]
   |Required: Rule[HNil, 
   |  TailSwitch[Byte :: Byte :: Byte :: HNil, HNil, Byte :: HNil]{
   |    Out = (Byte :: Byte :: Byte :: Byte :: HNil)
   |  }#Out
   |]
   |
   |where:    ?1 is an unknown value of type TailSwitch[HNil, ?3.Out, ?4.Out]{Out = TailSwitch.TS[HNil, ?3.Out, ?4.Out]}
   |          ?2 is an unknown value of type TailSwitch[?3.Out, HNil, Byte :: HNil]{
   |  Out = TailSwitch.TS[?3.Out, HNil, Byte :: HNil]
   |}
   |
   |
   |Note: a match type could not be fully reduced:
   |
   |  trying to reduce  StripSuffix[?5.Out, HNil]
   |  failed since selector  ?5.Out
   |  does not match  case h :: t => StripSuffix[t, HNil] match {
   |  case Some[x] => Some[h :: x]
   |  case Any => None.type
   |} <: Option[HList]
   |  and cannot be shown to be disjoint from it either.
   |  Therefore, reduction cannot advance to the remaining case
   |
   |    case _ => None.type
   |
   | longer explanation available when compiling with `-explain`

Correction: It did not hang in main (with transparent inline), it just took a minute or so. And I get the same error as above when it terminates.

@prolativ
Copy link
Contributor Author

Indeed. This looks like a regression between main and 3.1.2 then.

@odersky
Copy link
Contributor

odersky commented Apr 12, 2022

@prolativ Can we find out what was the commit that regressed? And, is the error legit or wrong?

@jrudolph
Copy link
Contributor

jrudolph commented Apr 12, 2022

3.1.3-RC1-bin-20220404-ad2553d-NIGHTLY is the first nightly that doesn't compile quickly with transparent inline, so it should be in a26a2c3..ad2553d. Maybe #13780?

@odersky
Copy link
Contributor

odersky commented Apr 12, 2022

I guess #13780 is a likely candidate. The question still remains whether the error is legit or not.

@smarter
Copy link
Member

smarter commented Apr 12, 2022

For anyone looking into this, note that the error message doesn't give enough information because it talks about skolems without giving their underlying type, this happened before and in #13491 (comment) I gave a way to workaround this:

Replacing i" by ex" in the errors printed in MatchTypeTrace#explainEntry, we can get more information

(I didn't make a PR for that because it can lead to the same skolem being explained multiple times so something a bit more clever is needed)

@smarter
Copy link
Member

smarter commented Apr 12, 2022

In d1957e5, the test case for #13491 was changed as follow:

--- tests/pos/13491.scala
+++ tests/pos/13491.scala
@@ -86,7 +86,8 @@ object Rule {
   type RuleN[+L <: HList]   = Rule[HNil, L]

   def rule[I <: HList, O <: HList](r: Rule[I, O]): Rule[I, O] = ???
-  implicit def valueMap[T](m: Map[String, T])(implicit h: HListable[T]): RuleN[h.Out] = ???
+
+  implicit def valueMap[T, Out0 <: HList](m: Map[String, T])(implicit h: HListable[T] { type Out = Out0 }): RuleN[Out0] = ???
 }

 object Test {

If I apply the same kind of change to the test case in this issue, it works too (and it doesn't lead to slow compilation on master anymore since skolems aren't involved):

diff --git tests/neg/i14903.scala tests/neg/i14903.scala
index cd61a12e858..cfd9fd6d546 100644
--- tests/neg/i14903.scala
+++ tests/pos/i14903.scala
@@ -43,10 +43,10 @@ object TailSwitch:
  * Rule popping I from stack and pushing back O
 */
 sealed class Rule[-I <: HList, +O <: HList]:
-  def ~[I2 <: HList, O2 <: HList](that: Rule[I2, O2])(implicit
-      i: TailSwitch[I2, O @uncheckedVariance, I @uncheckedVariance],
-      o: TailSwitch[O @uncheckedVariance, I2, O2]
-  ): Rule[i.Out, o.Out] = ???
+  def ~[I2 <: HList, O2 <: HList, Out0 <: HList, Out1 <: HList](that: Rule[I2, O2])(implicit
+      i: TailSwitch[I2, O @uncheckedVariance, I @uncheckedVariance] { type Out = Out0 },
+      o: TailSwitch[O @uncheckedVariance, I2, O2] { type Out = Out1 }
+  ): Rule[Out0, Out1] = ???

 object Test:
   def dot = new Rule[HNil, HNil] {}

@smarter
Copy link
Member

smarter commented Apr 13, 2022

@jrudolph does the proposed change to ~ in my previous comment work for parboiled2?

@jrudolph
Copy link
Contributor

@jrudolph does the proposed change to ~ in my previous comment work for parboiled2?

I can definitely try. Would it have to stay that way or just to check that it would be a suitable workaround for the issue for the time being?

@smarter
Copy link
Member

smarter commented Apr 14, 2022

Seems like it would have to be that way, unless there's a chance the compiler could accept this code without breaking soundness /cc @OlivierBlanvillain

@jrudolph
Copy link
Contributor

Seems like it would have to be that way, unless there's a chance the compiler could accept this code without breaking soundness /cc @OlivierBlanvillain

Doing it in that one place doesn't seem to be enough. I would imagine that it will break things at other places as well as this kind of pattern is used in many more places. Which kind of usages are fine and which are going to break with 3.1.3?

@smarter
Copy link
Member

smarter commented Apr 20, 2022

Which kind of usages are fine and which are going to break with 3.1.3?

Unclear so far, I've opened #14987 as an alternative fix which if backported would avoid breaking anything hopefully.

smarter added a commit to dotty-staging/dotty that referenced this issue Apr 20, 2022
Previously, when reducing `a.T` we checked if the type of `a` was a subtype of
`RefinedType(.., T, TypeAlias(...))`, now we extend this check to handle
refinements where the `info` is a `TypeBounds` where both bounds are equal.

This solves two big issues at once:
- We can restore tests/pos/13491.scala to its original form from before scala#13780.
  The check for abstract types introduced by scala#13780 for soundness reasons is no
  longer hit because the type selection is reduced before we get to that point.
  This is important because parboiled2 relies on this and is therefore currently
  broken on 3.1.3-RC1 and main (sirthias/parboiled2#365).
- This fixes scala#14903 (slow compilation issue affecting parboiled2) without
  caching skolems (as in the alternative fix scala#14909). Again, this is due to the
  type containing skolem being reducible to a simpler type and therefore cacheable.
odersky added a commit to dotty-staging/dotty that referenced this issue Apr 21, 2022
Skolem types were not cached, which means that any type containing a skolem type
was not cached either. This meant that the same match type with a skolem type as
selector was created many times instead of once, so its reduction was not cached
either. We now cache skolem types. It's a bet that in practice few skolem types are
created and that therefore the hashtable pollution with skolemtypes is less of
a problem than the potential problem of losing identity of types containing
skolem types.

Fixes scala#14903
michelou pushed a commit to michelou/scala3 that referenced this issue Apr 25, 2022
Previously, when reducing `a.T` we checked if the type of `a` was a subtype of
`RefinedType(.., T, TypeAlias(...))`, now we extend this check to handle
refinements where the `info` is a `TypeBounds` where both bounds are equal.

This solves two big issues at once:
- We can restore tests/pos/13491.scala to its original form from before scala#13780.
  The check for abstract types introduced by scala#13780 for soundness reasons is no
  longer hit because the type selection is reduced before we get to that point.
  This is important because parboiled2 relies on this and is therefore currently
  broken on 3.1.3-RC1 and main (sirthias/parboiled2#365).
- This fixes scala#14903 (slow compilation issue affecting parboiled2) without
  caching skolems (as in the alternative fix scala#14909). Again, this is due to the
  type containing skolem being reducible to a simpler type and therefore cacheable.
smarter pushed a commit to dotty-staging/dotty that referenced this issue Apr 26, 2022
Skolem types were not cached, which means that any type containing a skolem type
was not cached either. This meant that the same match type with a skolem type as
selector was created many times instead of once, so its reduction was not cached
either. We now cache skolem types. It's a bet that in practice few skolem types are
created and that therefore the hashtable pollution with skolemtypes is less of
a problem than the potential problem of losing identity of types containing
skolem types.

This was originally motivated by scala#14903 but that ended up being fixed in a
different way. It still seems like a good idea to do this since it matches what
we do for type variables.
@mbovel mbovel removed their assignment May 3, 2022
bishabosha pushed a commit to dotty-staging/dotty that referenced this issue Oct 18, 2022
Skolem types were not cached, which means that any type containing a skolem type
was not cached either. This meant that the same match type with a skolem type as
selector was created many times instead of once, so its reduction was not cached
either. We now cache skolem types. It's a bet that in practice few skolem types are
created and that therefore the hashtable pollution with skolemtypes is less of
a problem than the potential problem of losing identity of types containing
skolem types.

This was originally motivated by scala#14903 but that ended up being fixed in a
different way. It still seems like a good idea to do this since it matches what
we do for type variables.
@Kordyjan Kordyjan added this to the 3.2.0 milestone Aug 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants