-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Functional Typelevel Programming in Scala #4768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Working draft document. Implementation is done elsewhere.
docs/docs/typelevel.md
Outdated
|
||
Transparent values are effectively final; they may not be overridden. In Scala-2, constant values had to be expressed using `final`, which gave an unfortunate double meaning to the modifier. The `final` syntax is still supported in Scala 3 for a limited time to support cross-building. | ||
|
||
Transparent values are more general than the old meaning of `final` since they also work on paths. For instance, the `field` definition above establishes at typing time the knowlegde that `field` is an alias of `outer.field`. The same effect can be achieved with an explicit singleton type ascription: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typo here in knowlegde
:)
docs/docs/typelevel.md
Outdated
transparent val pi: Double = 3.14159265359 | ||
transparent val field = outer.field | ||
``` | ||
The right hand side of a `transparent` value definition must have singleton type. The type of the value is then the singleton type of its right hand side, without any widenings. For instance, the type of `label` above is the singleton type `"url"` instead of `String` and the type of `pi` is `3.14159265359` instead of `Double`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can the right side be a call to a transparent def as well?
transparent def natToInt[N <: Nat]: Int = ...
transparent val natOne: Int = natToInt[S[Z]]
Edit: I assume it can, but maybe it's worth mentioning here
I reed this yesterday, it looks really good! |
|
||
By contrast, here are some advantages of PE over DT: | ||
|
||
+ It uses the standard type checker and typing rules with no need to go to full dependent types |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's a bit unfair to claim "the use of standard type checker and typing rules" for PE. Indeed, PE wouldn't perform any (type) specialization without the Rewrite Rules which are neither part of the standard type checker nor of the typing rules. Also, these rewrites rules have a 1-1 correspondence to new typing rules we would need for "full term-dependencies".
An application of a transparent function then needs to just instantiate that type, whereas the term is not affected at all, and the function is called as usual. This means there is a guarantee that the | ||
semantics of a transparent call is the same as a normal call, only the type is better. On the other hand, one cannot play | ||
tricks such as code specialization, since there is nothing to specialize. | ||
Also, implicit matches would not fit into this scheme, as they require term rewritings. Another concern is how much additional complexity would be caused by mirroring the necessary term forms in types and tweaking type inference to work with the new types. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The diff of #4671 should give a good idea of how much complexity is needed to encode and infer these types, here is a break down of the line changes (pattern matching not included):
+270 -32 typer
+87 -30 pretty printer
+186 tests
There is a bit more if we remove the inheritance between TypeOf
and AnnotatedType
(we did that yesterday in the wip branch), but I'm now confident than encoding and infering these types is the easy part; the complexity will be in the rules to normalize TypeOf
types.
|
||
Using `anyValue`, we can then define `defaultValue` as follows: | ||
```scala | ||
transparent def defaultValue[T]: Option[T] = anyValue[T] match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's really nice like this! The equivalent in TypeOf
terms would use scala.Predef.valueOf
(described in SIP-23):
transparent def defaultValue[T] = ... // unchanged
val defaultByte: Some[0L] = valueOf[{ defaultValue[Long] }]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to play devil's advocate on a few things that feel sketchy to me but overall, this looks amazing!
``` | ||
In other words, specialization with transparent functions allows "abstraction without regret". The price to pay for this | ||
is the increase of code size through inlining. That price is often worth paying, but inlining decisions need to be considered carefully. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If I understand properly, *
in this example is bound to an implicit extension from the Numeric
instance when MathLib.apply
initially typechecks but then re-bound to the normal *
method on Double
at the use site. If this is correct, then I very much understand the utility but this feels surprisingly unhygenic to me: the implicit can get swapped for the normal method simply because they share a name. Maybe I'm just not used to it but this outcome would feel kind of shocking to me if I was programming to a typeclass. It seems like you could accomplish the same goal without requiring that kind of sketchy rebinding if only:
- The
Numeric
instance forDouble
were also transparent - We kept
inline
around and made all the methods onNumeric[Double]
beinline
- We allowed
inline
methods to implement abstract methods, with theinline
method body simply not inlining after being upcast to the abstract type (so thatNumeric[Double]
could still be valid). To do this, the bodies would probably have to get "inlined" against opaque identifiers representing the method parameters...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see how this can look like a lack of hygiene, but it isn't. Hygiene is about free variables being rebound, but *
is not a free variable: it is a method of its left operand, that, in general depends on the type of the operand. Arguably, it would be a surprise if *
was not rebound. I still have to write this argument up in detail that in general, but maybe it's better to get more experience with the system first.
`concatImpl` is a regular, non-transparent function that implements `concat` on generic tuples. It is not inlineable, and its result type is always `Tuple`. The actual code for `concat` calls `concatImpl` and casts its result to type `r.Type`, the computed result type of the concatenation. This gives the best of both worlds: Compact code and expressive types. | ||
|
||
One might criticize that this scheme involves code duplication. In the example above, the recursive `concat` algorithm had to be implemented twice, once as a regular function, the other time as a transparent function. However, in practice it is is quite likely that the regular function would use optimized data representatons and algortihms that do not lend themselves easily to a typelevel interpretation. In these cases a dual implementation is required anyway. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that the need for this duplicative and unsafe-feeling code could be avoided if we kept inline
around and transparent
didn't imply inline
: that is, if transparent
by itself caused the code to get run both at compile time and at runtime, whereas inline transparent
caused it to work exactly as this proposal currently describes. Of course, we'd probably have to either rule out implicit matches or have them only consider the definition site (and thus be kind of useless) without inline
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you do term reduction (necessary for the resulting type refinement) without at least inlining the argument into the body?
This is why I was thinking of inlining the argument into the body, looking at what's left from it at after the reductions and using that as parameters to a specialized version of the transparent function, specialized version which would be cached and reused by invocations that reduce similarly. But caching may be too tricky.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, yes yes! If I understand, you make a good point: you could quickly get an explosion of runtime versions of the method! Caching seems like it could help but it would only reduce the problem rather than eliminate it... It might also be really hard to get binary compatibility right... It's almost like this wouldn't be practical unless it were paired with the behavior of not recomputing invocation receivers at the use site that I described in my other comment... slash... Am I somehow missing a way that you would still somehow get runtime-version explosion if it worked that way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you could quickly get an explosion of runtime versions of the method
I was talking about caching the code being compiled, at compile time, to reduce the size of the generated bytecode (specifically, to avoid the adverse effects of extensive inlining). I am not sure I understand what runtime caching has to do with the problem at hand. Sorry if I wasn't being clear. Maybe 'caching' is not the right word, and I should say 'factoring' similar implementations together – the opposite of duplicating them all over the place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you do term reduction (necessary for the resulting type refinement) without at least inlining the argument into the body?
@LPTK By working on types directly instead of manipulating untyped trees, which is what the alternative "TypeOf" thing is about. Let me illustrate with an example.
transparent def unroll(i: Int): String = {
println(i)
if (i == 0) "foo"
else unroll(i - 1)
}
With transparent implemented with inlining of untyped trees, unroll(2)
would be expanded as follows:
unroll(2)
--> (inlining)
{
val i = 2
println(i)
if (i == 0) "foo"
else unroll(i - 1)
}
--> (constant foldings)
{ println(2); unroll(1) }
--> (inlining)
{ println(2); {
val i = 1
println(i)
if (i == 0) "foo"
else unroll(i - 1)
}}
--> (constant foldings)
{ println(2); println(1); unroll(0) }
--> (inlining)
{ println(2); println(1); {
val i = 0
println(i)
if (i == 0) "foo"
else unroll(i - 1)
}}
--> (constant foldings)
{ println(2); println(1); println(0); "foo" }
The resulting block is then re-typechecked to finally compute the very precise singleton type "foo" for unroll(2)
.
Now in the typed approach, the definition of unroll is typechecked once where it's defined, and terms are never touched again. Here is how the definition of unroll would be typed, working bottom up:
transparent def unroll(i: Int): String = {
println(i) // : Unit
if (
i == 0 // : { i.type == 0 }
)
"foo" // : "foo"
else
unroll(
j // : { i.type - 1 }
) // : { unroll({ i.type - 1 }) }
}
Where { a.f(b) }
is special syntax for the precise type of the call to f
. Don't be confused by the presence of {.}
, this is really a type (composed of other types), they are not trees in there. A more verbose syntax would be CALL-TYPE(a.type, f.symbol, b.type)
.
The if
would then be typed as follows:
IF-TYPE({ i.type == 0 }, "foo", { unroll({ i.type - 1 }) })
The block would get the same type, which thus becomes the precise return type of unroll
. The call to unroll(2)
would then be types as { unroll(2) }
, which can be normalized as follows:
{ unroll(2) }
(β-reduction) -->
IF-TYPE({ i.type == 0 }, "foo", { unroll({ i.type - 1 }) }) [2/i.type]
(substitution) -->
IF-TYPE({ 2 == 0 }, "foo", { unroll({ 2 - 1 }) })
(constant folding) -->
IF-TYPE(false, "foo", { unroll(1) })
(if) -->
{ unroll(1) }
...
--> "foo"
In conclusion: inlining is not mandatory to do computations on types as described in the proposal. Of course working with types is less powerful as it doesn't do any code specialization. The TypeOf
approach is, in a sense, simpler and less ambitious.
transparent def toNat(n: Int): Nat = n match { | ||
case 0 => Z | ||
case n if n > 0 => S(toNat(n - 1)) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is absolutely a tradeoff but I want to register that to me at least, the ability to give an explicit return type that gets overridden has never felt right on whitebox macros and it still doesn't feel right to me here. Of course, I'm just one person and I can definitely get over it but I know that I, for one, would feel much more comfortable if you were required to leave off the return type on transparent
methods and values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or maybe sth. like _ <: Nat
to make it clear that Nat
is actually an upper type bound?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The return type is needed to do the initial type checking of the function (before inlining).
Also, if we consider _
to have its usual semantics in types, which is that of an existential (and that's how I personally read it), then _ <: Nat
or otherwise written t forSome { type t <: Nat }
is exactly equivalent to Nat
when used in positive positions (as here).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hrm.... It feels kind of weak for us to "need" an initial type for typechecking given the general theme of what transparent
does. That said, I am not currently able to argue that it isn't needed, especially when we are dealing with recursion, which normally defeats method type inference.
I am taking us dangerously close to unproductive bikeshedding by suggesting this but what about transparent def toNat(n: Int) <: Nat
? That should still parse unambiguously and shows that something quite different from normal method type ascription is going on here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@clhodapp your version does read a little better, except for the flagrant conceptual violation that <:
is for comparing types, and is really different from type ascription (in that sense, it would make Scala similar to these IMHO syntactically-misled languages which use a function-type-arrow to ascribe a result type to a method, as Rust with fn foo(x: T) -> R { ... }
).
I think it's important that beginners understand that <:
and :
are fundamentally distinct concepts, and your proposed syntax would just muddy the waters for them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fair. I throw my hands up on a syntax for now!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thought experiment -- what if it required a non-explicit (inferred) type? (Not suggesting as a requirement in actuality, for practical reasons, but I think it would be helpful if to do the thought experiment.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here's another thought: require a type variable. Something like
transparent def toNat[R <: Nat](n: Int): R = ???
The "meaning" wouldn't have to change -- even in scala 2, what this means is, "compiler: please solve a type for R that has an upper bound of Nat," except that until now it could only use (unspecified) type inference rules (based on the constraints in the type signature). One way to look at this feature is, it's another case where you are giving the compiler a rough type but you want the compiler to drill it down more.
On the other hand, it's not actually a parameter. It isn't up to the caller to choose.
Hmm, almost like the difference between type parameters and type members. Except, methods can't have type members.
Oh well. Maybe food for thought anyway...
I considered that but then decided against it. |
docs/docs/typelevel.md
Outdated
transparent def nthDynamic(xs: Tuple, n: Int): Any = xs match { | ||
case (x, _) if n == 0 => x | ||
case (_, xs1) if n > 0 => nthDynamic(xs1, n - 1) | ||
case () => throw new IndexOutOfBoundsException |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo? Shouldn't the if n > 0
be removed from the second case? I don't see how the expansion below would work else (the if
s prevent the first two cases to apply all along).
Alternatively, the third case could be changed to case _ => throw …
(and nthDynamic(as, -1)
would be expanded straightaway into throw …
, IIUC).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
|
||
transparent def concat(xs: Tuple, ys: Tuple): Tuple = { | ||
erased val r = concatTyped(xs, ys) | ||
concatImpl(xs, ys).asInstanceOf[r.Type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of using Typed
, wouldn't it possible to add a mechanism to obtain the type inferred for a value? Something like
val n = 2
val m: typeOf(n) = 4
typeOf(…)
would not provide the singleton type, but the "widened" type. So here typeOf(n)
is just Int
.
That way, Typed
would be unnecessary, and the example above could become
transparent def concatErased(xs: Tuple, ys: Tuple): Tuple = xs match {
case () => ys
case (x, xs1) => (x, concatErased(xs1, ys))
}
transparent def concat(xs: Tuple, ys: Tuple): Tuple = {
erased val r = concatErased(xs, ys)
concatImpl(xs, ys).asInstanceOf[typeOf(r)]
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks like it would be a useful addition. There's a very subtle distinction here. If we allow typeOf(t)
where t
is a term, we get #4671, which means full dependent typing. One consequence is that, then, all term rewriting rules have to be closed under substitution, which makes them harder to implement and restricts their applicability. But if we restrict typeOf(v)
to singletons it seems we avoid that complexity, but still get some part of the power of #4671.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one question while reading this very interesting proposal: |
@mandubian The proposal here is not yet implemented. In particular the implementation of match is missing. |
@odersky Yes, I've compiled the branch & tested a few cases and reached limitations quite fast & that's why I was asking ;) I'm seriously thinking about experimenting those ideas as soon as possible to Matrix Calculus (& naturally later Machine/Deep Learning typesafe & staged DSL)... I was wondering about
thks |
I think that should be covered by GADT matching, but it might be that we have to extend GADTs for it. I have to check.
|
put in backticks. This is doable, but feels a bit unnatural. To improve on the syntax, we | ||
allow an alternative syntax that prefixes type variables by `type`: | ||
```scala | ||
case Convertible[T, type U], Printable[U] => ... |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will an analogous ability be providing for matching on values where the same value is bound in two places? (e.g. tuple2 match { case (x, x) => })
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, matching two patterns is something that is useful in value pattern matches too. It would be nice to have support for that. Currently one has to use something like
object & {
def unapply[A](a: A): Option[(A, A)] = Some((a, a))
}
thing match { case FirstPattern(a) & SecondPattern(b, c) => }
It would be nice if (1) there would be something like that out of the box, and (2) there would be symmetry between however it is done with type patterns and with value patterns.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's definitely worth some thought!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odersky Why not generalize @
so it can take a pattern on the left-hand side, and not just an identifier?
Then we can use it both for values:
thing match { case FirstPattern(a) @ SecondPattern(b, c) => }
And for types:
case Convertible[T, type U] @ Printable[U] => ...
Adding to the pattern-matching wish-list, it would be super useful and powerful if we could have repeated occurrences of pattern variables, with these properties:
-
a pattern variable occurring several times in an 'AND' pattern gets the type of the first occurrence, and all extracted occurrences are checked for equality on extraction. For example
case (Foo(x),Bar(x))
is equivalent tocase (Foo(x),Bar(x0)) if x == x0
. -
a pattern variable occurring several times in an 'OR' pattern gets the least upper bound type of the occurrences. For example
case Foo(x) | Bar(x)
extractsx: A | B
ifFoo.unapply
extracts anA
andBar.unapply
extracts aB
. -
if a pattern variable typed
T
does not occur in all branches of an 'OR' pattern, it gets wrapped in an option asOption[T]
--- or alternatively, to make it more consistent, it gets typeT | Unit
or evenT | null
if/when Scala supportsnull
-checking.
By contrast, the new `implicit match` construct makes implicit search available in a functional context. To solve | ||
the problem of creating the right set, one would use it as follows: | ||
```scala | ||
transparent def setFor[T]: Set[T] = implicit match { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems very ad hoc. It seems like there's a lot of potential for generalization.
By the way, would setFor
typically be marked implicit
itself?
Anyway, couldn't this somehow be written without implicit match
if we had some (transparent
) way to check whether an implicit exists? I would start by thinking of how to compose that with existing code.
Something like
transparent maybeImplicit[T](implicit t: T = null): Option[T] = Option(t)
transparent def setFor[T] =
maybeImplicit[Ordering[T]] match {
case None => new HashSet[T]
case Some(ordering) =>
implicit val o = ordering
new TreeSet[T]
}
So, it seems to me that such a construct is possible without introducing implicit match
. Perhaps it would force transparent
unrolling to work a little harder. Also, it would be nice if we would allow the implicit
modifier on pattern bindings -- then you could do case Some(implicit ordering) =>
(or even case Some(implicit _) =>
!) But that would be useful generally too. Also, I don't know if transparent
can unroll HOFs, but if it could then the above could be written with map/getOrElse or with fold.
That said, pattern matching on types seems like it could be a really nice feature, but again, I would hope it could be done in a general way, not tied to implicit search. At least, implicit search shouldn't "squat" on syntax that could be left open to the potential of such a feature.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this to work, maybeImplicit
has to be "magic". In the end we need some way to try to summon an implicit without causing a compile-time error if none is found. So, given that some new compiler-supported machinery is needed, implicit match
is more pleasant to use than maybeImplicit
. But I agree maybeImplicit
is simpler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@odersky in which ways does maybeImplicit
need to be magic?
FWIW, the following currently works:
scala> def maybeImplicit[T](implicit t: T = null): Option[T] = Option(t)
maybeImplicit: [T](implicit t: T)Option[T]
scala> maybeImplicit[String]
res0: Option[String] = None
scala> implicit val str = "ok"
str: String = ok
scala> maybeImplicit[String]
res4: Option[String] = Some(ok)
Not sure how wide-ranging the discussion on this PR is supposed to be -- the title ("Functional Typelevel Programming in Scala") is pretty broad... |
Maybe conditional types? Typescript has a lot of interesting machinery for sure. |
@nafg perhaps you're thinking of $Call in Flow: https://flow.org/en/docs/types/utilities/#toc-call |
@berrytj I wasn't (haven't looked much at Flow), but that's very interesting. Seems pretty similar to |
$Call looks more like regular macro invocation to me. You run arbitrary, untrusted user code. By contrast, For |
In order to work with PMP, we need stronger restrictions on transparent parameters than were outlined previously. Specifically, arguments to transparent parameters must be constant expressions. To stay uniform between value definitions and parameters we now impose the same restriction on value definitions. The use case of `transparent` definitions with paths as right hand side has to be dropped for consistency. Also, we allow transparent modifiers only in transparent methods. Anywhere else they just contribute to puzzlers, I fear.
Latest iteration of the doc are in #4616, which now implements all of the proposal. |
Working draft document. Implementation is done elsewhere.
I broke this out from #4616, so that language design and implementation can be discussed separately.