scala · vincenzobaz · Nov 30, 2020 · Nov 30, 2020 · Nov 30, 2020 · Nov 30, 2020
diff --git a/_posts/2020-11-30-flexible-and-safe-tuples-in-scala3.md b/_posts/2020-11-30-flexible-and-safe-tuples-in-scala3.md
@@ -0,0 +1,383 @@
+---
+layout: blog-detail
+post-type: blog
+by: Vincenzo Bazzucchi, Scala Center
+title: Flexible and safe tuples in Scala 3
+---
+
+# Flexible and safe tuples in Scala 3
+
+Tuples are revisited and completely rethought in Scala 3.
+They are more **flexible**, more dynamic and support a **wider range of operations**.
+This is enabled by new and powerful language features.
+
+In this post we will explore the new capabilities of tuples before
+looking under the hood to learn how the improvements in the Scala 3 type system,
+in particular *dependent types* and *match types*, enable implementing type safe
+operations on tuples.
+
+# The basics: what are tuples?
+
+In the Python programming language, tuples are a simple concept:
+they are immutable collections of objects. As such, they are opposed
+to lists, which are mutable.
+
+In Scala both `List`s and tuples are immutable, so why do we care
+about tuples?
+
+Scala being a statically typed programming language, the difference between
+list and tuples is in the type. Lists are *homogeneous* collections while
+tuples are *heterogeneous*. In simpler terms, a tuple collects items maintaining
+the type of each element, while a list collects objects retaining a common type
+for all the elements.
+
+This is better explained with an example:
+```scala
+scala> List(1, "2", 3.0, List(4))
+val res0: List[Any] = List(1, 2, 3.0, List(4))
+```
+We see that the compiler tries to infer a common supertype for the elements of the list,
+in this case `Any`.
+
+If we do the same with tuples, the elements maintain their individual and specific type:
+```scala
+scala> (1, "2", 3.0, List(4))
+val res0: (Int, String, Double, List[Int]) = (1, 2, 3.0, List(4))
+```
+This behavior is desirable in many cases, for example when
+we want a function to return two or more values having different types.
+
+# How are tuples better in Scala 3?
+
+## Size limit
+
+Probably the most well known limitation of tuples in Scala 2 was the
+restriction to 22 for the number of elements.
+
+```scala
+scala> (1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23)
+    error: tuples may not have more than 22 elements, but 23 given
+```
+
+In Scala 3 the previous tuple is perfectly legal.
+
+## Element accessor
+
+The only way to retrieve an element of a tuple in Scala 2 was to
+use the (1-based) `._i` attribute. For example:
+
+```scala
+("First", "Second")._2 // "Second"
+```
+
+In Scala 3, we can use the `apply` method with a 0-based argument:
+
+```scala
+("First", "Second")(1) // "Second"
+```
+
+As most of indexes are 0 based in Scala, this brings more consistency
+to codebases. It also provides more flexibility. We can, for example,
+*iterate* over any tuple to print each element on a line:
+
+```scala
+val someStuff = (1, "2", 3.0, List(4))
+for (i <- 0 until someStuff.size)
+  println(someStuff(i))
+```
+
+The argument provided to `apply` is checked at compile time. This means that
+**`someStuff(-1)` or `someStuff(4)` will result in a compilation error**.
+
+This was possible in Scala 2 with the `productIterator` although this
+produced a value of type `Iterator[Any]` which means that we had to pattern
+match or eventually cast the type of the elements.
+
+This brings us to the conceptual change that we will explore in the
+next change: tuples become a collection of data that we can manipulate
+and program against.
+
+## New operations
+
+A lot of operations are now available on tuples out of the box!
+
+Many of these were possible only using third-party libraries such
+as Shapeless in Scala 2, which was a complicated task for new
+Scala developers.
+
+These operations are now available in the standard library, they
+are safe and preserve the individual types of each element.
+
+The first one was already introduced: `.size` retrieves the number
+of elements in the tuple.
+
+### Adding elements to a tuple
+
+We can add an element to a tuple using the `*:` operator,
+which is very similar to the `::` operator available on `List`.
+
+```scala
+val fourElements = (1, "2", 3.0, List(4))
+val evenWeirder = 1 *: "2" *: 3.0 *: List(4) *: Tuple()
+
+val thisIsTrue = fourWeirdElements == evenWeirder // true
+
+val fiveWeirdElements = Set(0) *: evenWeirder // (Set(0),1,2,3.0,List(4))
+```
+
+When we use a tuple as argument of `*:`, it is prepended as a single element:
+```scala
+val notGood: ((Int, Int), Int, Int) = (1, 2) *: (3,4) // ((1, 2), 3, 4)
+```
+So how can we concatenate two tuples?
+The `++` is there exactly for this purpose:
+```scala
+val better: (Int, Int, Int, Int) = (1, 2) ++ (3, 4) // (1, 2, 3, 4)
+```
+
+### Removing elements from a tuple
+
+Similarly to operators available on lists, we can retrieve a subset of
+a tuple. Here is a quick overview:
+
+ - `drop` allows to ignore the first *n* elements of the tuple, returning
+   an empty tuple when the number of elements is smaller than *n*:
+```scala
+(1, "2", 3.0, List(4)).drop(2) // (3.0, List(4))
+(1, "2", 3.0, List(4)).drop(10) // ()
+```
+ - `take` retrieves the first *n* elements of the tuple, returning the original
+   tuple when the number of elements is smaller than *n*
+```scala
+(1, "2", 3.0, List(4)).take(2) // (1, "2")
+(1, "2", 3.0, List(4)).take(10) // (1, "2", 3.0, List(4))
+```
+ - `splitAt` creates two tuples, the first of which contains the first *n* elements
+   of the original tuple and the second contains the remaining elements
+```scala
+(Set(0), 1, "2", 3.0, List(4)).splitAt(3) // ((Set(0), 1, "2", 3.0), (3.0, List(4)))
+```
+
+### Transforming tuples
+
+Again, similarly to conversion methods on collections, it is possible to
+transform a tuple into a collection.
+
+We have to pay attention to the type of the resulting collection.
+Let's start with the simple case: as its name might hint,
+`toArray` produces an array. The type of its elements will always be
+`AnyRef`. This makes it easy to reason about this method although it
+forgets the type of the elements.
+It is also possible to use `.toIArray` which has exactly the same behavior
+but produces an `IArray` where the `I` stands for immutable.
+```scala
+scala> (1, "2").toArray
+val res0: Array[AnyRef] = Array(1, 2)
+```
+
+I believe however that the most interesting conversion is `toList`
+which produces a `List[U]` where `U` is the [union type](https://dotty.epfl.ch/docs/reference/new-types/union-types.html)
+of the types of the elements of the tuple.
+That is:
+
+```scala
+val ls: List[Int | String | Double] = (1, "2", 3.0).toList
+```
+This is interesting because the type information is somehow maintained.
+We can iterate over `list` and use pattern matching to apply the
+correct transformation, knowing exactly how many and what cases to
+treat:
+
+```scala
+// The compiler tells it cannot help with checking:
+// Non-exhaustive match
+(1, "2").toArray.map {
+  case i: Int => (i * 2).toString
+  case j: String => j
+}
+
+// The code compiles without errors or warning
+// the compile verified that we handled all possible cases
+(1, "2").toList.map {
+  case i: Int => (i + 2).toString
+  case j: String => j
+}
+```
+
+We can also transform a tuple by applying a function to each element.
+The method, similarly to what we are used to with collections, is called
+`map`. The difference from collections (or functors) is however
+that they expect a `f: A => B` where `A` is the type of the elements
+of the collection.
+With tuples each element has a different type!
+How can we generalize the concept of a function whose argument type is
+not fixed ?
+We can use a **`PolyFunction`**. This is a more advanced syntax:
+
+```scala
+val options: (Option[Int], Option[Char], Option[String], Option[Double]) =
+  (1, 'a', "dog", 3.0).map[[X] =>> Option[X]]([T] => (t: T) => Some(t))
+```
+You can read more about `PolyFunction`s [here]()
+
+## Zipping tuples
+
+The last operation allows to pair the elements of two tuples.
+You might have guessed, it is called `zip`. If the two tuples have
+different lengths, the extra elements of the longest will be
+ignored:
+
+```scala
+val numbers = (1, 2, 3, 4, 5)
+val letters = ('a', 'b', 'c')
+
+numbers.zip(letters) // ((1, 'a'), (2, 'b'), (3, 'c'))
+```
+
+# Under the hood: new type operators of Scala 3
+
+I believe that the core new features that allows such a flexible
+implementation of tuples are **match types**.
+I invite you to read more about them [here](http://dotty.epfl.ch/docs/reference/new-types/match-types.html).
+
+Let's see how we can implement the `++` operator using this powerful
+construct. We will naively call our version `concat`
+
+DISCLAIMER: This section is a bit more advanced !
+
+## Defining tuples
+
+First let's define our own tuple:
+
+```scala
+enum Tup:
+  case EmpT
+  case TCons[H, T <: Tup](head: H, tail: T)
+```
+
+That is a tuple is either empty, or an element `head` which precedes
+another tuple. Using this recursive definition we can create
+a tuple as follow:
+
+```scala
+import Tup._
+
+val myTup = TCons(1, TCons(2,  EmpT))
+```
+It is not very pretty, but it can be easily adapted to provide
+the same ease of use as the previous examples.
+To do so we can use another Scala 3 feature: [extension methods](http://dotty.epfl.ch/docs/reference/contextual/extension-methods.html)
+
+```scala
+import Tup._
+
+extension [A, T <: Tup] (a: A) def *: (t: T): TCons[A, T] =
+  TCons(a, t)
+```
+So that we can write:
+
+```scala
+1 *: "2" *: EmpT
+```
+
+## Concatenating tuples
+
+Now let's focus on `concat`, which could look like this:
+```scala
+import Tup._
+
+def concat[L <: Tup, R <: Tup](left: L, right: R): Tup =
+  left match
+    case EmpT => right
+    case TCons(head, tail) => TCons(head, concat(tail, right))
+```
+
+Let's analyze the algorithm line by line:
+`L` and `R` are the type of the left and right tuple. We require
+them to be a subtype of `Tup` because we want to concatenate tuples.
+Why not using `Tup` directly? Because in this way we receive more specific
+information about the two arguments.
+Then we proceed recursively by case: if the left tuple is empty,
+the result of the concatenation is just the right tuple.
+Otherwise the result is the current head followed by the result of
+concatenating the tail with the other tuple.
+
+If we test the function, it seems to work:
+```scala
+val left = 1 *: 2 *: EmpT
+val right = 3 *: 4 *: EmpT
+
+concat(left, right) // TCons(1,TCons(2,TCons(3, TCons(4,EmpT))))
+```
+
+So everything seems good. However we can have more safety.
+For instance the following code is perfectly fine:
+```scala
+def concat[L <: Tup, R <: Tup](left: L, right: R): Tup = left
+```
+Because the returned type is just a tuple, we do not check anything else.
+This means that the function can return an arbitrary tuple,
+the compiler cannot check that returned value consists of the concatenation
+of the two tuples. In other words, we need a type to indicate that
+the return of this function is all the types of `left` followed
+by all the types of the elements of `right`.
+
+Can we make it so that the compiler verifies that we are indeed
+returning a tuple consisting of the correct elements ?
+
+In Scala 3 it is now possible, without requiring external libraries!
+
+## A new type for the result of `concat`
+
+We know that we need to focus on the return type. We can define this the return
+type exactly as we have just described it.
+Let's call this type `Concat` to mirror the name of the function.
+
+```scala
+type Concat[L <: Tup, R <: Tup] <: Tup = L match
+  case EmpT.type => R
+  case TCons[h, t] => TCons[h, Concat[t, R]]
+```
+
+You can see that the implementation closely follows the one
+above for the method.
+To use it we need to massage a bit the method implementation and
+to change its return type:
+
+```scala
+def concat[L <: Tup, R <: Tup](left: L, right: R): Concat[L, R] =
+  left match
+    case _: EmpT.type => right
+    case cons: TCons[head, tail] => TCons(cons.head, concat(cons.tail, right))
+```
+
+We use here a combination of match types and a form of dependent types called
+*dependent match types*. There are some quirks to it as you might have noticed:
+using lower case types means using type variables and we cannot use pattern matching
+on the object. I think however that this implementation is extremely concise and readable.
+
+Now the compiler will prevent us from doing mistakes:
+
+```scala
+def malicious[L <: Tup, R <: Tup](left: L, right: R): Concat[L, R] = left
+// This does not compile!
+```
+
+We can use an extension method to allow users to write `(1, 2) ++ (3, 4)` instead
+of `concat((1, 2), (3, 4))`, I believe that you now know how to do this too.
+
+We can use the same approach for other functions on tuples, I invite you to have
+a look at the source code of the standard library to see how the other operators are
+implemented.
+
+# Conclusion
+
+We had a look at the new operations that are available on tuples in Scala 3 and at
+how a more flexible type system provides the fundamental tools to implement safer
+and more readable code.
+
+This shows how advanced type combinators in Scala 3 allow to create
+APIs that benefit developers no matter their level of proficiency in the language:
+an expert-oriented feature such as dependent match types allow to build a safe
+and simple operation such as tuple concatenation.
+