Add MapView and SeqView #436

julienrf · 2018-02-08T17:00:46Z

My goal with this PR was to fix #160. In short, the problem was that, currently, the view of a Map or a Seq has type View, and doesn’t support common Map and Seq operations (such as get or reverse, respectively). This is an incompatibility compared to the old collections.

This PR adds a SeqView type and a MapView type, supporting Seq operations and Map operations, respectively. However, their usefulness is quite limited: if you call, say, filter on a SeqView, you end up with a View, which doesn’t support anymore Seq operations… (same for MapView). More generally, most transformation operations give back a View, excepted a few.

This behaviour differs from the old collections, where transformation operations always return the same collection type (even when applied to views). We could replicate that behaviour for the sake of compatibility, but I don’t think that would be a good idea because to do so we would have to (sometimes) force view elements, which would be surprising because as a user I expect transformations applied to views to always be lazy.

That being said, even the limited SeqView that I introduced in this PR can be surprising because it supports reverse, which returns a View although its implementation necessarily evaluates all the underlying collection’s elements. Actually, in our current View, we also have operations that return views although they fully compute an intermediate collection: takeRight, dropRight, grouped, sliding, groupBy, scanRight, tails, inits.

I wish we could make it really clear in the types that some transformation operations won’t create intermediate collections and some others will. But that would complicate a lot more the Ops traits. Instead, I suggest that we try to limit the operations provided by views so that users have less chance to accidentally create intermediate collections when transforming views. Therefore, I would not try to replicate the old behaviour that makes SeqView#filter return a SeqView (for instance).

While working on that I also did the following changes:
— add a test checking that calling .view on a view is effectively a no-op (that wasn’t the case!),
— turn all view implementations into classes instead of case classes (the motivation is that it should reduce the bytecode size, since we don’t use any feature of case classes),
— optimized knownSize for small Map implementations.

The code I pushed doesn’t compile with Dotty. Actually my understanding is that 9f4f6f6 should be rejected by scalac but is not (it is rejected by dotc, though). I tried to fix the issue but my fix makes dotc crash:

java.lang.AssertionError: assertion failed: Cyclic reference while unpickling definition at address 110 in unit collection-strawman/collections/src/main/scala/strawman/collection/MapView.scala
        at dotty.DottyPredef$.assertFail(DottyPredef.scala:36)
        at dotty.tools.dotc.core.tasty.TreeUnpickler$TreeReader.readIndexedDef(TreeUnpickler.scala:637)
        at dotty.tools.dotc.core.tasty.TreeUnpickler$Completer.complete(TreeUnpickler.scala:91)
        at dotty.tools.dotc.core.SymDenotations$SymDenotation.completeFrom(SymDenotations.scala:246)
        at dotty.tools.dotc.core.SymDenotations$SymDenotation.completeInfo$1(SymDenotations.scala:209)
        at dotty.tools.dotc.core.SymDenotations$SymDenotation.info(SymDenotations.scala:211)

[update] I’ve just tried with the latest dotty nightly and it seems that the code compiles. However there is another error due to scala/scala3#3965

lrytz · 2018-02-09T15:48:39Z

collections/src/main/scala/strawman/collection/SeqView.scala

+
+import scala.{Int, IndexOutOfBoundsException}
+
+trait SeqView[+A] extends SeqOps[A, View, View[A]] with View[A] {


could it extends SeqOps[A, SeqView, SeqView[A]]?

The problem is that if we do that transformation operations have to always return a SeqView, and sometimes (e.g. filter) this is not possible without computing the entire result of the transformation operation, which would defeat the purpose of using views.

lrytz · 2018-02-09T15:51:11Z

I'll have to think about it more, it's a big topic. We'd have to do more to support the example here http://docs.scala-lang.org/overviews/collections/views.html

scala> def negate(xs: collection.mutable.Seq[Int]) = for (i <- 0 until xs.length) xs(i) = -xs(i)
negate: (xs: scala.collection.mutable.Seq[Int])Unit

scala> val a = Array(1,2,3)
a: Array[Int] = Array(1, 2, 3)

scala> negate(a.view.slice(2,3))

scala> a
res13: Array[Int] = Array(1, 2, -3)

Could SeqView extend Seq (not just SeqOps)?

julienrf · 2018-02-09T16:03:11Z

If SeqView extends Seq then it has to be comparable with other Seq collections. Maybe we should always return false? We couldn’t find a sensible behaviour for comparing views.

lrytz · 2018-02-09T19:33:49Z

Another data point: in 2.12, filterKeys and mapValues on maps have return type Map; in SortedMap they return SortedMap.

julienrf · 2018-02-12T09:22:19Z

@lrytz good point. Currently neither SortedSet nor SortedMap have a “sorted” view.

lrytz · 2018-02-12T12:08:09Z

There are conflicting goals between lazyness and providing "view" functionality.

I think in the past, views were often mentioned as a way to fuse operations. But 2.12 views have bugs, and you need to know which operations are lazy, which are forcing; that even depends on the collection/view type, your example of filter is very good:

// SeqView
scala> (1 to 10).toList.view.filter(x => {println(s"filter $x"); x%2==0}).head
filter 1
filter 2
res0: Int = 2

// IndexedSeqView
scala> (1 to 10).toArray.view.filter(x => {println(s"filter $x"); x%2==0}).head
filter 1
filter 2
...
filter 9
filter 10
res1: Int = 2

As far as I know, iterators are nowadays recommended for fusing (?), as they provide a restricted interface that makes accidental forcing less likely.

I think this is by far the most common use case: TargetColl.from(sourceColl.iterator.[..].[..].[..]), and iterators work well.

The less common use case is actually providing a transformed view onto some data. Maybe for this use case, providing a rich interface is more important than trying to ensure lazyness? Not sure.

I wonder how much code we break if we go for a simpler, but safer interface. I just realized another problem in 2.12:

scala> val a = (0 to 10).toArray
a: Array[Int] = Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> val av = a.view.filter(x => {println(s"filter $x"); x > 5})
av: scala.collection.mutable.IndexedSeqView[Int,Array[Int]] = SeqViewF(...)

scala> av.head // builds the index
filter 0
...
filter 10
res0: Int = 6

scala> a(6) = 1

scala> av.head // old index
res2: Int = 1

In reality, with mutable data, people would not use views but something like scala.rx.

In any case, there is a need for clear user recommendataions.

lrytz · 2018-02-12T13:36:26Z

Historic thread with lots of good points: https://groups.google.com/forum/#!topic/scala-debate/M8s8FmASL8Y.

@julienrf can you summarize / reference discussions about views that took place during the strawman design period?

julienrf · 2018-02-12T15:51:01Z

Actually, there hasn’t been much discussion about views. There was that issue, #13, which digressed in several directions, but whose main takeaway point is that mutable views are probably too confusing to be useful. We also had meeting discussions about views and equality. We couldn’t find a sensible definition of equality for views. Especially because equality between collections is not defined at the level of iterable but only in Seq, Set and Map. A practical consequence of this is that views can not extend Seq, Set or Map.

julienrf · 2018-02-16T09:10:27Z

To make progress on this I propose that we don’t enrich views with transformation operations that will force the evaluation of their elements (so, view.filter(p) will only return a View, it won’t be able to return a SeqView). The main goal here is to always hold the promise that transformation operations applied to views should not evaluate the underlying collection nor build an intermediate collection.

This will be a source of backward incompatibility. However, my guess is that so far views weren’t heavily relied on, so hopefully we won’t break much things. Then, we will see when we will build the community build if this guess is confirmed or not, and we will consider or not fixing these incompatibilities.

What do you think?

SethTisue · 2018-02-16T20:35:57Z

However, my guess is that so far views weren’t heavily relied on, so hopefully we won’t break much things. Then, we will see when we will build the community build if this guess is confirmed or not, and we will consider or not fixing these incompatibilities

agree

lrytz · 2018-02-16T20:54:58Z

If views are their own hierarchy and not subtypes of collection types, I wonder what they add to iterators. I understand that views are multiple-use/traversal. But what are acutal use cases that they enable (in the current design / with this PR)?

If I write some code that acts on a collection type, I cannot pass in a view; that seems to be the main selling point in http://docs.scala-lang.org/overviews/collections/views.html. So I have to write code specifically for views, which then doesn't work for ordinary collection types. Maybe that tradeoff is acceptable? At least one can still use the view-based style which simplifies implementing certian algorithms.

lrytz · 2018-02-19T17:08:42Z

Discussion summary

Views are Iterable, they feel more like real collections; people might not like to use iterators because of their iterative nature
Views provide more / different operations (TODO: review this. Enforce consistency between the available operations of IterableOnce and Iterable #465)
Views can be practical as they are multiple-use (they update if the underlying collection is mutable). The cost of building the view is paid only once.
For views, the return types should make it clear if an operation traverses / forces or not. Operations that return views should never force.
We need better documentation (Document better the differences between views, iterators and lazylists #466)
Refined subtypes (SeqView, MapView) are useful as they provide more operations. For example, someMap.filterKeys(p).values. If filterKeys returns just a View[(K, V)], more hops are required for users
There's scope for fusing operations in views, a little of this is done already (https://github.com/scala/collection-strawman/blob/v0.9.0/collections/src/main/scala/strawman/collection/View.scala#L125). This would be not possible with iterators, as you'd have to take into accound their state (is it still un-evaluated?)

So our conclusion was that we continue with this PR and keep views in their separate hierarchy.

julienrf · 2018-02-20T10:29:26Z

collections/src/main/scala/strawman/collection/IndexedSeq.scala


-  /** A collection containing the last `n` elements of this collection. */
-  override def takeRight(n: Int): C = fromSpecificIterable(view.takeRight(n))
+  override def reverse: C = fromSpecificIterable(new IndexedView.Reverse(this))


The difference between view.takeRight and IndexedView.TakeRight (line 39) is that the former builds a view of a view, whereas the latter builds just one view. The idea is to remove one level of indirection. If that would have been optimized anyway by the JIT then we should switch to view.takeRight instead.

julienrf · 2018-02-20T10:34:24Z

collections/src/main/scala/strawman/collection/immutable/Vector.scala

-  override def slice(from: Int, until: Int): Vector[A] =
-    take(until).drop(from)
-
-  override def splitAt(n: Int): (Vector[A], Vector[A]) = (take(n), drop(n))


Removed because this is exactly the default implementation.

julienrf · 2018-02-20T10:35:05Z

collections/src/main/scala/strawman/collection/immutable/Vector.scala

@@ -171,11 +171,6 @@ final class Vector[+A] private[immutable] (private[collection] val startIndex: I
    dropRight(1)
  }

-  override def slice(from: Int, until: Int): Vector[A] =
-    take(until).drop(from)


Removed because IndexedView.Slice does the same job without creating an intermediate Vector (like take(until) does)

julienrf · 2018-02-20T14:02:49Z

@szeiger @lrytz do you want to review it again or can I merge it?

lrytz · 2018-02-20T15:19:32Z

Give me some time to review it still, I haven't yet acutally; so far I just handwaved around views in general :-)

Remove some upper bound constraints on Ops traits. Remove unnecessary ArrayLike trait.

Add IndexedView.Slice Reduce indirection levels by generalizing views to accept XxxOps parameters instead of collection types Add MapView.FilterKeys

lrytz

looks good!

lrytz · 2018-02-21T10:20:18Z

collections/src/main/scala/strawman/collection/IndexedView.scala

+import scala.Predef.{<:<, intWrapper}
+
+/** View defined in terms of indexing a range */
+trait IndexedView[+A] extends IndexedSeqOps[A, View, View[A]] with SeqView[A] { self =>


Should we call it IndexedSeqView?

I agree that it would make more sense.

I am not sure that you can have a sensible indexing without a seq, so to me IndexedView (and, indeed, IndexedOps) make just as much sense to me.

SethTisue · 2020-04-19T23:20:13Z

further discussion at scala/bug#10919

julienrf force-pushed the views branch 2 times, most recently from b52b6bb to 236fa97 Compare February 9, 2018 10:10

julienrf requested review from szeiger, lrytz, odersky and Ichoran February 9, 2018 10:14

lrytz reviewed Feb 9, 2018

View reviewed changes

julienrf force-pushed the views branch from 6195e9c to 3955b6a Compare February 20, 2018 10:22

julienrf commented Feb 20, 2018

View reviewed changes

julienrf force-pushed the views branch from 450fdb1 to f5bce57 Compare February 20, 2018 14:25

julienrf added 5 commits February 21, 2018 10:42

Move IndexedView in its own file

ddf00b0

Relax type bounds

9da61c4

Remove some upper bound constraints on Ops traits. Remove unnecessary ArrayLike trait.

Add SeqView. Make sure getting a View’s View is a no-op.

52c93c9

Add MapView and make View.MapValue a MapView.

e9f391c

Optimize knownSize of small Maps

02ed893

julienrf added 5 commits February 21, 2018 10:42

Failed attempt to fix Dotty compilation error

1c82059

Implement some IndexedViews in terms of SeqViews

9d38db0

Take advantage of IndexedViews in IndexedSeqOps

429de14

Add IndexedView.Slice Reduce indirection levels by generalizing views to accept XxxOps parameters instead of collection types Add MapView.FilterKeys

Remove temporary hack

06327b0

Fix compilation error after rebase

4ddfbcb

julienrf force-pushed the views branch from f5bce57 to 4ddfbcb Compare February 21, 2018 09:46

julienrf merged commit 3d836d9 into scala:master Feb 21, 2018

julienrf deleted the views branch February 21, 2018 10:03

lrytz reviewed Feb 21, 2018

View reviewed changes

lrytz mentioned this pull request Jul 16, 2020

simplify Map.equals, plus small other cleanups scala/scala#9117

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MapView and SeqView #436

Add MapView and SeqView #436

julienrf commented Feb 8, 2018 •

edited

Loading

lrytz Feb 9, 2018

julienrf Feb 9, 2018 •

edited

Loading

lrytz commented Feb 9, 2018

julienrf commented Feb 9, 2018

lrytz commented Feb 9, 2018

julienrf commented Feb 12, 2018

lrytz commented Feb 12, 2018

lrytz commented Feb 12, 2018 •

edited

Loading

julienrf commented Feb 12, 2018

julienrf commented Feb 16, 2018

SethTisue commented Feb 16, 2018

lrytz commented Feb 16, 2018

lrytz commented Feb 19, 2018

julienrf Feb 20, 2018

julienrf Feb 20, 2018

julienrf Feb 20, 2018

julienrf commented Feb 20, 2018

lrytz commented Feb 20, 2018

lrytz left a comment

lrytz Feb 21, 2018

julienrf Feb 21, 2018

Ichoran Feb 21, 2018

SethTisue commented Apr 19, 2020


		import scala.{Int, IndexOutOfBoundsException}

		trait SeqView[+A] extends SeqOps[A, View, View[A]] with View[A] {

Add MapView and SeqView #436

Add MapView and SeqView #436

Conversation

julienrf commented Feb 8, 2018 • edited Loading

lrytz Feb 9, 2018

Choose a reason for hiding this comment

julienrf Feb 9, 2018 • edited Loading

Choose a reason for hiding this comment

lrytz commented Feb 9, 2018

julienrf commented Feb 9, 2018

lrytz commented Feb 9, 2018

julienrf commented Feb 12, 2018

lrytz commented Feb 12, 2018

lrytz commented Feb 12, 2018 • edited Loading

julienrf commented Feb 12, 2018

julienrf commented Feb 16, 2018

SethTisue commented Feb 16, 2018

lrytz commented Feb 16, 2018

lrytz commented Feb 19, 2018

julienrf Feb 20, 2018

Choose a reason for hiding this comment

julienrf Feb 20, 2018

Choose a reason for hiding this comment

julienrf Feb 20, 2018

Choose a reason for hiding this comment

julienrf commented Feb 20, 2018

lrytz commented Feb 20, 2018

lrytz left a comment

Choose a reason for hiding this comment

lrytz Feb 21, 2018

Choose a reason for hiding this comment

julienrf Feb 21, 2018

Choose a reason for hiding this comment

Ichoran Feb 21, 2018

Choose a reason for hiding this comment

SethTisue commented Apr 19, 2020

julienrf commented Feb 8, 2018 •

edited

Loading

julienrf Feb 9, 2018 •

edited

Loading

lrytz commented Feb 12, 2018 •

edited

Loading