Skip to content
This repository was archived by the owner on Dec 22, 2021. It is now read-only.

Add MapView and SeqView #436

Merged
merged 10 commits into from
Feb 21, 2018
Merged

Add MapView and SeqView #436

merged 10 commits into from
Feb 21, 2018

Conversation

julienrf
Copy link
Contributor

@julienrf julienrf commented Feb 8, 2018

My goal with this PR was to fix #160. In short, the problem was that, currently, the view of a Map or a Seq has type View, and doesn’t support common Map and Seq operations (such as get or reverse, respectively). This is an incompatibility compared to the old collections.

This PR adds a SeqView type and a MapView type, supporting Seq operations and Map operations, respectively. However, their usefulness is quite limited: if you call, say, filter on a SeqView, you end up with a View, which doesn’t support anymore Seq operations… (same for MapView). More generally, most transformation operations give back a View, excepted a few.

This behaviour differs from the old collections, where transformation operations always return the same collection type (even when applied to views). We could replicate that behaviour for the sake of compatibility, but I don’t think that would be a good idea because to do so we would have to (sometimes) force view elements, which would be surprising because as a user I expect transformations applied to views to always be lazy.

That being said, even the limited SeqView that I introduced in this PR can be surprising because it supports reverse, which returns a View although its implementation necessarily evaluates all the underlying collection’s elements. Actually, in our current View, we also have operations that return views although they fully compute an intermediate collection: takeRight, dropRight, grouped, sliding, groupBy, scanRight, tails, inits.

I wish we could make it really clear in the types that some transformation operations won’t create intermediate collections and some others will. But that would complicate a lot more the Ops traits. Instead, I suggest that we try to limit the operations provided by views so that users have less chance to accidentally create intermediate collections when transforming views. Therefore, I would not try to replicate the old behaviour that makes SeqView#filter return a SeqView (for instance).

While working on that I also did the following changes:
— add a test checking that calling .view on a view is effectively a no-op (that wasn’t the case!),
— turn all view implementations into classes instead of case classes (the motivation is that it should reduce the bytecode size, since we don’t use any feature of case classes),
— optimized knownSize for small Map implementations.

The code I pushed doesn’t compile with Dotty. Actually my understanding is that 9f4f6f6 should be rejected by scalac but is not (it is rejected by dotc, though). I tried to fix the issue but my fix makes dotc crash:

java.lang.AssertionError: assertion failed: Cyclic reference while unpickling definition at address 110 in unit collection-strawman/collections/src/main/scala/strawman/collection/MapView.scala
        at dotty.DottyPredef$.assertFail(DottyPredef.scala:36)
        at dotty.tools.dotc.core.tasty.TreeUnpickler$TreeReader.readIndexedDef(TreeUnpickler.scala:637)
        at dotty.tools.dotc.core.tasty.TreeUnpickler$Completer.complete(TreeUnpickler.scala:91)
        at dotty.tools.dotc.core.SymDenotations$SymDenotation.completeFrom(SymDenotations.scala:246)
        at dotty.tools.dotc.core.SymDenotations$SymDenotation.completeInfo$1(SymDenotations.scala:209)
        at dotty.tools.dotc.core.SymDenotations$SymDenotation.info(SymDenotations.scala:211)

[update] I’ve just tried with the latest dotty nightly and it seems that the code compiles. However there is another error due to scala/scala3#3965

@julienrf julienrf force-pushed the views branch 2 times, most recently from b52b6bb to 236fa97 Compare February 9, 2018 10:10

import scala.{Int, IndexOutOfBoundsException}

trait SeqView[+A] extends SeqOps[A, View, View[A]] with View[A] {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could it extends SeqOps[A, SeqView, SeqView[A]]?

Copy link
Contributor Author

@julienrf julienrf Feb 9, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem is that if we do that transformation operations have to always return a SeqView, and sometimes (e.g. filter) this is not possible without computing the entire result of the transformation operation, which would defeat the purpose of using views.

@lrytz
Copy link
Member

lrytz commented Feb 9, 2018

I'll have to think about it more, it's a big topic. We'd have to do more to support the example here http://docs.scala-lang.org/overviews/collections/views.html

scala> def negate(xs: collection.mutable.Seq[Int]) = for (i <- 0 until xs.length) xs(i) = -xs(i)
negate: (xs: scala.collection.mutable.Seq[Int])Unit

scala> val a = Array(1,2,3)
a: Array[Int] = Array(1, 2, 3)

scala> negate(a.view.slice(2,3))

scala> a
res13: Array[Int] = Array(1, 2, -3)

Could SeqView extend Seq (not just SeqOps)?

@julienrf
Copy link
Contributor Author

julienrf commented Feb 9, 2018

If SeqView extends Seq then it has to be comparable with other Seq collections. Maybe we should always return false? We couldn’t find a sensible behaviour for comparing views.

@lrytz
Copy link
Member

lrytz commented Feb 9, 2018

Another data point: in 2.12, filterKeys and mapValues on maps have return type Map; in SortedMap they return SortedMap.

@julienrf
Copy link
Contributor Author

@lrytz good point. Currently neither SortedSet nor SortedMap have a “sorted” view.

@lrytz
Copy link
Member

lrytz commented Feb 12, 2018

There are conflicting goals between lazyness and providing "view" functionality.

I think in the past, views were often mentioned as a way to fuse operations. But 2.12 views have bugs, and you need to know which operations are lazy, which are forcing; that even depends on the collection/view type, your example of filter is very good:

// SeqView
scala> (1 to 10).toList.view.filter(x => {println(s"filter $x"); x%2==0}).head
filter 1
filter 2
res0: Int = 2

// IndexedSeqView
scala> (1 to 10).toArray.view.filter(x => {println(s"filter $x"); x%2==0}).head
filter 1
filter 2
...
filter 9
filter 10
res1: Int = 2

As far as I know, iterators are nowadays recommended for fusing (?), as they provide a restricted interface that makes accidental forcing less likely.

I think this is by far the most common use case: TargetColl.from(sourceColl.iterator.[..].[..].[..]), and iterators work well.

The less common use case is actually providing a transformed view onto some data. Maybe for this use case, providing a rich interface is more important than trying to ensure lazyness? Not sure.

I wonder how much code we break if we go for a simpler, but safer interface. I just realized another problem in 2.12:

scala> val a = (0 to 10).toArray
a: Array[Int] = Array(0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

scala> val av = a.view.filter(x => {println(s"filter $x"); x > 5})
av: scala.collection.mutable.IndexedSeqView[Int,Array[Int]] = SeqViewF(...)

scala> av.head // builds the index
filter 0
...
filter 10
res0: Int = 6

scala> a(6) = 1

scala> av.head // old index
res2: Int = 1

In reality, with mutable data, people would not use views but something like scala.rx.

In any case, there is a need for clear user recommendataions.

@lrytz
Copy link
Member

lrytz commented Feb 12, 2018

Historic thread with lots of good points: https://groups.google.com/forum/#!topic/scala-debate/M8s8FmASL8Y.

@julienrf can you summarize / reference discussions about views that took place during the strawman design period?

@julienrf
Copy link
Contributor Author

Actually, there hasn’t been much discussion about views. There was that issue, #13, which digressed in several directions, but whose main takeaway point is that mutable views are probably too confusing to be useful. We also had meeting discussions about views and equality. We couldn’t find a sensible definition of equality for views. Especially because equality between collections is not defined at the level of iterable but only in Seq, Set and Map. A practical consequence of this is that views can not extend Seq, Set or Map.

@julienrf
Copy link
Contributor Author

To make progress on this I propose that we don’t enrich views with transformation operations that will force the evaluation of their elements (so, view.filter(p) will only return a View, it won’t be able to return a SeqView). The main goal here is to always hold the promise that transformation operations applied to views should not evaluate the underlying collection nor build an intermediate collection.

This will be a source of backward incompatibility. However, my guess is that so far views weren’t heavily relied on, so hopefully we won’t break much things. Then, we will see when we will build the community build if this guess is confirmed or not, and we will consider or not fixing these incompatibilities.

What do you think?

@SethTisue
Copy link
Member

However, my guess is that so far views weren’t heavily relied on, so hopefully we won’t break much things. Then, we will see when we will build the community build if this guess is confirmed or not, and we will consider or not fixing these incompatibilities

agree

@lrytz
Copy link
Member

lrytz commented Feb 16, 2018

If views are their own hierarchy and not subtypes of collection types, I wonder what they add to iterators. I understand that views are multiple-use/traversal. But what are acutal use cases that they enable (in the current design / with this PR)?

If I write some code that acts on a collection type, I cannot pass in a view; that seems to be the main selling point in http://docs.scala-lang.org/overviews/collections/views.html. So I have to write code specifically for views, which then doesn't work for ordinary collection types. Maybe that tradeoff is acceptable? At least one can still use the view-based style which simplifies implementing certian algorithms.

@lrytz
Copy link
Member

lrytz commented Feb 19, 2018

Discussion summary

So our conclusion was that we continue with this PR and keep views in their separate hierarchy.


/** A collection containing the last `n` elements of this collection. */
override def takeRight(n: Int): C = fromSpecificIterable(view.takeRight(n))
override def reverse: C = fromSpecificIterable(new IndexedView.Reverse(this))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The difference between view.takeRight and IndexedView.TakeRight (line 39) is that the former builds a view of a view, whereas the latter builds just one view. The idea is to remove one level of indirection. If that would have been optimized anyway by the JIT then we should switch to view.takeRight instead.

override def slice(from: Int, until: Int): Vector[A] =
take(until).drop(from)

override def splitAt(n: Int): (Vector[A], Vector[A]) = (take(n), drop(n))
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because this is exactly the default implementation.

@@ -171,11 +171,6 @@ final class Vector[+A] private[immutable] (private[collection] val startIndex: I
dropRight(1)
}

override def slice(from: Int, until: Int): Vector[A] =
take(until).drop(from)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed because IndexedView.Slice does the same job without creating an intermediate Vector (like take(until) does)

@julienrf
Copy link
Contributor Author

@szeiger @lrytz do you want to review it again or can I merge it?

@lrytz
Copy link
Member

lrytz commented Feb 20, 2018

Give me some time to review it still, I haven't yet acutally; so far I just handwaved around views in general :-)

@julienrf julienrf merged commit 3d836d9 into scala:master Feb 21, 2018
@julienrf julienrf deleted the views branch February 21, 2018 10:03
Copy link
Member

@lrytz lrytz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks good!

import scala.Predef.{<:<, intWrapper}

/** View defined in terms of indexing a range */
trait IndexedView[+A] extends IndexedSeqOps[A, View, View[A]] with SeqView[A] { self =>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we call it IndexedSeqView?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that it would make more sense.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure that you can have a sensible indexing without a seq, so to me IndexedView (and, indeed, IndexedOps) make just as much sense to me.

@SethTisue
Copy link
Member

further discussion at scala/bug#10919

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a MapView type providing lazy Map operations
4 participants