|
| 1 | +This file infers the variance of type and lifetime parameters. The |
| 2 | +algorithm is taken from Section 4 of the paper "Taming the Wildcards: |
| 3 | +Combining Definition- and Use-Site Variance" published in PLDI'11 and |
| 4 | +written by Altidor et al., and hereafter referred to as The Paper. |
| 5 | + |
| 6 | +This inference is explicitly designed *not* to consider the uses of |
| 7 | +types within code. To determine the variance of type parameters |
| 8 | +defined on type `X`, we only consider the definition of the type `X` |
| 9 | +and the definitions of any types it references. |
| 10 | + |
| 11 | +We only infer variance for type parameters found on *data types* |
| 12 | +like structs and enums. In these cases, there is fairly straightforward |
| 13 | +explanation for what variance means. The variance of the type |
| 14 | +or lifetime parameters defines whether `T<A>` is a subtype of `T<B>` |
| 15 | +(resp. `T<'a>` and `T<'b>`) based on the relationship of `A` and `B` |
| 16 | +(resp. `'a` and `'b`). |
| 17 | + |
| 18 | +We do not infer variance for type parameters found on traits, fns, |
| 19 | +or impls. Variance on trait parameters can make indeed make sense |
| 20 | +(and we used to compute it) but it is actually rather subtle in |
| 21 | +meaning and not that useful in practice, so we removed it. See the |
| 22 | +addendum for some details. Variances on fn/impl parameters, otoh, |
| 23 | +doesn't make sense because these parameters are instantiated and |
| 24 | +then forgotten, they don't persist in types or compiled |
| 25 | +byproducts. |
| 26 | + |
| 27 | +### The algorithm |
| 28 | + |
| 29 | +The basic idea is quite straightforward. We iterate over the types |
| 30 | +defined and, for each use of a type parameter X, accumulate a |
| 31 | +constraint indicating that the variance of X must be valid for the |
| 32 | +variance of that use site. We then iteratively refine the variance of |
| 33 | +X until all constraints are met. There is *always* a sol'n, because at |
| 34 | +the limit we can declare all type parameters to be invariant and all |
| 35 | +constraints will be satisfied. |
| 36 | + |
| 37 | +As a simple example, consider: |
| 38 | + |
| 39 | + enum Option<A> { Some(A), None } |
| 40 | + enum OptionalFn<B> { Some(|B|), None } |
| 41 | + enum OptionalMap<C> { Some(|C| -> C), None } |
| 42 | + |
| 43 | +Here, we will generate the constraints: |
| 44 | + |
| 45 | + 1. V(A) <= + |
| 46 | + 2. V(B) <= - |
| 47 | + 3. V(C) <= + |
| 48 | + 4. V(C) <= - |
| 49 | + |
| 50 | +These indicate that (1) the variance of A must be at most covariant; |
| 51 | +(2) the variance of B must be at most contravariant; and (3, 4) the |
| 52 | +variance of C must be at most covariant *and* contravariant. All of these |
| 53 | +results are based on a variance lattice defined as follows: |
| 54 | + |
| 55 | + * Top (bivariant) |
| 56 | + - + |
| 57 | + o Bottom (invariant) |
| 58 | + |
| 59 | +Based on this lattice, the solution V(A)=+, V(B)=-, V(C)=o is the |
| 60 | +optimal solution. Note that there is always a naive solution which |
| 61 | +just declares all variables to be invariant. |
| 62 | + |
| 63 | +You may be wondering why fixed-point iteration is required. The reason |
| 64 | +is that the variance of a use site may itself be a function of the |
| 65 | +variance of other type parameters. In full generality, our constraints |
| 66 | +take the form: |
| 67 | + |
| 68 | + V(X) <= Term |
| 69 | + Term := + | - | * | o | V(X) | Term x Term |
| 70 | + |
| 71 | +Here the notation V(X) indicates the variance of a type/region |
| 72 | +parameter `X` with respect to its defining class. `Term x Term` |
| 73 | +represents the "variance transform" as defined in the paper: |
| 74 | + |
| 75 | + If the variance of a type variable `X` in type expression `E` is `V2` |
| 76 | + and the definition-site variance of the [corresponding] type parameter |
| 77 | + of a class `C` is `V1`, then the variance of `X` in the type expression |
| 78 | + `C<E>` is `V3 = V1.xform(V2)`. |
| 79 | + |
| 80 | +### Constraints |
| 81 | + |
| 82 | +If I have a struct or enum with where clauses: |
| 83 | + |
| 84 | + struct Foo<T:Bar> { ... } |
| 85 | + |
| 86 | +you might wonder whether the variance of `T` with respect to `Bar` |
| 87 | +affects the variance `T` with respect to `Foo`. I claim no. The |
| 88 | +reason: assume that `T` is invariant w/r/t `Bar` but covariant w/r/t |
| 89 | +`Foo`. And then we have a `Foo<X>` that is upcast to `Foo<Y>`, where |
| 90 | +`X <: Y`. However, while `X : Bar`, `Y : Bar` does not hold. In that |
| 91 | +case, the upcast will be illegal, but not because of a variance |
| 92 | +failure, but rather because the target type `Foo<Y>` is itself just |
| 93 | +not well-formed. Basically we get to assume well-formedness of all |
| 94 | +types involved before considering variance. |
| 95 | + |
| 96 | +#### Dependency graph management |
| 97 | + |
| 98 | +Because variance works in two phases, if we are not careful, we wind |
| 99 | +up with a muddled mess of a dep-graph. Basically, when gathering up |
| 100 | +the constraints, things are fairly well-structured, but then we do a |
| 101 | +fixed-point iteration and write the results back where they |
| 102 | +belong. You can't give this fixed-point iteration a single task |
| 103 | +because it reads from (and writes to) the variance of all types in the |
| 104 | +crate. In principle, we *could* switch the "current task" in a very |
| 105 | +fine-grained way while propagating constraints in the fixed-point |
| 106 | +iteration and everything would be automatically tracked, but that |
| 107 | +would add some overhead and isn't really necessary anyway. |
| 108 | + |
| 109 | +Instead what we do is to add edges into the dependency graph as we |
| 110 | +construct the constraint set: so, if computing the constraints for |
| 111 | +node `X` requires loading the inference variables from node `Y`, then |
| 112 | +we can add an edge `Y -> X`, since the variance we ultimately infer |
| 113 | +for `Y` will affect the variance we ultimately infer for `X`. |
| 114 | + |
| 115 | +At this point, we've basically mirrored the inference graph in the |
| 116 | +dependency graph. This means we can just completely ignore the |
| 117 | +fixed-point iteration, since it is just shuffling values along this |
| 118 | +graph. In other words, if we added the fine-grained switching of tasks |
| 119 | +I described earlier, all it would show is that we repeatedly read the |
| 120 | +values described by the constraints, but those edges were already |
| 121 | +added when building the constraints in the first place. |
| 122 | + |
| 123 | +Here is how this is implemented (at least as of the time of this |
| 124 | +writing). The associated `DepNode` for the variance map is (at least |
| 125 | +presently) `Signature(DefId)`. This means that, in `constraints.rs`, |
| 126 | +when we visit an item to load up its constraints, we set |
| 127 | +`Signature(DefId)` as the current task (the "memoization" pattern |
| 128 | +described in the `dep-graph` README). Then whenever we find an |
| 129 | +embedded type or trait, we add a synthetic read of `Signature(DefId)`, |
| 130 | +which covers the variances we will compute for all of its |
| 131 | +parameters. This read is synthetic (i.e., we call |
| 132 | +`variance_map.read()`) because, in fact, the final variance is not yet |
| 133 | +computed -- the read *will* occur (repeatedly) during the fixed-point |
| 134 | +iteration phase. |
| 135 | + |
| 136 | +In fact, we don't really *need* this synthetic read. That's because we |
| 137 | +do wind up looking up the `TypeScheme` or `TraitDef` for all |
| 138 | +references types/traits, and those reads add an edge from |
| 139 | +`Signature(DefId)` (that is, they share the same dep node as |
| 140 | +variance). However, I've kept the synthetic reads in place anyway, |
| 141 | +just for future-proofing (in case we change the dep-nodes in the |
| 142 | +future), and because it makes the intention a bit clearer I think. |
| 143 | + |
| 144 | +### Addendum: Variance on traits |
| 145 | + |
| 146 | +As mentioned above, we used to permit variance on traits. This was |
| 147 | +computed based on the appearance of trait type parameters in |
| 148 | +method signatures and was used to represent the compatibility of |
| 149 | +vtables in trait objects (and also "virtual" vtables or dictionary |
| 150 | +in trait bounds). One complication was that variance for |
| 151 | +associated types is less obvious, since they can be projected out |
| 152 | +and put to myriad uses, so it's not clear when it is safe to allow |
| 153 | +`X<A>::Bar` to vary (or indeed just what that means). Moreover (as |
| 154 | +covered below) all inputs on any trait with an associated type had |
| 155 | +to be invariant, limiting the applicability. Finally, the |
| 156 | +annotations (`MarkerTrait`, `PhantomFn`) needed to ensure that all |
| 157 | +trait type parameters had a variance were confusing and annoying |
| 158 | +for little benefit. |
| 159 | + |
| 160 | +Just for historical reference,I am going to preserve some text indicating |
| 161 | +how one could interpret variance and trait matching. |
| 162 | + |
| 163 | +#### Variance and object types |
| 164 | + |
| 165 | +Just as with structs and enums, we can decide the subtyping |
| 166 | +relationship between two object types `&Trait<A>` and `&Trait<B>` |
| 167 | +based on the relationship of `A` and `B`. Note that for object |
| 168 | +types we ignore the `Self` type parameter -- it is unknown, and |
| 169 | +the nature of dynamic dispatch ensures that we will always call a |
| 170 | +function that is expected the appropriate `Self` type. However, we |
| 171 | +must be careful with the other type parameters, or else we could |
| 172 | +end up calling a function that is expecting one type but provided |
| 173 | +another. |
| 174 | + |
| 175 | +To see what I mean, consider a trait like so: |
| 176 | + |
| 177 | + trait ConvertTo<A> { |
| 178 | + fn convertTo(&self) -> A; |
| 179 | + } |
| 180 | + |
| 181 | +Intuitively, If we had one object `O=&ConvertTo<Object>` and another |
| 182 | +`S=&ConvertTo<String>`, then `S <: O` because `String <: Object` |
| 183 | +(presuming Java-like "string" and "object" types, my go to examples |
| 184 | +for subtyping). The actual algorithm would be to compare the |
| 185 | +(explicit) type parameters pairwise respecting their variance: here, |
| 186 | +the type parameter A is covariant (it appears only in a return |
| 187 | +position), and hence we require that `String <: Object`. |
| 188 | + |
| 189 | +You'll note though that we did not consider the binding for the |
| 190 | +(implicit) `Self` type parameter: in fact, it is unknown, so that's |
| 191 | +good. The reason we can ignore that parameter is precisely because we |
| 192 | +don't need to know its value until a call occurs, and at that time (as |
| 193 | +you said) the dynamic nature of virtual dispatch means the code we run |
| 194 | +will be correct for whatever value `Self` happens to be bound to for |
| 195 | +the particular object whose method we called. `Self` is thus different |
| 196 | +from `A`, because the caller requires that `A` be known in order to |
| 197 | +know the return type of the method `convertTo()`. (As an aside, we |
| 198 | +have rules preventing methods where `Self` appears outside of the |
| 199 | +receiver position from being called via an object.) |
| 200 | + |
| 201 | +#### Trait variance and vtable resolution |
| 202 | + |
| 203 | +But traits aren't only used with objects. They're also used when |
| 204 | +deciding whether a given impl satisfies a given trait bound. To set the |
| 205 | +scene here, imagine I had a function: |
| 206 | + |
| 207 | + fn convertAll<A,T:ConvertTo<A>>(v: &[T]) { |
| 208 | + ... |
| 209 | + } |
| 210 | + |
| 211 | +Now imagine that I have an implementation of `ConvertTo` for `Object`: |
| 212 | + |
| 213 | + impl ConvertTo<i32> for Object { ... } |
| 214 | + |
| 215 | +And I want to call `convertAll` on an array of strings. Suppose |
| 216 | +further that for whatever reason I specifically supply the value of |
| 217 | +`String` for the type parameter `T`: |
| 218 | + |
| 219 | + let mut vector = vec!["string", ...]; |
| 220 | + convertAll::<i32, String>(vector); |
| 221 | + |
| 222 | +Is this legal? To put another way, can we apply the `impl` for |
| 223 | +`Object` to the type `String`? The answer is yes, but to see why |
| 224 | +we have to expand out what will happen: |
| 225 | + |
| 226 | +- `convertAll` will create a pointer to one of the entries in the |
| 227 | + vector, which will have type `&String` |
| 228 | +- It will then call the impl of `convertTo()` that is intended |
| 229 | + for use with objects. This has the type: |
| 230 | + |
| 231 | + fn(self: &Object) -> i32 |
| 232 | + |
| 233 | + It is ok to provide a value for `self` of type `&String` because |
| 234 | + `&String <: &Object`. |
| 235 | + |
| 236 | +OK, so intuitively we want this to be legal, so let's bring this back |
| 237 | +to variance and see whether we are computing the correct result. We |
| 238 | +must first figure out how to phrase the question "is an impl for |
| 239 | +`Object,i32` usable where an impl for `String,i32` is expected?" |
| 240 | + |
| 241 | +Maybe it's helpful to think of a dictionary-passing implementation of |
| 242 | +type classes. In that case, `convertAll()` takes an implicit parameter |
| 243 | +representing the impl. In short, we *have* an impl of type: |
| 244 | + |
| 245 | + V_O = ConvertTo<i32> for Object |
| 246 | + |
| 247 | +and the function prototype expects an impl of type: |
| 248 | + |
| 249 | + V_S = ConvertTo<i32> for String |
| 250 | + |
| 251 | +As with any argument, this is legal if the type of the value given |
| 252 | +(`V_O`) is a subtype of the type expected (`V_S`). So is `V_O <: V_S`? |
| 253 | +The answer will depend on the variance of the various parameters. In |
| 254 | +this case, because the `Self` parameter is contravariant and `A` is |
| 255 | +covariant, it means that: |
| 256 | + |
| 257 | + V_O <: V_S iff |
| 258 | + i32 <: i32 |
| 259 | + String <: Object |
| 260 | + |
| 261 | +These conditions are satisfied and so we are happy. |
| 262 | + |
| 263 | +#### Variance and associated types |
| 264 | + |
| 265 | +Traits with associated types -- or at minimum projection |
| 266 | +expressions -- must be invariant with respect to all of their |
| 267 | +inputs. To see why this makes sense, consider what subtyping for a |
| 268 | +trait reference means: |
| 269 | + |
| 270 | + <T as Trait> <: <U as Trait> |
| 271 | + |
| 272 | +means that if I know that `T as Trait`, I also know that `U as |
| 273 | +Trait`. Moreover, if you think of it as dictionary passing style, |
| 274 | +it means that a dictionary for `<T as Trait>` is safe to use where |
| 275 | +a dictionary for `<U as Trait>` is expected. |
| 276 | + |
| 277 | +The problem is that when you can project types out from `<T as |
| 278 | +Trait>`, the relationship to types projected out of `<U as Trait>` |
| 279 | +is completely unknown unless `T==U` (see #21726 for more |
| 280 | +details). Making `Trait` invariant ensures that this is true. |
| 281 | + |
| 282 | +Another related reason is that if we didn't make traits with |
| 283 | +associated types invariant, then projection is no longer a |
| 284 | +function with a single result. Consider: |
| 285 | + |
| 286 | +``` |
| 287 | +trait Identity { type Out; fn foo(&self); } |
| 288 | +impl<T> Identity for T { type Out = T; ... } |
| 289 | +``` |
| 290 | + |
| 291 | +Now if I have `<&'static () as Identity>::Out`, this can be |
| 292 | +validly derived as `&'a ()` for any `'a`: |
| 293 | + |
| 294 | + <&'a () as Identity> <: <&'static () as Identity> |
| 295 | + if &'static () < : &'a () -- Identity is contravariant in Self |
| 296 | + if 'static : 'a -- Subtyping rules for relations |
| 297 | + |
| 298 | +This change otoh means that `<'static () as Identity>::Out` is |
| 299 | +always `&'static ()` (which might then be upcast to `'a ()`, |
| 300 | +separately). This was helpful in solving #21750. |
| 301 | + |
| 302 | + |
0 commit comments