Skip to content

Commit fefd741

Browse files
committed
Optional indexes
1 parent 3f490a3 commit fefd741

24 files changed

+925
-613
lines changed

doc/api.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -44,6 +44,7 @@ Attributes
4444
Dataset.coords
4545
Dataset.attrs
4646
Dataset.indexes
47+
Dataset.get_index
4748

4849
Dictionary interface
4950
--------------------
@@ -193,6 +194,7 @@ Attributes
193194
DataArray.attrs
194195
DataArray.encoding
195196
DataArray.indexes
197+
DataArray.get_index
196198

197199
**ndarray attributes**:
198200
:py:attr:`~DataArray.ndim`

doc/whats-new.rst

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -21,6 +21,26 @@ v0.9.0 (unreleased)
2121
Breaking changes
2222
~~~~~~~~~~~~~~~~
2323

24+
- Index coordinates for each dimensions are now optional, and no longer created
25+
by default. This has a number of implications:
26+
27+
- :py:func:`~align` and :py:meth:`~Dataset.reindex` can now error, if
28+
dimensions labels are missing and dimensions have different sizes.
29+
- Because pandas does not support missing indexes, methods such as
30+
``to_dataframe``/``from_dataframe`` and ``stack``/``unstack`` no longer
31+
roundtrip faithfully on all inputs. Use :py:meth:`~Dataset.reset_index` to
32+
remove undesired indexes.
33+
- ``Dataset.__delitem__`` and :py:meth:`~Dataset.drop` no longer delete/drop
34+
variables that have dimensions matching a deleted/dropped variable.
35+
- ``DataArray.coords.__delitem__`` is now allowed on variables matching
36+
dimension names.
37+
- ``.sel`` and ``.loc`` now handle indexing along a dimension with a
38+
coordinate label by doing integer based indexing.
39+
- :py:attr:`~Dataset.indexes` is no longer guaranteed to include all
40+
dimensions names as keys. The new method :py:meth:`~Dataset.get_index` has
41+
been added to get an index for a dimension guaranteed, falling back to
42+
produce a default ``RangeIndex`` if necessary.
43+
2444
- The default behavior of ``merge`` is now ``compat='no_conflicts'``, so some
2545
merges will now succeed in cases that previously raised
2646
``xarray.MergeError``. Set ``compat='broadcast_equals'`` to restore the
@@ -113,6 +133,13 @@ Bug fixes
113133
- ``Dataset.concat()`` now preserves variables order (:issue:`1027`).
114134
By `Fabien Maussion <https://github.com/fmaussion>`_.
115135

136+
- Grouping over an dimension with non-unique values with ``groupby`` gives
137+
correct groups.
138+
139+
- Fixed accessing coordinate variables with non-string names from ``.coords``
140+
(:issue:`TBD`).
141+
By `Stephan Hoyer <https://github.com/shoyer>`_.
142+
116143
.. _whats-new.0.8.2:
117144

118145
v0.8.2 (18 August 2016)

xarray/backends/common.py

Lines changed: 0 additions & 25 deletions
Original file line numberDiff line numberDiff line change
@@ -30,25 +30,6 @@ def _decode_variable_name(name):
3030
return name
3131

3232

33-
def is_trivial_index(var):
34-
"""
35-
Determines if in index is 'trivial' meaning that it is
36-
equivalent to np.arange(). This is determined by
37-
checking if there are any attributes or encodings,
38-
if ndims is one, dtype is int and finally by comparing
39-
the actual values to np.arange()
40-
"""
41-
# if either attributes or encodings are defined
42-
# the index is not trivial.
43-
if len(var.attrs) or len(var.encoding):
44-
return False
45-
# if the index is not a 1d integer array
46-
if var.ndim > 1 or not var.dtype.kind == 'i':
47-
return False
48-
arange = np.arange(var.size, dtype=var.dtype)
49-
return np.all(var.values == arange)
50-
51-
5233
def robust_getitem(array, key, catch=Exception, max_retries=6,
5334
initial_delay=500):
5435
"""
@@ -200,12 +181,6 @@ def store_dataset(self, dataset):
200181

201182
def store(self, variables, attributes, check_encoding_set=frozenset()):
202183
self.set_attributes(attributes)
203-
neccesary_dims = [v.dims for v in variables.values()]
204-
neccesary_dims = set(itertools.chain(*neccesary_dims))
205-
# set all non-indexes and any index which is not trivial.
206-
variables = OrderedDict((k, v) for k, v in iteritems(variables)
207-
if not (k in neccesary_dims and
208-
is_trivial_index(v)))
209184
self.set_variables(variables, check_encoding_set)
210185

211186
def set_attributes(self, attributes):

xarray/conventions.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -910,7 +910,7 @@ def decode_cf(obj, concat_characters=True, mask_and_scale=True,
910910
identify coordinates.
911911
drop_variables: string or iterable, optional
912912
A variable or list of variables to exclude from being parsed from the
913-
dataset.This may be useful to drop variables with problems or
913+
dataset. This may be useful to drop variables with problems or
914914
inconsistent values.
915915
916916
Returns
@@ -936,7 +936,7 @@ def decode_cf(obj, concat_characters=True, mask_and_scale=True,
936936
vars, attrs, concat_characters, mask_and_scale, decode_times,
937937
decode_coords, drop_variables=drop_variables)
938938
ds = Dataset(vars, attrs=attrs)
939-
ds = ds.set_coords(coord_names.union(extra_coords))
939+
ds = ds.set_coords(coord_names.union(extra_coords).intersection(vars))
940940
ds._file_obj = file_obj
941941
return ds
942942

0 commit comments

Comments
 (0)