Skip to content

display_width doesn't apply to dask-backed arrays #2528

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
rabernat opened this issue Oct 30, 2018 · 3 comments
Closed

display_width doesn't apply to dask-backed arrays #2528

rabernat opened this issue Oct 30, 2018 · 3 comments
Labels

Comments

@rabernat
Copy link
Contributor

The representation of dask-backed arrays in xarray's __repr__ methods results in very long lines which often overflow the desired line width. Unfortunately, this can't be controlled or overridden with xr.set_options(display_width=...).

Code Sample, a copy-pastable example if possible

import xarray as xr
xr.set_options(display_width=20)
ds = (xr.DataArray(range(100))
      .chunk({'dim_0': 10})
      .to_dataset(name='really_long_long_name'))
ds
<xarray.Dataset>
Dimensions:                (dim_0: 100)
Dimensions without coordinates: dim_0
Data variables:
    really_long_long_name  (dim_0) int64 dask.array<shape=(100,), chunksize=(10,)>

Problem description

[this should explain why the current behavior is a problem and why the expected output is a better solution.]

Expected Output

We need to decide how to abbreviate dask arrays with something more concise. I'm not sure the best way to do this. Maybe

   really_long_long_name  (dim_0) int64 dask chunks=(10,)
@shoyer
Copy link
Member

shoyer commented Oct 30, 2018

short_dask_repr controls this in xarray:

def short_dask_repr(array, show_dtype=True):
"""Similar to dask.array.DataArray.__repr__, but without
redundant information that's already printed by the repr
function of the xarray wrapper.
"""
chunksize = tuple(c[0] for c in array.chunks)
if show_dtype:
return 'dask.array<shape=%s, dtype=%s, chunksize=%s>' % (
array.shape, array.dtype, chunksize)
else:
return 'dask.array<shape=%s, chunksize=%s>' % (array.shape, chunksize)

A good place to start might be dropping shape -- that information is largely redundant with dimension sizes.

@stale
Copy link

stale bot commented Sep 30, 2020

In order to maintain a list of currently relevant issues, we mark issues as stale after a period of inactivity

If this issue remains relevant, please comment here or remove the stale label; otherwise it will be marked as closed automatically

@stale stale bot added the stale label Sep 30, 2020
@shoyer
Copy link
Member

shoyer commented Sep 30, 2020

As of #3211, we no longer include the redundant shape in short reprs for dask arrays.

So I'm going to close this, though of course further improvements to the short repr would be welcome!

@shoyer shoyer closed this as completed Sep 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants