Skip to content

BUG: Regression creating DataFrame from nested dict #22227

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wesm opened this issue Aug 7, 2018 · 8 comments · Fixed by #22232
Closed

BUG: Regression creating DataFrame from nested dict #22227

wesm opened this issue Aug 7, 2018 · 8 comments · Fixed by #22232
Labels
Bug Regression Functionality that used to work in a prior pandas version
Milestone

Comments

@wesm
Copy link
Member

wesm commented Aug 7, 2018

Code Sample, a copy-pastable example if possible

# Your code here
pop = {'Nevada': {2001: 2.4, 2002: 2.9},
       'Ohio': {2000: 1.5, 2001: 1.7, 2002: 3.6}}
pd.DataFrame(pop, index=[2001, 2002, 2003])

Problem description

Raises exception:

In [6]: pd.DataFrame(pop, index=[2001, 2002, 2003])
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-6-84df27ae30f4> in <module>()
----> 1 pd.DataFrame(pop, index=[2001, 2002, 2003])

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in __init__(self, data, index, columns, dtype, copy)
    346                                  dtype=dtype, copy=copy)
    347         elif isinstance(data, dict):
--> 348             mgr = self._init_dict(data, index, columns, dtype=dtype)
    349         elif isinstance(data, ma.MaskedArray):
    350             import numpy.ma.mrecords as mrecords

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in _init_dict(self, data, index, columns, dtype)
    457             arrays = [data[k] for k in keys]
    458 
--> 459         return _arrays_to_mgr(arrays, data_names, index, columns, dtype=dtype)
    460 
    461     def _init_ndarray(self, values, index, columns, dtype=None, copy=False):

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in _arrays_to_mgr(arrays, arr_names, index, columns, dtype)
   7357 
   7358     # don't force copy because getting jammed in an ndarray anyway
-> 7359     arrays = _homogenize(arrays, index, dtype)
   7360 
   7361     # from BlockManager perspective

~/miniconda/envs/arrow-dev/lib/python3.6/site-packages/pandas/core/frame.py in _homogenize(data, index, dtype)
   7659             if isinstance(v, dict):
   7660                 if oindex is None:
-> 7661                     oindex = index.astype('O')
   7662 
   7663                 if isinstance(index, (DatetimeIndex, TimedeltaIndex)):

AttributeError: 'list' object has no attribute 'astype'

This code has worked for about 10 years; is this a deliberate change?

@wesm
Copy link
Member Author

wesm commented Aug 7, 2018

This change happened between 0.22.0 and 0.23.0

@chris-b1
Copy link
Contributor

chris-b1 commented Aug 7, 2018

maybe side effect of #19884? cc @topper-123

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2018

I just bisected this. The first bad commit is 4efb39f

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2018

Pretty sure it's this change: 4efb39f#diff-1e79abbbdd150d4771b91ea60a4e1cc7L7231

@cpcloud
Copy link
Member

cpcloud commented Aug 7, 2018

cc @toobaz

@gfyoung gfyoung added Bug Regression Functionality that used to work in a prior pandas version labels Aug 7, 2018
@jreback jreback added this to the 0.23.5 milestone Aug 7, 2018
@wesm
Copy link
Member Author

wesm commented Aug 7, 2018

While I'm surprised that there were no unit tests to catch this regression, it only occurred in the odd (?) case where the index passed is not a NumPy array

@wesm
Copy link
Member Author

wesm commented Aug 7, 2018

FYI, the reason I ran into this was that I'm updating Python for Data Analysis to fix errata and the book build broke. Sorry for not reporting the issue sooner

@toobaz
Copy link
Member

toobaz commented Aug 8, 2018

Thanks @wesm for noticing this, I had completely overlooked one code branch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants