You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Sparse looks to handle missing (NaN) and fill_value confusingly. Based on the doc, I understand fill_value is a user-specified value to be omitted in the sparse internal repr. fill_value may be different from missing (NaN).
Code Sample, a copy-pastable example if possible
# NG, 2nd and last element must be NaN
pd.SparseArray([1, np.nan, 0, 3, np.nan], fill_value=0).to_dense()
# array([ 1., 0., 0., 3., 0.])
# NG, 2nd element must be NaN
orig = pd.Series([1, np.nan, 0, 3, np.nan], index=list('ABCDE'))
sparse = orig.to_sparse(fill_value=0)
sparse.reindex(['A', 'B', 'C'])
# A 1.0
# B 0.0
# C 0.0
# dtype: float64
# BlockIndex
# Block locations: array([0], dtype=int32)
# Block lengths: array([1], dtype=int32)
hmm, I think its using np.nan as the missing value indicator. Which is right. THEN you fill using the fill_value those locations. not the other way around.
Sparse looks to handle
missing (NaN)
andfill_value
confusingly. Based on the doc, I understandfill_value
is a user-specified value to be omitted in the sparse internal repr.fill_value
may be different from missing (NaN).Code Sample, a copy-pastable example if possible
Expected Output
output of
pd.show_versions()
Current master.
The fix itself looks straightforward, but it breaks some tests use dubious comparison.
The text was updated successfully, but these errors were encountered: