Skip to content

Commit 42a8c38

Browse files
merged master
2 parents cb3c8d6 + 8c7efd1 commit 42a8c38

File tree

109 files changed

+1972
-1013
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

109 files changed

+1972
-1013
lines changed

doc/source/whatsnew/v1.1.2.rst

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
.. _whatsnew_112:
22

3-
What's new in 1.1.2 (??)
4-
------------------------
3+
What's new in 1.1.2 (September 8, 2020)
4+
---------------------------------------
55

66
These are the changes in pandas 1.1.2. See :ref:`release` for a full changelog
77
including other versions of pandas.
@@ -16,12 +16,15 @@ Fixed regressions
1616
~~~~~~~~~~~~~~~~~
1717
- Regression in :meth:`DatetimeIndex.intersection` incorrectly raising ``AssertionError`` when intersecting against a list (:issue:`35876`)
1818
- Fix regression in updating a column inplace (e.g. using ``df['col'].fillna(.., inplace=True)``) (:issue:`35731`)
19+
- Fix regression in :meth:`DataFrame.append` mixing tz-aware and tz-naive datetime columns (:issue:`35460`)
1920
- Performance regression for :meth:`RangeIndex.format` (:issue:`35712`)
21+
- Regression where :meth:`MultiIndex.get_loc` would return a slice spanning the full index when passed an empty list (:issue:`35878`)
2022
- Fix regression in invalid cache after an indexing operation; this can manifest when setting which does not update the data (:issue:`35521`)
2123
- Regression in :meth:`DataFrame.replace` where a ``TypeError`` would be raised when attempting to replace elements of type :class:`Interval` (:issue:`35931`)
2224
- Fix regression in pickle roundtrip of the ``closed`` attribute of :class:`IntervalIndex` (:issue:`35658`)
2325
- Fixed regression in :meth:`DataFrameGroupBy.agg` where a ``ValueError: buffer source array is read-only`` would be raised when the underlying array is read-only (:issue:`36014`)
24-
-
26+
- Fixed regression in :meth:`Series.groupby.rolling` number of levels of :class:`MultiIndex` in input was compressed to one (:issue:`36018`)
27+
- Fixed regression in :class:`DataFrameGroupBy` on an empty :class:`DataFrame` (:issue:`36197`)
2528

2629
.. ---------------------------------------------------------------------------
2730
@@ -32,11 +35,15 @@ Bug fixes
3235
- Bug in :meth:`DataFrame.eval` with ``object`` dtype column binary operations (:issue:`35794`)
3336
- Bug in :class:`Series` constructor raising a ``TypeError`` when constructing sparse datetime64 dtypes (:issue:`35762`)
3437
- Bug in :meth:`DataFrame.apply` with ``result_type="reduce"`` returning with incorrect index (:issue:`35683`)
38+
- Bug in :meth:`Series.astype` and :meth:`DataFrame.astype` not respecting the ``errors`` argument when set to ``"ignore"`` for extension dtypes (:issue:`35471`)
3539
- Bug in :meth:`DateTimeIndex.format` and :meth:`PeriodIndex.format` with ``name=True`` setting the first item to ``"None"`` where it should be ``""`` (:issue:`35712`)
3640
- Bug in :meth:`Float64Index.__contains__` incorrectly raising ``TypeError`` instead of returning ``False`` (:issue:`35788`)
41+
- Bug in :class:`Series` constructor incorrectly raising a ``TypeError`` when passed an ordered set (:issue:`36044`)
3742
- Bug in :meth:`Series.dt.isocalendar` and :meth:`DatetimeIndex.isocalendar` that returned incorrect year for certain dates (:issue:`36032`)
3843
- Bug in :class:`DataFrame` indexing returning an incorrect :class:`Series` in some cases when the series has been altered and a cache not invalidated (:issue:`33675`)
3944
- Bug in :meth:`DataFrame.corr` causing subsequent indexing lookups to be incorrect (:issue:`35882`)
45+
- Bug in :meth:`import_optional_dependency` returning incorrect package names in cases where package name is different from import name (:issue:`35948`)
46+
- Bug when setting empty :class:`DataFrame` column to a :class:`Series` in preserving name of index in frame (:issue:`31368`)
4047

4148
.. ---------------------------------------------------------------------------
4249
@@ -45,6 +52,7 @@ Bug fixes
4552
Other
4653
~~~~~
4754
- :meth:`factorize` now supports ``na_sentinel=None`` to include NaN in the uniques of the values and remove ``dropna`` keyword which was unintentionally exposed to public facing API in 1.1 version from :meth:`factorize` (:issue:`35667`)
55+
- :meth:`DataFrame.plot` and :meth:`Series.plot` raise ``UserWarning`` about usage of ``FixedFormatter`` and ``FixedLocator`` (:issue:`35684` and :issue:`35945`)
4856

4957
.. ---------------------------------------------------------------------------
5058

doc/source/whatsnew/v1.2.0.rst

Lines changed: 8 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -103,7 +103,7 @@ Other enhancements
103103

104104
- Added :meth:`~DataFrame.set_flags` for setting table-wide flags on a ``Series`` or ``DataFrame`` (:issue:`28394`)
105105
- :class:`Index` with object dtype supports division and multiplication (:issue:`34160`)
106-
-
106+
- :meth:`DataFrame.explode` and :meth:`Series.explode` now support exploding of sets (:issue:`35614`)
107107
-
108108

109109
.. _whatsnew_120.api_breaking.python:
@@ -214,8 +214,6 @@ Performance improvements
214214

215215
Bug fixes
216216
~~~~~~~~~
217-
- Bug in :meth:`DataFrameGroupBy.apply` raising error with ``np.nan`` group(s) when ``dropna=False`` (:issue:`35889`)
218-
-
219217

220218
Categorical
221219
^^^^^^^^^^^
@@ -230,6 +228,7 @@ Datetimelike
230228
- Bug in :class:`DateOffset` where attributes reconstructed from pickle files differ from original objects when input values exceed normal ranges (e.g months=12) (:issue:`34511`)
231229
- Bug in :meth:`DatetimeIndex.get_slice_bound` where ``datetime.date`` objects were not accepted or naive :class:`Timestamp` with a tz-aware :class:`DatetimeIndex` (:issue:`35690`)
232230
- Bug in :meth:`DatetimeIndex.slice_locs` where ``datetime.date`` objects were not accepted (:issue:`34077`)
231+
- Bug in :meth:`DatetimeIndex.searchsorted`, :meth:`TimedeltaIndex.searchsorted`, and :meth:`Series.searchsorted` with ``datetime64`` or ``timedelta64`` dtype placement of ``NaT`` values being inconsistent with ``NumPy`` (:issue:`36176`)
233232

234233
Timedelta
235234
^^^^^^^^^
@@ -246,7 +245,8 @@ Timezones
246245

247246
Numeric
248247
^^^^^^^
249-
- Bug in :meth: `Series.equals` where a ``ValueError`` was raised when iterables were compared to non-iterables (:issue:`35267`)
248+
- Bug in :func:`to_numeric` where float precision was incorrect (:issue:`31364`)
249+
- Bug in :meth:`Series.equals` where a ``ValueError`` was raised when iterables were compared to non-iterables (:issue:`35267`)
250250
-
251251

252252
Conversion
@@ -257,7 +257,7 @@ Conversion
257257

258258
Strings
259259
^^^^^^^
260-
260+
- Bug in :meth:`Series.to_string`, :meth:`DataFrame.to_string`, and :meth:`DataFrame.to_latex` adding a leading space when ``index=False`` (:issue:`24980`)
261261
-
262262
-
263263

@@ -270,8 +270,9 @@ Interval
270270

271271
Indexing
272272
^^^^^^^^
273+
273274
- Bug in :meth:`PeriodIndex.get_loc` incorrectly raising ``ValueError`` on non-datelike strings instead of ``KeyError``, causing similar errors in :meth:`Series.__geitem__`, :meth:`Series.__contains__`, and :meth:`Series.loc.__getitem__` (:issue:`34240`)
274-
-
275+
- Bug in :meth:`Index.sort_values` where, when empty values were passed, the method would break by trying to compare missing values instead of pushing them to the end of the sort order. (:issue:`35584`)
275276
-
276277

277278
Missing
@@ -301,7 +302,6 @@ Plotting
301302
^^^^^^^^
302303

303304
- Bug in :meth:`DataFrame.plot` where a marker letter in the ``style`` keyword sometimes causes a ``ValueError`` (:issue:`21003`)
304-
- meth:`DataFrame.plot` and meth:`Series.plot` raise ``UserWarning`` about usage of FixedFormatter and FixedLocator (:issue:`35684` and :issue:`35945`)
305305

306306
Groupby/resample/rolling
307307
^^^^^^^^^^^^^^^^^^^^^^^^
@@ -314,6 +314,7 @@ Groupby/resample/rolling
314314
- Bug when subsetting columns on a :class:`~pandas.core.groupby.DataFrameGroupBy` (e.g. ``df.groupby('a')[['b']])``) would reset the attributes ``axis``, ``dropna``, ``group_keys``, ``level``, ``mutated``, ``sort``, and ``squeeze`` to their default values. (:issue:`9959`)
315315
- Bug in :meth:`DataFrameGroupby.tshift` failing to raise ``ValueError`` when a frequency cannot be inferred for the index of a group (:issue:`35937`)
316316
- Bug in :meth:`DataFrame.groupby` does not always maintain column index name for ``any``, ``all``, ``bfill``, ``ffill``, ``shift`` (:issue:`29764`)
317+
- Bug in :meth:`DataFrameGroupBy.apply` raising error with ``np.nan`` group(s) when ``dropna=False`` (:issue:`35889`)
317318
-
318319

319320
Reshaping

pandas/_libs/hashtable.pyx

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -56,7 +56,7 @@ from pandas._libs.missing cimport checknull
5656

5757

5858
cdef int64_t NPY_NAT = util.get_nat()
59-
_SIZE_HINT_LIMIT = (1 << 20) + 7
59+
SIZE_HINT_LIMIT = (1 << 20) + 7
6060

6161

6262
cdef Py_ssize_t _INIT_VEC_CAP = 128
@@ -176,7 +176,7 @@ def unique_label_indices(const int64_t[:] labels):
176176
ndarray[int64_t, ndim=1] arr
177177
Int64VectorData *ud = idx.data
178178

179-
kh_resize_int64(table, min(n, _SIZE_HINT_LIMIT))
179+
kh_resize_int64(table, min(n, SIZE_HINT_LIMIT))
180180

181181
with nogil:
182182
for i in range(n):

pandas/_libs/hashtable_class_helper.pxi.in

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -268,7 +268,7 @@ cdef class {{name}}HashTable(HashTable):
268268
def __cinit__(self, int64_t size_hint=1):
269269
self.table = kh_init_{{dtype}}()
270270
if size_hint is not None:
271-
size_hint = min(size_hint, _SIZE_HINT_LIMIT)
271+
size_hint = min(size_hint, SIZE_HINT_LIMIT)
272272
kh_resize_{{dtype}}(self.table, size_hint)
273273

274274
def __len__(self) -> int:
@@ -603,7 +603,7 @@ cdef class StringHashTable(HashTable):
603603
def __init__(self, int64_t size_hint=1):
604604
self.table = kh_init_str()
605605
if size_hint is not None:
606-
size_hint = min(size_hint, _SIZE_HINT_LIMIT)
606+
size_hint = min(size_hint, SIZE_HINT_LIMIT)
607607
kh_resize_str(self.table, size_hint)
608608

609609
def __dealloc__(self):
@@ -916,7 +916,7 @@ cdef class PyObjectHashTable(HashTable):
916916
def __init__(self, int64_t size_hint=1):
917917
self.table = kh_init_pymap()
918918
if size_hint is not None:
919-
size_hint = min(size_hint, _SIZE_HINT_LIMIT)
919+
size_hint = min(size_hint, SIZE_HINT_LIMIT)
920920
kh_resize_pymap(self.table, size_hint)
921921

922922
def __dealloc__(self):

pandas/_libs/hashtable_func_helper.pxi.in

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -138,7 +138,7 @@ def duplicated_{{dtype}}(const {{c_type}}[:] values, object keep='first'):
138138
kh_{{ttype}}_t *table = kh_init_{{ttype}}()
139139
ndarray[uint8_t, ndim=1, cast=True] out = np.empty(n, dtype='bool')
140140

141-
kh_resize_{{ttype}}(table, min(n, _SIZE_HINT_LIMIT))
141+
kh_resize_{{ttype}}(table, min(n, SIZE_HINT_LIMIT))
142142

143143
if keep not in ('last', 'first', False):
144144
raise ValueError('keep must be either "first", "last" or False')

pandas/_libs/parsers.pyx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -67,7 +67,7 @@ from pandas._libs.khash cimport (
6767
khiter_t,
6868
)
6969

70-
from pandas.compat import _get_lzma_file, _import_lzma
70+
from pandas.compat import get_lzma_file, import_lzma
7171
from pandas.errors import DtypeWarning, EmptyDataError, ParserError, ParserWarning
7272

7373
from pandas.core.dtypes.common import (
@@ -82,7 +82,7 @@ from pandas.core.dtypes.common import (
8282
)
8383
from pandas.core.dtypes.concat import union_categoricals
8484

85-
lzma = _import_lzma()
85+
lzma = import_lzma()
8686

8787
cdef:
8888
float64_t INF = <float64_t>np.inf
@@ -638,9 +638,9 @@ cdef class TextReader:
638638
f'zip file {zip_names}')
639639
elif self.compression == 'xz':
640640
if isinstance(source, str):
641-
source = _get_lzma_file(lzma)(source, 'rb')
641+
source = get_lzma_file(lzma)(source, 'rb')
642642
else:
643-
source = _get_lzma_file(lzma)(filename=source)
643+
source = get_lzma_file(lzma)(filename=source)
644644
else:
645645
raise ValueError(f'Unrecognized compression type: '
646646
f'{self.compression}')

pandas/_libs/reshape.pyx

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -124,7 +124,8 @@ def explode(ndarray[object] values):
124124
counts = np.zeros(n, dtype='int64')
125125
for i in range(n):
126126
v = values[i]
127-
if c_is_list_like(v, False):
127+
128+
if c_is_list_like(v, True):
128129
if len(v):
129130
counts[i] += len(v)
130131
else:
@@ -138,8 +139,9 @@ def explode(ndarray[object] values):
138139
for i in range(n):
139140
v = values[i]
140141

141-
if c_is_list_like(v, False):
142+
if c_is_list_like(v, True):
142143
if len(v):
144+
v = list(v)
143145
for j in range(len(v)):
144146
result[count] = v[j]
145147
count += 1

pandas/_libs/src/parse_helper.h

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,9 @@ int to_double(char *item, double *p_value, char sci, char decimal,
1818
char *p_end = NULL;
1919
int error = 0;
2020

21-
*p_value = xstrtod(item, &p_end, decimal, sci, '\0', 1, &error, maybe_int);
21+
/* Switch to precise xstrtod GH 31364 */
22+
*p_value = precise_xstrtod(item, &p_end, decimal, sci, '\0', 1,
23+
&error, maybe_int);
2224

2325
return (error == 0) && (!*p_end);
2426
}

pandas/_testing.py

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@
2525
from pandas._libs.lib import no_default
2626
import pandas._libs.testing as _testing
2727
from pandas._typing import Dtype, FilePathOrBuffer, FrameOrSeries
28-
from pandas.compat import _get_lzma_file, _import_lzma
28+
from pandas.compat import get_lzma_file, import_lzma
2929

3030
from pandas.core.dtypes.common import (
3131
is_bool,
@@ -70,7 +70,7 @@
7070
from pandas.io.common import urlopen
7171
from pandas.io.formats.printing import pprint_thing
7272

73-
lzma = _import_lzma()
73+
lzma = import_lzma()
7474

7575
_N = 30
7676
_K = 4
@@ -243,7 +243,7 @@ def decompress_file(path, compression):
243243
elif compression == "bz2":
244244
f = bz2.BZ2File(path, "rb")
245245
elif compression == "xz":
246-
f = _get_lzma_file(lzma)(path, "rb")
246+
f = get_lzma_file(lzma)(path, "rb")
247247
elif compression == "zip":
248248
zip_file = zipfile.ZipFile(path)
249249
zip_names = zip_file.namelist()
@@ -288,7 +288,7 @@ def write_to_compressed(compression, path, data, dest="test"):
288288
elif compression == "bz2":
289289
compress_method = bz2.BZ2File
290290
elif compression == "xz":
291-
compress_method = _get_lzma_file(lzma)
291+
compress_method = get_lzma_file(lzma)
292292
else:
293293
raise ValueError(f"Unrecognized compression type: {compression}")
294294

pandas/compat/__init__.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -77,7 +77,7 @@ def is_platform_mac() -> bool:
7777
return sys.platform == "darwin"
7878

7979

80-
def _import_lzma():
80+
def import_lzma():
8181
"""
8282
Importing the `lzma` module.
8383
@@ -97,7 +97,7 @@ def _import_lzma():
9797
warnings.warn(msg)
9898

9999

100-
def _get_lzma_file(lzma):
100+
def get_lzma_file(lzma):
101101
"""
102102
Importing the `LZMAFile` class from the `lzma` module.
103103

pandas/compat/_optional.py

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,19 @@
3333
"numba": "0.46.0",
3434
}
3535

36+
# A mapping from import name to package name (on PyPI) for packages where
37+
# these two names are different.
38+
39+
INSTALL_MAPPING = {
40+
"bs4": "beautifulsoup4",
41+
"bottleneck": "Bottleneck",
42+
"lxml.etree": "lxml",
43+
"odf": "odfpy",
44+
"pandas_gbq": "pandas-gbq",
45+
"sqlalchemy": "SQLAlchemy",
46+
"jinja2": "Jinja2",
47+
}
48+
3649

3750
def _get_version(module: types.ModuleType) -> str:
3851
version = getattr(module, "__version__", None)
@@ -82,9 +95,13 @@ def import_optional_dependency(
8295
is False, or when the package's version is too old and `on_version`
8396
is ``'warn'``.
8497
"""
98+
99+
package_name = INSTALL_MAPPING.get(name)
100+
install_name = package_name if package_name is not None else name
101+
85102
msg = (
86-
f"Missing optional dependency '{name}'. {extra} "
87-
f"Use pip or conda to install {name}."
103+
f"Missing optional dependency '{install_name}'. {extra} "
104+
f"Use pip or conda to install {install_name}."
88105
)
89106
try:
90107
module = importlib.import_module(name)

pandas/conftest.py

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -437,6 +437,29 @@ def index(request):
437437
index_fixture2 = index
438438

439439

440+
@pytest.fixture(params=indices_dict.keys())
441+
def index_with_missing(request):
442+
"""
443+
Fixture for indices with missing values
444+
"""
445+
if request.param in ["int", "uint", "range", "empty", "repeats"]:
446+
pytest.xfail("missing values not supported")
447+
# GH 35538. Use deep copy to avoid illusive bug on np-dev
448+
# Azure pipeline that writes into indices_dict despite copy
449+
ind = indices_dict[request.param].copy(deep=True)
450+
vals = ind.values
451+
if request.param in ["tuples", "mi-with-dt64tz-level", "multi"]:
452+
# For setting missing values in the top level of MultiIndex
453+
vals = ind.tolist()
454+
vals[0] = tuple([None]) + vals[0][1:]
455+
vals[-1] = tuple([None]) + vals[-1][1:]
456+
return MultiIndex.from_tuples(vals)
457+
else:
458+
vals[0] = None
459+
vals[-1] = None
460+
return type(ind)(vals)
461+
462+
440463
# ----------------------------------------------------------------
441464
# Series'
442465
# ----------------------------------------------------------------

pandas/core/algorithms.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -462,7 +462,7 @@ def isin(comps: AnyArrayLike, values: AnyArrayLike) -> np.ndarray:
462462
return f(comps, values)
463463

464464

465-
def _factorize_array(
465+
def factorize_array(
466466
values, na_sentinel: int = -1, size_hint=None, na_value=None, mask=None
467467
) -> Tuple[np.ndarray, np.ndarray]:
468468
"""
@@ -671,7 +671,7 @@ def factorize(
671671
else:
672672
na_value = None
673673

674-
codes, uniques = _factorize_array(
674+
codes, uniques = factorize_array(
675675
values, na_sentinel=na_sentinel, size_hint=size_hint, na_value=na_value
676676
)
677677

0 commit comments

Comments
 (0)