Skip to content

Commit 2cf7ee0

Browse files
Merge remote-tracking branch 'upstream/main' into np-doc-ts-asunit
2 parents 799e6db + 882b228 commit 2cf7ee0

38 files changed

+301
-1176
lines changed

ci/code_checks.sh

-43
Large diffs are not rendered by default.

doc/source/user_guide/indexing.rst

+3-3
Original file line numberDiff line numberDiff line change
@@ -403,9 +403,9 @@ are returned:
403403
s = pd.Series(list('abcde'), index=[0, 3, 2, 5, 4])
404404
s.loc[3:5]
405405
406-
If at least one of the two is absent, but the index is sorted, and can be
407-
compared against start and stop labels, then slicing will still work as
408-
expected, by selecting labels which *rank* between the two:
406+
If the index is sorted, and can be compared against start and stop labels,
407+
then slicing will still work as expected, by selecting labels which *rank*
408+
between the two:
409409

410410
.. ipython:: python
411411

doc/source/user_guide/io.rst

-20
Original file line numberDiff line numberDiff line change
@@ -276,28 +276,9 @@ parse_dates : boolean or list of ints or names or list of lists or dict, default
276276

277277
.. note::
278278
A fast-path exists for iso8601-formatted dates.
279-
infer_datetime_format : boolean, default ``False``
280-
If ``True`` and parse_dates is enabled for a column, attempt to infer the
281-
datetime format to speed up the processing.
282-
283-
.. deprecated:: 2.0.0
284-
A strict version of this argument is now the default, passing it has no effect.
285279
keep_date_col : boolean, default ``False``
286280
If ``True`` and parse_dates specifies combining multiple columns then keep the
287281
original columns.
288-
date_parser : function, default ``None``
289-
Function to use for converting a sequence of string columns to an array of
290-
datetime instances. The default uses ``dateutil.parser.parser`` to do the
291-
conversion. pandas will try to call date_parser in three different ways,
292-
advancing to the next if an exception occurs: 1) Pass one or more arrays (as
293-
defined by parse_dates) as arguments; 2) concatenate (row-wise) the string
294-
values from the columns defined by parse_dates into a single array and pass
295-
that; and 3) call date_parser once for each row using one or more strings
296-
(corresponding to the columns defined by parse_dates) as arguments.
297-
298-
.. deprecated:: 2.0.0
299-
Use ``date_format`` instead, or read in as ``object`` and then apply
300-
:func:`to_datetime` as-needed.
301282
date_format : str or dict of column -> format, default ``None``
302283
If used in conjunction with ``parse_dates``, will parse dates according to this
303284
format. For anything more complex,
@@ -1639,7 +1620,6 @@ Options that are unsupported by the pyarrow engine which are not covered by the
16391620
* ``decimal``
16401621
* ``iterator``
16411622
* ``dayfirst``
1642-
* ``infer_datetime_format``
16431623
* ``verbose``
16441624
* ``skipinitialspace``
16451625
* ``low_memory``

doc/source/whatsnew/v3.0.0.rst

+5
Original file line numberDiff line numberDiff line change
@@ -243,6 +243,7 @@ Removal of prior version deprecations/changes
243243
- Removed the "closed" and "unit" keywords in :meth:`TimedeltaIndex.__new__` (:issue:`52628`, :issue:`55499`)
244244
- All arguments in :meth:`Index.sort_values` are now keyword only (:issue:`56493`)
245245
- All arguments in :meth:`Series.to_dict` are now keyword only (:issue:`56493`)
246+
- Changed the default value of ``na_action`` in :meth:`Categorical.map` to ``None`` (:issue:`51645`)
246247
- Changed the default value of ``observed`` in :meth:`DataFrame.groupby` and :meth:`Series.groupby` to ``True`` (:issue:`51811`)
247248
- Enforce deprecation in :func:`testing.assert_series_equal` and :func:`testing.assert_frame_equal` with object dtype and mismatched null-like values, which are now considered not-equal (:issue:`18463`)
248249
- Enforced deprecation ``all`` and ``any`` reductions with ``datetime64``, :class:`DatetimeTZDtype`, and :class:`PeriodDtype` dtypes (:issue:`58029`)
@@ -254,6 +255,9 @@ Removal of prior version deprecations/changes
254255
- Enforced deprecation of :meth:`offsets.Tick.delta`, use ``pd.Timedelta(obj)`` instead (:issue:`55498`)
255256
- Enforced deprecation of ``axis=None`` acting the same as ``axis=0`` in the DataFrame reductions ``sum``, ``prod``, ``std``, ``var``, and ``sem``, passing ``axis=None`` will now reduce over both axes; this is particularly the case when doing e.g. ``numpy.sum(df)`` (:issue:`21597`)
256257
- Enforced deprecation of ``core.internals`` members ``Block``, ``ExtensionBlock``, and ``DatetimeTZBlock`` (:issue:`58467`)
258+
- Enforced deprecation of ``date_parser`` in :func:`read_csv`, :func:`read_table`, :func:`read_fwf`, and :func:`read_excel` in favour of ``date_format`` (:issue:`50601`)
259+
- Enforced deprecation of ``quantile`` keyword in :meth:`.Rolling.quantile` and :meth:`.Expanding.quantile`, renamed to ``q`` instead. (:issue:`52550`)
260+
- Enforced deprecation of argument ``infer_datetime_format`` in :func:`read_csv`, as a strict version of it is now the default (:issue:`48621`)
257261
- Enforced deprecation of non-standard (``np.ndarray``, :class:`ExtensionArray`, :class:`Index`, or :class:`Series`) argument to :func:`api.extensions.take` (:issue:`52981`)
258262
- Enforced deprecation of parsing system timezone strings to ``tzlocal``, which depended on system timezone, pass the 'tz' keyword instead (:issue:`50791`)
259263
- Enforced deprecation of passing a dictionary to :meth:`SeriesGroupBy.agg` (:issue:`52268`)
@@ -465,6 +469,7 @@ Sparse
465469

466470
ExtensionArray
467471
^^^^^^^^^^^^^^
472+
- Bug in :meth:`.arrays.ArrowExtensionArray.__setitem__` which caused wrong behavior when using an integer array with repeated values as a key (:issue:`58530`)
468473
- Bug in :meth:`api.types.is_datetime64_any_dtype` where a custom :class:`ExtensionDtype` would return ``False`` for array-likes (:issue:`57055`)
469474

470475
Styler

pandas/_libs/tslibs/offsets.pyx

+7
Original file line numberDiff line numberDiff line change
@@ -500,6 +500,13 @@ cdef class BaseOffset:
500500
"""
501501
Return a copy of the frequency.
502502
503+
See Also
504+
--------
505+
tseries.offsets.Week.copy : Return a copy of Week offset.
506+
tseries.offsets.DateOffset.copy : Return a copy of date offset.
507+
tseries.offsets.MonthEnd.copy : Return a copy of MonthEnd offset.
508+
tseries.offsets.YearBegin.copy : Return a copy of YearBegin offset.
509+
503510
Examples
504511
--------
505512
>>> freq = pd.DateOffset(1)

pandas/conftest.py

+26-26
Original file line numberDiff line numberDiff line change
@@ -672,47 +672,47 @@ def _create_mi_with_dt64tz_level():
672672

673673

674674
indices_dict = {
675-
"string": Index([f"pandas_{i}" for i in range(100)]),
676-
"datetime": date_range("2020-01-01", periods=100),
677-
"datetime-tz": date_range("2020-01-01", periods=100, tz="US/Pacific"),
678-
"period": period_range("2020-01-01", periods=100, freq="D"),
679-
"timedelta": timedelta_range(start="1 day", periods=100, freq="D"),
680-
"range": RangeIndex(100),
681-
"int8": Index(np.arange(100), dtype="int8"),
682-
"int16": Index(np.arange(100), dtype="int16"),
683-
"int32": Index(np.arange(100), dtype="int32"),
684-
"int64": Index(np.arange(100), dtype="int64"),
685-
"uint8": Index(np.arange(100), dtype="uint8"),
686-
"uint16": Index(np.arange(100), dtype="uint16"),
687-
"uint32": Index(np.arange(100), dtype="uint32"),
688-
"uint64": Index(np.arange(100), dtype="uint64"),
689-
"float32": Index(np.arange(100), dtype="float32"),
690-
"float64": Index(np.arange(100), dtype="float64"),
675+
"string": Index([f"pandas_{i}" for i in range(10)]),
676+
"datetime": date_range("2020-01-01", periods=10),
677+
"datetime-tz": date_range("2020-01-01", periods=10, tz="US/Pacific"),
678+
"period": period_range("2020-01-01", periods=10, freq="D"),
679+
"timedelta": timedelta_range(start="1 day", periods=10, freq="D"),
680+
"range": RangeIndex(10),
681+
"int8": Index(np.arange(10), dtype="int8"),
682+
"int16": Index(np.arange(10), dtype="int16"),
683+
"int32": Index(np.arange(10), dtype="int32"),
684+
"int64": Index(np.arange(10), dtype="int64"),
685+
"uint8": Index(np.arange(10), dtype="uint8"),
686+
"uint16": Index(np.arange(10), dtype="uint16"),
687+
"uint32": Index(np.arange(10), dtype="uint32"),
688+
"uint64": Index(np.arange(10), dtype="uint64"),
689+
"float32": Index(np.arange(10), dtype="float32"),
690+
"float64": Index(np.arange(10), dtype="float64"),
691691
"bool-object": Index([True, False] * 5, dtype=object),
692692
"bool-dtype": Index([True, False] * 5, dtype=bool),
693693
"complex64": Index(
694-
np.arange(100, dtype="complex64") + 1.0j * np.arange(100, dtype="complex64")
694+
np.arange(10, dtype="complex64") + 1.0j * np.arange(10, dtype="complex64")
695695
),
696696
"complex128": Index(
697-
np.arange(100, dtype="complex128") + 1.0j * np.arange(100, dtype="complex128")
697+
np.arange(10, dtype="complex128") + 1.0j * np.arange(10, dtype="complex128")
698698
),
699-
"categorical": CategoricalIndex(list("abcd") * 25),
700-
"interval": IntervalIndex.from_breaks(np.linspace(0, 100, num=101)),
699+
"categorical": CategoricalIndex(list("abcd") * 2),
700+
"interval": IntervalIndex.from_breaks(np.linspace(0, 100, num=11)),
701701
"empty": Index([]),
702702
"tuples": MultiIndex.from_tuples(zip(["foo", "bar", "baz"], [1, 2, 3])),
703703
"mi-with-dt64tz-level": _create_mi_with_dt64tz_level(),
704704
"multi": _create_multiindex(),
705705
"repeats": Index([0, 0, 1, 1, 2, 2]),
706-
"nullable_int": Index(np.arange(100), dtype="Int64"),
707-
"nullable_uint": Index(np.arange(100), dtype="UInt16"),
708-
"nullable_float": Index(np.arange(100), dtype="Float32"),
709-
"nullable_bool": Index(np.arange(100).astype(bool), dtype="boolean"),
706+
"nullable_int": Index(np.arange(10), dtype="Int64"),
707+
"nullable_uint": Index(np.arange(10), dtype="UInt16"),
708+
"nullable_float": Index(np.arange(10), dtype="Float32"),
709+
"nullable_bool": Index(np.arange(10).astype(bool), dtype="boolean"),
710710
"string-python": Index(
711-
pd.array([f"pandas_{i}" for i in range(100)], dtype="string[python]")
711+
pd.array([f"pandas_{i}" for i in range(10)], dtype="string[python]")
712712
),
713713
}
714714
if has_pyarrow:
715-
idx = Index(pd.array([f"pandas_{i}" for i in range(100)], dtype="string[pyarrow]"))
715+
idx = Index(pd.array([f"pandas_{i}" for i in range(10)], dtype="string[pyarrow]"))
716716
indices_dict["string-pyarrow"] = idx
717717

718718

pandas/core/arrays/arrow/array.py

+3-2
Original file line numberDiff line numberDiff line change
@@ -1425,7 +1425,7 @@ def to_numpy(
14251425
result[~mask] = data[~mask]._pa_array.to_numpy()
14261426
return result
14271427

1428-
def map(self, mapper, na_action=None):
1428+
def map(self, mapper, na_action: Literal["ignore"] | None = None):
14291429
if is_numeric_dtype(self.dtype):
14301430
return map_array(self.to_numpy(), mapper, na_action=na_action)
14311431
else:
@@ -1880,7 +1880,8 @@ def __setitem__(self, key, value) -> None:
18801880
raise ValueError("Length of indexer and values mismatch")
18811881
if len(indices) == 0:
18821882
return
1883-
argsort = np.argsort(indices)
1883+
# GH#58530 wrong item assignment by repeated key
1884+
_, argsort = np.unique(indices, return_index=True)
18841885
indices = indices[argsort]
18851886
value = value.take(argsort)
18861887
mask = np.zeros(len(self), dtype=np.bool_)

pandas/core/arrays/base.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -2270,7 +2270,7 @@ def __array_ufunc__(self, ufunc: np.ufunc, method: str, *inputs, **kwargs):
22702270

22712271
return arraylike.default_array_ufunc(self, ufunc, method, *inputs, **kwargs)
22722272

2273-
def map(self, mapper, na_action=None):
2273+
def map(self, mapper, na_action: Literal["ignore"] | None = None):
22742274
"""
22752275
Map values using an input mapping or function.
22762276

pandas/core/arrays/categorical.py

+2-18
Original file line numberDiff line numberDiff line change
@@ -1483,7 +1483,7 @@ def remove_unused_categories(self) -> Self:
14831483
def map(
14841484
self,
14851485
mapper,
1486-
na_action: Literal["ignore"] | None | lib.NoDefault = lib.no_default,
1486+
na_action: Literal["ignore"] | None = None,
14871487
):
14881488
"""
14891489
Map categories using an input mapping or function.
@@ -1501,15 +1501,10 @@ def map(
15011501
----------
15021502
mapper : function, dict, or Series
15031503
Mapping correspondence.
1504-
na_action : {None, 'ignore'}, default 'ignore'
1504+
na_action : {None, 'ignore'}, default None
15051505
If 'ignore', propagate NaN values, without passing them to the
15061506
mapping correspondence.
15071507
1508-
.. deprecated:: 2.1.0
1509-
1510-
The default value of 'ignore' has been deprecated and will be changed to
1511-
None in the future.
1512-
15131508
Returns
15141509
-------
15151510
pandas.Categorical or pandas.Index
@@ -1561,17 +1556,6 @@ def map(
15611556
>>> cat.map({"a": "first", "b": "second"}, na_action=None)
15621557
Index(['first', 'second', nan], dtype='object')
15631558
"""
1564-
if na_action is lib.no_default:
1565-
warnings.warn(
1566-
"The default value of 'ignore' for the `na_action` parameter in "
1567-
"pandas.Categorical.map is deprecated and will be "
1568-
"changed to 'None' in a future version. Please set na_action to the "
1569-
"desired value to avoid seeing this warning",
1570-
FutureWarning,
1571-
stacklevel=find_stack_level(),
1572-
)
1573-
na_action = "ignore"
1574-
15751559
assert callable(mapper) or is_dict_like(mapper)
15761560

15771561
new_categories = self.categories.map(mapper)

pandas/core/arrays/datetimelike.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -728,7 +728,7 @@ def _unbox(self, other) -> np.int64 | np.datetime64 | np.timedelta64 | np.ndarra
728728
# pandas assumes they're there.
729729

730730
@ravel_compat
731-
def map(self, mapper, na_action=None):
731+
def map(self, mapper, na_action: Literal["ignore"] | None = None):
732732
from pandas import Index
733733

734734
result = map_array(self, mapper, na_action=na_action)

pandas/core/arrays/interval.py

+34-10
Original file line numberDiff line numberDiff line change
@@ -1389,6 +1389,12 @@ def closed(self) -> IntervalClosedType:
13891389
13901390
Either ``left``, ``right``, ``both`` or ``neither``.
13911391
1392+
See Also
1393+
--------
1394+
IntervalArray.closed : Returns inclusive side of the IntervalArray.
1395+
Interval.closed : Returns inclusive side of the Interval.
1396+
IntervalIndex.closed : Returns inclusive side of the IntervalIndex.
1397+
13921398
Examples
13931399
--------
13941400
@@ -1747,22 +1753,40 @@ def repeat(
17471753
"""
17481754
)
17491755

1750-
@Appender(
1751-
_interval_shared_docs["contains"]
1752-
% {
1753-
"klass": "IntervalArray",
1754-
"examples": textwrap.dedent(
1755-
"""\
1756+
def contains(self, other):
1757+
"""
1758+
Check elementwise if the Intervals contain the value.
1759+
1760+
Return a boolean mask whether the value is contained in the Intervals
1761+
of the IntervalArray.
1762+
1763+
Parameters
1764+
----------
1765+
other : scalar
1766+
The value to check whether it is contained in the Intervals.
1767+
1768+
Returns
1769+
-------
1770+
boolean array
1771+
A boolean mask whether the value is contained in the Intervals.
1772+
1773+
See Also
1774+
--------
1775+
Interval.contains : Check whether Interval object contains value.
1776+
IntervalArray.overlaps : Check if an Interval overlaps the values in the
1777+
IntervalArray.
1778+
1779+
Examples
1780+
--------
17561781
>>> intervals = pd.arrays.IntervalArray.from_tuples([(0, 1), (1, 3), (2, 4)])
17571782
>>> intervals
17581783
<IntervalArray>
17591784
[(0, 1], (1, 3], (2, 4]]
17601785
Length: 3, dtype: interval[int64, right]
1786+
1787+
>>> intervals.contains(0.5)
1788+
array([ True, False, False])
17611789
"""
1762-
),
1763-
}
1764-
)
1765-
def contains(self, other):
17661790
if isinstance(other, Interval):
17671791
raise NotImplementedError("contains not implemented for two intervals")
17681792

pandas/core/arrays/masked.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1318,7 +1318,7 @@ def max(self, *, skipna: bool = True, axis: AxisInt | None = 0, **kwargs):
13181318
)
13191319
return self._wrap_reduction_result("max", result, skipna=skipna, axis=axis)
13201320

1321-
def map(self, mapper, na_action=None):
1321+
def map(self, mapper, na_action: Literal["ignore"] | None = None):
13221322
return map_array(self.to_numpy(), mapper, na_action=na_action)
13231323

13241324
@overload

pandas/core/arrays/sparse/array.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -1253,7 +1253,7 @@ def astype(self, dtype: AstypeArg | None = None, copy: bool = True):
12531253

12541254
return self._simple_new(sp_values, self.sp_index, dtype)
12551255

1256-
def map(self, mapper, na_action=None) -> Self:
1256+
def map(self, mapper, na_action: Literal["ignore"] | None = None) -> Self:
12571257
"""
12581258
Map categories using an input mapping or function.
12591259

pandas/core/dtypes/base.py

+8
Original file line numberDiff line numberDiff line change
@@ -486,6 +486,14 @@ def register_extension_dtype(cls: type_t[ExtensionDtypeT]) -> type_t[ExtensionDt
486486
callable
487487
A class decorator.
488488
489+
See Also
490+
--------
491+
api.extensions.ExtensionDtype : The base class for creating custom pandas
492+
data types.
493+
Series : One-dimensional array with axis labels.
494+
DataFrame : Two-dimensional, size-mutable, potentially heterogeneous
495+
tabular data.
496+
489497
Examples
490498
--------
491499
>>> from pandas.api.extensions import register_extension_dtype, ExtensionDtype

pandas/core/frame.py

+1-1
Original file line numberDiff line numberDiff line change
@@ -10369,7 +10369,7 @@ def apply(
1036910369
return op.apply().__finalize__(self, method="apply")
1037010370

1037110371
def map(
10372-
self, func: PythonFuncType, na_action: str | None = None, **kwargs
10372+
self, func: PythonFuncType, na_action: Literal["ignore"] | None = None, **kwargs
1037310373
) -> DataFrame:
1037410374
"""
1037510375
Apply a function to a Dataframe elementwise.

pandas/core/indexers/objects.py

+19
Original file line numberDiff line numberDiff line change
@@ -48,6 +48,25 @@ class BaseIndexer:
4848
"""
4949
Base class for window bounds calculations.
5050
51+
Parameters
52+
----------
53+
index_array : np.ndarray, default None
54+
Array-like structure representing the indices for the data points.
55+
If None, the default indices are assumed. This can be useful for
56+
handling non-uniform indices in data, such as in time series
57+
with irregular timestamps.
58+
window_size : int, default 0
59+
Size of the moving window. This is the number of observations used
60+
for calculating the statistic. The default is to consider all
61+
observations within the window.
62+
**kwargs
63+
Additional keyword arguments passed to the subclass's methods.
64+
65+
See Also
66+
--------
67+
DataFrame.rolling : Provides rolling window calculations on dataframe.
68+
Series.rolling : Provides rolling window calculations on series.
69+
5170
Examples
5271
--------
5372
>>> from pandas.api.indexers import BaseIndexer

0 commit comments

Comments
 (0)