Skip to content

Commit 73a6fef

Browse files
Merge branch 'main' into fix/type_coercion_for_unobserved_categories
2 parents 30013ee + de1131f commit 73a6fef

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+637
-483
lines changed

.github/ISSUE_TEMPLATE/pdep_vote.yaml

Lines changed: 74 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,74 @@
1+
name: PDEP Vote
2+
description: Call for a vote on a PDEP
3+
title: "VOTE: "
4+
labels: [Vote]
5+
6+
body:
7+
- type: markdown
8+
attributes:
9+
value: >
10+
As per [PDEP-1](https://pandas.pydata.org/pdeps/0001-purpose-and-guidelines.html), the following issue template should be used when a
11+
maintainer has opened a PDEP discussion and is ready to call for a vote.
12+
- type: checkboxes
13+
attributes:
14+
label: Locked issue
15+
options:
16+
- label: >
17+
I locked this voting issue so that only voting members are able to cast their votes or
18+
comment on this issue.
19+
required: true
20+
- type: input
21+
id: PDEP-name
22+
attributes:
23+
label: PDEP number and title
24+
placeholder: >
25+
PDEP-1: Purpose and guidelines
26+
validations:
27+
required: true
28+
- type: input
29+
id: PDEP-link
30+
attributes:
31+
label: Pull request with discussion
32+
description: e.g. https://github.com/pandas-dev/pandas/pull/47444
33+
validations:
34+
required: true
35+
- type: input
36+
id: PDEP-rendered-link
37+
attributes:
38+
label: Rendered PDEP for easy reading
39+
description: e.g. https://github.com/pandas-dev/pandas/pull/47444/files?short_path=7c449e6#diff-7c449e698132205b235c501f7e47ebba38da4d2b7f9492c98f16745dba787041
40+
validations:
41+
required: true
42+
- type: input
43+
id: PDEP-number-of-discussion-participants
44+
attributes:
45+
label: Discussion participants
46+
description: >
47+
You may find it useful to list or total the number of participating members in the
48+
PDEP discussion PR. This would be the maximum possible disapprove votes.
49+
placeholder: >
50+
14 voting members participated in the PR discussion thus far.
51+
- type: input
52+
id: PDEP-vote-end
53+
attributes:
54+
label: Voting will close in 15 days.
55+
description: The voting period end date. ('Voting will close in 15 days.' will be automatically written)
56+
- type: markdown
57+
attributes:
58+
value: ---
59+
- type: textarea
60+
id: Vote
61+
attributes:
62+
label: Vote
63+
value: |
64+
Cast your vote in a comment below.
65+
* +1: approve.
66+
* 0: abstain.
67+
* Reason: A one sentence reason is required.
68+
* -1: disapprove
69+
* Reason: A one sentence reason is required.
70+
A disapprove vote requires prior participation in the linked discussion PR.
71+
72+
@pandas-dev/pandas-core
73+
validations:
74+
required: true

ci/code_checks.sh

Lines changed: 3 additions & 32 deletions
Original file line numberDiff line numberDiff line change
@@ -81,22 +81,13 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
8181
-i "pandas.CategoricalIndex.ordered SA01" \
8282
-i "pandas.DataFrame.__dataframe__ SA01" \
8383
-i "pandas.DataFrame.__iter__ SA01" \
84-
-i "pandas.DataFrame.assign SA01" \
8584
-i "pandas.DataFrame.at_time PR01" \
86-
-i "pandas.DataFrame.bfill SA01" \
8785
-i "pandas.DataFrame.columns SA01" \
88-
-i "pandas.DataFrame.copy SA01" \
8986
-i "pandas.DataFrame.droplevel SA01" \
90-
-i "pandas.DataFrame.dtypes SA01" \
91-
-i "pandas.DataFrame.ffill SA01" \
92-
-i "pandas.DataFrame.first_valid_index SA01" \
93-
-i "pandas.DataFrame.get SA01" \
9487
-i "pandas.DataFrame.hist RT03" \
9588
-i "pandas.DataFrame.infer_objects RT03" \
96-
-i "pandas.DataFrame.keys SA01" \
9789
-i "pandas.DataFrame.kurt RT03,SA01" \
9890
-i "pandas.DataFrame.kurtosis RT03,SA01" \
99-
-i "pandas.DataFrame.last_valid_index SA01" \
10091
-i "pandas.DataFrame.max RT03" \
10192
-i "pandas.DataFrame.mean RT03,SA01" \
10293
-i "pandas.DataFrame.median RT03,SA01" \
@@ -123,24 +114,18 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
123114
-i "pandas.DatetimeIndex.ceil SA01" \
124115
-i "pandas.DatetimeIndex.date SA01" \
125116
-i "pandas.DatetimeIndex.day SA01" \
126-
-i "pandas.DatetimeIndex.day_name SA01" \
127117
-i "pandas.DatetimeIndex.day_of_year SA01" \
128118
-i "pandas.DatetimeIndex.dayofyear SA01" \
129119
-i "pandas.DatetimeIndex.floor SA01" \
130120
-i "pandas.DatetimeIndex.freqstr SA01" \
131-
-i "pandas.DatetimeIndex.hour SA01" \
132121
-i "pandas.DatetimeIndex.indexer_at_time PR01,RT03" \
133122
-i "pandas.DatetimeIndex.indexer_between_time RT03" \
134123
-i "pandas.DatetimeIndex.inferred_freq SA01" \
135124
-i "pandas.DatetimeIndex.is_leap_year SA01" \
136125
-i "pandas.DatetimeIndex.microsecond SA01" \
137-
-i "pandas.DatetimeIndex.minute SA01" \
138-
-i "pandas.DatetimeIndex.month SA01" \
139-
-i "pandas.DatetimeIndex.month_name SA01" \
140126
-i "pandas.DatetimeIndex.nanosecond SA01" \
141127
-i "pandas.DatetimeIndex.quarter SA01" \
142128
-i "pandas.DatetimeIndex.round SA01" \
143-
-i "pandas.DatetimeIndex.second SA01" \
144129
-i "pandas.DatetimeIndex.snap PR01,RT03,SA01" \
145130
-i "pandas.DatetimeIndex.std PR01,RT03" \
146131
-i "pandas.DatetimeIndex.time SA01" \
@@ -149,11 +134,10 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
149134
-i "pandas.DatetimeIndex.to_pydatetime RT03,SA01" \
150135
-i "pandas.DatetimeIndex.tz SA01" \
151136
-i "pandas.DatetimeIndex.tz_convert RT03" \
152-
-i "pandas.DatetimeIndex.year SA01" \
153137
-i "pandas.DatetimeTZDtype SA01" \
154138
-i "pandas.DatetimeTZDtype.tz SA01" \
155139
-i "pandas.DatetimeTZDtype.unit SA01" \
156-
-i "pandas.Grouper PR02,SA01" \
140+
-i "pandas.Grouper PR02" \
157141
-i "pandas.HDFStore.append PR01,SA01" \
158142
-i "pandas.HDFStore.get SA01" \
159143
-i "pandas.HDFStore.groups SA01" \
@@ -303,7 +287,6 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
303287
-i "pandas.Series.add PR07" \
304288
-i "pandas.Series.at_time PR01" \
305289
-i "pandas.Series.backfill PR01,SA01" \
306-
-i "pandas.Series.bfill SA01" \
307290
-i "pandas.Series.case_when RT03" \
308291
-i "pandas.Series.cat PR07,SA01" \
309292
-i "pandas.Series.cat.add_categories PR01,PR02" \
@@ -316,36 +299,31 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
316299
-i "pandas.Series.cat.rename_categories PR01,PR02" \
317300
-i "pandas.Series.cat.reorder_categories PR01,PR02" \
318301
-i "pandas.Series.cat.set_categories PR01,PR02" \
319-
-i "pandas.Series.copy SA01" \
320302
-i "pandas.Series.div PR07" \
321303
-i "pandas.Series.droplevel SA01" \
322304
-i "pandas.Series.dt.as_unit PR01,PR02" \
323305
-i "pandas.Series.dt.ceil PR01,PR02,SA01" \
324306
-i "pandas.Series.dt.components SA01" \
325307
-i "pandas.Series.dt.date SA01" \
326308
-i "pandas.Series.dt.day SA01" \
327-
-i "pandas.Series.dt.day_name PR01,PR02,SA01" \
309+
-i "pandas.Series.dt.day_name PR01,PR02" \
328310
-i "pandas.Series.dt.day_of_year SA01" \
329311
-i "pandas.Series.dt.dayofyear SA01" \
330312
-i "pandas.Series.dt.days SA01" \
331313
-i "pandas.Series.dt.days_in_month SA01" \
332314
-i "pandas.Series.dt.daysinmonth SA01" \
333315
-i "pandas.Series.dt.floor PR01,PR02,SA01" \
334316
-i "pandas.Series.dt.freq GL08" \
335-
-i "pandas.Series.dt.hour SA01" \
336317
-i "pandas.Series.dt.is_leap_year SA01" \
337318
-i "pandas.Series.dt.microsecond SA01" \
338319
-i "pandas.Series.dt.microseconds SA01" \
339-
-i "pandas.Series.dt.minute SA01" \
340-
-i "pandas.Series.dt.month SA01" \
341-
-i "pandas.Series.dt.month_name PR01,PR02,SA01" \
320+
-i "pandas.Series.dt.month_name PR01,PR02" \
342321
-i "pandas.Series.dt.nanosecond SA01" \
343322
-i "pandas.Series.dt.nanoseconds SA01" \
344323
-i "pandas.Series.dt.normalize PR01" \
345324
-i "pandas.Series.dt.quarter SA01" \
346325
-i "pandas.Series.dt.qyear GL08" \
347326
-i "pandas.Series.dt.round PR01,PR02,SA01" \
348-
-i "pandas.Series.dt.second SA01" \
349327
-i "pandas.Series.dt.seconds SA01" \
350328
-i "pandas.Series.dt.strftime PR01,PR02" \
351329
-i "pandas.Series.dt.time SA01" \
@@ -356,27 +334,20 @@ if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
356334
-i "pandas.Series.dt.tz_convert PR01,PR02,RT03" \
357335
-i "pandas.Series.dt.tz_localize PR01,PR02" \
358336
-i "pandas.Series.dt.unit GL08" \
359-
-i "pandas.Series.dt.year SA01" \
360337
-i "pandas.Series.dtype SA01" \
361-
-i "pandas.Series.dtypes SA01" \
362338
-i "pandas.Series.empty GL08" \
363339
-i "pandas.Series.eq PR07,SA01" \
364-
-i "pandas.Series.ffill SA01" \
365-
-i "pandas.Series.first_valid_index SA01" \
366340
-i "pandas.Series.floordiv PR07" \
367341
-i "pandas.Series.ge PR07,SA01" \
368-
-i "pandas.Series.get SA01" \
369342
-i "pandas.Series.gt PR07,SA01" \
370343
-i "pandas.Series.hasnans SA01" \
371344
-i "pandas.Series.infer_objects RT03" \
372345
-i "pandas.Series.is_monotonic_decreasing SA01" \
373346
-i "pandas.Series.is_monotonic_increasing SA01" \
374347
-i "pandas.Series.is_unique SA01" \
375348
-i "pandas.Series.item SA01" \
376-
-i "pandas.Series.keys SA01" \
377349
-i "pandas.Series.kurt RT03,SA01" \
378350
-i "pandas.Series.kurtosis RT03,SA01" \
379-
-i "pandas.Series.last_valid_index SA01" \
380351
-i "pandas.Series.le PR07,SA01" \
381352
-i "pandas.Series.list.__getitem__ SA01" \
382353
-i "pandas.Series.list.flatten SA01" \

doc/source/user_guide/io.rst

Lines changed: 2 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1949,13 +1949,6 @@ Writing in ISO date format, with microseconds:
19491949
json = dfd.to_json(date_format="iso", date_unit="us")
19501950
json
19511951
1952-
Epoch timestamps, in seconds:
1953-
1954-
.. ipython:: python
1955-
1956-
json = dfd.to_json(date_format="epoch", date_unit="s")
1957-
json
1958-
19591952
Writing to a file, with a date index and a date column:
19601953

19611954
.. ipython:: python
@@ -1965,7 +1958,7 @@ Writing to a file, with a date index and a date column:
19651958
dfj2["ints"] = list(range(5))
19661959
dfj2["bools"] = True
19671960
dfj2.index = pd.date_range("20130101", periods=5)
1968-
dfj2.to_json("test.json")
1961+
dfj2.to_json("test.json", date_format="iso")
19691962
19701963
with open("test.json") as fh:
19711964
print(fh.read())
@@ -2140,7 +2133,7 @@ Dates written in nanoseconds need to be read back in nanoseconds:
21402133
.. ipython:: python
21412134
21422135
from io import StringIO
2143-
json = dfj2.to_json(date_unit="ns")
2136+
json = dfj2.to_json(date_format="iso", date_unit="ns")
21442137
21452138
# Try to parse timestamps as milliseconds -> Won't Work
21462139
dfju = pd.read_json(StringIO(json), date_unit="ms")

doc/source/whatsnew/v3.0.0.rst

Lines changed: 8 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ Other enhancements
3737
- Support reading value labels from Stata 108-format (Stata 6) and earlier files (:issue:`58154`)
3838
- Users can globally disable any ``PerformanceWarning`` by setting the option ``mode.performance_warnings`` to ``False`` (:issue:`56920`)
3939
- :meth:`Styler.format_index_names` can now be used to format the index and column names (:issue:`48936` and :issue:`47489`)
40+
- :class:`.errors.DtypeWarning` improved to include column names when mixed data types are detected (:issue:`58174`)
4041
- :meth:`DataFrame.cummin`, :meth:`DataFrame.cummax`, :meth:`DataFrame.cumprod` and :meth:`DataFrame.cumsum` methods now have a ``numeric_only`` parameter (:issue:`53072`)
4142
- :meth:`DataFrame.fillna` and :meth:`Series.fillna` can now accept ``value=None``; for non-object dtype the corresponding NA value will be used (:issue:`57723`)
4243

@@ -195,6 +196,7 @@ Other Deprecations
195196
- Deprecated allowing non-keyword arguments in :meth:`DataFrame.all`, :meth:`DataFrame.min`, :meth:`DataFrame.max`, :meth:`DataFrame.sum`, :meth:`DataFrame.prod`, :meth:`DataFrame.mean`, :meth:`DataFrame.median`, :meth:`DataFrame.sem`, :meth:`DataFrame.var`, :meth:`DataFrame.std`, :meth:`DataFrame.skew`, :meth:`DataFrame.kurt`, :meth:`Series.all`, :meth:`Series.min`, :meth:`Series.max`, :meth:`Series.sum`, :meth:`Series.prod`, :meth:`Series.mean`, :meth:`Series.median`, :meth:`Series.sem`, :meth:`Series.var`, :meth:`Series.std`, :meth:`Series.skew`, and :meth:`Series.kurt`. (:issue:`57087`)
196197
- Deprecated allowing non-keyword arguments in :meth:`Series.to_markdown` except ``buf``. (:issue:`57280`)
197198
- Deprecated allowing non-keyword arguments in :meth:`Series.to_string` except ``buf``. (:issue:`57280`)
199+
- Deprecated using ``epoch`` date format in :meth:`DataFrame.to_json` and :meth:`Series.to_json`, use ``iso`` instead. (:issue:`57063`)
198200
-
199201

200202
.. ---------------------------------------------------------------------------
@@ -204,6 +206,7 @@ Removal of prior version deprecations/changes
204206
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
205207
- :class:`.DataFrameGroupBy.idxmin`, :class:`.DataFrameGroupBy.idxmax`, :class:`.SeriesGroupBy.idxmin`, and :class:`.SeriesGroupBy.idxmax` will now raise a ``ValueError`` when used with ``skipna=False`` and an NA value is encountered (:issue:`10694`)
206208
- :func:`concat` no longer ignores empty objects when determining output dtypes (:issue:`39122`)
209+
- :func:`concat` with all-NA entries no longer ignores the dtype of those entries when determining the result dtype (:issue:`40893`)
207210
- :func:`read_excel`, :func:`read_json`, :func:`read_html`, and :func:`read_xml` no longer accept raw string or byte representation of the data. That type of data must be wrapped in a :py:class:`StringIO` or :py:class:`BytesIO` (:issue:`53767`)
208211
- :meth:`DataFrame.groupby` with ``as_index=False`` and aggregation methods will no longer exclude from the result the groupings that do not arise from the input (:issue:`49519`)
209212
- :meth:`Series.dt.to_pydatetime` now returns a :class:`Series` of :py:class:`datetime.datetime` objects (:issue:`52459`)
@@ -219,6 +222,7 @@ Removal of prior version deprecations/changes
219222
- Disallow units other than "s", "ms", "us", "ns" for datetime64 and timedelta64 dtypes in :func:`array` (:issue:`53817`)
220223
- Removed "freq" keyword from :class:`PeriodArray` constructor, use "dtype" instead (:issue:`52462`)
221224
- Removed 'fastpath' keyword in :class:`Categorical` constructor (:issue:`20110`)
225+
- Removed 'kind' keyword in :meth:`Series.resample` and :meth:`DataFrame.resample` (:issue:`58125`)
222226
- Removed alias :class:`arrays.PandasArray` for :class:`arrays.NumpyExtensionArray` (:issue:`53694`)
223227
- Removed deprecated "method" and "limit" keywords from :meth:`Series.replace` and :meth:`DataFrame.replace` (:issue:`53492`)
224228
- Removed extension test classes ``BaseNoReduceTests``, ``BaseNumericReduceTests``, ``BaseBooleanReduceTests`` (:issue:`54663`)
@@ -331,6 +335,7 @@ Performance improvements
331335
- Performance improvement in :meth:`RangeIndex.reindex` returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57647`, :issue:`57752`)
332336
- Performance improvement in :meth:`RangeIndex.take` returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57445`, :issue:`57752`)
333337
- Performance improvement in :func:`merge` if hash-join can be used (:issue:`57970`)
338+
- Performance improvement in :meth:`to_hdf` avoid unnecessary reopenings of the HDF5 file to speedup data addition to files with a very large number of groups . (:issue:`58248`)
334339
- Performance improvement in ``DataFrameGroupBy.__len__`` and ``SeriesGroupBy.__len__`` (:issue:`57595`)
335340
- Performance improvement in indexing operations for string dtypes (:issue:`56997`)
336341
- Performance improvement in unary methods on a :class:`RangeIndex` returning a :class:`RangeIndex` instead of a :class:`Index` when possible. (:issue:`57825`)
@@ -386,7 +391,7 @@ Interval
386391

387392
Indexing
388393
^^^^^^^^
389-
-
394+
- Bug in :meth:`DataFrame.__getitem__` returning modified columns when called with ``slice`` in Python 3.12 (:issue:`57500`)
390395
-
391396

392397
Missing
@@ -396,7 +401,7 @@ Missing
396401

397402
MultiIndex
398403
^^^^^^^^^^
399-
-
404+
- :func:`DataFrame.loc` with ``axis=0`` and :class:`MultiIndex` when setting a value adds extra columns (:issue:`58116`)
400405
-
401406

402407
I/O
@@ -406,7 +411,6 @@ I/O
406411
- Bug in :meth:`DataFrame.to_string` that raised ``StopIteration`` with nested DataFrames. (:issue:`16098`)
407412
- Bug in :meth:`read_csv` raising ``TypeError`` when ``index_col`` is specified and ``na_values`` is a dict containing the key ``None``. (:issue:`57547`)
408413

409-
410414
Period
411415
^^^^^^
412416
-
@@ -415,6 +419,7 @@ Period
415419
Plotting
416420
^^^^^^^^
417421
- Bug in :meth:`.DataFrameGroupBy.boxplot` failed when there were multiple groupings (:issue:`14701`)
422+
- Bug in :meth:`DataFrame.plot` that causes a shift to the right when the frequency multiplier is greater than one. (:issue:`57587`)
418423
-
419424

420425
Groupby/resample/rolling

environment.yml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -89,6 +89,7 @@ dependencies:
8989
- numpydoc
9090
- pydata-sphinx-theme=0.14
9191
- pytest-cython # doctest
92+
- docutils < 0.21 # https://github.com/sphinx-doc/sphinx/issues/12302
9293
- sphinx
9394
- sphinx-design
9495
- sphinx-copybutton

0 commit comments

Comments
 (0)