Skip to content

Commit 98d0f55

Browse files
committed
Merge branch 'master' into feature/series-info
2 parents d41ecf1 + 5a74e97 commit 98d0f55

File tree

141 files changed

+3274
-2065
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

141 files changed

+3274
-2065
lines changed

.pre-commit-config.yaml

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -119,6 +119,12 @@ repos:
119119
entry: python scripts/validate_unwanted_patterns.py --validation-type="private_function_across_module"
120120
types: [python]
121121
exclude: ^(asv_bench|pandas/tests|doc)/
122+
- id: inconsistent-namespace-usage
123+
name: 'Check for inconsistent use of pandas namespace in tests'
124+
entry: python scripts/check_for_inconsistent_pandas_namespace.py
125+
language: python
126+
types: [python]
127+
files: ^pandas/tests/
122128
- id: FrameOrSeriesUnion
123129
name: Check for use of Union[Series, DataFrame] instead of FrameOrSeriesUnion alias
124130
entry: Union\[.*(Series.*DataFrame|DataFrame.*Series).*\]

ci/code_checks.sh

Lines changed: 2 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -37,12 +37,6 @@ function invgrep {
3737
return $((! $EXIT_STATUS))
3838
}
3939

40-
function check_namespace {
41-
local -r CLASS=${1}
42-
grep -R -l --include "*.py" " ${CLASS}(" pandas/tests | xargs grep -n "pd\.${CLASS}[(\.]"
43-
test $? -gt 0
44-
}
45-
4640
if [[ "$GITHUB_ACTIONS" == "true" ]]; then
4741
FLAKE8_FORMAT="##[error]%(path)s:%(row)s:%(col)s:%(code)s:%(text)s"
4842
INVGREP_PREPEND="##[error]"
@@ -120,13 +114,6 @@ if [[ -z "$CHECK" || "$CHECK" == "patterns" ]]; then
120114
MSG='Check for use of {foo!r} instead of {repr(foo)}' ; echo $MSG
121115
invgrep -R --include=*.{py,pyx} '!r}' pandas
122116
RET=$(($RET + $?)) ; echo $MSG "DONE"
123-
124-
# -------------------------------------------------------------------------
125-
MSG='Check for inconsistent use of pandas namespace in tests' ; echo $MSG
126-
for class in "Series" "DataFrame" "Index" "MultiIndex" "Timestamp" "Timedelta" "TimedeltaIndex" "DatetimeIndex" "Categorical"; do
127-
check_namespace ${class}
128-
RET=$(($RET + $?))
129-
done
130117
echo $MSG "DONE"
131118
fi
132119

@@ -238,8 +225,8 @@ fi
238225
### DOCSTRINGS ###
239226
if [[ -z "$CHECK" || "$CHECK" == "docstrings" ]]; then
240227

241-
MSG='Validate docstrings (GL03, GL04, GL05, GL06, GL07, GL09, GL10, SS04, SS05, PR03, PR04, PR05, PR10, EX04, RT01, RT04, RT05, SA02, SA03)' ; echo $MSG
242-
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=GL03,GL04,GL05,GL06,GL07,GL09,GL10,SS04,SS05,PR03,PR04,PR05,PR10,EX04,RT01,RT04,RT05,SA02,SA03
228+
MSG='Validate docstrings (GL03, GL04, GL05, GL06, GL07, GL09, GL10, SS02, SS04, SS05, PR03, PR04, PR05, PR10, EX04, RT01, RT04, RT05, SA02, SA03)' ; echo $MSG
229+
$BASE_DIR/scripts/validate_docstrings.py --format=actions --errors=GL03,GL04,GL05,GL06,GL07,GL09,GL10,SS02,SS04,SS05,PR03,PR04,PR05,PR10,EX04,RT01,RT04,RT05,SA02,SA03
243230
RET=$(($RET + $?)) ; echo $MSG "DONE"
244231

245232
MSG='Validate correct capitalization among titles in documentation' ; echo $MSG

doc/source/development/code_style.rst

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,8 +12,9 @@ pandas code style guide
1212
pandas follows the `PEP8 <https://www.python.org/dev/peps/pep-0008/>`_
1313
standard and uses `Black <https://black.readthedocs.io/en/stable/>`_
1414
and `Flake8 <https://flake8.pycqa.org/en/latest/>`_ to ensure a
15-
consistent code format throughout the project. For details see the
16-
:ref:`contributing guide to pandas<contributing.code-formatting>`.
15+
consistent code format throughout the project. We encourage you to use
16+
:ref:`pre-commit <contributing.pre-commit>` to automatically run ``black``,
17+
``flake8``, ``isort``, and related code checks when you make a git commit.
1718

1819
Patterns
1920
========

doc/source/development/contributing.rst

Lines changed: 41 additions & 44 deletions
Original file line numberDiff line numberDiff line change
@@ -638,7 +638,46 @@ In addition to ``./ci/code_checks.sh``, some extra checks are run by
638638
``pre-commit`` - see :ref:`here <contributing.pre-commit>` for how to
639639
run them.
640640

641-
Additional standards are outlined on the :ref:`pandas code style guide <code_style>`
641+
Additional standards are outlined on the :ref:`pandas code style guide <code_style>`.
642+
643+
.. _contributing.pre-commit:
644+
645+
Pre-commit
646+
----------
647+
648+
You can run many of these styling checks manually as we have described above. However,
649+
we encourage you to use `pre-commit hooks <https://pre-commit.com/>`_ instead
650+
to automatically run ``black``, ``flake8``, ``isort`` when you make a git commit. This
651+
can be done by installing ``pre-commit``::
652+
653+
pip install pre-commit
654+
655+
and then running::
656+
657+
pre-commit install
658+
659+
from the root of the pandas repository. Now all of the styling checks will be
660+
run each time you commit changes without your needing to run each one manually.
661+
In addition, using ``pre-commit`` will also allow you to more easily
662+
remain up-to-date with our code checks as they change.
663+
664+
Note that if needed, you can skip these checks with ``git commit --no-verify``.
665+
666+
If you don't want to use ``pre-commit`` as part of your workflow, you can still use it
667+
to run its checks with::
668+
669+
pre-commit run --files <files you have modified>
670+
671+
without needing to have done ``pre-commit install`` beforehand.
672+
673+
.. note::
674+
675+
If you have conflicting installations of ``virtualenv``, then you may get an
676+
error - see `here <https://github.com/pypa/virtualenv/issues/1875>`_.
677+
678+
Also, due to a `bug in virtualenv <https://github.com/pypa/virtualenv/issues/1986>`_,
679+
you may run into issues if you're using conda. To solve this, you can downgrade
680+
``virtualenv`` to version ``20.0.33``.
642681

643682
Optional dependencies
644683
---------------------
@@ -712,7 +751,7 @@ Python (PEP8 / black)
712751
pandas follows the `PEP8 <https://www.python.org/dev/peps/pep-0008/>`_ standard
713752
and uses `Black <https://black.readthedocs.io/en/stable/>`_ and
714753
`Flake8 <http://flake8.pycqa.org/en/latest/>`_ to ensure a consistent code
715-
format throughout the project.
754+
format throughout the project. We encourage you to use :ref:`pre-commit <contributing.pre-commit>`.
716755

717756
:ref:`Continuous Integration <contributing.ci>` will run those tools and
718757
report any stylistic errors in your code. Therefore, it is helpful before
@@ -727,9 +766,6 @@ apply ``black`` as you edit files.
727766
You should use a ``black`` version 20.8b1 as previous versions are not compatible
728767
with the pandas codebase.
729768

730-
If you wish to run these checks automatically, we encourage you to use
731-
:ref:`pre-commits <contributing.pre-commit>` instead.
732-
733769
One caveat about ``git diff upstream/master -u -- "*.py" | flake8 --diff``: this
734770
command will catch any stylistic errors in your changes specifically, but
735771
be beware it may not catch all of them. For example, if you delete the only
@@ -807,45 +843,6 @@ Where similar caveats apply if you are on OSX or Windows.
807843

808844
You can then verify the changes look ok, then git :ref:`commit <contributing.commit-code>` and :ref:`push <contributing.push-code>`.
809845

810-
.. _contributing.pre-commit:
811-
812-
Pre-commit
813-
~~~~~~~~~~
814-
815-
You can run many of these styling checks manually as we have described above. However,
816-
we encourage you to use `pre-commit hooks <https://pre-commit.com/>`_ instead
817-
to automatically run ``black``, ``flake8``, ``isort`` when you make a git commit. This
818-
can be done by installing ``pre-commit``::
819-
820-
pip install pre-commit
821-
822-
and then running::
823-
824-
pre-commit install
825-
826-
from the root of the pandas repository. Now all of the styling checks will be
827-
run each time you commit changes without your needing to run each one manually.
828-
In addition, using this pre-commit hook will also allow you to more easily
829-
remain up-to-date with our code checks as they change.
830-
831-
Note that if needed, you can skip these checks with ``git commit --no-verify``.
832-
833-
If you don't want to use ``pre-commit`` as part of your workflow, you can still use it
834-
to run its checks by running::
835-
836-
pre-commit run --files <files you have modified>
837-
838-
without having to have done ``pre-commit install`` beforehand.
839-
840-
.. note::
841-
842-
If you have conflicting installations of ``virtualenv``, then you may get an
843-
error - see `here <https://github.com/pypa/virtualenv/issues/1875>`_.
844-
845-
Also, due to a `bug in virtualenv <https://github.com/pypa/virtualenv/issues/1986>`_,
846-
you may run into issues if you're using conda. To solve this, you can downgrade
847-
``virtualenv`` to version ``20.0.33``.
848-
849846
Backwards compatibility
850847
~~~~~~~~~~~~~~~~~~~~~~~
851848

doc/source/whatsnew/v1.1.5.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,8 @@ Fixed regressions
2424
Bug fixes
2525
~~~~~~~~~
2626
- Bug in metadata propagation for ``groupby`` iterator (:issue:`37343`)
27+
- Bug in indexing on a :class:`Series` with ``CategoricalDtype`` after unpickling (:issue:`37631`)
28+
- Bug in :class:`RollingGroupby` with the resulting :class:`MultiIndex` when grouping by a label that is in the index (:issue:`37641`)
2729
-
2830

2931
.. ---------------------------------------------------------------------------

doc/source/whatsnew/v1.2.0.rst

Lines changed: 11 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -232,6 +232,9 @@ Other enhancements
232232
- :meth:`testing.assert_index_equal` now has a ``check_order`` parameter that allows indexes to be checked in an order-insensitive manner (:issue:`37478`)
233233
- :meth:`Series.info` has been added, for compatibility with :meth:`DataFrame.info` (:issue:`5167`)
234234
- :func:`read_csv` supports memory-mapping for compressed files (:issue:`37621`)
235+
- Improve error reporting for :meth:`DataFrame.merge()` when invalid merge column definitions were given (:issue:`16228`)
236+
- Improve numerical stability for :meth:`Rolling.skew()`, :meth:`Rolling.kurt()`, :meth:`Expanding.skew()` and :meth:`Expanding.kurt()` through implementation of Kahan summation (:issue:`6929`)
237+
- Improved error reporting for subsetting columns of a :class:`DataFrameGroupBy` with ``axis=1`` (:issue:`37725`)
235238

236239
.. _whatsnew_120.api_breaking.python:
237240

@@ -381,7 +384,7 @@ Categorical
381384
^^^^^^^^^^^
382385
- :meth:`Categorical.fillna` will always return a copy, will validate a passed fill value regardless of whether there are any NAs to fill, and will disallow a ``NaT`` as a fill value for numeric categories (:issue:`36530`)
383386
- Bug in :meth:`Categorical.__setitem__` that incorrectly raised when trying to set a tuple value (:issue:`20439`)
384-
-
387+
- Bug in :meth:`CategoricalIndex.equals` incorrectly casting non-category entries to ``np.nan`` (:issue:`37667`)
385388

386389
Datetimelike
387390
^^^^^^^^^^^^
@@ -413,7 +416,7 @@ Timezones
413416
^^^^^^^^^
414417

415418
- Bug in :func:`date_range` was raising AmbiguousTimeError for valid input with ``ambiguous=False`` (:issue:`35297`)
416-
-
419+
- Bug in :meth:`Timestamp.replace` was losing fold information (:issue:`37610`)
417420

418421

419422
Numeric
@@ -466,6 +469,8 @@ Indexing
466469
- Bug in indexing on a :class:`Series` or :class:`DataFrame` with a :class:`MultiIndex` with a level named "0" (:issue:`37194`)
467470
- Bug in :meth:`Series.__getitem__` when using an unsigned integer array as an indexer giving incorrect results or segfaulting instead of raising ``KeyError`` (:issue:`37218`)
468471
- Bug in :meth:`Index.where` incorrectly casting numeric values to strings (:issue:`37591`)
472+
- Bug in :meth:`Series.loc` and :meth:`DataFrame.loc` raises when numeric label was given for object :class:`Index` although label was in :class:`Index` (:issue:`26491`)
473+
- Bug in :meth:`DataFrame.loc` returned requested key plus missing values when ``loc`` was applied to single level from :class:`MultiIndex` (:issue:`27104`)
469474

470475
Missing
471476
^^^^^^^
@@ -504,13 +509,16 @@ I/O
504509
- Bug in :class:`HDFStore` was dropping timezone information when exporting :class:`Series` with ``datetime64[ns, tz]`` dtypes with a fixed HDF5 store (:issue:`20594`)
505510
- :func:`read_csv` was closing user-provided binary file handles when ``engine="c"`` and an ``encoding`` was requested (:issue:`36980`)
506511
- Bug in :meth:`DataFrame.to_hdf` was not dropping missing rows with ``dropna=True`` (:issue:`35719`)
512+
- Bug in :func:`read_html` was raising a ``TypeError`` when supplying a ``pathlib.Path`` argument to the ``io`` parameter (:issue:`37705`)
507513

508514
Plotting
509515
^^^^^^^^
510516

511517
- Bug in :meth:`DataFrame.plot` was rotating xticklabels when ``subplots=True``, even if the x-axis wasn't an irregular time series (:issue:`29460`)
512518
- Bug in :meth:`DataFrame.plot` where a marker letter in the ``style`` keyword sometimes causes a ``ValueError`` (:issue:`21003`)
513519
- Twinned axes were losing their tick labels which should only happen to all but the last row or column of 'externally' shared axes (:issue:`33819`)
520+
- Bug in :meth:`DataFrameGroupBy.boxplot` when ``subplots=False``, a KeyError would raise (:issue:`16748`)
521+
514522

515523
Groupby/resample/rolling
516524
^^^^^^^^^^^^^^^^^^^^^^^^
@@ -576,6 +584,7 @@ Other
576584
- Bug in :meth:`Index.union` behaving differently depending on whether operand is a :class:`Index` or other list-like (:issue:`36384`)
577585
- Passing an array with 2 or more dimensions to the :class:`Series` constructor now raises the more specific ``ValueError``, from a bare ``Exception`` previously (:issue:`35744`)
578586
- Bug in ``accessor.DirNamesMixin``, where ``dir(obj)`` wouldn't show attributes defined on the instance (:issue:`37173`).
587+
- Bug in :meth:`Series.nunique` with ``dropna=True`` was returning incorrect results when both ``NA`` and ``None`` missing values were present (:issue:`37566`)
579588

580589
.. ---------------------------------------------------------------------------
581590

pandas/_libs/tslibs/nattype.pyx

Lines changed: 23 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -418,7 +418,6 @@ class NaTType(_NaT):
418418
utctimetuple = _make_error_func("utctimetuple", datetime)
419419
timetz = _make_error_func("timetz", datetime)
420420
timetuple = _make_error_func("timetuple", datetime)
421-
strftime = _make_error_func("strftime", datetime)
422421
isocalendar = _make_error_func("isocalendar", datetime)
423422
dst = _make_error_func("dst", datetime)
424423
ctime = _make_error_func("ctime", datetime)
@@ -435,6 +434,23 @@ class NaTType(_NaT):
435434
# The remaining methods have docstrings copy/pasted from the analogous
436435
# Timestamp methods.
437436

437+
strftime = _make_error_func(
438+
"strftime",
439+
"""
440+
Timestamp.strftime(format)
441+
442+
Return a string representing the given POSIX timestamp
443+
controlled by an explicit format string.
444+
445+
Parameters
446+
----------
447+
format : str
448+
Format string to convert Timestamp to string.
449+
See strftime documentation for more information on the format string:
450+
https://docs.python.org/3/library/datetime.html#strftime-and-strptime-behavior.
451+
""",
452+
)
453+
438454
strptime = _make_error_func(
439455
"strptime",
440456
"""
@@ -457,15 +473,15 @@ class NaTType(_NaT):
457473
"""
458474
Timestamp.fromtimestamp(ts)
459475
460-
timestamp[, tz] -> tz's local time from POSIX timestamp.
476+
Transform timestamp[, tz] to tz's local time from POSIX timestamp.
461477
""",
462478
)
463479
combine = _make_error_func(
464480
"combine",
465481
"""
466482
Timestamp.combine(date, time)
467483
468-
date, time -> datetime with same date and time fields.
484+
Combine date, time into datetime with same date and time fields.
469485
""",
470486
)
471487
utcnow = _make_error_func(
@@ -606,7 +622,7 @@ timedelta}, default 'raise'
606622
floor = _make_nat_func(
607623
"floor",
608624
"""
609-
return a new Timestamp floored to this resolution.
625+
Return a new Timestamp floored to this resolution.
610626
611627
Parameters
612628
----------
@@ -645,7 +661,7 @@ timedelta}, default 'raise'
645661
ceil = _make_nat_func(
646662
"ceil",
647663
"""
648-
return a new Timestamp ceiled to this resolution.
664+
Return a new Timestamp ceiled to this resolution.
649665
650666
Parameters
651667
----------
@@ -761,7 +777,7 @@ default 'raise'
761777
replace = _make_nat_func(
762778
"replace",
763779
"""
764-
implements datetime.replace, handles nanoseconds.
780+
Implements datetime.replace, handles nanoseconds.
765781
766782
Parameters
767783
----------
@@ -774,7 +790,7 @@ default 'raise'
774790
microsecond : int, optional
775791
nanosecond : int, optional
776792
tzinfo : tz-convertible, optional
777-
fold : int, optional, default is 0
793+
fold : int, optional
778794
779795
Returns
780796
-------

pandas/_libs/tslibs/period.pyx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2316,7 +2316,7 @@ class Period(_Period):
23162316
freq : str, default None
23172317
One of pandas period strings or corresponding objects.
23182318
ordinal : int, default None
2319-
The period offset from the gregorian proleptic epoch.
2319+
The period offset from the proleptic Gregorian epoch.
23202320
year : int, default None
23212321
Year value of the period.
23222322
month : int, default 1

0 commit comments

Comments
 (0)