Skip to content

Upstream sync #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 72 commits into from
Jan 1, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
e817fff
CI: troubleshoot Web_and_Docs failing (#30534)
simonjayhawkins Dec 29, 2019
2ec41bd
CLN: Update old .format to f-string (#30547)
Dec 30, 2019
af84f45
TYP: check_untyped_defs astype_nansafe (#30548)
simonjayhawkins Dec 30, 2019
f08437e
CLN: Remove dead IntervalIndex code (#30545)
jschendel Dec 30, 2019
59a1485
CI: Travis default to 3.7 (#30540)
ShaharNaveh Dec 30, 2019
c76f810
CLN: update check_untyped_defs setup.cfg (#30535)
simonjayhawkins Dec 30, 2019
cf99831
CLN: ops cleanups (#30530)
jbrockmendel Dec 30, 2019
60d5200
REF: collect Index setops tests (#30529)
jbrockmendel Dec 30, 2019
0f86ddb
remove kwargs from (DataFrame|Series).fillna (#30528)
topper-123 Dec 30, 2019
6822e43
TST: tests for preserving views (#30523)
jbrockmendel Dec 30, 2019
012b0bd
POC/TST: dynamic xfail (#30521)
jbrockmendel Dec 30, 2019
85d9e54
TST: fix maybe_promote dt64tz case, 323 xfails (#30506)
jbrockmendel Dec 30, 2019
a6a8440
TYP: check_untyped_defs core.computation.eval (#30551)
simonjayhawkins Dec 30, 2019
db062da
REF: implement cumulative ops block-wise (#29872)
jbrockmendel Dec 30, 2019
11284f5
:white_check_mark: Test DataFrame.append with other dtypes (#30558)
MarcoGorelli Dec 30, 2019
9c40e06
REF: Refactor window/test_moments.py (#30542)
charlesdong1991 Dec 30, 2019
844dc4a
API: Uses pd.NA in IntegerArray (#29964)
TomAugspurger Dec 30, 2019
b82b0c9
CI: Building docs in GitHub actions (#29874)
datapythonista Dec 30, 2019
733eb77
CI: Checks job aborted if a step fails (#30303)
suzutomato Dec 30, 2019
fd7db98
DOC: Document behaviour of head(n), tail(n) for negative values of n …
bharatr21 Dec 30, 2019
cfffad9
fstring in io.sql (#30026)
lexy-lixinyu Dec 31, 2019
17d19c4
CLN: assorted cleanups (#30575)
jbrockmendel Dec 31, 2019
95be077
TYP: Add return types to some top-level func (#30565)
topper-123 Dec 31, 2019
c068313
BUG: Change IntervalDtype.kind from None to "O" (#30569)
jschendel Dec 31, 2019
4b0f75d
TYP: check_untyped_defs io.json._normalize (#30573)
simonjayhawkins Dec 31, 2019
392c7b8
TYP: check_untyped_defs various (#30572)
simonjayhawkins Dec 31, 2019
d1f82f7
TYP: check_untyped_defs pandas.core.computation.align (#30550)
simonjayhawkins Dec 31, 2019
19578e3
BUG: TypeError in groupby pct_change when fill_method is None (#30532)
fujiaxiang Dec 31, 2019
f8a0989
BUG: hash_pandas_object fails on array containing tuple #28969 (#30508)
jbrockmendel Dec 31, 2019
7670262
STY: Concat string (#30579)
ShaharNaveh Dec 31, 2019
c0b2a7f
pandas\core\common.py:273: error: Implicit generic "Any". Use "typing…
simonjayhawkins Dec 29, 2019
5511ea4
pandas\core\arrays\categorical.py:514: error: Implicit generic "Any".…
simonjayhawkins Dec 29, 2019
c6ba046
pandas\core\indexing.py:2227: error: Implicit generic "Any". Use "typ…
simonjayhawkins Dec 29, 2019
8aba25d
pandas\core\groupby\grouper.py:422: error: Implicit generic "Any". Us…
simonjayhawkins Dec 29, 2019
d76a8fb
pandas\tests\frame\methods\test_replace.py:15: error: Implicit generi…
simonjayhawkins Dec 29, 2019
be30edc
pandas\tests\frame\methods\test_replace.py:20: error: Implicit generi…
simonjayhawkins Dec 29, 2019
fbea6a6
pandas\io\pytables.py:1462: error: Implicit generic "Any". Use "typin…
simonjayhawkins Dec 29, 2019
4b61db6
np.nan is float
simonjayhawkins Dec 29, 2019
d6f01b8
Iterable -> Collection
simonjayhawkins Dec 29, 2019
b805290
update per comments
simonjayhawkins Dec 30, 2019
b760466
fix typo
simonjayhawkins Dec 30, 2019
052ac7b
replace typvar in union
simonjayhawkins Dec 30, 2019
aab7d7c
TST: Regression testing for fixed issues (#30554)
mroeschke Dec 31, 2019
ee42275
PERF: Fixed performance regression in Series init (#30571)
TomAugspurger Dec 31, 2019
6322b8f
TST: XFAIL Travis read_html tests (#30544)
alimcmaster1 Dec 31, 2019
35ba6b0
ENH: Add StataWriter 118 for unicode support (#30285)
bashtage Dec 31, 2019
a29cee3
BUG: Disable parallel cythonize on Windows (GH 30214) (#30585)
Dr-Irv Jan 1, 2020
d788234
CLN: datetimelike EA and Index cleanups (#30591)
jbrockmendel Jan 1, 2020
11664c3
REF: share code between DatetimeIndex and TimedeltaIndex (#30587)
jbrockmendel Jan 1, 2020
bd78b32
CLN: Clean _test_moments_consistency in common.py (#30577)
charlesdong1991 Jan 1, 2020
6349c68
REF: separate casting out of Index.__new__ (#30586)
jbrockmendel Jan 1, 2020
ac3715b
CLN: remove no-longer-reachable addsub_int_array (#30592)
jbrockmendel Jan 1, 2020
7ecd9af
CLN: Clean test moments for expanding (#30566)
charlesdong1991 Jan 1, 2020
cd357c6
BLD: Fix IntervalTree build warnings (#30560)
jschendel Jan 1, 2020
ff26171
DOC: Make pyplot import explicit in the 10 minutes to pandas page (#3…
yuseitahara Jan 1, 2020
3c8030f
DOC/CLN: move NDFrame.groupby to (DataFrame|Series).groupby (#30314)
topper-123 Jan 1, 2020
bac9a1b
ENH: Allow scatter plot to plot objects and datetime type data (#30434)
charlesdong1991 Jan 1, 2020
f3642d2
BUG: DTA/TDA/PA add/sub object-dtype (#30594)
jbrockmendel Jan 1, 2020
8e892d4
REF: share join methods for DTI/TDI (#30595)
jbrockmendel Jan 1, 2020
e77a28a
REF: move inference/casting out of Index.__new__ (#30596)
jbrockmendel Jan 1, 2020
f9fb02e
Fix TypeError when pulling field that is None (#30145)
bolkedebruin Jan 1, 2020
2b0d1a1
share _wrap_joined_index (#30599)
jbrockmendel Jan 1, 2020
07ee00e
CLN: Use fstring instead of .format in io/excel and test/generic (#30…
bharatr21 Jan 1, 2020
7c9042a
BUG: Fix groupby.apply (#28662)
dsaxton Jan 1, 2020
a9423e3
CLN: Clean up of locale testing (#29883)
datapythonista Jan 1, 2020
efaadd5
TST: Add test for TypeError when using datetime.time in scatter plot …
charlesdong1991 Jan 1, 2020
765d8db
BUG: pass 2D ndarray and EA-dtype to DataFrame, closes #12513 (#30507)
jbrockmendel Jan 1, 2020
27b713b
BUG: Raise when casting NaT to int (#28492)
dsaxton Jan 1, 2020
0aa48f7
PERF: perform reductions block-wise (#29847)
jbrockmendel Jan 1, 2020
56b6561
DOC: .get_slice_bound in MultiIndex needs documentation. (#29967) (#3…
proost Jan 1, 2020
f146632
BUG: Fix pd.NA `na_rep` truncated in to_csv (#30146)
jbman223 Jan 1, 2020
0be573e
CLN: Remove int32 and float32 dtypes from IntervalTree (#30598)
jschendel Jan 1, 2020
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
82 changes: 72 additions & 10 deletions .github/workflows/ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -23,53 +23,53 @@ jobs:

- name: Looking for unwanted patterns
run: ci/code_checks.sh patterns
if: true
if: always()

- name: Setup environment and build pandas
run: ci/setup_env.sh
if: true
if: always()

- name: Linting
run: |
source activate pandas-dev
ci/code_checks.sh lint
if: true
if: always()

- name: Dependencies consistency
run: |
source activate pandas-dev
ci/code_checks.sh dependencies
if: true
if: always()

- name: Checks on imported code
run: |
source activate pandas-dev
ci/code_checks.sh code
if: true
if: always()

- name: Running doctests
run: |
source activate pandas-dev
ci/code_checks.sh doctests
if: true
if: always()

- name: Docstring validation
run: |
source activate pandas-dev
ci/code_checks.sh docstrings
if: true
if: always()

- name: Typing validation
run: |
source activate pandas-dev
ci/code_checks.sh typing
if: true
if: always()

- name: Testing docstring validation script
run: |
source activate pandas-dev
pytest --capture=no --strict scripts
if: true
if: always()

- name: Running benchmarks
run: |
Expand All @@ -87,11 +87,73 @@ jobs:
else
echo "Benchmarks did not run, no changes detected"
fi
if: true
if: always()

- name: Publish benchmarks artifact
uses: actions/upload-artifact@master
with:
name: Benchmarks log
path: asv_bench/benchmarks.log
if: failure()

web_and_docs:
name: Web and docs
runs-on: ubuntu-latest
steps:

- name: Setting conda path
run: echo "::set-env name=PATH::${HOME}/miniconda3/bin:${PATH}"

- name: Checkout
uses: actions/checkout@v1

- name: Setup environment and build pandas
run: ci/setup_env.sh

- name: Build website
run: |
source activate pandas-dev
python web/pandas_web.py web/pandas --target-path=web/build

- name: Build documentation
run: |
source activate pandas-dev
doc/make.py --warnings-are-errors | tee sphinx.log ; exit ${PIPESTATUS[0]}

# This can be removed when the ipython directive fails when there are errors,
# including the `tee sphinx.log` in te previous step (https://github.com/ipython/ipython/issues/11547)
- name: Check ipython directive errors
run: "! grep -B1 \"^<<<-------------------------------------------------------------------------$\" sphinx.log"

- name: Merge website and docs
run: |
mkdir -p pandas_web/docs
cp -r web/build/* pandas_web/
cp -r doc/build/html/* pandas_web/docs/
if: github.event_name == 'push'

- name: Install Rclone
run: sudo apt install rclone -y
if: github.event_name == 'push'

- name: Set up Rclone
run: |
RCLONE_CONFIG_PATH=$HOME/.config/rclone/rclone.conf
mkdir -p `dirname $RCLONE_CONFIG_PATH`
echo "[ovh_cloud_pandas_web]" > $RCLONE_CONFIG_PATH
echo "type = swift" >> $RCLONE_CONFIG_PATH
echo "env_auth = false" >> $RCLONE_CONFIG_PATH
echo "auth_version = 3" >> $RCLONE_CONFIG_PATH
echo "auth = https://auth.cloud.ovh.net/v3/" >> $RCLONE_CONFIG_PATH
echo "endpoint_type = public" >> $RCLONE_CONFIG_PATH
echo "tenant_domain = default" >> $RCLONE_CONFIG_PATH
echo "tenant = 2977553886518025" >> $RCLONE_CONFIG_PATH
echo "domain = default" >> $RCLONE_CONFIG_PATH
echo "user = w4KGs3pmDxpd" >> $RCLONE_CONFIG_PATH
echo "key = ${{ secrets.ovh_object_store_key }}" >> $RCLONE_CONFIG_PATH
echo "region = BHS" >> $RCLONE_CONFIG_PATH
if: github.event_name == 'push'

- name: Sync web
run: rclone sync pandas_web ovh_cloud_pandas_web:dev
if: github.event_name == 'push'
7 changes: 1 addition & 6 deletions .travis.yml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
language: python
python: 3.5
python: 3.7

# To turn off cached cython files and compiler cache
# set NOCACHE-true
Expand Down Expand Up @@ -48,17 +48,12 @@ matrix:
- mysql
- postgresql

# In allow_failures
- env:
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"
services:
- mysql
- postgresql

allow_failures:
- env:
- JOB="3.6, slow" ENV_FILE="ci/deps/travis-36-slow.yaml" PATTERN="slow" SQL="1"

before_install:
- echo "before_install"
# set non-blocking IO on travis
Expand Down
28 changes: 21 additions & 7 deletions ci/azure/posix.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,18 +19,24 @@ jobs:
ENV_FILE: ci/deps/azure-36-minimum_versions.yaml
CONDA_PY: "36"
PATTERN: "not slow and not network"

py36_locale_slow_old_np:
ENV_FILE: ci/deps/azure-36-locale_slow.yaml
CONDA_PY: "36"
PATTERN: "slow"
LOCALE_OVERRIDE: "zh_CN.UTF-8"
# pandas does not use the language (zh_CN), but should support diferent encodings (utf8)
# we should test with encodings different than utf8, but doesn't seem like Ubuntu supports any
LANG: "zh_CN.utf8"
LC_ALL: "zh_CN.utf8"
EXTRA_APT: "language-pack-zh-hans"

py36_locale:
ENV_FILE: ci/deps/azure-36-locale.yaml
CONDA_PY: "36"
PATTERN: "not slow and not network"
LOCALE_OVERRIDE: "it_IT.UTF-8"
LANG: "it_IT.utf8"
LC_ALL: "it_IT.utf8"
EXTRA_APT: "language-pack-it"

py36_32bit:
ENV_FILE: ci/deps/azure-36-32bit.yaml
Expand All @@ -42,7 +48,9 @@ jobs:
ENV_FILE: ci/deps/azure-37-locale.yaml
CONDA_PY: "37"
PATTERN: "not slow and not network"
LOCALE_OVERRIDE: "zh_CN.UTF-8"
LANG: "zh_CN.utf8"
LC_ALL: "zh_CN.utf8"
EXTRA_APT: "language-pack-zh-hans"

py37_np_dev:
ENV_FILE: ci/deps/azure-37-numpydev.yaml
Expand All @@ -54,10 +62,16 @@ jobs:

steps:
- script: |
if [ "$(uname)" == "Linux" ]; then sudo apt-get install -y libc6-dev-i386 $EXTRA_APT; fi
echo '##vso[task.prependpath]$(HOME)/miniconda3/bin'
echo "Creating Environment"
ci/setup_env.sh
if [ "$(uname)" == "Linux" ]; then
sudo apt-get update
sudo apt-get install -y libc6-dev-i386 $EXTRA_APT
fi
displayName: 'Install extra packages'

- script: echo '##vso[task.prependpath]$(HOME)/miniconda3/bin'
displayName: 'Set conda path'

- script: ci/setup_env.sh
displayName: 'Setup environment and build pandas'

- script: |
Expand Down
2 changes: 1 addition & 1 deletion ci/azure/windows.yml
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@ jobs:
- bash: |
source activate pandas-dev
conda list
python setup.py build_ext -q -i
python setup.py build_ext -q -i -j 4
python -m pip install --no-build-isolation -e .
displayName: 'Build'

Expand Down
2 changes: 1 addition & 1 deletion ci/deps/azure-36-locale_slow.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ dependencies:
- pytest-azurepipelines

# pandas dependencies
- beautifulsoup4==4.6.0
- beautifulsoup4=4.6.0
- bottleneck=1.2.*
- lxml
- matplotlib=2.2.2
Expand Down
11 changes: 0 additions & 11 deletions ci/run_tests.sh
Original file line number Diff line number Diff line change
Expand Up @@ -5,17 +5,6 @@
# https://github.com/pytest-dev/pytest/issues/1075
export PYTHONHASHSEED=$(python -c 'import random; print(random.randint(1, 4294967295))')

if [ -n "$LOCALE_OVERRIDE" ]; then
export LC_ALL="$LOCALE_OVERRIDE"
export LANG="$LOCALE_OVERRIDE"
PANDAS_LOCALE=`python -c 'import pandas; pandas.get_option("display.encoding")'`
if [[ "$LOCALE_OVERRIDE" != "$PANDAS_LOCALE" ]]; then
echo "pandas could not detect the locale. System locale: $LOCALE_OVERRIDE, pandas detected: $PANDAS_LOCALE"
# TODO Not really aborting the tests until https://github.com/pandas-dev/pandas/issues/23923 is fixed
# exit 1
fi
fi

if [[ "not network" == *"$PATTERN"* ]]; then
export http_proxy=http://1.2.3.4 https_proxy=http://1.2.3.4;
fi
Expand Down
6 changes: 3 additions & 3 deletions ci/setup_env.sh
Original file line number Diff line number Diff line change
@@ -1,15 +1,15 @@
#!/bin/bash -e

# edit the locale file if needed
if [ -n "$LOCALE_OVERRIDE" ]; then
if [[ "$(uname)" == "Linux" && -n "$LC_ALL" ]]; then
echo "Adding locale to the first line of pandas/__init__.py"
rm -f pandas/__init__.pyc
SEDC="3iimport locale\nlocale.setlocale(locale.LC_ALL, '$LOCALE_OVERRIDE')\n"
SEDC="3iimport locale\nlocale.setlocale(locale.LC_ALL, '$LC_ALL')\n"
sed -i "$SEDC" pandas/__init__.py

echo "[head -4 pandas/__init__.py]"
head -4 pandas/__init__.py
echo
sudo locale-gen "$LOCALE_OVERRIDE"
fi

MINICONDA_DIR="$HOME/miniconda3"
Expand Down
3 changes: 2 additions & 1 deletion doc/source/getting_started/10min.rst
Original file line number Diff line number Diff line change
Expand Up @@ -697,8 +697,9 @@ Plotting

See the :ref:`Plotting <visualization>` docs.

We use the standard convention for referencing the matplotlib API:

.. ipython:: python
:suppress:

import matplotlib.pyplot as plt
plt.close('all')
Expand Down
28 changes: 28 additions & 0 deletions doc/source/user_guide/integer_na.rst
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,10 @@ Nullable integer data type
IntegerArray is currently experimental. Its API or implementation may
change without warning.

.. versionchanged:: 1.0.0

Now uses :attr:`pandas.NA` as the missing value rather
than :attr:`numpy.nan`.

In :ref:`missing_data`, we saw that pandas primarily uses ``NaN`` to represent
missing data. Because ``NaN`` is a float, this forces an array of integers with
Expand All @@ -23,6 +27,9 @@ much. But if your integer column is, say, an identifier, casting to float can
be problematic. Some integers cannot even be represented as floating point
numbers.

Construction
------------

Pandas can represent integer data with possibly missing values using
:class:`arrays.IntegerArray`. This is an :ref:`extension types <extending.extension-types>`
implemented within pandas.
Expand All @@ -39,6 +46,12 @@ NumPy's ``'int64'`` dtype:

pd.array([1, 2, np.nan], dtype="Int64")

All NA-like values are replaced with :attr:`pandas.NA`.

.. ipython:: python

pd.array([1, 2, np.nan, None, pd.NA], dtype="Int64")

This array can be stored in a :class:`DataFrame` or :class:`Series` like any
NumPy array.

Expand Down Expand Up @@ -78,6 +91,9 @@ with the dtype.
In the future, we may provide an option for :class:`Series` to infer a
nullable-integer dtype.

Operations
----------

Operations involving an integer array will behave similar to NumPy arrays.
Missing values will be propagated, and the data will be coerced to another
dtype if needed.
Expand Down Expand Up @@ -123,3 +139,15 @@ Reduction and groupby operations such as 'sum' work as well.

df.sum()
df.groupby('B').A.sum()

Scalar NA Value
---------------

:class:`arrays.IntegerArray` uses :attr:`pandas.NA` as its scalar
missing value. Slicing a single element that's missing will return
:attr:`pandas.NA`

.. ipython:: python

a = pd.array([1, None], dtype="Int64")
a[1]
Loading