Add remove_from_default_na options to read_csv, read_excel... #55280

oda · 2023-09-25T12:36:43Z

closes #xxxx (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

#52493
As already known, pd.read_csv(io.StringIO("a\nNone")).a[0] is 'None' on pandas 1 but NaN on pandas 2.

Pandas uses NaN to represent missing values, but "None" often means 'No options satisfy the condition' in real data.
Even though this is an unintentional breaking change, this bug is considered to be too late to be fixed now.

I was trying to update libraries we use and bumped into this problem.
So, I suggest to add a parameter 'remove_from_default_na'

read_excel(path, remove_from_default_na="None")

which is also a straightforward solution to the issue below.
#19156

Without this parameter, the workaround would be something like

from pandas._libs.parsers import STR_NA_VALUES
read_excel(path, na_values=STR_NA_VALUES-set(["None"]), keep_default_na=False)

Thank you for the great library.

mroeschke · 2023-09-25T18:19:40Z

Thanks for the pull request, but it looks like this new feature breaks a lot of checks, and we require a discussion and buy-in from core devs before moving forward with a pull request. I would suggest providing more context in one of your linked issues before moving forward here so closing for now.

oda added 4 commits September 25, 2023 20:52

Add remove_from_default_na options to read_csv, read_excel...

669adb9

Nit: Fix typo and lint.

8fe299c

Fix docstring.

1733f4b

Nit: Fix lint.

94a9a18

mroeschke closed this Sep 25, 2023

oda deleted the remove_from_default_na branch September 26, 2023 00:32

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add remove_from_default_na options to read_csv, read_excel... #55280

Add remove_from_default_na options to read_csv, read_excel... #55280

oda commented Sep 25, 2023

mroeschke commented Sep 25, 2023

Add remove_from_default_na options to read_csv, read_excel... #55280

Add remove_from_default_na options to read_csv, read_excel... #55280

Conversation

oda commented Sep 25, 2023

mroeschke commented Sep 25, 2023