Skip to content

Add remove_from_default_na options to read_csv, read_excel... #55280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 4 commits into from
Closed

Add remove_from_default_na options to read_csv, read_excel... #55280

wants to merge 4 commits into from

Conversation

oda
Copy link

@oda oda commented Sep 25, 2023

  • closes #xxxx (Replace xxxx with the GitHub issue number)
  • Tests added and passed if fixing a bug or adding a new feature
  • All code checks passed.
  • Added type annotations to new arguments/methods/functions.
  • Added an entry in the latest doc/source/whatsnew/vX.X.X.rst file if fixing a bug or adding a new feature.

#52493
As already known, pd.read_csv(io.StringIO("a\nNone")).a[0] is 'None' on pandas 1 but NaN on pandas 2.

Pandas uses NaN to represent missing values, but "None" often means 'No options satisfy the condition' in real data.
Even though this is an unintentional breaking change, this bug is considered to be too late to be fixed now.

I was trying to update libraries we use and bumped into this problem.
So, I suggest to add a parameter 'remove_from_default_na'

read_excel(path, remove_from_default_na="None")

which is also a straightforward solution to the issue below.
#19156

Without this parameter, the workaround would be something like

from pandas._libs.parsers import STR_NA_VALUES
read_excel(path, na_values=STR_NA_VALUES-set(["None"]), keep_default_na=False)

Thank you for the great library.

@mroeschke
Copy link
Member

Thanks for the pull request, but it looks like this new feature breaks a lot of checks, and we require a discussion and buy-in from core devs before moving forward with a pull request. I would suggest providing more context in one of your linked issues before moving forward here so closing for now.

@mroeschke mroeschke closed this Sep 25, 2023
@oda oda deleted the remove_from_default_na branch September 26, 2023 00:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants