Skip to content

DOC: clarify purpose of DataFrame.from_csv (GH4191) #10163

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

jorisvandenbossche
Copy link
Member

Closes #9556

xref #9568, #4916, #4191

However, while writing this up, I started to doubt a bit if this is necessary.
DataFrame.from_csv is implemented as a round-trip method together with to_csv. If you use a plain df.to_csv(path), you cannnot read it in as pd.read_csv(path) to get exactly the same. You at least need pd.read_csv(path, index_col=True) and a parse_dates keyword if you have datetimes.

Secondly, there is also the question of Series.from_csv. It would be logical to deprecate this as well, but for this you don't directly have an alternative (but pd.read_csv(path, index_col=0)[0] will work). There is also for example this SO answer of Wes: http://stackoverflow.com/questions/13557559/how-to-write-read-pandas-series-to-from-csv (and it is used in the Python for Data Analysis book).

@jorisvandenbossche jorisvandenbossche added IO Data IO issues that don't fit into a more specific label IO CSV read_csv, to_csv Deprecate Functionality to remove in pandas and removed IO Data IO issues that don't fit into a more specific label labels May 18, 2015
@jorisvandenbossche jorisvandenbossche added this to the 0.17.0 milestone May 18, 2015
@jreback
Copy link
Contributor

jreback commented May 18, 2015

pd.read_csv(path, index_col=0, squeeze=True) should work for Series.from_csv.

I am also not strong on actually deprecating this particular method (as you say its the inverse of to_csv).

@jreback
Copy link
Contributor

jreback commented Jun 26, 2015

let's push this off / close. Not sure this is necessary.

@jreback
Copy link
Contributor

jreback commented Aug 5, 2015

let's just update the docs a tiny bit to note that DataFrame.to_csv <-> DataFrame.from_csv (and put it on pd.read_csv somewhere? I don't think we can actually deprecate this

@jorisvandenbossche
Copy link
Member Author

Yes, it was on my to do list to update this PR with only the docs part (and not the actual deprecation).
I think we can still encourage users in the docstring to use read_csv, but note that from_csv is useful if you want an exact opposite of to_csv

@jorisvandenbossche
Copy link
Member Author

which is actually the part from the docstring that I now removed ... :-)

@jorisvandenbossche jorisvandenbossche changed the title DEPR: deprecate DataFrame.from_csv (GH4191) DOC: clarify purpose of DataFrame.from_csv (GH4191) Aug 6, 2015
@jorisvandenbossche
Copy link
Member Author

@jreback updated it. Thoughts?

I still have to do the same for Series.from_csv

@jreback
Copy link
Contributor

jreback commented Aug 6, 2015

lgtm. DISCOURAGED should be an official term!

@jreback
Copy link
Contributor

jreback commented Aug 20, 2015

lgtm

@jorisvandenbossche
Copy link
Member Author

updated with similar text for Series.from_csv

jorisvandenbossche added a commit that referenced this pull request Aug 21, 2015
DOC: clarify purpose of DataFrame.from_csv (GH4191)
@jorisvandenbossche jorisvandenbossche merged commit bd804aa into pandas-dev:master Aug 21, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Deprecate Functionality to remove in pandas IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DataFrame.from_csv undocumented behavior of index_col
2 participants