Skip to content

to_hdf(mode='a') overwrites HDF table, does not append #14625

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
MaxPowerWasTaken opened this issue Nov 9, 2016 · 2 comments
Closed

to_hdf(mode='a') overwrites HDF table, does not append #14625

MaxPowerWasTaken opened this issue Nov 9, 2016 · 2 comments
Labels
IO HDF5 read_hdf, HDFStore Usage Question

Comments

@MaxPowerWasTaken
Copy link

This seems to be the same issue as #4584 (#4584), but the issue seems to persist.

I'm using Pandas 0.19.1. The docs state that for to_hdf(), "mode = 'a' " means "Append; an existing file is opened for reading and writing"
http://pandas.pydata.org/pandas-docs/version/0.19.1/generated/pandas.DataFrame.to_hdf.html

But the following code example shows to_hdf(... mode='a') is overwriting, not appending

df1 = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
df2 = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
df3 = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))
df4 = pd.DataFrame(np.random.randint(0,100,size=(100, 4)), columns=list('ABCD'))

for df in (df1, df2, df3, df4):
    df.to_hdf('example.h5', 'my_hdf_table', mode='a', format='table')    

store = pd.HDFStore('example.h5')
print(store.items)    
#prints "/my_hdf_table             frame_table  (typ->appendable,nrows->100,..."

However, using df.to_hdf('example.h5', 'my_hdf_table2', format='table', append=True) does result in appending, not overwriting.

Maybe this is a documentation issue not a code issue? Why do we have "mode = a" described as "append" if a separate non-default option "append = True" is still required for appending?

As always, thanks for all the work on a fantastic library.

@jreback
Copy link
Contributor

jreback commented Nov 9, 2016

you misunderstand

mode is about the file itself

a node has possibilities of different formats (e.g. table can be appended while fixed cannot)

if you think that the documentation is unclear please submit a PR for a change

@jreback jreback closed this as completed Nov 9, 2016
@jreback jreback added IO HDF5 read_hdf, HDFStore Usage Question labels Nov 9, 2016
@jreback jreback added this to the No action milestone Nov 9, 2016
@AlyShmahell
Copy link

Kudos for noticing @MaxPowerWasTaken
And it's true what @jreback said, at the moment, the doc seems to be clear:

mode : {‘a’, ‘w’, ‘r+’}, default ‘a’
    Mode to open file:
         ‘w’: write, a new file is created (an existing file with the same name would be deleted).
         ‘a’: append, an existing file is opened for reading and writing, and if the file does not exist it is created.
         ‘r+’: similar to ‘a’, but the file must already exist.
append : bool, default False
    For Table formats, append the input data to the existing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
IO HDF5 read_hdf, HDFStore Usage Question
Projects
None yet
Development

No branches or pull requests

3 participants