Skip to content

CI prints Closing remaining open files:...test_hdf_empty_dataframe0/data.h5 #10204

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
graingert opened this issue Apr 19, 2023 · 4 comments · Fixed by #10205
Closed

CI prints Closing remaining open files:...test_hdf_empty_dataframe0/data.h5 #10204

graingert opened this issue Apr 19, 2023 · 4 comments · Fixed by #10205
Labels
needs triage Needs a response from a contributor

Comments

@graingert
Copy link
Member

example eg https://github.com/dask/dask/actions/runs/4727453904/jobs/8388080950#step:8:24759

prints: Closing remaining open files:/tmp/pytest-of-runner/pytest-0/popen-gw0/test_hdf_empty_dataframe0/data.h5...done

we're using with pd.HDFStore(...): correctly so this could be a pandas or pytables bug

@github-actions github-actions bot added the needs triage Needs a response from a contributor label Apr 19, 2023
@graingert
Copy link
Member Author

I can reproduce with pure pandas with:

def test_hdf_empty_dataframe(tmp_path):
    pytest.importorskip("tables")
    # https://github.com/dask/dask/issues/8707

    df = pd.DataFrame({"A": [], "B": []}, index=[])
    df.to_hdf(tmp_path / "data.h5", format="fixed", key="df", mode="w")
    try:
        pd.read_hdf(tmp_path / "data.h5", "/df", mode="r", stop=0)
    except IndexError:
        pass

@graingert
Copy link
Member Author

raised upstream as pandas-dev/pandas#52781

@jrbourbeau
Copy link
Member

Thanks @graingert! Should we close this issue in favor of the upstream pandas issue then?

graingert added a commit to graingert/dask that referenced this issue Apr 19, 2023
closes dask#10204

this is a performance optimization that also bypasses the
pandas bug pandas-dev/pandas#52781
@graingert
Copy link
Member Author

@jrbourbeau I have a performance enhancement that side-steps the pandas bug: #10205

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs triage Needs a response from a contributor
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants