Skip to content

DOC: Clarify ExcelFile's available engine compatibility with file types in... #34261

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 5 commits into from
Jun 1, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
29 changes: 21 additions & 8 deletions pandas/io/excel/_base.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,7 +34,7 @@
"""
Read an Excel file into a pandas DataFrame.

Supports `xls`, `xlsx`, `xlsm`, `xlsb`, and `odf` file extensions
Supports `xls`, `xlsx`, `xlsm`, `xlsb`, `odf`, `ods` and `odt` file extensions
read from a local filesystem or URL. Supports an option to read
a single sheet or a list of sheets.

Expand Down Expand Up @@ -103,7 +103,12 @@
of dtype conversion.
engine : str, default None
If io is not a buffer or path, this must be set to identify io.
Acceptable values are None, "xlrd", "openpyxl" or "odf".
Supported engines: "xlrd", "openpyxl", "odf", "pyxlsb", default "xlrd".
Engine compatibility :
- "xlrd" supports most old/new Excel file formats.
- "openpyxl" supports newer Excel file formats.
- "odf" supports OpenDocument file formats (.odf, .ods, .odt).
- "pyxlsb" supports Binary Excel files.
converters : dict, default None
Dict of functions for converting values in certain columns. Keys can
either be integers or column labels, values are functions that take one
Expand Down Expand Up @@ -785,17 +790,24 @@ def close(self):
class ExcelFile:
"""
Class for parsing tabular excel sheets into DataFrame objects.
Uses xlrd. See read_excel for more documentation

Uses xlrd engine by default. See read_excel for more documentation

Parameters
----------
io : str, path object (pathlib.Path or py._path.local.LocalPath),
a file-like object, xlrd workbook or openpypl workbook.
If a string or path object, expected to be a path to xls, xlsx or odf file.
a file-like object, xlrd workbook or openpypl workbook.
If a string or path object, expected to be a path to a
.xls, .xlsx, .xlsb, .xlsm, .odf, .ods, or .odt file.
engine : str, default None
If io is not a buffer or path, this must be set to identify io.
Acceptable values are None, ``xlrd``, ``openpyxl``, ``odf``, or ``pyxlsb``.
Note that ``odf`` reads tables out of OpenDocument formatted files.
Supported engines: ``xlrd``, ``openpyxl``, ``odf``, ``pyxlsb``,
default ``xlrd``.
Engine compatibility :
- ``xlrd`` supports most old/new Excel file formats.
- ``openpyxl`` supports newer Excel file formats.
- ``odf`` supports OpenDocument file formats (.odf, .ods, .odt).
- ``pyxlsb`` supports Binary Excel files.
"""

from pandas.io.excel._odfreader import _ODFReader
Expand All @@ -817,7 +829,8 @@ def __init__(self, io, engine=None):
raise ValueError(f"Unknown engine: {engine}")

self.engine = engine
# could be a str, ExcelFile, Book, etc.

# Could be a str, ExcelFile, Book, etc.
self.io = io
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we type this? (not sure its easy)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope for this PR I think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure

# Always a string
self._io = stringify_path(io)
Expand Down