Skip to content

Commit 96eb4c9

Browse files
BUG: parse_dates=False while passing date_parser tries to use date parser (#44599)
* BUG: Providing fix for GH#44366 * BUG: fix pre-commit run * BUG: adding other API changes for change of default argument * Rewriting if statement and adding test * BUG: changing pd.Series to Series to validate pre-commit hook * Removing the second whatsnew statement only to keep the bug fix one * BUG: simplifying the test with parser and parse_dates False #44366 * BUG: fixing pre-commit #44366
1 parent f0d4d8d commit 96eb4c9

File tree

3 files changed

+44
-4
lines changed

3 files changed

+44
-4
lines changed

doc/source/whatsnew/v1.4.0.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -663,6 +663,7 @@ I/O
663663
- Bug in :func:`read_csv` raising ``ValueError`` when ``parse_dates`` was used with ``MultiIndex`` columns (:issue:`8991`)
664664
- Bug in :func:`read_csv` raising ``AttributeError`` when attempting to read a .csv file and infer index column dtype from an nullable integer type (:issue:`44079`)
665665
- :meth:`DataFrame.to_csv` and :meth:`Series.to_csv` with ``compression`` set to ``'zip'`` no longer create a zip file containing a file ending with ".zip". Instead, they try to infer the inner file name more smartly. (:issue:`39465`)
666+
- Bug in :func:`read_csv` when passing simultaneously a parser in ``date_parser`` and ``parse_dates=False``, the parsing was still called (:issue:`44366`)
666667

667668
Period
668669
^^^^^^

pandas/io/parsers/readers.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -510,9 +510,15 @@ def _read(
510510
filepath_or_buffer: FilePath | ReadCsvBuffer[bytes] | ReadCsvBuffer[str], kwds
511511
):
512512
"""Generic reader of line files."""
513-
if kwds.get("date_parser", None) is not None:
514-
if isinstance(kwds["parse_dates"], bool):
515-
kwds["parse_dates"] = True
513+
# if we pass a date_parser and parse_dates=False, we should not parse the
514+
# dates GH#44366
515+
if (
516+
kwds.get("date_parser", None) is not None
517+
and kwds.get("parse_dates", None) is None
518+
):
519+
kwds["parse_dates"] = True
520+
elif kwds.get("parse_dates", None) is None:
521+
kwds["parse_dates"] = False
516522

517523
# Extract some of the arguments (pass chunksize on).
518524
iterator = kwds.get("iterator", False)
@@ -585,7 +591,7 @@ def read_csv(
585591
verbose=False,
586592
skip_blank_lines=True,
587593
# Datetime Handling
588-
parse_dates=False,
594+
parse_dates=None,
589595
infer_datetime_format=False,
590596
keep_date_col=False,
591597
date_parser=None,

pandas/tests/io/parser/test_parse_dates.py

Lines changed: 33 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -97,6 +97,39 @@ def __custom_date_parser(time):
9797
tm.assert_frame_equal(result, expected)
9898

9999

100+
@xfail_pyarrow
101+
def test_read_csv_with_custom_date_parser_parse_dates_false(all_parsers):
102+
# GH44366
103+
def __custom_date_parser(time):
104+
time = time.astype(np.float_)
105+
time = time.astype(np.int_) # convert float seconds to int type
106+
return pd.to_timedelta(time, unit="s")
107+
108+
testdata = StringIO(
109+
"""time e
110+
41047.00 -93.77
111+
41048.00 -95.79
112+
41049.00 -98.73
113+
41050.00 -93.99
114+
41051.00 -97.72
115+
"""
116+
)
117+
result = all_parsers.read_csv(
118+
testdata,
119+
delim_whitespace=True,
120+
parse_dates=False,
121+
date_parser=__custom_date_parser,
122+
index_col="time",
123+
)
124+
time = Series([41047.00, 41048.00, 41049.00, 41050.00, 41051.00], name="time")
125+
expected = DataFrame(
126+
{"e": [-93.77, -95.79, -98.73, -93.99, -97.72]},
127+
index=time,
128+
)
129+
130+
tm.assert_frame_equal(result, expected)
131+
132+
100133
@xfail_pyarrow
101134
def test_separator_date_conflict(all_parsers):
102135
# Regression test for gh-4678

0 commit comments

Comments
 (0)