Skip to content

BUG: Avoid casting to double type unnecessarily when setting values i… #23462

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 5, 2018

Conversation

icexelloss
Copy link
Contributor

@icexelloss icexelloss commented Nov 2, 2018

…n time delta column

When setting a timedelta column in a dataframe using pd.Series or array as index, there is some rounding error because the underlying TimeDeltaBlock is casted to double and back to int. This patch fixes the issue.

@pep8speaks
Copy link

Hello @icexelloss! Thanks for submitting the PR.

@icexelloss icexelloss force-pushed the timedelta-casting-bug branch 2 times, most recently from b43b803 to ca6bbd3 Compare November 2, 2018 18:29
@codecov
Copy link

codecov bot commented Nov 2, 2018

Codecov Report

Merging #23462 into master will not change coverage.
The diff coverage is 100%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master   #23462   +/-   ##
=======================================
  Coverage   92.23%   92.23%           
=======================================
  Files         161      161           
  Lines       51197    51197           
=======================================
  Hits        47220    47220           
  Misses       3977     3977
Flag Coverage Δ
#multiple 90.61% <100%> (ø) ⬆️
#single 42.27% <0%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/internals/blocks.py 94.02% <100%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0bb24b7...62ab354. Read the comment docs.

@jreback jreback added Bug Indexing Related to indexing on series/frames, not to indexes themselves labels Nov 3, 2018
@jreback jreback added this to the 0.24.0 milestone Nov 3, 2018
Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

small comments. pls add a whatsnew note in bug fixes / indexing. ping on green.

@@ -80,3 +80,16 @@ def test_numpy_timedelta_scalar_indexing(self, start, stop,
result = s.loc[slice(start, stop)]
expected = s.iloc[expected_slice]
tm.assert_series_equal(result, expected)

def test_set_dataframe_column_by_index(self):

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add this PR number here as a comment

@@ -80,3 +80,16 @@ def test_numpy_timedelta_scalar_indexing(self, start, stop,
result = s.loc[slice(start, stop)]
expected = s.iloc[expected_slice]
tm.assert_series_equal(result, expected)

def test_set_dataframe_column_by_index(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can rename to

test_roundtrip_thru_setitem


df = pd.DataFrame({'dt': pd.Series([dt1, dt2])})
s = pd.Series([dt1])
value_before = df['dt'].iloc[1].value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name this expected

s = pd.Series([dt1])
value_before = df['dt'].iloc[1].value
df.loc[[True, False]] = s
value_after = df['dt'].iloc[1].value
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

name this result

value_before = df['dt'].iloc[1].value
df.loc[[True, False]] = s
value_after = df['dt'].iloc[1].value

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you also test

tm.assert_frame_equal(df, df_copy)

(and copy df on line 89)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added

@icexelloss icexelloss force-pushed the timedelta-casting-bug branch from ca6bbd3 to 62ab354 Compare November 5, 2018 15:48
@gfyoung gfyoung added Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type labels Nov 5, 2018
@jreback
Copy link
Contributor

jreback commented Nov 5, 2018

lgtm. ping on green.

@jreback jreback merged commit f0877ec into pandas-dev:master Nov 5, 2018
@jreback
Copy link
Contributor

jreback commented Nov 5, 2018

thanks @icexelloss

@icexelloss
Copy link
Contributor Author

Thanks @jreback !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Indexing Related to indexing on series/frames, not to indexes themselves Numeric Operations Arithmetic, Comparison, and Logical operations Timedelta Timedelta data type
Projects
None yet
Development

Successfully merging this pull request may close these issues.

BUG: Setting timedelta column by index causes it to loss precision
4 participants