Skip to content

REGR: fix eval with inplace=True to correctly update column values inplace #47550

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 20 commits into from
Aug 19, 2022

Conversation

CloseChoice
Copy link
Member

@CloseChoice CloseChoice commented Jun 29, 2022

The bad commit that causes this issue (see #43406) is the rootcause for multiple others, so am not sure if we want to tackle the overall problem bit by bit or if there is a generic solution to all of them.

@pep8speaks
Copy link

pep8speaks commented Jun 29, 2022

Hello @CloseChoice! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2022-08-19 08:09:31 UTC

@CloseChoice CloseChoice requested a review from jreback June 29, 2022 16:32
@CloseChoice CloseChoice requested a review from jbrockmendel June 29, 2022 16:36
and assigner in target.columns
):
loc = target.columns.get_loc(assigner)
target._iset_item_mgr(loc, np.array(ret), inplace=inplace)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not use this here. Please use loc for this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that helps a lot.

@phofl
Copy link
Member

phofl commented Jun 29, 2022

Could you add a whatsnew? Is there a suitable class for this test?

The root cause was intended, so we have to fix them where they occur.

@CloseChoice
Copy link
Member Author

Could you add a whatsnew? Is there a suitable class for this test?

The root cause was intended, so we have to fix them where they occur.

Thanks for the answer. I added a whatsnew entry.

@@ -384,7 +384,10 @@ def eval(
try:
with warnings.catch_warnings(record=True):
# TODO: Filter the warnings we actually care about here.
target[assigner] = ret
if hasattr(target, "loc"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not ideal. What could target be here? Dict, series, DataFrame, something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nope, basically these 3. Used ´isinstance(target, (DataFrame, Series))` instead

@simonjayhawkins simonjayhawkins added the expressions pd.eval, query label Jul 2, 2022
@simonjayhawkins simonjayhawkins added this to the 1.4.4 milestone Jul 2, 2022
@jorisvandenbossche
Copy link
Member

@CloseChoice I started converting the suggestion at #47449 (comment) into a PR, missing that you already opened one with a very similar fix! I hope you don't mind, but so I merged some of the changes I started doing in your PR:

  • Also checking for inplace in eval.py (when not doing inplace, I think the target[assigner] = ret without loc is faster), as suggested in REGR: DataFrame.eval not respecting inplace argument in pandas 1.4 #47449 (comment)). In addition I also replaced (Series, DataFrame) with NDFrame
  • I slightly updated the test (used some simpler (less) values), and also added a check for a viewing Series to be updated (in addition to viewing DataFrame)

@jorisvandenbossche jorisvandenbossche added the Regression Functionality that used to work in a prior pandas version label Aug 12, 2022
@jorisvandenbossche jorisvandenbossche changed the title REGR: Ignore eval inplace REGR: fix eval with inplace=True to correctly update column values inplace Aug 12, 2022
@phofl
Copy link
Member

phofl commented Aug 12, 2022

Is it possible to get there with a MultiIndex? Similar to #47774 where the dimension reduction bothers us

@jorisvandenbossche
Copy link
Member

I am not sure if eval supports referring (or assigning) to MultiIndex columns (I also don't directly see a test in test_eval.py that involved MultiIndex). I tried a few things that all fail:

# those two give syntax error when parsing the ast
df.eval("('A', 'a') = ('A', 'b') + ('B', 'a')")
df.eval("\"('A', 'a')\" = \"('A', 'b')\" + \"('B', 'a')\"")
# those two each give another error
df.eval("D = ('A', 'b') + ('B', 'a')") 
df.eval("D = \"('A', 'b')\" + \"('B', 'a')\"")

If not using existing column names, but just assigning a new column, that still seems to work with this PR:

In [1]: df = DataFrame({("A", "a"): [1, 2, 3], ("A", "b"): [4, 5, 6], ("B", "a"): [7, 8, 9]})

In [2]: df
Out[2]: 
   A     B
   a  b  a
0  1  4  7
1  2  5  8
2  3  6  9

In [3]: df.eval("D = 1")
Out[3]: 
   A     B  D
   a  b  a   
0  1  4  7  1
1  2  5  8  1
2  3  6  9  1

In [4]: df.eval("E = 1", inplace=True)

In [5]: df
Out[5]: 
   A     B  E
   a  b  a   
0  1  4  7  1
1  2  5  8  1
2  3  6  9  1

@phofl
Copy link
Member

phofl commented Aug 12, 2022

Thanks for checking

@CloseChoice
Copy link
Member Author

@jorisvandenbossche I don't mind at all. Thanks for picking this up and improving upon it.

@phofl
Copy link
Member

phofl commented Aug 19, 2022

Greenish, merging

@phofl phofl merged commit 838b04f into pandas-dev:main Aug 19, 2022
@phofl
Copy link
Member

phofl commented Aug 19, 2022

thx @CloseChoice

@lumberbot-app

This comment was marked as resolved.

phofl pushed a commit to phofl/pandas that referenced this pull request Aug 19, 2022
…place (pandas-dev#47550)

* fix pre-commit issues

* fix linting errors

* add inplace argument to isetitem and use in eval

* changes due to PR discussions

* fix issues

* update eval

* update whatsnew

* add PR suggestions

* update imports in eval.py

* check inplace and use NDFrame + small update to test

* update test to use using_copy_on_write

* skip test for array manager

Co-authored-by: Joris Van den Bossche <[email protected]>
phofl added a commit that referenced this pull request Aug 19, 2022
…to correctly update column values inplace) (#48151)

* REGR: fix eval with inplace=True to correctly update column values inplace (#47550)

* fix pre-commit issues

* fix linting errors

* add inplace argument to isetitem and use in eval

* changes due to PR discussions

* fix issues

* update eval

* update whatsnew

* add PR suggestions

* update imports in eval.py

* check inplace and use NDFrame + small update to test

* update test to use using_copy_on_write

* skip test for array manager

Co-authored-by: Joris Van den Bossche <[email protected]>

* Remove copy on write fixture

Co-authored-by: Tobias Pitters <[email protected]>
Co-authored-by: Joris Van den Bossche <[email protected]>
noatamir pushed a commit to noatamir/pandas that referenced this pull request Nov 9, 2022
…place (pandas-dev#47550)

* fix pre-commit issues

* fix linting errors

* add inplace argument to isetitem and use in eval

* changes due to PR discussions

* fix issues

* update eval

* update whatsnew

* add PR suggestions

* update imports in eval.py

* check inplace and use NDFrame + small update to test

* update test to use using_copy_on_write

* skip test for array manager

Co-authored-by: Joris Van den Bossche <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
expressions pd.eval, query Regression Functionality that used to work in a prior pandas version
Projects
None yet
Development

Successfully merging this pull request may close these issues.

REGR: DataFrame.eval not respecting inplace argument in pandas 1.4
5 participants