Skip to content

BUG: left join between df with a single index and df with a multiindex produces an inner join #34292

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
3 tasks done
CuylenE opened this issue May 21, 2020 · 0 comments
Open
3 tasks done
Labels
Bug MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode

Comments

@CuylenE
Copy link
Contributor

CuylenE commented May 21, 2020

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • (optional) I have confirmed this bug exists on the master branch of pandas.


Code Sample

A = pd.DataFrame([1,2], columns=["i"]).set_index(["i"])
B = pd.DataFrame([(1,4),(3,0),(1,5)], columns=["i", "ii"]).set_index(["i", "ii"])
A.join(B, how="left")
#Empty DataFrame
#Columns: []
#Index: [(1, 4), (1, 5)]

Problem description

Joining a df with 1 index and a df with a multiindex always generates an inner join, no matter the value of "how". In this case index-value 2 from the left df is missing, while it should be kept.

The same happens when just joining the indexes. This problem does not happen when joining 2 multi-indexes.

It seems something in the implementation in file pandas\core\indexes\base.py , class Index, method _join_level goes wrong. The generated new codes are either calculated wrong or don't take into account the join-method. But that's as far as I've gotten.

Expected Output

Empty DataFrame
Columns: []
Index: [(1, 4), (1, 5), (2, nan)]

Output of pd.show_versions()

[paste the output of pd.show_versions() here leaving a blank line after the details tag]

@CuylenE CuylenE added Bug Needs Triage Issue that has not been reviewed by a pandas team member labels May 21, 2020
@CuylenE CuylenE changed the title BUG: join between df with a single index and df with a multiindex are always inner joins BUG: left join between df with a single index and df with a multiindex produces an inner join May 21, 2020
@TomAugspurger TomAugspurger added MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Sep 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug MultiIndex Reshaping Concat, Merge/Join, Stack/Unstack, Explode
Projects
None yet
2 participants