-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
[ArrayManager] Implement concat with axis=1 (merge/join) #39841
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
jorisvandenbossche
merged 12 commits into
pandas-dev:master
from
jorisvandenbossche:am-concat-axis-0
Feb 23, 2021
Merged
Changes from all commits
Commits
Show all changes
12 commits
Select commit
Hold shift + click to select a range
072ed39
[ArrayManager] ENH: implement concat with axis=1 (merge/join)
jorisvandenbossche dd14d2d
Merge remote-tracking branch 'upstream/master' into am-concat-axis-0
jorisvandenbossche 4da4433
Merge remote-tracking branch 'upstream/master' into am-concat-axis-0
jorisvandenbossche 31cfb0d
address feedback
jorisvandenbossche 480b32d
Merge remote-tracking branch 'upstream/master' into am-concat-axis-0
jorisvandenbossche 3488d63
add array property
jorisvandenbossche 437c1d2
simplify do_integrity_check
jorisvandenbossche 2597cd5
clean-up unused using_array_manager
jorisvandenbossche 31d1305
remove keyword-only indicator
jorisvandenbossche b6a6b54
Merge remote-tracking branch 'upstream/master' into am-concat-axis-0
jorisvandenbossche 3f5fc38
concatenate_block_managers -> concatenate_managers
jorisvandenbossche 54692ca
add comment
jorisvandenbossche File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -282,6 +282,18 @@ def get_dtypes(self): | |
dtypes = np.array([blk.dtype for blk in self.blocks]) | ||
return algos.take_nd(dtypes, self.blknos, allow_fill=False) | ||
|
||
@property | ||
def arrays(self): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. annotate |
||
""" | ||
Quick access to the backing arrays of the Blocks. | ||
|
||
Only for compatibility with ArrayManager for testing convenience. | ||
Not to be used in actual code, and return value is not the same as the | ||
ArrayManager method (list of 1D arrays vs iterator of 2D ndarrays / 1D EAs). | ||
""" | ||
for blk in self.blocks: | ||
yield blk.values | ||
|
||
def __getstate__(self): | ||
block_values = [b.values for b in self.blocks] | ||
block_items = [self.items[b.mgr_locs.indexer] for b in self.blocks] | ||
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -287,17 +287,27 @@ def test_merge_copy(self): | |
merged["d"] = "peekaboo" | ||
assert (right["d"] == "bar").all() | ||
|
||
def test_merge_nocopy(self): | ||
def test_merge_nocopy(self, using_array_manager): | ||
left = DataFrame({"a": 0, "b": 1}, index=range(10)) | ||
right = DataFrame({"c": "foo", "d": "bar"}, index=range(10)) | ||
|
||
merged = merge(left, right, left_index=True, right_index=True, copy=False) | ||
|
||
merged["a"] = 6 | ||
assert (left["a"] == 6).all() | ||
if using_array_manager: | ||
# With ArrayManager, setting a column doesn't change the values inplace | ||
# and thus does not propagate the changes to the original left/right | ||
# dataframes -> need to check that no copy was made in a different way | ||
# TODO(ArrayManager) we should be able to simplify this with a .loc | ||
# setitem test: merged.loc[0, "a"] = 10; assert left.loc[0, "a"] == 10 | ||
# but this currently replaces the array (_setitem_with_indexer_split_path) | ||
assert merged._mgr.arrays[0] is left._mgr.arrays[0] | ||
assert merged._mgr.arrays[2] is right._mgr.arrays[0] | ||
else: | ||
merged["a"] = 6 | ||
assert (left["a"] == 6).all() | ||
|
||
merged["d"] = "peekaboo" | ||
assert (right["d"] == "peekaboo").all() | ||
merged["d"] = "peekaboo" | ||
assert (right["d"] == "peekaboo").all() | ||
|
||
def test_intelligently_handle_join_key(self): | ||
# #733, be a bit more 1337 about not returning unconsolidated DataFrame | ||
|
@@ -1381,7 +1391,10 @@ def test_merge_readonly(self): | |
np.arange(20).reshape((5, 4)) + 1, columns=["a", "b", "x", "y"] | ||
) | ||
|
||
data1._mgr.blocks[0].values.flags.writeable = False | ||
# make each underlying block array / column array read-only | ||
for arr in data1._mgr.arrays: | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. cabn you comment on this (I know what you are doing by seeing what you added, but is non-obvious here i think) |
||
arr.flags.writeable = False | ||
|
||
data1.merge(data2) # no error | ||
|
||
|
||
|
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only
reshape/merge
and notreshape/concat
, as there are too many tests still failing in the concat tests (because axis=0 is not yet handled properly)