diff --git a/doc/source/merging.rst b/doc/source/merging.rst index c6541a26c72b4..f95987afd4c77 100644 --- a/doc/source/merging.rst +++ b/doc/source/merging.rst @@ -692,6 +692,29 @@ either the left or right tables, the values in the joined table will be p.plot([left, right], result, labels=['left', 'right'], vertical=False); plt.close('all'); + +Here is another example with duplicate join keys in DataFrames: + +.. ipython:: python + + left = pd.DataFrame({'A' : [1,2], 'B' : [2, 2]}) + + right = pd.DataFrame({'A' : [4,5,6], 'B': [2,2,2]}) + + result = pd.merge(left, right, on='B', how='outer') + +.. ipython:: python + :suppress: + + @savefig merging_merge_on_key_dup.png + p.plot([left, right], result, + labels=['left', 'right'], vertical=False); + plt.close('all'); + +.. warning:: + + Joining / merging on duplicate keys can cause a returned frame that is the multiplication of the row dimensions, + may result in memory overflow. It is the user' s responsibility to manage duplicate values in keys before joining large DataFrames. .. _merging.indicator: