Skip to content

Commit 30cbb02

Browse files
committed
Merge pull request #10658 from ajcr/GH9428
BUG: GH9428 promote string dtype to object dtype for empty DataFrame
2 parents 5cb70d9 + c3effa6 commit 30cbb02

File tree

3 files changed

+17
-0
lines changed

3 files changed

+17
-0
lines changed

doc/source/whatsnew/v0.17.0.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -395,4 +395,5 @@ Bug Fixes
395395
- Bug in `read_msgpack` where DataFrame to decode has duplicate column names (:issue:`9618`)
396396
- Bug in ``io.common.get_filepath_or_buffer`` which caused reading of valid S3 files to fail if the bucket also contained keys for which the user does not have read permission (:issue:`10604`)
397397
- Bug in vectorised setting of timestamp columns with python ``datetime.date`` and numpy ``datetime64`` (:issue:`10408`, :issue:`10412`)
398+
- Bug in ``pd.DataFrame`` when constructing an empty DataFrame with a string dtype (:issue:`9428`)
398399

pandas/core/frame.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -322,6 +322,8 @@ def _init_dict(self, data, index, columns, dtype=None):
322322
if dtype is None:
323323
# 1783
324324
v = np.empty(len(index), dtype=object)
325+
elif np.issubdtype(dtype, np.flexible):
326+
v = np.empty(len(index), dtype=object)
325327
else:
326328
v = np.empty(len(index), dtype=dtype)
327329

pandas/tests/test_frame.py

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3617,6 +3617,20 @@ def test_constructor_column_duplicates(self):
36173617
[('a', [8]), ('a', [5]), ('b', [6])],
36183618
columns=['b', 'a', 'a'])
36193619

3620+
def test_constructor_empty_with_string_dtype(self):
3621+
# GH 9428
3622+
expected = DataFrame(index=[0, 1], columns=[0, 1], dtype=object)
3623+
3624+
df = DataFrame(index=[0, 1], columns=[0, 1], dtype=str)
3625+
assert_frame_equal(df, expected)
3626+
df = DataFrame(index=[0, 1], columns=[0, 1], dtype=np.str_)
3627+
assert_frame_equal(df, expected)
3628+
df = DataFrame(index=[0, 1], columns=[0, 1], dtype=np.unicode_)
3629+
assert_frame_equal(df, expected)
3630+
df = DataFrame(index=[0, 1], columns=[0, 1], dtype='U5')
3631+
assert_frame_equal(df, expected)
3632+
3633+
36203634
def test_column_dups_operations(self):
36213635

36223636
def check(result, expected=None):

0 commit comments

Comments
 (0)