Skip to content

[ArrayManager] TST: get tests running for /tests/frame #39700

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 5, 2021

Conversation

jorisvandenbossche
Copy link
Member

@jorisvandenbossche jorisvandenbossche commented Feb 9, 2021

xref #39146

This is on top of #39612, so only the last commit is relevant for this PR: ec83091

@jorisvandenbossche jorisvandenbossche added the Internals Related to non-user accessible pandas implementation label Feb 9, 2021
@jorisvandenbossche jorisvandenbossche added this to the 1.3 milestone Feb 9, 2021
@lithomas1 lithomas1 added the Testing pandas testing functions or related to the test suite label Feb 10, 2021
@jorisvandenbossche
Copy link
Member Author

Rebased this now #39841 is merged

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -165,7 +166,10 @@ def test_constructor_cast_failure(self):
df["foo"] = np.ones((4, 2)).tolist()

# this is not ok
msg = "Wrong number of items passed 2, placement implies 1"
msg = (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you use the "|".join pattern

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but I already did that in #39991, which I would like to get merged first, and then will fix the conflics with this PR to resolve this

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

# GH#12857
lets = list("ACDEFGHIJKLMNOP")
slen = 50
nseqs = 1000
words = [[np.random.choice(lets) for x in range(slen)] for _ in range(nseqs)]
df = DataFrame(words).astype("U1")
assert (df.dtypes == object).all()
# TODO(Arraymanager) astype("U1") actually gives this dtype instead of object
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

im pretty sure we dont want this behavior? so should xfail for now?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is not actually testing astype, but rather repr, so I would prefer to run the rest of the test (which actually passes) without having this line error.

Now, I assume we should have an astype-specific test about this as well, that could be xfailed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't directly see a test about it in frame/methods/test_astype.py, so I will add a test for that there. It's not very clear, though, what the expected behaviour should be.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could move the repr calls up and do the dtype assertion at the end

@@ -896,8 +902,13 @@ def verify(df):
)

right = DataFrame(vals, columns=cols, index=idx)
if using_array_manager:
# INFO(ArrayManager) with ArrayManager preserve dtype where possible
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is INFO(ArrayManager) a pattern i should know?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it was requested by Jeff on one of the previous PRs. It's an explicit comment about behaviour that changed with ArrayManager, but which is not a TODO (since it's not wrong behaviour that still needs to be fixed)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks

@jbrockmendel
Copy link
Member

couple comments, otherwise LGTM

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. @jbrockmendel comments and pls merge master

@@ -816,7 +816,13 @@ def insert(self, loc: int, item: Hashable, value, allow_duplicates: bool = False

value = extract_array(value, extract_numpy=True)
if value.ndim == 2:
value = value[0, :]
if value.shape[0] == 1:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe make a helper function for this type of operation / check (followon ok)

@jreback jreback merged commit c5695c8 into pandas-dev:master Mar 5, 2021
@jreback
Copy link
Contributor

jreback commented Mar 5, 2021

thanks, hopefully nothing breaks from prior merges :-<

@jorisvandenbossche jorisvandenbossche deleted the am-tests-frame branch March 5, 2021 07:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Internals Related to non-user accessible pandas implementation Testing pandas testing functions or related to the test suite
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants