Skip to content

COMPAT: Consider Python 2.x tarfiles file-like #16533

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jun 1, 2017

Conversation

gfyoung
Copy link
Member

@gfyoung gfyoung commented May 29, 2017

Tarfile.ExFileObject has no next method in Python 2.x, making it an invalid file-like object in read_csv. However, they can be read in just fine, meaning our check is too strict for file-like. This commit relaxes
the check to just look for the __iter__ attribute.

Closes #16530.

@gfyoung gfyoung force-pushed the is-file-like-relax branch 7 times, most recently from f17a20c to a9c9365 Compare May 29, 2017 21:38
@codecov
Copy link

codecov bot commented May 29, 2017

Codecov Report

Merging #16533 into master will decrease coverage by <.01%.
The diff coverage is 60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16533      +/-   ##
==========================================
- Coverage   90.79%   90.79%   -0.01%     
==========================================
  Files         161      161              
  Lines       51063    51075      +12     
==========================================
+ Hits        46365    46371       +6     
- Misses       4698     4704       +6
Flag Coverage Δ
#multiple 88.63% <60%> (-0.01%) ⬇️
#single 40.15% <60%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (ø) ⬆️
pandas/io/parsers.py 95.3% <57.14%> (-0.36%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e60dc4c...a9c9365. Read the comment docs.

@codecov
Copy link

codecov bot commented May 29, 2017

Codecov Report

Merging #16533 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16533      +/-   ##
==========================================
+ Coverage   90.75%   90.75%   +<.01%     
==========================================
  Files         161      161              
  Lines       51073    51081       +8     
==========================================
+ Hits        46350    46360      +10     
+ Misses       4723     4721       -2
Flag Coverage Δ
#multiple 88.59% <100%> (ø) ⬆️
#single 40.16% <81.81%> (ø) ⬆️
Impacted Files Coverage Δ
pandas/core/dtypes/inference.py 98.33% <100%> (ø) ⬆️
pandas/io/parsers.py 95.68% <100%> (+0.02%) ⬆️
pandas/core/internals.py 93.4% <0%> (+0.07%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ee8346d...7c59fc9. Read the comment docs.

if is_file_like(f):
next_attr = "__next__" if PY3 else "next"

if engine != "c" and not hasattr(f, next_attr):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a comment of why you are doing this

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@@ -702,6 +702,7 @@ def pxd(name):
'parser/data/*.gz',
'parser/data/*.bz2',
'parser/data/*.txt',
'parser/data/*.tar',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add with .tar.gz as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant test with .tar.gz as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see. Done.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. minor comments.

@jreback jreback added Compat pandas objects compatability with Numpy or Python functions IO CSV read_csv, to_csv labels May 30, 2017
@jreback jreback added this to the 0.20.2 milestone May 30, 2017
@gfyoung gfyoung force-pushed the is-file-like-relax branch from a9c9365 to 6dd7837 Compare May 30, 2017 13:03
# explicitly calls "next(...)" when iterating through
# such an object, meaning it needs to have that attribute.
if engine != "c" and not hasattr(f, next_attr):
msg = ("The 'python' engine can not iterate "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this block seems untested. Are you able to add some, or is it a hassle to hit it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was meaning to do that eventually. Done.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm. couple of comments.

@@ -702,6 +702,7 @@ def pxd(name):
'parser/data/*.gz',
'parser/data/*.bz2',
'parser/data/*.txt',
'parser/data/*.tar',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I meant test with .tar.gz as well.

@gfyoung gfyoung force-pushed the is-file-like-relax branch 3 times, most recently from 0df7b2c to e236ba5 Compare May 30, 2017 23:34
@jreback
Copy link
Contributor

jreback commented May 31, 2017

lgtm. ping on green.

@gfyoung gfyoung force-pushed the is-file-like-relax branch from e236ba5 to 8611b79 Compare May 31, 2017 00:40
@gfyoung
Copy link
Member Author

gfyoung commented May 31, 2017

@jreback : Green all around and ready to merge.

else:
engine = "c"
msg += " Falling back to the 'c' engine."
warnings.warn(msg, ParserWarning, stacklevel=5)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Appears this isn't being hit by test coverage. Should the warning be checked? Can this actually be hit?

Copy link
Member Author

@gfyoung gfyoung May 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right. This actually can't be hit (engine = python means the engine was specified). Removing this block.

@gfyoung gfyoung force-pushed the is-file-like-relax branch 2 times, most recently from 0526fd6 to 7a5fcd3 Compare May 31, 2017 16:29
@@ -801,6 +803,24 @@ def _get_options_with_defaults(self, engine):

return options

@staticmethod
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this a staticmethod?

Copy link
Member Author

@gfyoung gfyoung May 31, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because self. is not used. I pass in the engine and f attributes as parameters.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right but no other methods are like this. you can make it a free function if you want, but prob just make it a method.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alright. Done.

@gfyoung gfyoung force-pushed the is-file-like-relax branch from 7a5fcd3 to 08efe9c Compare May 31, 2017 23:59
Tarfile.ExFileObject has no "next" method in
Python 2.x, making it an invalid file-like
object in read_csv. However, they can be
read in just fine, meaning our check is too
strict for file-like. This commit relaxes
the check to just look for "__iter__".

Closes pandas-devgh-16530.
@gfyoung gfyoung force-pushed the is-file-like-relax branch from 08efe9c to 7c59fc9 Compare June 1, 2017 03:10
@jreback jreback merged commit e0a127a into pandas-dev:master Jun 1, 2017
@jreback
Copy link
Contributor

jreback commented Jun 1, 2017

thanks!

@gfyoung gfyoung deleted the is-file-like-relax branch June 1, 2017 10:40
TomAugspurger pushed a commit to TomAugspurger/pandas that referenced this pull request Jun 1, 2017
TomAugspurger pushed a commit that referenced this pull request Jun 4, 2017
Kiv pushed a commit to Kiv/pandas that referenced this pull request Jun 11, 2017
stangirala pushed a commit to stangirala/pandas that referenced this pull request Jun 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Compat pandas objects compatability with Numpy or Python functions IO CSV read_csv, to_csv
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants