Skip to content

BUG: Fix float formatting when a string is passed as float_format arg #22308

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
Nov 26, 2018
Merged

BUG: Fix float formatting when a string is passed as float_format arg #22308

merged 8 commits into from
Nov 26, 2018

Conversation

tomneep
Copy link
Contributor

@tomneep tomneep commented Aug 13, 2018

closes #21625
closes #22270.

If a string float_format argument is passed by the user to to_string, to_latex, to_html, etc, then disable fixed width, so that now e.g.

df.to_string(float_format='%.3f')

will give the same result as

df.to_string(float_format=lambda x: '%.3f' % x)

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would need a whatsnew entry as well

@@ -1328,6 +1328,12 @@ def test_to_string_float_formatting(self):
'1 2.512000e-01')
assert df_s == expected

df = DataFrame({'x': [0.19999, 100.0]})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you make a new test, pls add the issue number as a comment

@@ -974,6 +974,7 @@ def __init__(self, *args, **kwargs):
# float_format is expected to be a string
# formatter should be used to pass a function
if self.float_format is not None and self.formatter is None:
self.fixed_width = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you see if u can change fixed_width to a cached property instead of setting it (need to remove from the signature as well)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this possible?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a good chance I'm not understanding the comment, but I'm not sure we can set this as a cached property (using the cache_readonly decorator?) since it is set in the base class. As I say I might be missing something...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can set it in each of the subclasses. I just don't think we actually need this. try doing this as a property first to see if it works.

@jreback jreback added the Output-Formatting __repr__ of pandas objects, to_string label Aug 13, 2018
@jreback
Copy link
Contributor

jreback commented Aug 13, 2018

can you add the test from #21625 as well

@codecov
Copy link

codecov bot commented Aug 13, 2018

Codecov Report

Merging #22308 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #22308      +/-   ##
==========================================
+ Coverage   92.29%   92.29%   +<.01%     
==========================================
  Files         161      161              
  Lines       51497    51501       +4     
==========================================
+ Hits        47530    47534       +4     
  Misses       3967     3967
Flag Coverage Δ
#multiple 90.69% <100%> (ø) ⬆️
#single 42.42% <0%> (-0.01%) ⬇️
Impacted Files Coverage Δ
pandas/io/formats/format.py 97.76% <100%> (ø) ⬆️
pandas/io/pytables.py 92.3% <0%> (-0.05%) ⬇️
pandas/core/reshape/pivot.py 96.55% <0%> (ø) ⬆️
pandas/io/json/normalize.py 96.87% <0%> (ø) ⬆️
pandas/io/packers.py 88.08% <0%> (ø) ⬆️
pandas/core/util/hashing.py 98.4% <0%> (ø) ⬆️
pandas/core/algorithms.py 95.11% <0%> (ø) ⬆️
pandas/util/_decorators.py 91.34% <0%> (ø) ⬆️
pandas/tseries/offsets.py 96.98% <0%> (ø) ⬆️
pandas/core/indexes/datetimes.py 96.2% <0%> (ø) ⬆️
... and 20 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update e5c90e5...a9666bd. Read the comment docs.

-
- :func:`DataFrame.to_string()`, :func:`DataFrame.to_html()` and
:func:`DataFrame.to_latex()` will correctly format output when a string is
passed as the ``float_format`` argument (:issue:`21625` and :issue:`22270`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a ',' rather than 'and'

@@ -974,6 +974,7 @@ def __init__(self, *args, **kwargs):
# float_format is expected to be a string
# formatter should be used to pass a function
if self.float_format is not None and self.formatter is None:
self.fixed_width = False
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this possible?

@jreback
Copy link
Contributor

jreback commented Sep 15, 2018

can you rebase

@jreback
Copy link
Contributor

jreback commented Sep 18, 2018

can you rebase and see if you can address comments above

@tomneep
Copy link
Contributor Author

tomneep commented Sep 18, 2018

Sorry I am away at the moment. I can have a look at this next week.

@jreback
Copy link
Contributor

jreback commented Nov 23, 2018

can you merge master and update to comments

@pep8speaks
Copy link

pep8speaks commented Nov 23, 2018

Hello @tomneep! Thanks for updating the PR.

Comment last updated on November 26, 2018 at 12:27 Hours UTC

@@ -1359,6 +1359,18 @@ def test_to_string_float_formatting(self):
'1 2.512000e-01')
assert df_s == expected

def test_to_string_float_format_no_fixed_width(self):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you would need to add tests for to_latex and to_html in the appropriate files

@tomneep
Copy link
Contributor Author

tomneep commented Nov 23, 2018

Changes above are just merging with the current master.
Sorry it's taken me so long to get back to this, I have some time now.

With fresh eyes I agree this probably isn't the best way to solve this but I'm a bit stuck how better to proceed.

grepping the code I see that fixed_width isn't used outside of the FloatArrayFormatter (other than appearing in the base class __init__), but there are a couple of instances of FloatArrayFormatter created in pandas/core/indexes/numeric.py and pandas/core/internals/blocks.py with fixed_width=False.

A slightly different way of doing essentially the same thing (but without touching the *ArrayFormatters) would be to change format_array (https://github.com/pandas-dev/pandas/blob/master/pandas/io/formats/format.py#L846). For example, adding fixed_width=float_format is None to the fmt_klass instance has the same effect, with all tests passing. This is maybe slightly nicer, as you could still directly create a FloatArrayFormatter with a value of float_format and fixed_width=True.

Any suggestions would be welcome.

@jreback
Copy link
Contributor

jreback commented Nov 23, 2018

the way the formatters is done now is not super flexible, so your change is ok. if you can add tests for latex & html would be good.

@@ -0,0 +1,14 @@
<table border="1" class="dataframe">
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't have a very strong opinion atm but wondering if we want dedicated files for tests this small. Could maybe include HTML as part of the test or alternately just test for the existence of the appropriate format in the result or the expression.

This doesn't need to be addressed in this PR but just bringing up as a general discussion point

Copy link
Member

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@jreback jreback added this to the 0.24.0 milestone Nov 26, 2018
@jreback jreback merged commit b7294dd into pandas-dev:master Nov 26, 2018
@jreback
Copy link
Contributor

jreback commented Nov 26, 2018

thanks!

@tomneep tomneep deleted the fl_format_fix branch November 26, 2018 14:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Output-Formatting __repr__ of pandas objects, to_string
Projects
None yet
Development

Successfully merging this pull request may close these issues.

to_html w/ float_format="%.0f" renders 100.0 as 1 to_latex with float_format doesn't work like expected
4 participants