Skip to content

Huge html with df.style.render to due css duplications #20695

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wiso opened this issue Apr 14, 2018 · 3 comments · Fixed by #23019
Closed

Huge html with df.style.render to due css duplications #20695

wiso opened this issue Apr 14, 2018 · 3 comments · Fixed by #23019
Labels
Enhancement IO HTML read_html, to_html, Styler.apply, Styler.applymap Performance Memory or execution speed performance

Comments

@wiso
Copy link

wiso commented Apr 14, 2018

When creating html with df.style e.g.

df.style.apply(color_f, axis=1).render()

pandas assigns to each html cell of the table a unique css class. This means huge html are created, while it would be possible to use the same class for cell with the same style.


commit: None
python: 2.7.14.final.0
python-bits: 64
OS: Linux
OS-release: 4.15.14-300.fc27.x86_64
machine: x86_64
processor: x86_64
byteorder: little
LC_ALL: None
LANG: it_IT.UTF-8
LOCALE: None.None

pandas: 0.22.0
pytest: 3.1.3
pip: 9.0.1
setuptools: 36.0.1
Cython: 0.27.3
numpy: 1.14.2
scipy: 1.0.0
pyarrow: None
xarray: None
IPython: 5.4.1
sphinx: None
patsy: 0.4.1
dateutil: 2.7.2
pytz: 2018.4
blosc: None
bottleneck: 1.2.1
tables: 3.4.2
numexpr: 2.6.2
feather: None
matplotlib: 2.0.2
openpyxl: None
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: None
lxml: None
bs4: 4.6.0
html5lib: 0.9999999
sqlalchemy: 1.1.11
pymysql: None
psycopg2: None
jinja2: 2.9.6
s3fs: None
fastparquet: None
pandas_gbq: None
pandas_datareader: None

@TomAugspurger
Copy link
Contributor

Can you make a small copy-pastable example, and note which CSS classes you would want to exclude?

We could offer a couple optimizations

  1. disable ID per cell
  2. Disable row / col classes for cells that aren't referenced by a style

This will be a bit of work though.

@TomAugspurger TomAugspurger added IO HTML read_html, to_html, Styler.apply, Styler.applymap Enhancement Performance Memory or execution speed performance Difficulty Intermediate labels Apr 14, 2018
@TomAugspurger TomAugspurger added this to the Next Major Release milestone Apr 14, 2018
@wiso
Copy link
Author

wiso commented Apr 15, 2018

Sure, here a very minimal example

import pandas as pd
import numpy as np
df = pd.DataFrame(np.random.random(size=(2, 2)), columns=list('AB'))
print df.style.render()

as you can see I am not applying any special style. Even in this trivial case I get one css class and id for each cell.

<style  type="text/css" >
</style>  
<table id="T_af65670a_40b3_11e8_9422_ac220bbe67c8" > 
<thead>    <tr> 
        <th class="blank level0" ></th> 
        <th class="col_heading level0 col0" >A</th> 
        <th class="col_heading level0 col1" >B</th> 
    </tr></thead> 
<tbody>    <tr> 
        <th id="T_af65670a_40b3_11e8_9422_ac220bbe67c8level0_row0" class="row_heading level0 row0" >0</th> 
        <td id="T_af65670a_40b3_11e8_9422_ac220bbe67c8row0_col0" class="data row0 col0" >0.628483</td> 
        <td id="T_af65670a_40b3_11e8_9422_ac220bbe67c8row0_col1" class="data row0 col1" >0.961722</td> 
    </tr>    <tr> 
        <th id="T_af65670a_40b3_11e8_9422_ac220bbe67c8level0_row1" class="row_heading level0 row1" >1</th> 
        <td id="T_af65670a_40b3_11e8_9422_ac220bbe67c8row1_col0" class="data row1 col0" >0.0814626</td> 
        <td id="T_af65670a_40b3_11e8_9422_ac220bbe67c8row1_col1" class="data row1 col1" >0.723978</td> 
    </tr></tbody> 
</table> 

In my real case I have thousands of row and hundres of lines, so I and end with html ~10Mb.

What you propose seems ok (don't know how the id is used), but as final solution I guess you have to group cell css class using the same classid if the cells have the same style.

@TomAugspurger
Copy link
Contributor

TomAugspurger commented Apr 15, 2018

You're welcome to take a look at

def render(self, **kwargs):
and
def _translate(self):
to explore how things can be (optionally) optimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO HTML read_html, to_html, Styler.apply, Styler.applymap Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants