Skip to content

Commit 6cccd67

Browse files
committed
DOC: add examples for importing and pivoting data in Excel
1 parent e79f1ff commit 6cccd67

File tree

2 files changed

+36
-5
lines changed

2 files changed

+36
-5
lines changed

doc/source/_static/excel_pivot.png

156 KB
Loading

doc/source/getting_started/comparison/comparison_with_excel.rst

Lines changed: 36 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -64,6 +64,26 @@ effectively.
6464
Commonly used Excel functionalities
6565
-----------------------------------
6666

67+
Importing data
68+
~~~~~~~~~~~~~~
69+
70+
Both `Excel <https://support.microsoft.com/en-us/office/import-data-from-external-data-sources-power-query-be4330b3-5356-486c-a168-b68e9e616f5a>`_
71+
and :ref:`pandas <10min_tut_02_read_write>` can import data from various sources in various
72+
formats. Let's load and display the `tips <https://github.com/pandas-dev/pandas/blob/master/pandas/tests/io/data/csv/tips.csv>`_
73+
dataset from the pandas tests, which is a CSV file.
74+
75+
In Excel, you would download and then `open the CSV <https://support.microsoft.com/en-us/office/import-or-export-text-txt-or-csv-files-5250ac4c-663c-47ce-937b-339e391393ba>`_.
76+
In pandas, you pass the URL or local path of the CSV file to :func:`~pandas.read_csv`:
77+
78+
.. ipython:: python
79+
80+
url = (
81+
"https://raw.github.com/pandas-dev"
82+
"/pandas/master/pandas/tests/io/data/csv/tips.csv"
83+
)
84+
tips = pd.read_csv(url)
85+
tips
86+
6787
Fill Handle
6888
~~~~~~~~~~~
6989

@@ -121,12 +141,23 @@ This is supported in pandas via :meth:`~DataFrame.drop_duplicates`.
121141
df.drop_duplicates(["class", "student_count"])
122142
123143
124-
Pivot Table
125-
~~~~~~~~~~~
144+
Pivot Tables
145+
~~~~~~~~~~~~
146+
147+
`PivotTables <https://support.microsoft.com/en-us/office/create-a-pivottable-to-analyze-worksheet-data-a9a84538-bfe9-40a9-a8e9-f99134456576>`_
148+
from Excel can be replicated in pandas through :ref:`reshaping`. Using the ``tips`` dataset again,
149+
let's find the average gratuity by size of the party and sex of the server.
150+
151+
In Excel, we use the following configuration for the PivotTable:
126152

127-
This can be achieved by using ``pandas.pivot_table`` for examples and reference,
128-
please see `pandas.pivot_table <http://pandas.pydata.org/pandas-docs/stable/generated/pandas.pivot_table.html>`__
153+
.. image:: ../../_static/excel_pivot.png
154+
:align: center
155+
156+
The equivalent in pandas:
157+
158+
.. ipython:: python
129159
160+
pd.pivot_table(df, values='tip', index=['size'], columns=['sex'], aggfunc=np.average)
130161
131162
Formulas
132163
~~~~~~~~
@@ -197,7 +228,7 @@ VLOOKUP
197228
Adding a row
198229
~~~~~~~~~~~~
199230

200-
To appended a row, we can just assign values to an index using ``loc``.
231+
To appended a row, we can just assign values to an index using :meth:`~DataFrame.loc`.
201232

202233
NOTE: If the index already exists, the values in that index will be over written.
203234

0 commit comments

Comments
 (0)