You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
wr.s3.to_parquet fails when writing a DataFrame with an an ExtensionDType index resulting in error
AttributeError: 'Index' object has no attribute 'head'
Specifically, the failure happens when using pandas==1.4.* and awswrangler==2.15.*. This behavior is consistent on both pyarrow==5.0.0 and pyarrow==7.0.0. Appears to be a regression in the 2.15.0 release, as works fine when awswrangler==2.14.*
it looks like it continues to fail in the case the index has no name.
import pandas as pd
import awswrangler as wr
df = pd.DataFrame(
{"col1": [1, 2, 3], "col2": [1, 2, 3]}, dtype=pd.Int64Dtype()
).set_index("col1")
df.index.name = None
wr.s3.to_parquet(
df,
path="s3://bucket/tmp.parquet",
index=True,
dataset=False,
)
# throws error
```
This is a common case when dealing with df generated with user-functions.
Describe the bug
Regression of #1188 in awswrangler 2.15.*
wr.s3.to_parquet
fails when writing aDataFrame
with an an ExtensionDType index resulting in errorSpecifically, the failure happens when using
pandas==1.4.*
andawswrangler==2.15.*
. This behavior is consistent on bothpyarrow==5.0.0
andpyarrow==7.0.0
. Appears to be a regression in the 2.15.0 release, as works fine whenawswrangler==2.14.*
How to Reproduce
Expected behavior
Successful write of parquet file to S3 location
Your project
No response
Screenshots
No response
OS
Linux / Debian 10.12
Python version
3.9
AWS DataWrangler version
2.15.0, 2.15.1
Additional context
Pandas 1.4.2
The text was updated successfully, but these errors were encountered: