Skip to content

100000 assignments of .__sizeof__ cause a segfault on del #87053

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
xxm mannequin opened this issue Jan 11, 2021 · 10 comments
Open

100000 assignments of .__sizeof__ cause a segfault on del #87053

xxm mannequin opened this issue Jan 11, 2021 · 10 comments
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump

Comments

@xxm
Copy link
Mannequin

xxm mannequin commented Jan 11, 2021

BPO 42887
Nosy @terryjreedy, @ronaldoussoren, @vstinner, @tiran, @markshannon, @serhiy-storchaka, @WildCard65

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-01-11.07:38:04.066>
labels = ['interpreter-core', '3.10', 'type-crash']
title = '100000 assignments of .__sizeof__  cause a segfault on del'
updated_at = <Date 2021-01-19.17:07:10.964>
user = 'https://bugs.python.org/xxm'

bugs.python.org fields:

activity = <Date 2021-01-19.17:07:10.964>
actor = 'vstinner'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Interpreter Core']
creation = <Date 2021-01-11.07:38:04.066>
creator = 'xxm'
dependencies = []
files = []
hgrepos = []
issue_num = 42887
keywords = []
message_count = 10.0
messages = ['384797', '384801', '384802', '385128', '385132', '385134', '385177', '385187', '385265', '385267']
nosy_count = 8.0
nosy_names = ['terry.reedy', 'ronaldoussoren', 'vstinner', 'christian.heimes', 'Mark.Shannon', 'serhiy.storchaka', 'WildCard65', 'xxm']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'crash'
url = 'https://bugs.python.org/issue42887'
versions = ['Python 3.10']

@xxm
Copy link
Mannequin Author

xxm mannequin commented Jan 11, 2021

In the following program 1, method "__sizeof__()" is called and assigned multiple times. The program can work well on Python 3.10. However if I change "__sizeof__()" to "__sizeof__". Then a segmentation fault is reported. I think something wrong for the parser when dealing build-in attribute assignment.

program 1:
=========================

mystr  = "hello123"
for x in range(1000000):
    mystr = mystr.__sizeof__()
    print(mystr)

=========================
56
28
28
.......
28
28

Output: work well as expected.

program 2:
==========================

mystr = "hello123"
for x in range(1000000):
        mystr = mystr.__sizeof__
        print(mystr)

==========================
<built-in method __sizeof__ of builtin_function_or_method object at 0x7f04d3e0c220>
......
<built-in method __sizeof__ of builtin_function_or_method object at 0x7f04d3e0c4f0>
<built-in method __sizeof__ of builtin_function_or_method object at 0x7f04d3e0c540>
Segmentation fault (core dumped)

Expected output: no segfault.

@xxm xxm mannequin added 3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump labels Jan 11, 2021
@tiran
Copy link
Member

tiran commented Jan 11, 2021

I can reproduce the issue. The stack trace is several hundred thousand (!) levels deep.

#0 _Py_DECREF (op=<built-in method __sizeof__ of builtin_function_or_method object at remote 0x7fffe60703b0>, lineno=514, filename=0x6570af "./Include/object.h")
at ./Include/object.h:448
#1 _Py_XDECREF (op=<built-in method __sizeof__ of builtin_function_or_method object at remote 0x7fffe60703b0>) at ./Include/object.h:514
#2 meth_dealloc (m=0x7fffe6070470) at Objects/methodobject.c:170
#3 0x0000000000466a99 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2209
#4 0x00000000005da2fa in _Py_DECREF (op=<optimized out>, lineno=514, filename=0x6570af "./Include/object.h") at ./Include/object.h:448
#5 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
#6 meth_dealloc (m=0x7fffe60704d0) at Objects/methodobject.c:170
#7 0x0000000000466a99 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2209
#8 0x00000000005da2fa in _Py_DECREF (op=<optimized out>, lineno=514, filename=0x6570af "./Include/object.h") at ./Include/object.h:448
#9 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
#10 meth_dealloc (m=0x7fffe6070530) at Objects/methodobject.c:170
#11 0x0000000000466a99 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2209
#12 0x00000000005da2fa in _Py_DECREF (op=<optimized out>, lineno=514, filename=0x6570af "./Include/object.h") at ./Include/object.h:448
#13 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
#14 meth_dealloc (m=0x7fffe6070590) at Objects/methodobject.c:170
#15 0x0000000000466a99 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2209
#16 0x00000000005da2fa in _Py_DECREF (op=<optimized out>, lineno=514, filename=0x6570af "./Include/object.h") at ./Include/object.h:448
#17 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
#18 meth_dealloc (m=0x7fffe60705f0) at Objects/methodobject.c:170
#19 0x0000000000466a99 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2209
#20 0x00000000005da2fa in _Py_DECREF (op=<optimized out>, lineno=514, filename=0x6570af "./Include/object.h") at ./Include/object.h:448
#21 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
#22 meth_dealloc (m=0x7fffe6070650) at Objects/methodobject.c:170
...
bpo-509737 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
bpo-509738 meth_dealloc (m=0x7fffe54ca6b0) at Objects/methodobject.c:170
bpo-509739 0x0000000000466a99 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2209
bpo-509740 0x00000000005da2fa in _Py_DECREF (op=<optimized out>, lineno=514, filename=0x6570af "./Include/object.h") at ./Include/object.h:448
bpo-509741 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
bpo-509742 meth_dealloc (m=0x7fffe54ca710) at Objects/methodobject.c:170
bpo-509743 0x0000000000466a99 in _Py_Dealloc (op=<optimized out>) at Objects/object.c:2209
bpo-509744 0x00000000005da2fa in _Py_DECREF (op=<optimized out>, lineno=514, filename=0x6570af "./Include/object.h") at ./Include/object.h:448
bpo-509745 _Py_XDECREF (op=<optimized out>) at ./Include/object.h:514
bpo-509746 meth_dealloc (m=0x7fffe54ca770) at Objects/methodobject.c:170

...

@ronaldoussoren
Copy link
Contributor

This is a recursion problem, "mystr" will be equivalent to 'hello123'.__sizeof__.__sizeof__. ...(100K repetition)... .__sizeof__. The dealloc of "mystr" will cause recursive calls to tp_dealloc along the entire chain and that can exhaust the C stack.

@terryjreedy
Copy link
Member

Xinmeng, to verify Ronald's explanation, run this instead

mystr  = "hello123"
for x in range(1000000):
    mystr = mystr.__sizeof__()
input('>')  # Hit Enter to continue.
del mystr   # Expect crash here.
input('<')  # And never get here.

@terryjreedy terryjreedy changed the title Multiple assignments of attribute "__sizeof__" will cause a segfault 100000 assignments of .__sizeof__ cause a segfault on del Jan 15, 2021
@terryjreedy terryjreedy changed the title Multiple assignments of attribute "__sizeof__" will cause a segfault 100000 assignments of .__sizeof__ cause a segfault on del Jan 15, 2021
@xxm
Copy link
Mannequin Author

xxm mannequin commented Jan 16, 2021

Thank you. But I am not sure this is a recursion problem. Please see the following example, I replace "__sizeof__" with "__class__". No segmentation fault. Everything goes well.

========================

mystr  = "hello123"
print(dir(mystr))
for x in range(1000000):
    mystr = mystr.__class__
    print(mystr)

=========================
and

=========================

mystr  = "hello123"
for x in range(1000000):
    mystr = mystr.__class__
input('>')  # Hit Enter to continue.
del mystr   # Expect crash here.
input('<')  # And never get here

=========================
No segmentation fault

@WildCard65
Copy link
Mannequin

WildCard65 mannequin commented Jan 16, 2021

Jumping in here to explain why '__class' doesn't crash when '__sizeof__' does:

When '__class__' is fetched, it returns a new reference to the object's type.

When '__sizeof__' is fetched on the otherhand, a new object is allocated on the heap ('types.MethodType') and is returned to the caller.

This object also has a '__sizeof__' that does the same (as it's implemented on 'object'.

So yes, you are exhausting the C runtime stack by de-allocating over a THOUSAND objects.

You can see this happen by watching the memory usage of Python steadily climb.

@ronaldoussoren
Copy link
Contributor

Note that there is a way to avoid this crash using the trashcan API (see the use of Py_TRASHCAN_BEGIN in various implementation). This API is generally only used for recursive data structures and because it has a performance cost (based on what I've read in other issues).

@serhiy-storchaka
Copy link
Member

Yes, there is an overhead of using the trashcan mechanism. This is why it is only used in data collections, because it is expected that your data can contain arbitrary long chains of links. There is many ways to create arbitrary long chains with other objects, but it does not happen in common code. For methods the cost would be especially high, because method objects are usually short-lived and the performance of creating/destroying is critical.

AFAIK the same issue (maybe not with __sizeof__, but with other method of the basic object class, like __reduce__) was already reported earlier. I propose to close this issue as "won't fix".

@terryjreedy
Copy link
Member

Mark, would your proposal in PEP-651 fix this case?

@markshannon
Copy link
Member

It won't solve the problem.
Maybe make it would make it easier to avoid the segfault, but some sort of recursion/overflow check is needed.

It might make the use of the trashcan cheaper, as it only need be used when stack space is running low.

Ultimately, the cycle GC needs to be iterative, rather than recursive. That will take a *lot* of work though.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.10 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-crash A hard crash of the interpreter, possibly with a core dump
Projects
None yet
Development

No branches or pull requests

5 participants