-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Add tests for accessing uninitialized holders #5654
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
f697672
to
465d4fb
Compare
c1f81bf
to
81c324a
Compare
81c324a
to
6c6db0e
Compare
Hi @XuehaiPan, to set expectations: Unless there is a surprise, any fix will have to come from you.
|
0715201
to
1b219be
Compare
1b219be
to
2c73d2b
Compare
tests/test_invalid_holder_access.py
Outdated
# During the unpickling process, Python firstly allocates the object with | ||
# the `__new__` method and then calls the `__setstate__`` method to set the | ||
# state of the object. | ||
# | ||
# obj = cls.__new__(cls) | ||
# obj.__setstate__(state) | ||
# | ||
# The `__init__` method is not called during unpickling. | ||
# So, if the `__setstate__` method raises an error, the object will be in | ||
# an uninitialized state. The GC will traverse the uninitialized C++ STL | ||
# container and cause a segmentation fault. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please explain (like you would in documentation) why this is useful, and what exactly cannot be achieved with a normal init.
@rwgk I added a test and comments for this in the latest comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, this goes into a different direction?
Is the primary goal to fix this hole in the pickle support?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see we touched on this before:
Unfortunately,
__getinitargs__
does not fit my use case because I do not bind the C++ ctor to Python__init__
.
Well, that's a choice.
That, or we need to find a mechanism to gracefully handle exceptions in __setstate__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding a public API to check whether the holder is constructed can resolve my issue.
// Maybe need to check if the object is an instance of pybind11 type
bool is_holder_constructed(PyObject *object) {
auto *const instance = reinterpret_cast<py::detail::instance *>(self_base);
return instance->get_value_and_holder().holder_constructed();
}
I also updated the documentation about this but accessing the pybind11::detail
namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The general idea sounds good, but I have doubts about adding it as pybind11:: is_holder_constructed()
.
To be safe enough for a public API, we'd need to guard with a check: "is this actually a pybind11 class_-wrapped type"?
Something like PyTypeObject *srctype = Py_TYPE(src.ptr());
and all_type_info(srctype)
probably (see pybind11/detail/type_caster_base.h).
But in the context of custom_type_setup
, tp_traverse
, tp_clear
, is that actually needed?
Another idea:
I see this in the tp_traverse
and tp_clear
implementations:
auto &self = py::cast<OwnsPythonObjects &>(py::handle(self_base));
That could do the holder_constructed
check internally, my guess is cheaply, although I could be wrong.
We might have to change the recipe to:
auto *self = py::cast<OwnsPythonObjects *>(py::handle(self_base));
And if the holder isn't constructed, this could return nullptr
.
Would that still work for your purposes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see this in the
tp_traverse
andtp_clear
implementations:auto &self = py::cast<OwnsPythonObjects &>(py::handle(self_base));
That could do the
holder_constructed
check internally, my guess is cheaply, although I could be wrong.
If the instance is not initialized, what behavior would be expected? Should it raise an error? I don't know if it is allowed to raise exceptions in tp_traverse
and tp_clear
. These two functions should always succeed, I think.
We might have to change the recipe to:
auto *self = py::cast<OwnsPythonObjects *>(py::handle(self_base));
And if the holder isn't constructed, this could return
nullptr
.
This works for me! I think this would be the best solution to have minimal impact on the downstream and be easy to migrate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This works for me! I think this would be the best solution to have minimal impact on the downstream and be easy to migrate.
Do you think you could try? — I can only offer to assist, by answering question if you run into roadblocks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It crossed my mind: Possibly the suggested change could get very involved ("hairy"). In case you start to feel it's too much, I'd be OK to add pybind11::detail:: is_holder_constructed()
as a semi-private API, without runtime guards, but a carefully written comment to explain the critical precondition: the caller is responsible for ensuring that the reinterpret_cast
is valid.
BTW: You can help me following along by not force pushing. I recommend never force pushing, just keep merging in from master (i.e. do not rebase). We'll squash-merge in the end anyway. |
I strongly recommend rebasing to keep following master, and never, ever create a merge commit. There's an even an automated rebase button in the dropdown below. If you create merge commits, eventually the pull request becomes unmergable and it is horrible to try to pick out the actual changes. It also becomes really hard to see what you've changed and what change upstream; eventually GitHub can't display it either. I've had a had an important long-standing PR in pypa/build die because of this. I spend hours trying to pick out the actual changes before giving up. Linear history is really important in git! Just don't rebase to change previous commits, only to keep the PR fresh. (PS: if I'm the primary reviewer, then you can change commits too, I don't mind, I prefer the PR to tell a story. But this is the balance we've agreed to for general PRs here.) PS: Feel free to follow @rwgk's wishes for this PR to get his help, this is general advice since he gave general advice. |
docs/advanced/classes.rst
Outdated
#if PY_VERSION_HEX >= 0x03090000 // Python 3.9 | ||
Py_VISIT(Py_TYPE(self_base)); | ||
#endif | ||
if (!py::detail::is_holder_constructed(self_base)) [[unlikely]] { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will look a lot more straightforward if we simply guard the existing code with
if (py::detail::is_holder_constructed(self_base)) {
Could you please update the code in tests/test_custom_type_setup.cpp accordingly?
Do we still need the new test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we still need the new test?
I think it can serve as a backlog if we implement #5654 (comment) eventually.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think what'll be most practical and best at this stage:
-
Pull out the changes to docs/advanced/classes.rst and include/pybind11/detail/value_and_holder.h into a separate PR.
-
Update tests/test_custom_type_setup.cpp to be 1:1 with the updated documentation in classes.rst
Merge that PR.
Then you can merge the new master here and work on this PR in the future.
The tests will need more work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opened #5669.
Description
Add tests for #4549.
__new__
does not initialize STL containers and resulting in undefined behavior #4549Suggested changelog entry: