Skip to content

Add tests for accessing uninitialized holders #5654

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 15 commits into
base: master
Choose a base branch
from

Conversation

XuehaiPan
Copy link
Contributor

@XuehaiPan XuehaiPan requested a review from henryiii as a code owner May 12, 2025 03:41
@XuehaiPan XuehaiPan force-pushed the invalid-holder-access branch 8 times, most recently from f697672 to 465d4fb Compare May 12, 2025 06:09
@XuehaiPan XuehaiPan force-pushed the invalid-holder-access branch from 81c324a to 6c6db0e Compare May 12, 2025 13:02
@XuehaiPan
Copy link
Contributor Author

cc @rwgk @henryiii for code review

@rwgk
Copy link
Collaborator

rwgk commented May 13, 2025

Hi @XuehaiPan, to set expectations: Unless there is a surprise, any fix will have to come from you.

  • I recommend that you reduce this PR as much as possible ("minimal reproducer"). Concretely, reduce VectorOwns4PythonObjects to just append and size, and maybe change the name to something shorter (e.g. VecObj). Also remove all things related to the faulthandler/multiprocessing, those are just distracting you from working on a fix. (Please add [skip ci] to your commits, to not trigger the CI. The main purpose of the PR is that we can all inspect and clone it as needed.)

  • Please explain (like you would in documentation) why this is useful, and what exactly cannot be achieved with a normal init:

m.VectorOwns4PythonObjects.__new__(m.VectorOwns4PythonObjects)
  • Next, maybe, try a minimal change: preempt the apparently non-deterministic (?) undefined behavior by null-initializing the holder (?), so that you reliably get a segfault when it is dereferenced.

  • Usually, from work like that I get ideas for real fixes.

@XuehaiPan XuehaiPan force-pushed the invalid-holder-access branch from 0715201 to 1b219be Compare May 13, 2025 17:40
@XuehaiPan XuehaiPan force-pushed the invalid-holder-access branch from 1b219be to 2c73d2b Compare May 13, 2025 17:43
Comment on lines 136 to 146
# During the unpickling process, Python firstly allocates the object with
# the `__new__` method and then calls the `__setstate__`` method to set the
# state of the object.
#
# obj = cls.__new__(cls)
# obj.__setstate__(state)
#
# The `__init__` method is not called during unpickling.
# So, if the `__setstate__` method raises an error, the object will be in
# an uninitialized state. The GC will traverse the uninitialized C++ STL
# container and cause a segmentation fault.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please explain (like you would in documentation) why this is useful, and what exactly cannot be achieved with a normal init.

@rwgk I added a test and comments for this in the latest comment.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, this goes into a different direction?

Is the primary goal to fix this hole in the pickle support?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see we touched on this before:

Unfortunately, __getinitargs__ does not fit my use case because I do not bind the C++ ctor to Python __init__.

Well, that's a choice.

That, or we need to find a mechanism to gracefully handle exceptions in __setstate__.

Copy link
Contributor Author

@XuehaiPan XuehaiPan May 13, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a public API to check whether the holder is constructed can resolve my issue.

// Maybe need to check if the object is an instance of pybind11 type
bool is_holder_constructed(PyObject *object) {
    auto *const instance = reinterpret_cast<py::detail::instance *>(self_base);
    return instance->get_value_and_holder().holder_constructed();
}

I also updated the documentation about this but accessing the pybind11::detail namespace.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The general idea sounds good, but I have doubts about adding it as pybind11:: is_holder_constructed().

To be safe enough for a public API, we'd need to guard with a check: "is this actually a pybind11 class_-wrapped type"?

Something like PyTypeObject *srctype = Py_TYPE(src.ptr()); and all_type_info(srctype) probably (see pybind11/detail/type_caster_base.h).

But in the context of custom_type_setup, tp_traverse, tp_clear, is that actually needed?

Another idea:

I see this in the tp_traverse and tp_clear implementations:

                auto &self = py::cast<OwnsPythonObjects &>(py::handle(self_base));

That could do the holder_constructed check internally, my guess is cheaply, although I could be wrong.

We might have to change the recipe to:

                auto *self = py::cast<OwnsPythonObjects *>(py::handle(self_base));

And if the holder isn't constructed, this could return nullptr.

Would that still work for your purposes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see this in the tp_traverse and tp_clear implementations:

                auto &self = py::cast<OwnsPythonObjects &>(py::handle(self_base));

That could do the holder_constructed check internally, my guess is cheaply, although I could be wrong.

If the instance is not initialized, what behavior would be expected? Should it raise an error? I don't know if it is allowed to raise exceptions in tp_traverse and tp_clear. These two functions should always succeed, I think.

We might have to change the recipe to:

                auto *self = py::cast<OwnsPythonObjects *>(py::handle(self_base));

And if the holder isn't constructed, this could return nullptr.

This works for me! I think this would be the best solution to have minimal impact on the downstream and be easy to migrate.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works for me! I think this would be the best solution to have minimal impact on the downstream and be easy to migrate.

Do you think you could try? — I can only offer to assist, by answering question if you run into roadblocks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It crossed my mind: Possibly the suggested change could get very involved ("hairy"). In case you start to feel it's too much, I'd be OK to add pybind11::detail:: is_holder_constructed() as a semi-private API, without runtime guards, but a carefully written comment to explain the critical precondition: the caller is responsible for ensuring that the reinterpret_cast is valid.

@rwgk
Copy link
Collaborator

rwgk commented May 13, 2025

BTW: You can help me following along by not force pushing.

I recommend never force pushing, just keep merging in from master (i.e. do not rebase). We'll squash-merge in the end anyway.

@henryiii
Copy link
Collaborator

henryiii commented May 13, 2025

I strongly recommend rebasing to keep following master, and never, ever create a merge commit. There's an even an automated rebase button in the dropdown below. If you create merge commits, eventually the pull request becomes unmergable and it is horrible to try to pick out the actual changes. It also becomes really hard to see what you've changed and what change upstream; eventually GitHub can't display it either. I've had a had an important long-standing PR in pypa/build die because of this. I spend hours trying to pick out the actual changes before giving up. Linear history is really important in git!

Just don't rebase to change previous commits, only to keep the PR fresh. (PS: if I'm the primary reviewer, then you can change commits too, I don't mind, I prefer the PR to tell a story. But this is the balance we've agreed to for general PRs here.)

PS: Feel free to follow @rwgk's wishes for this PR to get his help, this is general advice since he gave general advice.

#if PY_VERSION_HEX >= 0x03090000 // Python 3.9
Py_VISIT(Py_TYPE(self_base));
#endif
if (!py::detail::is_holder_constructed(self_base)) [[unlikely]] {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will look a lot more straightforward if we simply guard the existing code with

if (py::detail::is_holder_constructed(self_base)) {

Could you please update the code in tests/test_custom_type_setup.cpp accordingly?

Do we still need the new test?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need the new test?

I think it can serve as a backlog if we implement #5654 (comment) eventually.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think what'll be most practical and best at this stage:

  • Pull out the changes to docs/advanced/classes.rst and include/pybind11/detail/value_and_holder.h into a separate PR.

  • Update tests/test_custom_type_setup.cpp to be 1:1 with the updated documentation in classes.rst

Merge that PR.

Then you can merge the new master here and work on this PR in the future.

The tests will need more work.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #5669.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants