Skip to content

umbrella bug: asyncio interoperability #171

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
njsmith opened this issue May 24, 2017 · 8 comments
Closed

umbrella bug: asyncio interoperability #171

njsmith opened this issue May 24, 2017 · 8 comments

Comments

@njsmith
Copy link
Member

njsmith commented May 24, 2017

There are several discussions here that are somewhat logically independent, but linked:

  • Should we use the asyncio (or some other third party) event loop internally to implement our lowest-level IO primitives? This wouldn't necessarily change anything user visible; it would just outsource the dirty business of calling epoll and friends to someone else. Let's call this feature io-via-asyncio.

  • Should it be possible to run trio on top of an asyncio event loop? e.g., loop.run_until_complete(trio.run_in_asyncio, ...), to allow asyncio applications to call into trio code. Let's call this trio-libs-on-asyncio.

  • How can we best allow asyncio libraries to be used from trio? Let's call this asyncio-libs-on-trio.

Some initial thoughts:

io-via-asyncio

The major challenges here would be in coming up with a shim layer to implement trio's semantics in terms of the asyncio APIs, and extend those APIs where necessary. @1st1 has offered to add whatever APIs we need, which is great, but it isn't immediately obvious where to start.

In practice, it's very unlikely we could actually use the stdlib asyncio default event loop, because we'll definitely need at least some enhancements and bug fixes and the stdlib doesn't really get those on any kind of useful schedule. (Curio's experience with the selectors module has also made me wary of depending on the stdlib for this kind of thing.) So the assumption should be that we'd be using a third-party implementation like uvloop, or a hot-off-the-presses unstable version of asyncio ripped out of cpython master. (And this makes it faster to get bug fixes, but I think enhancements would still need to go on the PEP / CPython release timescale?)

uvloop doesn't play well with pypy (because of the cython), and inherits a number of limitations from libuv, e.g. no pluggable clock support, no cancellation support for most operations on windows (libuv source code does not contain the string CancelIoEx), and I don't see how to make wait_all_tasks_blocked work without some pretty extreme workarounds. If the pitch is "but this way you don't have to write your own I/O code!" then the part where I end up having to write a bunch of I/O code in C makes it somewhat less compelling :-(

And obviously the stdlib asyncio loops have similar limitations, or else uvloop couldn't be a drop-in replacement, in addition to the part where they aren't currently shipped in a usable form. And e.g. AFAICT from a quick look asyncio's IOCP cancellation is just broken (_winapi.Overlapped.cancel is synchronous, but CancelIoEx is asynchronous – see e.g. and search for "Wait for the I/O subsystem to acknowledge our cancellation"). Plus – and this is perhaps the most important issue – the current API is rather limited and using it would require overcoming some extreme abstraction skew. Implementing a fake socket object on top of a protocol/transport pair is going to involve a lot of complicated and relatively inefficient code, and then how do we implement sendmsg? Or raw or seqpacket or AF_BLUETOOTH sockets? trio supports all this stuff right now.

That's assuming we use the "protocol" APIs, which are the main ones and the only fully portable ones. We could also potentially restrict ourselves to just using add_reader and add_writer, and sticking to the select reactor on Windows (since the iocp reactor doesn't support these); that's actually enough to implement all the things we really support right now. But then we can't properly implement things like subprocess support (requires access to the kqueue object on kqueue platforms), or ever support iocp (see #52).

With sufficient effort, many of these limitations can be overcome or lived with. But it does look like some substantial effort. And it'll likely make things slower and more brittle architecturally.

Against this, the primary advantage would be that we don't have to maintain our own IO code. Given that IO code tends to be extremely tricky and have many obscure corner cases, this would be good. If we were using asyncio, then we could automatically take advantage of their bug fixes, and any work we put into testing and fixing bugs would automatically benefit all of asyncio's users.

This advantage isn't urgent, though, in the sense that what we have right now works, and (assuming the issues above are somehow fixed/worked around) we could switch what we do internally at any time.

And fundamentally, this isn't on trio's critical path: trio is an experiment, and the question we're trying to discover the answer to is whether trio's developer experience is so overwhelmingly better than traditional callback-based libraries that it can overcome their head start on ecosystems / maturity / familiarity. A 50% increase or reduction in the number of rare and obscure I/O bugs is not going to change the answer to this question.

So in the short/medium term this suggests that we should stick with our code, see how much trouble it causes, and continue to weigh that against the costs of switching. If we find ourselves wasting weeks trying to figure out why our tcp stack is flaking out then that'd be a pretty good sign that we're on the wrong path. I honestly find it hard to predict; trio's code is written extremely carefully, and taking full advantage of every existing source I can get my hands on (by which I mean: blatantly stealing twisted's hard-won knowledge at every opportunity), but IO is hard.

The other advantage that this doesn't consider is that io-via-asyncio might help with the trio-libs-on-asyncio or asyncio-libs-on-trio features, so lets consider those.

trio-libs-on-asyncio

This pretty much has io-via-asyncio as a minimal prerequisite, so see above. (We wouldn't necessarily have to switch to using only asyncio, but we would at least need to implement an asyncio backend, which is basically all the work.) In addition it would require some rearrangement of the run interface, which is not a big deal. And... there currently aren't any trio libs that people want to run on asyncio right now, so it doesn't seem super urgent :-). Perhaps it would attract people to trio if they thought that it was a good way to make libraries that work everywhere? But if that's your main motivation then even if we implemented this you would probably still be better off asyncio (or twisted or gevent) instead of trio.

The thing is, even if we made trio.run work as an asyncio coroutine, there still wouldn't be any sensible way for the asyncio and trio worlds to talk to each other. I guess it's fine if the only thing your library needs to expose is one-and-done functions that can be executed via await trio.run_under_asyncio(...) calls, but that leaves out a lot of use cases.

I'm not sure what a sensible communications channel would look like. Some sort of cross-world Queue object?

In general, it seems unlikely that using asyncio libraries on trio is never going to feel very natural, and ditto using trio libraries on asyncio is never going to feel very natural, because they have such different idiomatic ways of structuring code.

asyncio-libs-on-trio

At least in the short term, this seems like the most interesting option. (And it might be very helpful in getting us over the early adoption hump!) What concerns me is how to let asyncio into the trio world in a controlled fashion that doesn't end up breaking all of trio's carefully created invariants. Maybe this is silly and we'd be better of just YOLOing it, worse-is-better style, but it makes me nervous. For example: if there's a global event loop full of callback spaghetti coexisting with trio's tidy task tree, then how do we do things like figure out when to exit? (Maybe 3.7's asyncio will be better in this regard; I know Yury is planning to propose a curio/trio-inspired loop.run in his asyncio updates PEP.)

One possible way to do this would be to not have a global loop object, but instead treat loops as something like nurseries: a specific place bound to a specific task. with open_asyncio_loop() as loop: await loop.run_until_complete(...).

I have no idea how silly this would be. Technically what it would require is essentially implementing a custom asyncio event loop on top of trio's public API (so at least it has the advantage that it's something that doesn't require rearchitecting the entire trio core, and in fact could live in a separate library). I don't think we'd know how difficult this is until we try it – there's a bunch of stuff in the event loop interface, but most of it seems like it should map pretty straightforwardly onto relatively short trio code? And asyncio is designed to support the creation of new event loops by subclassing (sigh).

This seems like the most interesting place to experiment in the short term, though.

note

These are very much "initial thoughts" as mentioned above; lets use this thread for further discussion.

@njsmith
Copy link
Member Author

njsmith commented Jun 15, 2017

More notes on asyncio-libs-on-trio

Entering and exiting asyncio-land

Let's say trasyncio is the name of this hypothetical library that provides an asyncio event loop implemented using trio APIs. I think a nice API might be await trasyncio.run(async_fn, *args), which is a trio-flavored async function that takes an asyncio-flavored async function and sets things up for it to run under a newly allocated virtual asyncio loop (see also: python/asyncio#465). And I guess we'd want a await trasyncio.wait_stop() which would be an asyncio-flavored function that waits until loop.stop() is called, e.g.:

def asyncio_main():
    loop = asyncio.get_event_loop()
    loop.run_until_complete(set_stuff_up)
    loop.run_forever()   # runs until loop.stop() is called

becomes

async def trasyncio_main():
    await set_stuff_up()
    await trasyncio.wait_stop()

Nitpicky detail to figure out: trio's convention says the signature should be trasyncio.run(async_fn, *args). asyncio's convention says it should be trasyncio.run(async_fn(*args)) (and I guess in 3.7 there will probably be an asyncio.run with this signature). So what's the trasyncio convention? I think the main argument for going with the trio convention is that if there are users who really only learned trio, but that want to use some existing asyncio library without learning anything about asyncio, then sticking with the trio convention lets them do that. At least in simple cases. OTOH consistency with asyncio.run has some value too, so I dunno.

Finding the loop

There's a tricky detail about how the asyncio code finds the loop. There are two mechanisms: asyncio.get_event_loop(), and asyncio._get_running_loop(). (The latter looks like an internal API, but Yury says that people should feel free to depend on it.) Fortunately, the first thing that asyncio.get_event_loop() does is try to delegate to asyncio._get_running_loop(), so in practice we can ignore asyncio.get_event_loop() and only have to worry about asyncio._get_running_loop(). Basically the way this works is that there's a thread-local which stores the currently running loop (settable by calling asyncio._set_running_loop(loop)), and asyncio._get_running_loop() returns it. Unfortunately, what we want is a task-local loop, not a thread-local loop so we can't quite make use of this directly.

I think the thing to do is to define two classes that implement the asyncio loop interface: one that's the actual task-local thing that implements all the logic, and then a second which is a simple facade where all its methods look up the real loop in task-local storage and then delegate to it. So trasyncio.run would take care of creating a real loop and stashing it in task-local storage (as well as all the other little bits of nonsense that are required to set up and tear down a loop), and then somehow we'd make sure that the trio thread as a whole has called asyncio._set_running_loop() with this facade object.

A possible downside here is code that gets confused because it ends up holding a reference to the facade object and carrying across different contexts, and then different calls on the "the same" loop object effectively end up going to different places. Maybe it would be better to monkeypatch _get_running_loop() to check task-local storage before checking thread-local storage.

....Or, maybe I'm thinking about this wrong, and actually the thing to do is to ignore the _{get,set}_running_loop machinery, and instead install a custom event loop policy that just checks our task-local storage. Maybe everyone actually calls get_event_loop, and it doesn't matter if _get_running_loop doesn't work.

Cancellation

In this setup, asyncio-land is embedded inside a blocking trio call, and that trio call might receive a trio cancellation. Then what? I guess the really lazy thing would be to allow trio.Cancelled to be raised inside asyncio, and see what happens. Or potentially we could convert trio.Cancelled into asyncio.CancelledError. The tricky bit would be converting it back if it propagates back into trio-land, since we'd need to restore the magic metadata that lets the cancel scope recognize which exception it raised. Maybe we can smuggle that through asyncio by stashing it on the asyncio.CancelledError somewhere.

Finer-grained integration

The above works for cases where you want to run one self-contained operation inside asyncio, like a single HTTP GET or starting up a self-contained server that will run for a while. And if we had to then it would probably be easy enough to add some explicit communication channels between asyncio-land and trio-land, like say a Queue that works in both places. But there's another use case one can imagine. Like, suppose someone wants to use asyncpg from trio. Here's some example asyncpg code -- notice how it involves setting up a connection, and then making a bunch of calls on that, intermixed with other logic. If you want to use asyncpg in your trio web app (and assuming asyncpg doesn't grow native trio support), then you probably want to be able to mix calls to asyncpg async functions and calls to native trio async functions.

One way to do this would be to have some way to set up an asyncio loop context in the current trio task, and then have an explicit adapter function you have to call each time:

with trasyncio.open_loop():
    conn = await trasyncio.run_in_open_loop(asyncpg.connect, 'postgresql://postgres@localhost/test')
    # Execute a statement to create a new table.
    await trasyncio.run_in_open_loop(conn.execute, '''
        CREATE TABLE users(
            id serial PRIMARY KEY,
            name text,
            dob date
        )
    '''))
    # and so forth

The advantage of this is that it's explicit, and it gives a nice place to manage the boundary between asyncio-land and trio-land. For example, it could take care of convert asyncio.CancelledError back into trio.Cancelled. There's some precedent for this; e.g. if you're using twisted and asyncio together than you'll sometimes need to explicitly juggle between Futures and Deferreds. Probably it's not quite as repetitive as this, though. The repetitiveness could potentially be hidden by a wrapper library, like trasyncpg or something, that just wraps asyncpg's API in trasyncio.run_in_open_loop.

The alternative would be to have some sort of magic inside trio's run loop, so that await asyncio_fn(...) Just Works. This would mean speaking asyncio's yield protocol, which I believe just involves yielding asyncio.Future objects (and there's some provision for third-party duck-Future classes); so the core run loop would need to recognize these objects and then somehow hand them to the current loop for execution. I guess this wouldn't necessarily be too bad – we already have code to efficiently and reliably recognize "weird" yielded objects and print a nice error message; we could add a step where we ask some task-local handler if it knows what to do with them. This would let us support asyncio (and also potentially twisted etc.) without having to hard code them into the run loop, or add much complexity to the run loop in general.

BUT, if doing it this way, then there'd be no place to translate back from asyncio-land exceptions to trio-land exceptions. E.g.:

with trasyncio.open_magic_loop():
    with trio.move_on_after(10):
        await asyncio.sleep(20)

if we translate trio.Cancelledasyncio.CancelledError when raising errors in asyncio-land, then asyncio.sleep will have a asyncio.CancelledError injected, and this will then propagate out (no way to stop it), trio.move_on_after won't recognize it, and it'll potentially keep propagating indefinitely until it crashes our whole program. Not so nice.

Maybe it would help to have a class TrasyncioCancelled(trio.Cancelled, asyncio.CancelledError): pass, which asyncio code will understand as CancelledError and trio code will understand as Cancelled, without having to convert back and forth?

Another thing we'd probably want if going done this road is the ability to run await trio.whatever from inside asyncio.Tasks. ...in fact, this is probably more important than the other way around, because if this worked then it would be sufficient to handle all the mixed cases, using the original trasyncio.run API described above.

I think this is possible without horrible hacks, because modern asyncio has a loop.create_task API that can be overridden to force the use of a custom Task object. But it might involve copy-pasting a bunch of code from asyncio to implement our custom Task type.

There's also the issue that once we go into asyncio-land, the mapping between our code and trio tasks gets pretty confused pretty fast. Callbacks will be running in separate tasks from the task that registered them, etc. Trying to use trio's task-local storage from inside asyncio-land for example will be an exercise in frustration. (Though in some sense this is orthogonal to whether we allow mixing asyncio-flavored async functions and trio-flavored async functions, because accessing task-local storage is a synchronous operation. But my worry is that if we say "hey you can freely intermix trio and asyncio code" then people will expect everything to work and be confused when it doesn't. I don't want people to have to have a deep understanding of how trio and asyncio are implemented to use this. Though OTOH maybe that's just par for the course with asyncio...) I'm not entirely sure how the with trio.move_on_after example above would actually work - can we propagate Cancelled into Future objects that we're awaiting, like asyncio does with task cancellation? Should we? It's possible that finer-grained integration is just intrinsically too confusing to support.

@smurfix
Copy link
Contributor

smurfix commented Sep 1, 2017

The main problem I see with a hybrid class TrasyncioCancelled is that it's now an Exception, not just a BaseException, so any except Exception: in trio code will catch it. Not good. I propose to monkey-patch asnycio.CancelledError instead: simply replace it with trio.Cancelled.

Another approach might be to replace asyncio.* entirely. I.e. sys.modules['asyncio']=importlb.import_module("trio.asyncio"). Ditto its submodules, if necessary. I wonder if that might not actually be less work overall.

@njsmith
Copy link
Member Author

njsmith commented Sep 1, 2017

Ugh, that's a good point about Cancelled and Exception.

In general I've tried to avoid messing with global state -- so e.g. if you want to have two calls to trio.run inside two different threads, or trio in one thread and asyncio in another, that all works. Monkeypatching asyncio would break that. But if it were the only way to make it work, well...

@smurfix
Copy link
Contributor

smurfix commented Sep 1, 2017

You do have a point about threads.
That triggers an idea: the easiest way to get trio and asyncio to interoperate is probably quite simple: use separate threads and write a slim translation layer. So instead of await asyncio_code() you write await aio_(asyncio_code()) and vice versa.

@cjerdonek
Copy link
Contributor

Could you replace except Exception: with different boilerplate? And if the boilerplate were a helper, you could update / change it if necessary centrally.

@njsmith
Copy link
Member Author

njsmith commented Sep 1, 2017

@smurfix running asyncio in a thread is an option, yeah, and does make simple cases easy, but as soon as you have any kind of non-trivial interaction between the trio and asyncio worlds then you're doing real thread programming. (In particular, if there's any state that the asyncio and trio code both touch, then you have all the mind-bending problems that come with that.) It's another interesting point in the space of interoperability options, but I think a custom loop is potentially a much nicer solution. (It would also mean that you can use all of Trio's regular niceties, like MockClock and virtual networks for testing, etc.)

@cjerdonek actually, that's a good point -- in trio for catch-all exceptions you already kind of have to use MultiError.catch instead of except Exception (I'm hoping to add some discussion of this to the tutorial soon). But even there, the idiom would be something like:

def catch_all(exc):
    if isinstance(exc, Exception):
        log_error(exc)
        return None
    else:
        return exc

with MultiError.catch(catch_all):
    ...

So you still end up having to somehow tell everyone who might have asyncio code running under them to special case this by making it if isinstance(exc, Exception) and not isinstance(exc, trio.Cancelled): .... This is particularly awkward given that this kind of catch-all handler will show up in places like HTTP servers running arbitrary user-defined handlers, so the server author actually has no idea whether the users will use asyncio or not.

This may be borrowing trouble though. This particular problem only arises in the fancy version of the API that allows asyncio and trio functions to be freely intermixed. That's certainly not necessary as part of a minimum viable product, and possibly not a good idea at all for other reasons. If we wanted to experiment with this, the first thing to do would be to implement a simple await trasyncio.run(...) with trio-land on one side and asyncio-land on the other and no direct mixing.

@smurfix
Copy link
Contributor

smurfix commented Sep 1, 2017

@njsmith Right. The idea to write a TrioEventLoop for asyncio to run "under" trio should be a better option all around.

@cjerdonek The point of this exercise (or, more precisely, my take on solving this problem) is to not require any replacement of boilerplate. Suppose you have a library like my qbroker which accepts async callbacks. It needs to do something sensible when that code sleeps, raises an exception, and/or gets cancelled, no matter whether it's aio- or trio-flavored. I doubt that handling all the corner cases reasonably correctly can work without modifying the guts of either the library,the caller (which may well be yet another llibrary), or asyncio; I somehow doubt that asyncio's create_task/future() hooks are sufficient.

In summary I don't think freely intermixing asyncio and trio is a good idea. The two have too-different semantics. You can't just pass a trio-flavored async function to asyncio.ensure_future() and expect that to work without major asyncio surgery; I doubt that asyncio's create_task/future() callbacks are sufficient.

@njsmith
Copy link
Member Author

njsmith commented Jan 22, 2018

Going to close this in favor of the https://github.com/python-trio/trio-asyncio/ bug tracker, since that seems to be the only form of trio/asyncio interoperability that makes sense right now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants