Skip to content

Use anonymous statement when needed #80

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
AmatanHead opened this issue Sep 27, 2017 · 5 comments
Closed

Use anonymous statement when needed #80

AmatanHead opened this issue Sep 27, 2017 · 5 comments
Assignees
Labels
Milestone

Comments

@AmatanHead
Copy link
Contributor

I'm using pgbouncer with transaction pooling enabled which means no prepared statements.
I'd like to see an option to disable prepared statements.
Actually, I think we should disable them at all because asyncpg has its own statements cache.

@fantix
Copy link
Member

fantix commented Sep 28, 2017

Hmm I wasn't aware of that side effect, apologies. The reason to enable prepared statement for all execution was to feed the "row description" to SQLAlchemy result proxy for typing purposes, which is offered only through prepared statement in asyncpg. I'll investigate a bit about how psycopg2 did this and try to find a workaround.

@fantix fantix self-assigned this Sep 28, 2017
@fantix fantix added this to the v0.5.x milestone Sep 28, 2017
@fantix fantix added the task label Sep 28, 2017
@AmatanHead
Copy link
Contributor Author

Apart from query introspection, using prepared statements would only make sense for executemany.

Also, asyncpg itself has some issues with pgbouncer — see MagicStack/asyncpg#198. So this is not a critical issue in any way =)

@fantix
Copy link
Member

fantix commented Sep 28, 2017

Apart from query introspection, using prepared statements would only make sense for executemany.

Yeah exactly 😈

Also, asyncpg itself has some issues with pgbouncer — see MagicStack/asyncpg#198. So this is not a critical issue in any way =)

Ah good to know! I'll track in and see how I can be helpful.

@fantix
Copy link
Member

fantix commented Oct 4, 2017

Looks like MagicStack/asyncpg#198 is fixed, I'll check how this can proceed.

@fantix
Copy link
Member

fantix commented Oct 17, 2017

I'm recently reading docs and source code of libpq, psycopg2, SQLAlchemy and asyncpg. Here're some updates (a bit long).

First of all, to clarify about prepared statements: according to PostgreSQL docs 52.1.2, other than execute many and introspection, prepared statements are also an essential step "parse" in extended query, which is the default method asyncpg used to execute queries with parameters. Therefore, any SQL (with parameters) sent to asyncpg is internally prepared first in the "parse" step, through a Parse command on protocol level. It is not an option to disable prepared statements in asyncpg.

Meanwhile, simple queries without parameters are executed through the simple query interface, which is the only method used by psycopg2 for now. Simple query uses Query command, which by default returns the RowDescription together with the result set, that is how psycopg2 gained access to the introspection information. But for extended query, RowDescription is retrieved only through the Describe command, and this result is not (yet) accessible in asyncpg for normal queries (normal query means fetch for example, all asyncpg query API but connection.prepare and connection.cursor), for now. So when SQLAlchemy asked for the introspection information, GINO had to find another way.

asyncpg.prepared_stmt exposed RowDescription. Under the hood it is the same thing as the "parse" step of normal extended query, just that extended query uses unnamed prepared statements (not always, explained later), while asyncpg.prepared_stmt uses named ones. Within one PostgreSQL session, there can be many named prepared statements, but at most only one unnamed prepared statement. The unnamed prepared statement always store the result of the most recent prepare action without name. Other than that, named and unnamed works identically. That is to say, GINO only replaced some unnamed prepared statements into named ones for normal queries (as well as simple queries, which should follow asyncpg to use simple query API later).

Prepared statements are stored on PostgreSQL server, and cached by asyncpg. Obviously unnamed prepared statements cannot be cached, because the next prepare action without name would just flush the previously cached prepared statement on the server side, thus making the cache useless. In order to cache prepared statements for normal queries, asyncpg turned unnamed prepared statements into named ones for extended query if cache is enabled. Meanwhile, asyncpg.prepared_stmt always use named prepared statements, and cached in the same way. So when cache is enabled (which is default), current GINO is identical to asyncpg on PostgreSQL protocol level.

However, there is PgBouncer in transaction pooling mode. Officially it states not supporting prepared statements, but it does support extended query. So I tested it, and found it true that PgBouncer doesn't "support" prepared statements. But it doesn't forbid to do so, it's only that prepared statements may be messed up badly between transactions. However, if prepared statements stay inside transactions, it is still safe to use them, both named and unnamed. Especially unnamed ones - because you don't need to worry about closing them, it is always the newest and previous ones are just closed. When this comes to asyncpg/GINO who has no information about which exact backend PostgreSQL session it is using (thanks to PgBouncer), it is meaningless to cache prepared statements even if naming them globally unique between connections (OK to create, but still big issue in using, because they are scoped per backend PostgreSQL session). Therefore, it is the right way under PgBouncer transaction pooling mode to disable prepared statement cache in asyncpg, forcing asyncpg to use unnamed prepared statements for extended queries. But of course connection.prepare still explicitly uses named prepared statements thus cannot be used safely (asyncpg automatically name prepared statements per connection, names may conflict between connections).

Coming to this issue itself, MagicStack/asyncpg#198 fixed an internal usage of named prepared statement, making it possible to use normal queries with PgBoucner. Summing this up, GINO should also use unnamed prepared statement for normal queries, if statement cache is disabled. This would be a simple fix (a bit hacky using asyncpg internal API).

For future, it may be reasonable for asyncpg to offer introspection information for all APIs - simple queries, normal extended queries, connection.prepare and connection.cursor - instead of only using internally. I'll try to patch up for this.

fantix added a commit that referenced this issue Oct 17, 2017
fantix added a commit that referenced this issue Oct 18, 2017
fantix added a commit that referenced this issue Oct 18, 2017
@fantix fantix changed the title Option to disable prepared statements Use anonymous statement when needed Oct 18, 2017
@fantix fantix mentioned this issue Oct 18, 2017
fantix added a commit that referenced this issue Nov 15, 2017
fantix added a commit to decentfox/gino that referenced this issue Jan 29, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants