Use anonymous statement when needed #80

AmatanHead · 2017-09-27T17:43:12Z

I'm using pgbouncer with transaction pooling enabled which means no prepared statements.
I'd like to see an option to disable prepared statements.
Actually, I think we should disable them at all because asyncpg has its own statements cache.

fantix · 2017-09-28T01:22:07Z

Hmm I wasn't aware of that side effect, apologies. The reason to enable prepared statement for all execution was to feed the "row description" to SQLAlchemy result proxy for typing purposes, which is offered only through prepared statement in asyncpg. I'll investigate a bit about how psycopg2 did this and try to find a workaround.

AmatanHead · 2017-09-28T01:39:29Z

Apart from query introspection, using prepared statements would only make sense for executemany.

Also, asyncpg itself has some issues with pgbouncer — see MagicStack/asyncpg#198. So this is not a critical issue in any way =)

fantix · 2017-09-28T02:07:47Z

Apart from query introspection, using prepared statements would only make sense for executemany.

Yeah exactly 😈

Also, asyncpg itself has some issues with pgbouncer — see MagicStack/asyncpg#198. So this is not a critical issue in any way =)

Ah good to know! I'll track in and see how I can be helpful.

fantix · 2017-10-04T03:25:13Z

Looks like MagicStack/asyncpg#198 is fixed, I'll check how this can proceed.

fantix · 2017-10-17T08:55:24Z

I'm recently reading docs and source code of libpq, psycopg2, SQLAlchemy and asyncpg. Here're some updates (a bit long).

First of all, to clarify about prepared statements: according to PostgreSQL docs 52.1.2, other than execute many and introspection, prepared statements are also an essential step "parse" in extended query, which is the default method asyncpg used to execute queries with parameters. Therefore, any SQL (with parameters) sent to asyncpg is internally prepared first in the "parse" step, through a Parse command on protocol level. It is not an option to disable prepared statements in asyncpg.

Meanwhile, simple queries without parameters are executed through the simple query interface, which is the only method used by psycopg2 for now. Simple query uses Query command, which by default returns the RowDescription together with the result set, that is how psycopg2 gained access to the introspection information. But for extended query, RowDescription is retrieved only through the Describe command, and this result is not (yet) accessible in asyncpg for normal queries (normal query means fetch for example, all asyncpg query API but connection.prepare and connection.cursor), for now. So when SQLAlchemy asked for the introspection information, GINO had to find another way.

asyncpg.prepared_stmt exposed RowDescription. Under the hood it is the same thing as the "parse" step of normal extended query, just that extended query uses unnamed prepared statements (not always, explained later), while asyncpg.prepared_stmt uses named ones. Within one PostgreSQL session, there can be many named prepared statements, but at most only one unnamed prepared statement. The unnamed prepared statement always store the result of the most recent prepare action without name. Other than that, named and unnamed works identically. That is to say, GINO only replaced some unnamed prepared statements into named ones for normal queries (as well as simple queries, which should follow asyncpg to use simple query API later).

Prepared statements are stored on PostgreSQL server, and cached by asyncpg. Obviously unnamed prepared statements cannot be cached, because the next prepare action without name would just flush the previously cached prepared statement on the server side, thus making the cache useless. In order to cache prepared statements for normal queries, asyncpg turned unnamed prepared statements into named ones for extended query if cache is enabled. Meanwhile, asyncpg.prepared_stmt always use named prepared statements, and cached in the same way. So when cache is enabled (which is default), current GINO is identical to asyncpg on PostgreSQL protocol level.

However, there is PgBouncer in transaction pooling mode. Officially it states not supporting prepared statements, but it does support extended query. So I tested it, and found it true that PgBouncer doesn't "support" prepared statements. But it doesn't forbid to do so, it's only that prepared statements may be messed up badly between transactions. However, if prepared statements stay inside transactions, it is still safe to use them, both named and unnamed. Especially unnamed ones - because you don't need to worry about closing them, it is always the newest and previous ones are just closed. When this comes to asyncpg/GINO who has no information about which exact backend PostgreSQL session it is using (thanks to PgBouncer), it is meaningless to cache prepared statements even if naming them globally unique between connections (OK to create, but still big issue in using, because they are scoped per backend PostgreSQL session). Therefore, it is the right way under PgBouncer transaction pooling mode to disable prepared statement cache in asyncpg, forcing asyncpg to use unnamed prepared statements for extended queries. But of course connection.prepare still explicitly uses named prepared statements thus cannot be used safely (asyncpg automatically name prepared statements per connection, names may conflict between connections).

Coming to this issue itself, MagicStack/asyncpg#198 fixed an internal usage of named prepared statement, making it possible to use normal queries with PgBoucner. Summing this up, GINO should also use unnamed prepared statement for normal queries, if statement cache is disabled. This would be a simple fix (a bit hacky using asyncpg internal API).

For future, it may be reasonable for asyncpg to offer introspection information for all APIs - simple queries, normal extended queries, connection.prepare and connection.cursor - instead of only using internally. I'll try to patch up for this.

… when cache disabled

Fixed #80

…ended query when cache disabled

fantix self-assigned this Sep 28, 2017

fantix added this to the v0.5.x milestone Sep 28, 2017

fantix added the task label Sep 28, 2017

fantix added a commit that referenced this issue Oct 17, 2017

Fixed #80, follow asyncpg to use unnamed statement for extended query…

c9eb88f

… when cache disabled

fantix added a commit that referenced this issue Oct 18, 2017

Fixed #80, follow asyncpg to use unnamed statement for extended query…

c4f904f

… when cache disabled

fantix added a commit that referenced this issue Oct 18, 2017

Fixed #80, follow asyncpg to use unnamed statement for extended query…

77b7cf0

… when cache disabled

fantix changed the title ~~Option to disable prepared statements~~ Use anonymous statement when needed Oct 18, 2017

fantix mentioned this issue Oct 18, 2017

Fixed #80 #90

Merged

fantix closed this as completed in #90 Nov 15, 2017

fantix added a commit that referenced this issue Nov 15, 2017

Merge pull request #90 from fantix/t80

4ad8526

Fixed #80

fantix added a commit to decentfox/gino that referenced this issue Jan 29, 2018

Fixed python-gino#80, follow asyncpg to use unnamed statement for ext…

d91872b

…ended query when cache disabled

fantix mentioned this issue Jan 31, 2018

Problem when use enum type and create model #126

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use anonymous statement when needed #80

Use anonymous statement when needed #80

AmatanHead commented Sep 27, 2017

fantix commented Sep 28, 2017

AmatanHead commented Sep 28, 2017

fantix commented Sep 28, 2017

fantix commented Oct 4, 2017

fantix commented Oct 17, 2017 •

edited

Loading

Use anonymous statement when needed #80

Use anonymous statement when needed #80

Comments

AmatanHead commented Sep 27, 2017

fantix commented Sep 28, 2017

AmatanHead commented Sep 28, 2017

fantix commented Sep 28, 2017

fantix commented Oct 4, 2017

fantix commented Oct 17, 2017 • edited Loading

fantix commented Oct 17, 2017 •

edited

Loading