Skip to content

Send raw SQL to database backend #213

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
michaelasapp opened this issue Mar 2, 2019 · 9 comments
Closed

Send raw SQL to database backend #213

michaelasapp opened this issue Mar 2, 2019 · 9 comments

Comments

@michaelasapp
Copy link

Background: I'm trying to initialize a database. I have a ddl file that contains a set of SQL statements, e.g.

DROP TABLE IF EXISTS table1;
CREATE TABLE table1 (id bigint, x float, y float);
CREATE INDEX table1_id ON table1 (id);

DROP TABLE IF EXISTS table2;
CREATE TABLE table2 (id bigint, label varchar(50));
CREATE INDEX table2_id ON table2 (id);

I'd like to read the file and send the query as-is to the PostgreSQL backend, e.g.

db <- DBI::dbConnect(RPostgres::Postgres(), ... )
create_db_query <- readr::read_file("db.ddl")
DBI::dbSendStatement(db, create_db_query)

Alas, I receive an error --

Error in result_create(conn@ptr, statement) : 
  Failed to prepare query: ERROR:  cannot insert multiple commands into a prepared statement

I also tried the other send approach (dbExecute, dbSendQuery, etc) available in the RPostgres package, and could not locate any other method that would let me do that.

I don't have the skills to navigate the C++ code; my crude read is that this error seems to arise when invoking the underlying Postgres function PQPrepare. It appears that a limitation of PostgreSQL is that there can only be one command per prepared statement (reference).

Desired Solution: Provide a mechanism to send raw SQL to the backend system without the SQL being be a prepared query.

I'm aware that an alternative is to write the table construction logic in R; I'd prefer to avoid that for interoperability reasons (i.e. I can paste into a SQL prompt; I can invoke via Python, etc) as well as clarity (I work with more people than can read SQL than can read R).

@krlmlr
Copy link
Member

krlmlr commented Mar 3, 2019

Thanks. Unfortunately this is a limitation of many database backends, and DBI supports only the lowest common denominator -- submitting individual queries. Perhaps squr or other related projects can help?

@michaelasapp
Copy link
Author

michaelasapp commented Mar 3, 2019

Thanks. I actually checked squr after you mentioned it on DBI #273; however, it looks like it functions by passing it a function that can send raw SQL. Taken from the manual of squr, you need to define a function like:

# on database, drivers, etc, but needs only setup once.
rodbc <- function(query)
{
  ch <- RODBC::odbcDriverConnect("<connectionstring>")
  on.exit(RODBC::odbcClose(ch))
  RODBC::sqlQuery(ch, query)
}

Once you have that function, you can use it to submit squr managed queries to the database with sq_send(.with = rodbc).

I believe that still gets me back to the first problem: having a function to send raw SQL to the database backend in the first place.

You also pointed out dbr in the same issue thread, however, AFAICT that's built on top of DBI.

@krlmlr
Copy link
Member

krlmlr commented Mar 3, 2019

Thanks for the heads up. There are other situations where using a "regular", non-prepared query is useful, e.g. RMariaDB doesn't support certain query types with its "prepared" API (r-dbi/DBI#268). We might be able to support execution of multiple queries if we support the non-prepared API here too.

For now, seems like your best bet is to look for semicolons at EOL, e.g. with:

strsplit(paste(readLines(...), collapse = "\n"), ";\n")[[1]]

and sending these one by one with dbExecute() . This requires that you have control over the code you're parsing, to avoid surprises.

@michaelasapp
Copy link
Author

Thanks! Yes, I'd love to see the non-prepared API; until then, that's a good workaround.

@JasperSch
Copy link

JasperSch commented Jan 3, 2020

Thank you both for the valuable information provided here.

I was wondering if there is a better way to parse the SQL statements apart from splitting by the semicolon. Is there an R package you would advise for this?

I recently stumbled upon problems with some more advanced SQL, where parsing on the semi column did not do the trick. I attached an example with solution below in case anyone is having similar problems.

Support for the non-prepared API would of course still be the better solution I believe.

example SQL code with semi columns:

CREATE OR REPLACE FUNCTION myfunction()
RETURNS trigger AS
$BODY$
  DECLARE
     cnt integer;
     id integer;
  BEGIN
...
$BODY$
LANGUAGE plpgsql; 

solution = introduce the semi columns again for code defined between two "$BODY$"'s:

  scriptStatements <- strsplit(scriptContent, split = ";")[[1]]

  bodyIndices <- grep("\\$BODY\\$", scriptStatements) 
  nBodies <- length(bodyIndices)
  for(i in seq(1, nBodies, by=2)){
    start <- bodyIndices[i]
    stop <- bodyIndices[i+1]
    statement <- paste0(scriptStatements[start:stop], collapse = ";")
    scriptStatements[start] = statement
    scriptStatements[(start+1):stop] = ""  
  }
  

@krlmlr
Copy link
Member

krlmlr commented Jan 3, 2020

I'm not sure it's possible to reliably split SQL -- it's something that the database backend needs to do.

According to the docs, if we use PQexec() we get this for free. We should use PQexec() if immediate = TRUE in dbSendQuery() et al..

@JasperSch
Copy link

Thank you for your feedback, I agree.
I'm looking forward to having this functionality and in the meantime thanks for the clarification and workaround.

@krlmlr
Copy link
Member

krlmlr commented Dec 26, 2020

Now in #272.

@krlmlr krlmlr closed this as completed Dec 26, 2020
@github-actions
Copy link
Contributor

This old thread has been automatically locked. If you think you have found something related to this, please open a new issue and link to this old issue if necessary.

@github-actions github-actions bot locked and limited conversation to collaborators Dec 27, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants