Skip to content

feat: support any number/order of merge operations #1938

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

davidjgoss
Copy link
Contributor

@davidjgoss davidjgoss commented Jan 2, 2024

This PR changes the parser to account for any number of WHEN MATCHED and/or WHEN NOT MATCHED clauses in a MERGE statement. This is most directly applicable to Snowflake. From their docs:

A single MERGE statement can include multiple matching and not-matching clauses (i.e. WHEN MATCHED ... and WHEN NOT MATCHED ...).

MERGE INTO t1 USING t2 ON t1.t1Key = t2.t2Key
    WHEN MATCHED AND t2.marked = 1 THEN DELETE
    WHEN MATCHED AND t2.isNewStatus = 1 THEN UPDATE SET val = t2.newVal, status = t2.newStatus
    WHEN MATCHED THEN UPDATE SET val = t2.newVal
    WHEN NOT MATCHED THEN INSERT (val, status) VALUES (t2.newVal, t2.newStatus);

To achieve this, the parser now collects a list of MergeOperations which can be a WHEN MATCHED THEN DELETE (newly added also), WHEN MATCHED THEN UPDATE or WHEN NOT MATCHED THEN INSERT.

The pre-existing getters that presume 0 or 1 of MergeUpdate and MergeInsert are preserved, but deprecated and suggest using the newly added MergeOperationVisitor instead, which is itself used for the deparser and validation logic internally.

@davidjgoss davidjgoss force-pushed the snowflake-multiple-when branch from 5be8e87 to 6570d44 Compare January 2, 2024 12:32
@manticore-projects
Copy link
Contributor

Greetings!

First of all, thank you for your good work and the contribution.
From a SnowFlake point of view this looks clearly valuable. However, I don't like it (pardon me for being so blunt) because its far away from the Standard and I do love and believe in standards, especially when there is no good reason to deviate from.

So what could be the way forward:

  1. we should have a kind of a vote about this feature. Maybe I am the only one with objections?!

  2. we definitely shall NOT deprecate the Standard Compliant getters/setters because those present the correct functionality

  3. I would rather love to see a way, where the MERGE statement is and stays first of all Standard compliant and then only adds a SnowFlake layer. The optional ON brackets raised my concern already, but I waived it because it was kind of reasonable (I really can't explain why the standard wants brackets there) and also non-invasive.

So just maybe, we should keep the Standard Merge and then derive an Enhanced Merge statement from it, probably even as a separate Class and Production.

@wumpz @tomershay @AnEmortalKid @jxnu-liguobin @zaza @gitmotte @tvar Your have contributed the most to the project, so please let us know your opinion on this.

@manticore-projects
Copy link
Contributor

Technically this PR is exceptionally well crafted! No concerns from this side. It's really all about the Snowflake specific syntax and how far we want to go under a RDBMS agnostic approach.

@davidjgoss
Copy link
Contributor Author

Thanks so much for the fast feedback @manticore-projects - all very fair, and I'm happy to make whatever changes we feel are best overall 👍

@manticore-projects
Copy link
Contributor

manticore-projects commented Jan 2, 2024

I want to add one thought: I am using MERGE intensively on Oracle and H2, Postgres only added it lately.
And those Merge statements are quite large and complex already, so it becomes handy when the Parser tells me during the development of the SQL, if the syntax is correct on Oracle and H2.

Washing up the syntax may allow for statements appearing valid when not valid on the big four RDBMS.
Again, Snowflage people of course will see this exactly the other way around, but this Project always has had an Oracle first sentiment.

Lets wait a week or so for opinions to hit and then we will make a joint decision. Thank you for understanding.

@manticore-projects
Copy link
Contributor

I have created a Github Poll: #1939

@gitmotte
Copy link
Contributor

gitmotte commented Jan 2, 2024

Washing up the syntax may allow for statements appearing valid when not valid on the big four RDBMS.

I don't see it quite so strictly. A validator (FeatureSetValidation/DatabaseType and FeatureSetValidation/Version) can be built/extended to validate which syntax is correct for which RDMS.

Again, Snowflage people of course will see this exactly the other way around, but this Project always has had an Oracle first sentiment.

I do not use Oracle, I have defined several validators for several widely used RDBMS. Besides the DatabaseType Validation classes ORACLE, MYSQL, SQLSERVER, MARIADB, POSTGRESQL, H2, there is a SQLVersion class that is supposed to define the ANSI SQL standard syntax. However, not all databases are fully ANSI-compatible for all statements.

  • we definitely shall NOT deprecate the Standard Compliant getters/setters because those present the correct functionality

Deprecation should only take place if the methods are completely outdated or incorrect. I agree that we should keep them.

  • I would rather love to see a way, where the MERGE statement is and stays first of all Standard compliant and then only adds a SnowFlake layer. The optional ON brackets raised my concern already, but I waived it because it was kind of reasonable (I really can't explain why the standard wants brackets there) and also non-invasive.

The class-structure will never fulfill all the requirements of all RDBMSs, so successful parsing will never guarantee a valid statement for an RDBMS. We have this problem in many cases in this library.

@tomershay
Copy link
Contributor

My 2 cents:

  • From my point of view JSQLParser is a parser, and not a validator. Examining this library from a validator point of view is doomed to fail, as many of its capabilities were not built for that purpose, but rather just to parse SQL.
  • So if the library's goal is just to parse (and not validate), I think we can accept changes that enhance the scope of accepted use cases, as long as those enhancements do not break anything related to other popular databases.
  • Even though the library might have been built with Oracle in mind originally, I will take a guess that many of its users are now using it for other popular databases (for example PostgreSQL), and over time I believe those use cases will just get more common. Therefore, I think we should not limit ourselves to anything related to a specific database like Oracle.

I didn't get a chance to look at the specific technical details of this change, so I'm mainly sharing thoughts based on the discussion above and initial understanding of the suggested change.

@jxnu-liguobin
Copy link
Contributor

I think adding one is acceptable if we have standalone support for that database (oracle, mysql,h2,pg). When there is no standalone support (a new database category), it may be more in line with the standard to not accept big changes.

I have a similar problem, we are heavily using Doris which uses MySQL driver but still has some special operations and since Doris is not a legacy database and not currently popular, there is no such database category in jsqlparser so I didn't create PR for Doris.

@manticore-projects
Copy link
Contributor

Thank you all for your feedback. I read it s supportive as long as the support for the Standard Merge statement stays intact.

@davidjgoss Can you please amend your PR in order to make it as compatible as possible to the Standard Merge. E. g. don't deprecate the existing methods and assume that standard count of merge operations = 1 ( 1or zero INSERT and 1 or zero DELETE) -- unless more operations are either APPENDED or SET with an index.

Then I would promptly merge the PR.
Thank you again for contributing and feedback!

@davidjgoss davidjgoss force-pushed the snowflake-multiple-when branch from e2017b7 to 49115da Compare January 5, 2024 10:17
@davidjgoss
Copy link
Contributor Author

@manticore-projects I've pushed a commit that I think addresses this. The original three standard fields in Merge are reinstated including accessors. The new single list remains too, and the setters will cross-update, meaning even with just the strictly standard behaviour we can still use the single list for deparsing and toString(), and consumers can make use of the new visitor methods too. Let me know what you think!

@manticore-projects manticore-projects merged commit f1c525a into JSQLParser:master Jan 5, 2024
@manticore-projects
Copy link
Contributor

Great! Thank you for your contribution and support, much appreciated!

@manticore-projects
Copy link
Contributor

Congrats, your new Snowflake syntax are online: https://manticore-projects.com/JSQLParser/syntax_snapshot.html#merge

@davidjgoss davidjgoss deleted the snowflake-multiple-when branch January 5, 2024 12:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants