-
Notifications
You must be signed in to change notification settings - Fork 3.9k
sql: fix semantic analysis and type checking of window functions #37251
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This currently breaks optimizer with an internal error on the following query (from our logic tests):
So there appears to be other changes to the optimizer code required, but I don't know what those are. cc @justinj please help :) |
d1dc622
to
904e03a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 7 of 7 files at r1.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, and @yuzefovich)
pkg/sql/opt/optbuilder/scope.go, line 1039 at r1 (raw file):
// TODO(yuzefovich): is this the right way to do what planner.analyzeExpr // does in HP? typedExpr := s.resolveAndRequireType(e, types.Any)
This just calls TypeCheck
again, which is redundant with the call above. I think the problem is that you deleted the TypeCheck code for window frame (see my comment there)
pkg/sql/sem/tree/type_check.go, line 946 at r1 (raw file):
// TypeCheck checks that offsets of the window frame (if present) are of the // appropriate type. func (f *WindowFrame) TypeCheck(ctx *SemaContext, windowDef *WindowDef) error {
The problem with deleting this function is now the optimizer has no way to access this logic. I think you probably want analyzeWindowFrame
to call this function, rather than moving all the logic there.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, @rytaft, and @yuzefovich)
pkg/sql/sem/tree/type_check.go, line 946 at r1 (raw file):
Previously, rytaft wrote…
The problem with deleting this function is now the optimizer has no way to access this logic. I think you probably want
analyzeWindowFrame
to call this function, rather than moving all the logic there.
Thanks for taking a look! Unfortunately, I don't think that your suggestion can be applied here - the whole purpose of this PR is to move the logic (that is now in analyzeWindowFrame
) out of TypeCheck. The reason is that in some cases we don't have enough information to do the type checking of window frames yet when FuncExpr is getting type-checked.
For example, with a query like this SELECT rank() OVER (w RANGE 1 PRECEDING) FROM t WINDOW w AS (ORDER BY a)
, to correctly type check that constant 1
we need to know the type of column a
that comes from WINDOW
clause. And (at least in the HP at the moment) we only have that information after we've constructed window definitions in sql/window.go
. The reason for why the type of a
is necessary to know is that if it is a numerical column (int, float, decimal), then 1
must be of the same type; if a
is of date-time related type, then 1
must be an interval.
pkg/sql/sem/tree/type_check.go, line 946 at r1 (raw file): Previously, yuzefovich wrote…
Ok, but couldn't you just take out the code that calls this function from |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, @rytaft, and @yuzefovich)
pkg/sql/sem/tree/type_check.go, line 946 at r1 (raw file):
Previously, rytaft wrote…
Ok, but couldn't you just take out the code that calls this function from
FuncExpr.TypeCheck
(as you've already done above)? Then you could still callWindowFrame.TypeCheck
fromsql.window.go
(as the previous code was doing). That would allow you to easily reuse this logic in the optimizer. At the moment, the new code you've written is not getting called by the optimizer.
Sorry for the long delay. I think during our investigation with Jordan, we came to a conclusion that doing any type checking on window definition within FuncExpr.TypeCheck
didn't make sense. I guess I need to somehow expose what's happening within constructWindowDefinitions
and analyzeWindowFrame
in sql/window.go
to the optimizer. Any ideas? cc @jordanlewis
7a5bc8f
to
718f9ae
Compare
Hooray! All tests now pass. It required some code copy/pasting from HP to CBO, but everything appears to be working. @justinj @rytaft @jordanlewis RFAL. What is the "fate" of the code in HP now that CBO fully supports window functions? I guess we'll be removing window functions related code from the former shortly, so it seems ok to have duplicated code in two places for now. |
Yeah, the plan is to remove the HP code for everything that's needed for DML statements in 19.2 (but we still have some work to get there, because of various internal paths that still work with just the HP). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Looking much better now....
Reviewed 5 of 7 files at r2, 4 of 4 files at r3.
Reviewable status:complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, and @yuzefovich)
pkg/sql/opt/optbuilder/scope.go, line 1091 at r3 (raw file):
f = typedFunc.(*tree.FuncExpr) if err := s.builder.semaCtx.CheckFunctionUsage(f, def); err != nil {
This call already happens inside the TypeCheck
call above.
pkg/sql/opt/optbuilder/scope.go, line 1100 at r3 (raw file):
s.builder.semaCtx.Properties.Derived.InWindowFunc, ) s.builder.semaCtx.Properties.Derived.InWindowFunc = true
This is also already set in TypeCheck
above.
pkg/sql/opt/optbuilder/scope.go, line 1103 at r3 (raw file):
for i, e := range f.WindowDef.Partitions { typedExpr := s.resolveAndRequireType(e, types.Any)
since you're passing types.Any
, just call resolveType
pkg/sql/opt/optbuilder/scope.go, line 1104 at r3 (raw file):
for i, e := range f.WindowDef.Partitions { typedExpr := s.resolveAndRequireType(e, types.Any) f.WindowDef.Partitions[i] = typedExpr.(tree.TypedExpr)
this cast is unnecessary
pkg/sql/opt/optbuilder/scope.go, line 1110 at r3 (raw file):
panic(builderError{errOrderByIndexInWindow}) } typedExpr := s.resolveAndRequireType(e.Expr, types.Any)
ditto
pkg/sql/opt/optbuilder/scope.go, line 1111 at r3 (raw file):
} typedExpr := s.resolveAndRequireType(e.Expr, types.Any) f.WindowDef.OrderBy[i].Expr = typedExpr.(tree.TypedExpr)
ditto
pkg/sql/opt/optbuilder/scope.go, line 1146 at r3 (raw file):
var ( errOrderByIndexInWindow = pgerror.New(pgcode.FeatureNotSupported, "ORDER BY INDEX in window definition is not supported") errVarOffsetGroups = pgerror.New(pgcode.Syntax, fmt.Sprintf("GROUPS offset cannot contain variables"))
unnecessary fmt.Sprintf
pkg/sql/sem/tree/type_check.go, line 186 at r3 (raw file):
// parameters of a window function in order to reject nested window // functions. InWindowFunc bool
shouldn't need to export this
pkg/sql/sem/tree/type_check.go, line 734 at r3 (raw file):
// CheckFunctionUsage checks whether a given built-in function is // allowed in the current context. func (sc *SemaContext) CheckFunctionUsage(expr *FuncExpr, def *FunctionDefinition) error {
shouldn't need to export this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, @rytaft, and @yuzefovich)
pkg/sql/opt/optbuilder/scope.go, line 1091 at r3 (raw file):
Previously, rytaft wrote…
This call already happens inside the
TypeCheck
call above.
Indeed, this seems unnecessary.
pkg/sql/opt/optbuilder/scope.go, line 1100 at r3 (raw file):
Previously, rytaft wrote…
This is also already set in
TypeCheck
above.
True, it is set there, but then it is reset upon exiting FuncExpr.TypeCheck
call. And because I'm removing type checking of PARTITION BY and ORDER BY clauses from FuncExpr.TypeCheck
, when we're actually doing the type checking on those clauses a few lines below, semaCtx
will think that we're not in a window function, so the call to s.resolveType
on partitions and order by will not error out.
As a concrete example, on the query SELECT sum(a) OVER (PARTITION BY count(a) OVER ()) FROM x
from our logic test without changing inWindowFunc
we get:
expected "window function calls cannot be nested", but no error occurred
.
pkg/sql/opt/optbuilder/scope.go, line 1103 at r3 (raw file):
Previously, rytaft wrote…
since you're passing
types.Any
, just callresolveType
Nice, thanks!
pkg/sql/opt/optbuilder/scope.go, line 1104 at r3 (raw file):
Previously, rytaft wrote…
this cast is unnecessary
Indeed, thanks!
pkg/sql/opt/optbuilder/scope.go, line 1110 at r3 (raw file):
Previously, rytaft wrote…
ditto
Done.
pkg/sql/opt/optbuilder/scope.go, line 1111 at r3 (raw file):
Previously, rytaft wrote…
ditto
Done.
pkg/sql/opt/optbuilder/scope.go, line 1146 at r3 (raw file):
Previously, rytaft wrote…
unnecessary
fmt.Sprintf
Done.
pkg/sql/sem/tree/type_check.go, line 734 at r3 (raw file):
Previously, rytaft wrote…
shouldn't need to export this
Resolved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 2 of 2 files at r4.
Reviewable status:complete! 1 of 0 LGTMs obtained (waiting on @jordanlewis, @justinj, and @yuzefovich)
pkg/sql/window.go, line 177 at r4 (raw file):
var ( errOrderByIndexInWindow = pgerror.New(pgcode.FeatureNotSupported, "ORDER BY INDEX in window definition is not supported") errVarOffsetGroups = pgerror.New(pgcode.Syntax, fmt.Sprintf("GROUPS offset cannot contain variables"))
another unnecessary fmt.Sprintf
pkg/sql/opt/optbuilder/scope.go, line 1100 at r3 (raw file):
Previously, yuzefovich wrote…
True, it is set there, but then it is reset upon exiting
FuncExpr.TypeCheck
call. And because I'm removing type checking of PARTITION BY and ORDER BY clauses fromFuncExpr.TypeCheck
, when we're actually doing the type checking on those clauses a few lines below,semaCtx
will think that we're not in a window function, so the call tos.resolveType
on partitions and order by will not error out.As a concrete example, on the query
SELECT sum(a) OVER (PARTITION BY count(a) OVER ()) FROM x
from our logic test without changinginWindowFunc
we get:
expected "window function calls cannot be nested", but no error occurred
.
Cool, thanks for the explanation!
Previously, performing semantic analysis and type checking of related to window functions expressions was performed as a part of analysis of FuncExpr. However, this is not sufficient in all cases: for example, on the query like SELECT rank() OVER (w RANGE 1 PRECEDING) FROM t WINDOW w AS (ORDER BY a), in order to correctly analyze 1, we need to have information about ORDER BY clause coming from window definition w. This information is only available after we constructed window definitions in window.go. Additionally, this commit now requires an ORDER BY clause with GROUPS mode of framing (which is exactly what PostgreSQL does). Actually, without the ordering clause, GROUPS mode doesn't make sense since all rows will be a part of the same peer group which probably indicates a user error, so we will return an error in such cases. Release note: None
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis, @justinj, and @rytaft)
pkg/sql/window.go, line 177 at r4 (raw file):
Previously, rytaft wrote…
another unnecessary
fmt.Sprintf
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed 1 of 1 files at r5.
Reviewable status:complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis and @justinj)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis and @justinj)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the reviews!
bors r+
Reviewable status:
complete! 1 of 0 LGTMs obtained (and 1 stale) (waiting on @jordanlewis)
37251: sql: fix semantic analysis and type checking of window functions r=yuzefovich a=yuzefovich Previously, performing semantic analysis and type checking of related to window functions expressions was performed as a part of analysis of FuncExpr. However, this is not sufficient in all cases: for example, on the query like SELECT rank() OVER (w RANGE 1 PRECEDING) FROM t WINDOW w AS (ORDER BY a), in order to correctly analyze 1, we need to have information about ORDER BY clause coming from window definition w. This information is only available after we constructed window definitions in window.go. Additionally, this commit now requires an ORDER BY clause with GROUPS mode of framing (which is exactly what PostgreSQL does). Actually, without the ordering clause, GROUPS mode doesn't make sense since all rows will be a part of the same peer group which probably indicates a user error, so we will return an error in such cases. Release note: None Co-authored-by: Yahor Yuzefovich <[email protected]>
Build succeeded |
Previously, performing semantic analysis and type checking of
related to window functions expressions was performed as a part
of analysis of FuncExpr. However, this is not sufficient in all
cases: for example, on the query like
SELECT rank() OVER (w RANGE 1 PRECEDING) FROM t WINDOW w
AS (ORDER BY a),
in order to correctly analyze 1, we need to have information about
ORDER BY clause coming from window definition w. This information
is only available after we constructed window definitions in
window.go.
Additionally, this commit now requires an ORDER BY clause with
GROUPS mode of framing (which is exactly what PostgreSQL does).
Actually, without the ordering clause, GROUPS mode doesn't make
sense since all rows will be a part of the same peer group which
probably indicates a user error, so we will return an error in
such cases.
Release note: None