Skip to content

Cannot convert Expr to InList #781

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Tracked by #776
HuashiSCNU0303 opened this issue Jul 27, 2024 · 3 comments · Fixed by #793
Closed
Tracked by #776

Cannot convert Expr to InList #781

HuashiSCNU0303 opened this issue Jul 27, 2024 · 3 comments · Fixed by #793
Labels
bug Something isn't working

Comments

@HuashiSCNU0303
Copy link

Describe the bug
When a SQL query contains a InList Expr, I can't get the InList object through Expr.to_variant().

To Reproduce

from datafusion import SessionContext
from datafusion.expr import Filter


def traverse_logical_plan(plan):
    cur_node = plan.to_variant()
    if isinstance(cur_node, Filter):
        predicate = cur_node.predicate().to_variant()
    if hasattr(plan, 'inputs'):
        for input_plan in plan.inputs():
            traverse_logical_plan(input_plan)


if __name__ == "__main__":
    ctx = SessionContext()
    data = {'id': [1, 2, 3], 'name': ['Alice', 'Bob', 'Charlie']}
    ctx.from_pydict(data, name='table1')
    query = "SELECT * FROM table1 t1 WHERE t1.name IN ('dfa', 'ad', 'dfre', 'vsa')"
    logical_plan = ctx.sql(query).optimized_logical_plan()
    traverse_logical_plan(logical_plan)

It produces,

  File "minimal_example.py", line 8, in traverse_logical_plan
    predicate = cur_node.predicate().to_variant()
RuntimeError: "Cannot convert this Expr to a Python object: InList(InList { expr: Column(Column { relation: Some(Bare { table: \"table1\" }), name: \"name\" }), list: [Literal(Utf8(\"dfa\")), Literal(Utf8(\"ad\")), Literal(Utf8(\"dfre\")), Literal(Utf8(\"vsa\"))], negated: false })"

Is there any other method to get the InList object?

@HuashiSCNU0303 HuashiSCNU0303 added the bug Something isn't working label Jul 27, 2024
@HuashiSCNU0303
Copy link
Author

I use datafusion 39.0.0 in Python 3.10.11.

This was referenced Jul 29, 2024
@timsaucer
Copy link
Contributor

Just for my own education, what are the use cases for invoking .to_variant()? I came across these when I was working on python wrapper functions and I don't know how people are using these objects. Can you help me understand why we would want to do this traversal? I'm going to tackle exposing all of those variants in #767 and the better I can understand the user needs, the better I can make sure I'm getting them covered. Thank you!

@HuashiSCNU0303
Copy link
Author

Just for my own education, what are the use cases for invoking .to_variant()? I came across these when I was working on python wrapper functions and I don't know how people are using these objects. Can you help me understand why we would want to do this traversal? I'm going to tackle exposing all of those variants in #767 and the better I can understand the user needs, the better I can make sure I'm getting them covered. Thank you!

Thanks for your support. I would like to implement a custom query optimizer based on the LogicalPlan generated by Datafusion. It needs to obtain expressions within the LogicalPlan (such as filter and join conditions), and then process them differently according to the type of expression (such as In, Like, Cast, etc.). However, implementing it with string matching is very difficult because many edge cases need to be considered. Therefore, I hope to use to_variant() to conveniently perform type checks and extract specific fields of different types of expressions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants