-
Notifications
You must be signed in to change notification settings - Fork 471
feat: suggest close matches using Levenshtein distance [POC] #836
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This pull request is automatically built and testable in CodeSandbox. To see build info of the built libraries, click here or the icon next to each commit SHA. Latest deployment of this branch, based on commit 8ee261f:
|
Codecov Report
@@ Coverage Diff @@
## master #836 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 26 27 +1
Lines 934 965 +31
Branches 286 298 +12
=========================================
+ Hits 934 965 +31
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this! I'm thinking this would be pretty useful. Good progress so far.
const closeMatches = | ||
!computeCloseMatches || typeof id !== 'string' | ||
? [] | ||
: getCloseMatchesByAttribute(getTestIdAttribute(), c, id, options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about this increasing the performance issues we already have with find*
queries which are expected to fail at least once. Any chance we could lazily calculate this value so it's only run when the error is actually displayed? I don't know whether this is possible.
But perhaps my concern is unwarranted? Maybe this is faster than I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm concerned about this increasing the performance issues we already have with find* queries which are expected to fail at least once.
Good point. I tested a few times on my local running getByTestId('search', {computeCloseMatches: true})
vs getByTestId('search', {computeCloseMatches: false})
.
Not a reliable benchmark, but when false it finished consistently at around 20ms. When true finished at around 35ms. It increases with the number of elements found, but not by much. Might become slightly quicker with the recommended lib.
Any chance we could lazily calculate this value so it's only run when the error is actually displayed?
An alternative would be to throw functions for find*
queries. And then check if typeof lastError === 'function'
when waitFor times out.
That might mean decoupling find*
and get*
queries a bit or perhaps make get
queries depend on find
queries instead of the other way around.
i.e: a get*
query could be a find*
query with {timeout: 0, interval: Infinity}
? (not sure if that would work)
Since that seems a bit involved, we could start with a config computeCloseMatches
that defaults to false
and give that some testing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with giving it a trial run and seeing what real-world experience with it will be like, so defaulting to disabled makes sense to me. Let's try it out using leven
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@kentcdodds do you think we should start with the testId query or add this feature to all 10 queries? Wondering if i can break down the work somehow...
Re. the performance concern. Maybe if we realise there is a huge performance impact we can skip the computation of close matches on find*
queries(similar to what was done for the role
query here: #590
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure. What does everyone else think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO if the default is false, we can give it a try in more queries, though, this seems to me like a configuration that shouldn't be set to true by default at all, especially since it has a performance impact. I see this as something that will not be helpful in CI for example and will only take more time so if the developer wants to opt in they will have an option to do that.
Putting aside what I wrote above, I really like this PR and do think it can have a valuable impact so thanks for this :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please try a benchmark again using the new leven
implementation?
What:
Inspired by #582
This provides a way to suggest close matches to users when the query cannot find any elements.
This is a POC, looking to gather feedback to see if its worth it pursuing.
Why:
Might make it easier to debug, specially when there are typos in the queries or when a certain element name has changed slightly.
How:
query by attribute:
calculate close matches:
M = element text length
andN = search string length
Note: this was implemented only on the byTestId query for now and behind a
computeClosetMatches
flagChecklist:
docs site