Skip to content

Tolerance #4467

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: main
Choose a base branch
from
Open

Tolerance #4467

wants to merge 4 commits into from

Conversation

ghislainp
Copy link
Contributor

@pep8speaks
Copy link

pep8speaks commented Sep 27, 2020

Hello @ghislainp! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

Line 847:1: E302 expected 2 blank lines, found 1

Comment last updated at 2020-10-03 20:06:26 UTC

Comment on lines 847 to 851
# fail if tolerance is not properly implemented
combine_by_coords([ds1, ds2], tolerance=1e-6)

# fail if tolerance is not properly implemented
combine_by_coords([ds1, ds2], tolerance={"x": 1e-6})
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a good start, but these tests should also build up the "expected" output to compare against. Not raising an error does not necessarily mean that things worked properly :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, and in fact my PR does not work. It actually detect that x is not to be concatenated, but then combine_by_coords calls align (indirectly through a chain of function call) to deal with the other dims. Align also tests the equality between the index of the same name and use join to decide what to do. It results that the index that are close, are joined. This is not the expected behavior.

To solve this issue, the first option I see is to add a tolerance argument to align (and to concat which calls align). This is profound changes, I need to opinion on that.

The alternative option is to copy the datasets in _infer_concat_order_from_coords and for the dims that are found to be allclose in all the datasets, set the same index in all the datasets (i.e. taking the index of the first dataset, and set it to all the others). This will trick the next step when concat and align are called and compares the indexes. I don't really like this option but it has the advantage to minimize the changes, and avoid touching concat and align. Adding a tolerance to concat and align is certainly useful, as it is for combine_by_coords, but it must be done well...

What do you think ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think both of these options would be OK.

With regards to the first solution (adding tolerance to align), that's exactly what we agreed on in #2217. But indeed, that would be a larger change (probably best handled separately).

With regards to the second solution, I suspect you could call align with join='override' in the right place to set the indexes properly. The challenge would be tracking down where that needs to be set, and ensuring that it doesn't effect other unrelated code paths.

Comment on lines +77 to +84
all(index.equals(indexes[0]) for index in indexes[1:])
or (
atol > 0
and all(index.is_numeric() for index in indexes)
and all(
np.allclose(index, indexes[0], atol=atol, rtol=0)
for index in indexes[1:]
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a lot of logic to put into a single expression :)

I would suggest writing a helper function with a signature of the form _equivalent(first_index, second_index, atol) that could be called on each candidate index.

@rabernat rabernat requested a review from benbovy March 31, 2021 15:58
@TomNicholas TomNicholas self-assigned this Jul 9, 2021
@TomNicholas TomNicholas added the topic-combine combine/concat/merge label Jul 9, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic-combine combine/concat/merge
Projects
None yet
Development

Successfully merging this pull request may close these issues.

combine_by_coords could use allclose instead of equal to compare coordinates
4 participants