Skip to content

hashset::contains() returns FALSE, although the item is in the set #218

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
SvetlinZarev opened this issue Dec 22, 2020 · 9 comments
Closed

Comments

@SvetlinZarev
Copy link

SvetlinZarev commented Dec 22, 2020

While playing AoC 2020 I've encountered a very strange issue with HashSet::contains() - it reports that an item is NOT present in the set, although it is. Here is a MCVE:

use std::collections::VecDeque;
//use std::collections::HashSet;

use hashbrown::HashSet;

pub fn main() {
    let mut deque = (0..10).collect::<VecDeque<usize>>();

    let mut seen_a = HashSet::new();
    let mut seen_b = HashSet::new();

    for i in 0..100 {
        if !seen_a.insert(deque.clone()) {
            if !seen_b.contains(&deque) {
                println!("{:3}: {:?}", i, deque);
            }
        }

        // already inserted
        // seen_a.insert(deque.clone());
        seen_b.insert(deque.clone());

        deque.rotate_left(1);
    }
}

Expected behavior:
The application does not print anything

Actual behavior
The application prints:

 10: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 11: [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
 12: [2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
 13: [3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
 14: [4, 5, 6, 7, 8, 9, 0, 1, 2, 3]
 15: [5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
 23: [3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
 24: [4, 5, 6, 7, 8, 9, 0, 1, 2, 3]
 25: [5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
 26: [6, 7, 8, 9, 0, 1, 2, 3, 4, 5]
 27: [7, 8, 9, 0, 1, 2, 3, 4, 5, 6]
 28: [8, 9, 0, 1, 2, 3, 4, 5, 6, 7]
 29: [9, 0, 1, 2, 3, 4, 5, 6, 7, 8]
 30: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 31: [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
 39: [9, 0, 1, 2, 3, 4, 5, 6, 7, 8]
 40: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 41: [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
 42: [2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
 43: [3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
 44: [4, 5, 6, 7, 8, 9, 0, 1, 2, 3]
 45: [5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
 46: [6, 7, 8, 9, 0, 1, 2, 3, 4, 5]
 47: [7, 8, 9, 0, 1, 2, 3, 4, 5, 6]
 55: [5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
 56: [6, 7, 8, 9, 0, 1, 2, 3, 4, 5]
 57: [7, 8, 9, 0, 1, 2, 3, 4, 5, 6]
 58: [8, 9, 0, 1, 2, 3, 4, 5, 6, 7]
 59: [9, 0, 1, 2, 3, 4, 5, 6, 7, 8]
 60: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 61: [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
 62: [2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
 63: [3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
 71: [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
 72: [2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
 73: [3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
 74: [4, 5, 6, 7, 8, 9, 0, 1, 2, 3]
 75: [5, 6, 7, 8, 9, 0, 1, 2, 3, 4]
 76: [6, 7, 8, 9, 0, 1, 2, 3, 4, 5]
 77: [7, 8, 9, 0, 1, 2, 3, 4, 5, 6]
 78: [8, 9, 0, 1, 2, 3, 4, 5, 6, 7]
 79: [9, 0, 1, 2, 3, 4, 5, 6, 7, 8]
 87: [7, 8, 9, 0, 1, 2, 3, 4, 5, 6]
 88: [8, 9, 0, 1, 2, 3, 4, 5, 6, 7]
 89: [9, 0, 1, 2, 3, 4, 5, 6, 7, 8]
 90: [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 91: [1, 2, 3, 4, 5, 6, 7, 8, 9, 0]
 92: [2, 3, 4, 5, 6, 7, 8, 9, 0, 1]
 93: [3, 4, 5, 6, 7, 8, 9, 0, 1, 2]
 94: [4, 5, 6, 7, 8, 9, 0, 1, 2, 3]
 95: [5, 6, 7, 8, 9, 0, 1, 2, 3, 4]

The issue is NOT reproducible with std::collections::HashSet. Just comment out the hashbrown import and uncomment the std one.

Hashbrown version: 0.9.1
rustc 1.48.0 (7eac88abb 2020-11-16)

@SvetlinZarev
Copy link
Author

Another, cleaner example:

use std::collections::VecDeque;

pub fn main() {
    let mut deque = (0..10).collect::<VecDeque<usize>>();

    let mut std_set = std::collections::HashSet::new();
    let mut hbr_set = hashbrown::HashSet::new();

    for _ in 0..100 {
        println!("Key: {:?}", deque);

        let std_x = std_set.contains(&deque);
        let hbr_x = hbr_set.contains(&deque);
        if std_x != hbr_x {
            println!("[C] Std: {}; Hbr: {};", std_x, hbr_x);
        }

        let std_x = std_set.get(&deque);
        let hbr_x = hbr_set.get(&deque);
        if std_x != hbr_x {
            println!("[G] Std: {:?}; Hbr: {:?};", std_x, hbr_x);
        }

        let std_x = std_set.insert(deque.clone());
        let hbr_x = hbr_set.insert(deque.clone());
        if std_x != hbr_x {
            println!("[I] Std: {}; Hbr: {};", std_x, hbr_x);
        }

        println!();

        deque.rotate_left(1);
    }
}

it seems that the issue occurs only with contains() and get(), but never with insert()

@cuviper
Copy link
Member

cuviper commented Dec 22, 2020

The issue is NOT reproducible with std::collections::HashSet.

This is odd since std is based on hashbrown -- but std::collections::HashSet<T, ahash::RandomState> also fails! And going the other way, hashbrown::HashSet<T, std::collections::hash_map::RandomState> is fine.

@cuviper
Copy link
Member

cuviper commented Dec 22, 2020

I tried to bisect this, but it just took me to 182d3e3, "Replace FxHash with AHash as the default hasher."

@cuviper
Copy link
Member

cuviper commented Dec 22, 2020

It seems ahash gets different values for a rotated (non-contiguous) VecDeque: (playground)

use ahash::AHasher;
use std::collections::VecDeque;
use std::hash::{Hash, Hasher};

fn hash(x: &impl Hash) -> u64 {
    let mut hasher = AHasher::default();
    Hash::hash(x, &mut hasher);
    hasher.finish()
}

fn main() {
    let deque: VecDeque<i32> = (0..10).collect();
    
    let mut deque2 = deque.clone();
    deque2.rotate_left(5);
    deque2.rotate_left(5);
    
    assert_eq!(deque, deque2);
    assert_eq!(hash(&deque), hash(&deque2)); // fails!
}

@Amanieu
Copy link
Member

Amanieu commented Dec 22, 2020

This feels like a bug in VecDequeue's Hash impl: it simply passes two slices to the hasher. This is incorrect since the slice length is allowed to affect the result, which means that hash(([1, 2], [3, 4])) is allowed to return a different result than hash(([1], [2, 3, 4])).

@SvetlinZarev
Copy link
Author

You are right:

fn main() {
    let mut deque: VecDeque<i32> = (0..10).collect();

    deque.rotate_left(5);
    deque.rotate_left(5);
    println!("{:?}", deque.as_slices());

    let hashcode = hash(&deque);

    deque.rotate_left(5);
    deque.rotate_left(5);
    println!("{:?}", deque.as_slices());


    assert_eq!(hashcode, hash(&deque)); // fails!
}

After rotation, the deque is split in a different way:

([0, 1, 2, 3, 4, 5], [6, 7, 8, 9])
([0, 1, 2, 3, 4, 5, 6, 7, 8, 9], [])

@Amanieu
Copy link
Member

Amanieu commented Dec 22, 2020

I'm going to close this since it isn't a bug in hashbrown or ahash. Can you open an issue in rust-lang/rust?

@Amanieu Amanieu closed this as completed Dec 22, 2020
@cuviper
Copy link
Member

cuviper commented Dec 22, 2020

This feels like a bug in VecDequeue's Hash impl: it simply passes two slices to the hasher. This is incorrect since the slice length is allowed to affect the result, which means that hash(([1, 2], [3, 4])) is allowed to return a different result than hash(([1], [2, 3, 4])).

It doesn't directly hash the slices, but calls Hash::hash_slice -- it's not clear to me if that is supposed to be sensitive to length or just act like each item was fed in sequence. But the Hash::hash_slice for primitive integers casts to a &[u8] and calls Hasher::write, and ahash does add the length there.

@Amanieu
Copy link
Member

Amanieu commented Dec 22, 2020

Well it's either a bug in VecDeque or in ahash, depending on how we specify Hash::hash_slice. In any case that method needs clearer documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants