-
Notifications
You must be signed in to change notification settings - Fork 1.8k
Fixed some dmu block clone reflink ASSERTs. #14995
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fixed some dmu block clone reflink ASSERTs. #14995
Conversation
The first one: the `list_head(&db->db_dirty_records)` can return elements from a previous transaction group. For example: This can happen, when a file is opened with the O_TRUNC flag and than reflinked. The free_long_range transaction group is not sync to disk yet, but the clone is already running. An ASSERT like this would work: ``` dr = list_head(&db->db_dirty_records); ASSERT(dr == NULL || (dr->dr_txg < tx->tx_txg)); ``` The second one: It looks like if you are fast enough to read the data the ASSERT is false. The db->db_state will be DB_READ but dr->dt.dl.dr_brtwrite is not synced to disk yet. So if (db->db_state == DB_READ && dr->dt.dl.dr_brtwrite == B_TRUE) is B_TRUE it is fine to continue. (in debug mode) It can be repoduced with: ``` while true; do /usr/bin/cp -fv /tank/test/test.img /tank/test/test.img2 && sleep 1 && sha256sum /tank/test/test.img2; done ``` Tested on linux 5.14 Signed-off-by: Kay Pedersen <[email protected]>
Plumbing it into (just) You may find this and this informative. Also I'd bet he'll find it anyway, but to speed that up, @pjd . |
Oh nice @rincebrain this one is really nice, I'm fairly new to the codebase and the chance to compare other code on the same problem with my own code is really nice. Tanks a lot @rincebrain EDIT:
|
I didn't say it didn't have the ASSERTs, I was just pointing to it as another experiment in wiring it up that wasn't just wiring up remap. |
Oh sorry I didn't wanted to point the errors, I only tried to figure out if your |
PR #15050 has been opened which adds the Linux integration. @oromenahar it'd be great if you could test out the new PR. |
@behlendorf already working on a review for that pull request. It would be nice if the PR can be merged and can get off the ground. |
It looks like this is all real. There's more analysis in #15050. |
I'm working on reflink support for linux and found some Asserts which are not working as expected.
I'm using
cp
to reflink the file and than usingsha256sum
right after it to read it. That results in some strange behaviour.My
cp
reflink without adding--reflink
as option explicitly.Motivation and Context
I don't know if this fixes an open issue. I guess there are some issues about reflink support on linux, but not this one exactly.
Right now I'm working on a reflink support for linux and while I'm was working on that I found it. If necessary here is a link to the branch: reflink on linux still WIP
Description
The first one: the
list_head(&db->db_dirty_records)
can return elements from a previous transaction group.For example: This can happen, when a file is opened with the
O_TRUNC
flag and than reflinked.The free_long_range transaction group is not sync to disk yet, but the clone is already running.
An ASSERT like this would work:
The second one:
It looks like if you are fast enough to read the data the ASSERT is false. The db->db_state will be DB_READ but dr->dt.dl.dr_brtwrite is not synced to disk yet. So if
(db->db_state == DB_READ && dr->dt.dl.dr_brtwrite == B_TRUE)
isB_TRUE
it is fine to continue. (in debug mode)It can be repoduced with:
Tested on linux 5.14
How Has This Been Tested?
I have tested on linux 5.14
Rockylinux 9
with custom build ZFS and the zsh. I used the commits from this branch to expose the reflink on linux still WIP interface to the linux kernel and than usecp
andsha256sum
to read and write it through theVFS
. I tested on a virtual machine with a one disk.Types of changes
Checklist:
Signed-off-by
.