-
Notifications
You must be signed in to change notification settings - Fork 1.8k
simple reproducer for mmap zfs_flush hang #17303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
@markwiederspahn If you don't see those thing then it's probably a different issue from #17159 |
No inode_wait_for_writeback, ever. Typically there is a flush-zfs-N kworker in D wait, and 16 or 32 python tasks also in D wait. No looping ever. The pool stops functioning since no txgs succeed (I assume). Other pools are not affected and the system as a whole continues to function normally. |
the even simpler failure reproducer given by @dweezil in #8751 using fio fails for me if the task count is the same as the number of virtual cpus. n=8 works fine with 16 HT cpus; fails with n=16 with at least 2.3.2-1 and recent git-master on Alma8. n=8 fails with 8 cpus with 2.3.2-1 on ubuntu 2404 kernel 6.8.0-59-generic.. The latter: /var/log/syslog 2025-05-12T09:08:17.985482-05:00 IG-D03574 kernel: INFO: task kworker/u16:0:28850 blocked for more than 122 seconds. |
oh, a reproducer, nice, I'll take a look as time permits |
ok this is with an older kernel, can't reproduce with a new one, guessing some folio migration related impedance mismatch |
System information
Describe the problem you're observing
production rhel8 system using python parallel threads with mmap hung when certain jobs are run. The issue was traced to a "flush" which does an msync() which leads to ZFS sync writes. Originally seen on a 2x12 disk raidz2 array with 2.1.16 36 cores 644G ram, occurred also with 2.2.7. Debugging moved to Alma8 system with 8 disk raidz1 8 cores 48G ram. Hang is generally reproducible within a few minutes at n=16, sometimes two copies of n=16
Describe how to reproduce the problem
on stock RHEL8/Alma8 with ZFS installed --enable-debug from the tarballs, dnf install python39 python39-numpy
/usr/bin/python3.9 ./memmap_flush_test.py -j 16 --flush will sometimes hang the system, two copies at one is faster.
echo 8191 > zfs_flags
no ASSERTS ever seen.
This seems to be related to various similar issues: #7964, #8751 and #17159, although I attempted to install the small changes from the latter in my copy of master and I continue to get hangs. Probably I'm doing something wrong and not actually getting the changes.
Include any warning/errors/backtraces from the system logs
...
/proc/spl/kstat/zfs/dbgmsg:
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75
1746464023 ffff8f994d87d000 metaslab.c:4021:metaslab_flush(): flushing: txg 17473623, spa xx1, vdev_id 0, ms_id 5231, unflushed_allocs 647168, unflushed_frees
507904, appended 280 bytes
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75
1746464023 ffff8f994d87d000 dbuf.c:3358:dbuf_findbp(): error 2
1746464023 ffff8f994d87d000 zap_leaf.c:501:zap_entry_read_name(): error 75
messages:
May 5 11:57:09 utig1 kernel: INFO: task kworker/u96:1:139110 blocked for more than 120 seconds.
May 5 11:57:09 utig1 kernel: Tainted: P OE -------- - - 4.18.0-553.51.1.el8_10.x86_64 #1
May 5 11:57:09 utig1 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
May 5 11:57:09 utig1 kernel: task:kworker/u96:1 state:D stack:0 pid:139110 ppid:2 flags:0x80004080
May 5 11:57:09 utig1 kernel: Workqueue: writeback wb_workfn (flush-zfs-6)
May 5 11:57:09 utig1 kernel: Call Trace:
May 5 11:57:09 utig1 kernel: __schedule+0x2d1/0x870
May 5 11:57:09 utig1 kernel: schedule+0x55/0xf0
May 5 11:57:09 utig1 kernel: io_schedule+0x12/0x40
May 5 11:57:09 utig1 kernel: __lock_page+0x136/0x270
May 5 11:57:09 utig1 kernel: ? filemap_fdatawait_keep_errors+0x50/0x50
May 5 11:57:09 utig1 kernel: write_cache_pages+0x1f2/0x420
May 5 11:57:09 utig1 kernel: ? zpl_readpages+0x20/0x20 [zfs]
May 5 11:57:09 utig1 kernel: ? __wb_update_bandwidth+0x35/0x1e0
May 5 11:57:09 utig1 kernel: ? _cond_resched+0x15/0x30
May 5 11:57:09 utig1 kernel: zpl_writepages+0xaa/0x200 [zfs]
May 5 11:57:09 utig1 kernel: do_writepages+0xc6/0x1d0
May 5 11:57:09 utig1 kernel: ? ip_finish_output2+0x1a6/0x430
May 5 11:57:09 utig1 kernel: __writeback_single_inode+0x39/0x300
May 5 11:57:09 utig1 kernel: writeback_sb_inodes+0x1ea/0x450
May 5 11:57:09 utig1 kernel: __writeback_inodes_wb+0x5f/0xd0
May 5 11:57:09 utig1 kernel: wb_writeback+0x24c/0x2e0
May 5 11:57:09 utig1 kernel: ? get_nr_inodes+0x35/0x60
May 5 11:57:09 utig1 kernel: wb_workfn+0x392/0x4e0
May 5 11:57:09 utig1 kernel: process_one_work+0x1d3/0x390
May 5 11:57:09 utig1 kernel: worker_thread+0x30/0x390
May 5 11:57:09 utig1 kernel: ? process_one_work+0x390/0x390
May 5 11:57:09 utig1 kernel: kthread+0x134/0x150
May 5 11:57:09 utig1 kernel: ? set_kthread_struct+0x50/0x50
May 5 11:57:09 utig1 kernel: ret_from_fork+0x35/0x40
ps axl | grep D:
5 0 139110 2 20 0 0 0 - D ? 0:16 [kworker/u96:1+flush-zfs-6]
1 0 253108 252676 20 0 6226276 101836 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
1 0 253109 252676 20 0 6226276 101836 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
1 0 253113 252676 20 0 6226276 101836 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
1 0 253118 252676 20 0 6226276 101840 - D pts/0 0:01 /usr/bin/python3.9 ../memmap_flush_test.py -j 16 --flush
...
The text was updated successfully, but these errors were encountered: