Skip to content

CIFS throws EINTR and ENOENT errors #483

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
gerroon opened this issue May 23, 2020 · 22 comments
Closed

CIFS throws EINTR and ENOENT errors #483

gerroon opened this issue May 23, 2020 · 22 comments

Comments

@gerroon
Copy link

gerroon commented May 23, 2020

Hi

Maybe I am pushing the limits here but this is what I am doing. Basically I am using symlinks in a Samba folder where Samba allows following symlinks, so the mounting side thinks those are regular file system files and folders

I am doing this method because I can rsync the reversed mode folder over Ssh with much faster speeds compared to mounting remote folders.

  • Create a folder (/media/SharePlain) on PC-A

  • Add some symlinks to thios folder

  • Enable symlinks following in the Samba share settings

  • On another networked pc PC-B , mount this share (/mnt/SharePlain).

  • On PC-B use reverse mode to get the ecrypted version of the Samba share using gocryptfs -reverse /mnt/SharePlain /mnt/ShareCipher

This seems to work but I get the error below, and not sure if this is a serious unstable situation or not. I definetely would want to use this is it will cause file corruption.

rsync: readdir("/media/CIPHER/xxxdO1_v0L7okPnUxxx_-A/xxxJ7Tymzt6vYP7PlQxxx/xxxw9BpChqTo2mYGpLaKxxx_xxxPm4eb0NyZbyGkxxx/xxxrlICrprrA2
Vhi-Lbxxx/xxxPxxx_xxxOT8pOcZdxxx"): Interrupted system call (4)

I also get another error

WARNING: xxxdO1_v0L7okPnUxxx_-A/x_xxxJ7Tymzt6vYP7PlQxxx/xxxs45X9Jvxxx__LBF0xxx failed verification -- update discarded (will try again)

another rsync complaint


rsync: read errors mapping

@gerroon gerroon changed the title Interrupted system call (4) Rsync: Interrupted system call (4) May 23, 2020
@rfjakob
Copy link
Owner

rfjakob commented May 23, 2020

Interesting. I can reproduce this.

smb server ------ network --------> cifs mount -> gocryptfs reverse mount

$ ls -R > /dev/null
ls: reading directory './exaZ9-h5hT_CuSEkijIV9g/3_uwsn7uKxiF-ppuGjslmg/z-zD5SNR2-B8XNWvWpalaykve0P_j-HfMmpW2XN22aQ': Interrupted system call
ls: reading directory './exaZ9-h5hT_CuSEkijIV9g/4f5e6MxSAHsThknKO5cEYw': Interrupted system call
ls: reading directory './exaZ9-h5hT_CuSEkijIV9g/QubRkUUV-e-_VorHidwKYg': Interrupted system call
[...]

@rfjakob
Copy link
Owner

rfjakob commented May 23, 2020

Looks like Go bug golang/go#38836

@gerroon
Copy link
Author

gerroon commented May 23, 2020

Do you think the bug fix would soon make it to Gocryptfs ? I so would like to try this method personally.

@rfjakob
Copy link
Owner

rfjakob commented May 24, 2020

Yes I merged the fix yesterday as 25f1727 .

You can test by following the (short) instructions at https://github.com/rfjakob/gocryptfs#compile

@gerroon
Copy link
Author

gerroon commented May 24, 2020

Hi

Thanks for the fix, that seems to fix one issue but here I am having another problem around this.

Please take notice of how file listing numbers are random for the given folder. There is a hit and miss there. If you think you want to try it maybe try it with a folder with large number of files

The folder originally has 1197 files, and I can get that steady number on the Samba server itself, and also on the mounting side if I query the actual share but not the reverse mode version

/mnt/.temp is mounted with gocryptfs -reverse SAMBASHARE /mnt/.tmp


USER@PC:/mnt/.temp/xxxxxxxxxxLaRKkDKKdTYx$ find -type f|wc -l
692
USER@PC:/mnt/.temp/xxxxxxxxxxLaRKkDKKdTYx$ find -type f|wc -l
1202
USER@PC:/mnt/.temp/xxxxxxxxxxLaRKkDKKdTYx$ find -type f|wc -l
7
USER@PC:/mnt/.temp/xxxxxxxxxxLaRKkDKKdTYx$ find -type f|wc -l
701
USER@PC:/mnt/.temp/xxxxxxxxxxLaRKkDKKdTYx$ find -type f|wc -l
1202
USER@PC:/mnt/.temp/xxxxxxxxxxLaRKkDKKdTYx$ 


Not sure if this is the same bug. I can open another one if you want.

@rfjakob
Copy link
Owner

rfjakob commented May 24, 2020

Well, that's no good.

@rfjakob rfjakob reopened this May 24, 2020
@rfjakob
Copy link
Owner

rfjakob commented May 24, 2020

But I cannot reproduce this here

cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l
1006
cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l
1006
cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l
1006
cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l
1006
cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l
1006
cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l
1006
cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l

@rfjakob
Copy link
Owner

rfjakob commented May 24, 2020

Got one:

cifs.reverse/u6tT_dcDrMxxuIYe7X0XbQ/CS_zRxKJxBOoyibUU6V3xw$ find -type f|wc -l
0

rfjakob added a commit that referenced this issue May 24, 2020
Small tool to try to debug unix.Getdents problems on CIFS mounts
#483
@rfjakob
Copy link
Owner

rfjakob commented May 24, 2020

Wow, looks like a kernel bug. Related: golang/go#24015

rfjakob added a commit that referenced this issue May 24, 2020
On CIFS mounts, unix.Getdents can return sudden ENOENT
in the middle of data. This will not be reported as an error
by user space tools, so return EIO instead.

Also log it as a warning.

#483
@gerroon
Copy link
Author

gerroon commented May 24, 2020

Thanks for investigating, sounds really messed up.

rfjakob added a commit that referenced this issue May 24, 2020
Another way to repro the problem in
#483
@rfjakob rfjakob changed the title Rsync: Interrupted system call (4) CIFS throws EINTR and ENOENT errors Jun 9, 2020
rfjakob added a commit that referenced this issue Jun 21, 2020
This was an attempt to make the C code more
similar to Go (which also reads from multiple threads).

However, I still could not repro the ENOENT problems.

#483
rfjakob added a commit that referenced this issue Jun 21, 2020
This was an attempt to make the C code more
similar to Go (which also reads from multiple threads).

However, I still could not repro the ENOENT problems.

#483
@NerdyProjects
Copy link

I might be getting similar errors :-(

CIFS, 5.8.11-1-MANJARO, gocryptfs updated to GIT version because manjaro packaged version was still on go 1.14, gocryptfs v1.8.0-39-g3b61244; go-fuse v2.0.3; 2020-10-13 go1.15.2 linux/amd64.

Getting "Interrupted system call" on trying to rsync or copy things onto the gocryptfs mounted on a cifs share:

//kanthaus-server/homes on XXX type cifs (rw,relatime,vers=3.1.1,cache=strict,username=matthias,uid=1000,forceuid,gid=1000,forcegid,addr=192.168.178.31,file_mode=0755,dir_mode=0755,soft,nounix,serverino,mapposix,rsize=4194304,wsize=4194304,bsize=1048576,echo_interval=60,actimeo=1)
XXX/storage/privat on /home/bla/mnt type fuse.gocryptfs (rw,nosuid,nodev,relatime,user_id=1000,group_id=1000,max_read=131072)

I also get errors on rm:
rm: cannot remove 'XXX/mnt/video/bulgarien/menu/fertig/chapp4.bmp': Interrupted system call

Sometimes get errors in journalctl, but not always:

kt 13 10:13:12 purefruit kernel: CIFS: VFS: \\kanthaus-server\homes Close interrupted close
Okt 13 10:13:12 purefruit kernel: CIFS: VFS: Send error in read = -4
Okt 13 10:13:13 purefruit kernel: CIFS: VFS: Send error in read = -4
Okt 13 10:13:13 purefruit kernel: CIFS: VFS: Send error in read = -4
Okt 13 10:14:01 purefruit kernel: CIFS: VFS: Send error in read = -4
Okt 13 10:14:01 purefruit gocryptfs[32063]: OpenDir "2M2gYlEyE_HFg7LQ88irLnvWH--Dq5HtKv731gi9LD4": could not read gocryptfs.diriv: interrupted system call

doing it repeatedly sometimes fails at the same files but then suceeds.

Copying/rsync from cifs -> cifs without gocryptfs works without any errors.

@rfjakob
Copy link
Owner

rfjakob commented Oct 13, 2020

Can confirm with latest gocryptfs / Go version:

$ uname -a
Linux brikett 5.8.13-200.fc32.x86_64 #1 SMP Thu Oct 1 21:49:42 UTC 2020 x86_64 x86_64 x86_64 GNU/Linux

$ gocryptfs -version
gocryptfs v1.8.0-137-g803fdf4-dirty.gofuse_v2api; go-fuse v2.0.4-0.20200908172753-0b6cbc515082 => github.com/rfjakob/go-fuse/v2 v2.0.4-0.20201010230434-31e96afd74d6; 2020-10-13 go1.15.2 linux/amd64
$ tar xf /tmp/linux-3.0.tar.gz
tar: linux-3.0: Cannot mkdir: Interrupted system call
tar: linux-3.0/Documentation/ABI/stable/sysfs-driver-usb-usbtmc: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/DocBook/v4l/biblio.xml: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/DocBook/v4l/vidioc-g-sliced-vbi-cap.xml: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/applying-patches.txt: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/arm/VFP: Cannot utime: Interrupted system call
tar: linux-3.0/Documentation/blockdev/cciss.txt: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/cgroups/cpuacct.txt: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/cpuidle/sysfs.txt: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/devicetree/bindings/gpio/led.txt: Cannot open: Interrupted system call
tar: linux-3.0/Documentation/devicetree/bindings/hwmon: Cannot mkdir: Interrupted system call
tar: linux-3.0/Documentation/devicetree/bindings/powerpc/fsl/cpm_qe: Cannot mkdir: Interrupted system call
...
$ rm -Rf linux-3.0
rm: cannot remove 'linux-3.0/.mailmap': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/cgroups': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/networking/arcnet.txt': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/networking/.gitignore': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/hwmon/wm8350': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/cdrom/00-INDEX': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/md.txt': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/RCU/torture.txt': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/RCU/lockdep.txt': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/isdn/README.hfc-pci': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/isdn/README.hysdn': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/isdn/00-INDEX': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/m68k/00-INDEX': Interrupted system call
rm: cannot remove 'linux-3.0/Documentation/i2c/busses/i2c-sis5595': Interrupted system call
...

@NerdyProjects
Copy link

As a workaround, I just use it on sshfs, which works perfectly fine.

@gerroon
Copy link
Author

gerroon commented Oct 13, 2020

SSHfs has been my solution as well.

@rfjakob
Copy link
Owner

rfjakob commented Oct 13, 2020

For future reference:

rfjakob added a commit that referenced this issue Oct 13, 2020
Retry operations that have been shown to throw EINTR
errors on CIFS.

Todo: Solution for this pain in the back:

	warning: unix.Getdents returned errno 2 in the middle of data
	rm: cannot remove 'linux-3.0.old3/Documentation/ABI/removed': Input/output error

Progress towards fixing #483 .
@rfjakob
Copy link
Owner

rfjakob commented Oct 13, 2020

@NerdyProjects I just merged a fix into the gofuse_v2api branch (the current development branch). if you can try it, you'd just need to run

git pull
git checkout gofuse_v2api
./build.bash

in the git repo.

@NerdyProjects
Copy link

So the fix seems to work, I still see a lot of cifs errors in the kernel log which I don't see without gocryptfs...

Okt 14 12:57:36 purefruit kernel: CIFS: VFS: \\kanthaus-server\homes Close interrupted close
Okt 14 12:57:41 purefruit kernel: SMB2_read: 33 callbacks suppressed
Okt 14 12:57:41 purefruit kernel: CIFS: VFS: Send error in read = -4
Okt 14 12:57:41 purefruit kernel: CIFS: VFS: Send error in read = -4
Okt 14 12:57:41 purefruit kernel: CIFS: VFS: \\kanthaus-server\homes Close interrupted close
Okt 14 12:57:48 purefruit kernel: CIFS: VFS: Send error in read = -4
Okt 14 12:57:48 purefruit kernel: CIFS: VFS: \\kanthaus-server\homes Close interrupted close
Okt 14 12:57:48 purefruit kernel: CIFS: VFS: Send error in read = -4

@rfjakob
Copy link
Owner

rfjakob commented Oct 14, 2020

Yes I also see those, I'm not completely sure what is going on, but it seems like the kernel cifs driver and the Go multithreading don't like each other.

But,

Error -4 = EINTR = Interrupted system call

and those errors are retried.

@rfjakob
Copy link
Owner

rfjakob commented Oct 14, 2020

One more thing I noticed when testing: When the mountpoint is on cifs, gocryptfs sometimes gets unmounted suddenly. For gocryptfs it looks like a regular unmount request.

I'm not sure what is going on here, but I made an audit rule for the umount2 syscall, and nobody seems to call it. Maybe when the kernel has problem accessing the cifs share that contains the mountpoint it unmounts filesystems below?

Anyway, the workaround is to mount on /tmp or /run/user , which stops the sudden unmounts.

@lechner
Copy link
Contributor

lechner commented Oct 14, 2020

Hi, is this still related to SIGWINCH? I have had strange problems with a Perl program in Debian recently (the lintian test suite) under xmonad. It aborted due to SIGWINCH even though that signal is normally ignored and the program used the standard handler in Perl.

@rfjakob
Copy link
Owner

rfjakob commented Oct 14, 2020

Maybe, in the sense that the Go runtime sends SIGURG to itself for multithreading interrupts. The theory is that cifs does not like it when the process that is accessing the share is interrupted.

SIGWINCH may have the same effect.

rfjakob added a commit that referenced this issue Mar 14, 2021
Give a user receiving the Getdents warning some background info.
@rfjakob
Copy link
Owner

rfjakob commented Dec 21, 2022

The mentioned gofuse_v2api branch including the fix has been released as gocryptfs v2.0 in 2021 - https://github.com/rfjakob/gocryptfs#v20-2021-06-05

@rfjakob rfjakob closed this as completed Dec 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants