Discussion:
Client says "Stale NFS file handle" but server does not return NFS3ERR_STALE
DENIEL Philippe
2012-03-27 15:54:03 UTC
Permalink
Hi,

I have the following issue:
Client does a classical "mount -o vers=3,lock server:/path /mnt". The
server is my nfs-ganesha user space server.
Then, a long time running "dd if=/dev/zero of=./foo..." is made inside a
directory in the mount point. No matter what the other parameters of dd
(like bs= or count=) are : I kill the daemon, and restart it a couple of
seconds later. Then I kill the dd (CTRL-C from the console). The dd
command returns an error (which is logical, it' sis IO error or Bad File
Descriptor), but I see something else that is quite strange:
- if I ls from the current directory (where I ran 'dd'), I got the
message "ls: cannot open directory .: Stale NFS file handle"
- In wireshark, I see no NFS3ERR_STALE
The wireshark capture shows that the "server shutdown" was made between
a WRITE reply and the related COMMIT call (I received the COMMIT call as
the server rebooted).
Apparently, the client decided to return "Stale NFS file handle" to the
client, the server returns no error, all replies are NFS3_OK.
What should I be looking for to fix this bug ? (which is probably on my
side)

Regards

Philippe


--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
DENIEL Philippe
2012-03-27 16:28:10 UTC
Permalink
More information:
if I do "echo 32767 > /proc/sys/sunrpc/nfs_debug", I can see this in sy=
slog:
Mar 27 18:11:15 aury63 kernel: [32430.065930] NFS:=20
nfs_lookup_revalidate(/a) is invalid

Any Idea ?

Philippe
Hi,
Client does a classical "mount -o vers=3D3,lock server:/path /mnt". T=
he=20
server is my nfs-ganesha user space server.
Then, a long time running "dd if=3D/dev/zero of=3D./foo..." is made i=
nside=20
a directory in the mount point. No matter what the other parameters o=
f=20
dd (like bs=3D or count=3D) are : I kill the daemon, and restart it a=
=20
couple of seconds later. Then I kill the dd (CTRL-C from the console)=
=2E=20
The dd command returns an error (which is logical, it' sis IO error o=
r=20
- if I ls from the current directory (where I ran 'dd'), I got the=
=20
message "ls: cannot open directory .: Stale NFS file handle"
- In wireshark, I see no NFS3ERR_STALE
The wireshark capture shows that the "server shutdown" was made=20
between a WRITE reply and the related COMMIT call (I received the=20
COMMIT call as the server rebooted).
Apparently, the client decided to return "Stale NFS file handle" to=20
the client, the server returns no error, all replies are NFS3_OK.
What should I be looking for to fix this bug ? (which is probably on=
=20
my side)
Regards
Philippe
--=20
To unsubscribe from this list: send the line "unsubscribe linux-nfs" =
in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html
Myklebust, Trond
2012-03-27 16:45:09 UTC
Permalink
Are you perhaps returning blatantly wrong attributes? I would expect
that returning the wrong fileid, or an incorrect file type in either a
GETATTR or a READDIR might cause a situation such as what you describe.

Cheers
Trond
Post by DENIEL Philippe
nfs_lookup_revalidate(/a) is invalid
Any Idea ?
Philippe
Post by DENIEL Philippe
Hi,
Client does a classical "mount -o vers=3,lock server:/path /mnt". The
server is my nfs-ganesha user space server.
Then, a long time running "dd if=/dev/zero of=./foo..." is made inside
a directory in the mount point. No matter what the other parameters of
dd (like bs= or count=) are : I kill the daemon, and restart it a
couple of seconds later. Then I kill the dd (CTRL-C from the console).
The dd command returns an error (which is logical, it' sis IO error or
- if I ls from the current directory (where I ran 'dd'), I got the
message "ls: cannot open directory .: Stale NFS file handle"
- In wireshark, I see no NFS3ERR_STALE
The wireshark capture shows that the "server shutdown" was made
between a WRITE reply and the related COMMIT call (I received the
COMMIT call as the server rebooted).
Apparently, the client decided to return "Stale NFS file handle" to
the client, the server returns no error, all replies are NFS3_OK.
What should I be looking for to fix this bug ? (which is probably on
my side)
Regards
Philippe
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Trond Myklebust
Linux NFS client maintainer

NetApp
***@netapp.com
www.netapp.com

N�����r��y����b�X��ǧv�^�)޺{.n�+����{���"��^n�r���z���h�����&����������v��fp)��br ���+
Loading...