Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE

Discussion:

Kerberized NFS (v3 and v4) does not work if Kerberos token is >= 2048 bytes (if PAGE_SIZE == 4096)

Jonathan Manton

2010-08-04 14:38:57 UTC

Kerberized NFS does not work for users who have a Kerberos token that
is 2048 bytes or larger (assuming PAGE_SIZE == 4096). This can
(easily) happen in enterprise environments that use Active Directory
as their Kerberos source, due to the additional Privilege Account
Certificate (PAC) added by AD to the token.

When a user with a Kerberos token of 2048 bytes or larger attempts to
access a filesystem mounted using Kerberized NFS, the NFS server locks
up for 30 seconds, and ultimately the call fails.

The root cause of this problem is the interface between the sunrpc
layer and the rpc.svcgssd daemon. When an NFS NULL request comes in
from a client to establish the initial security context, information
is passed via the rpc cache mechanism through a named pipe
(sunrpc_cache_pipe_upcall() in net/sunrpc/cache.c), to be consumed by
the rpc.svcgssd daemon. This results in the upcall data being
formatted via rsi_request() in net/sunrpc/auth_gss/svcauth_gss.c.
rsi_request() uses the qword_addhex() routine (implemented in net/
sunrpc/cache.c) to encode the upcall data as ASCII for the named pipe.

The issue is that the upcall data is limited to PAGE_SIZE bytes (this
buffer is allocated in sunrpc_cache_pipe_upcall). On my kernel at
least, this is 4096 bytes. The upcall data is encoded as ASCII
characters. It takes two ASCII characters to encode each byte of
upcall data, meaning that any token over 2047 bytes will fill the
buffer and result in an error condition.

When that happens, sunrpc_cache_pipe_upcall returns -EAGAIN, which
implies (according to Documentation/filesystems/nfs/rpc-cache.txt)
that the upcall is pending, even though in fact
sunrpc_cache_pipe_upcall has actually freed the buffer and never added
the call to the cache request queue.

The result is that all nfsd kernel processes continue to try to
process the request and check back on the request, continuously, for
30 seconds, trying to enqueue the upcall.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Andy Adamson

2010-08-04 15:28:35 UTC

Permalink

Yes, this limitation has been known for a long time. We ran into this same issue using X.509 certs and spkm3. I imagine PKINIT will also hit this limitation.

-->Andy

Kerberized NFS does not work for users who have a Kerberos token that is 2048 bytes or larger (assuming PAGE_SIZE == 4096). This can (easily) happen in enterprise environments that use Active Directory as their Kerberos source, due to the additional Privilege Account Certificate (PAC) added by AD to the token.
When a user with a Kerberos token of 2048 bytes or larger attempts to access a filesystem mounted using Kerberized NFS, the NFS server locks up for 30 seconds, and ultimately the call fails.
The root cause of this problem is the interface between the sunrpc layer and the rpc.svcgssd daemon. When an NFS NULL request comes in from a client to establish the initial security context, information is passed via the rpc cache mechanism through a named pipe (sunrpc_cache_pipe_upcall() in net/sunrpc/cache.c), to be consumed by the rpc.svcgssd daemon. This results in the upcall data being formatted via rsi_request() in net/sunrpc/auth_gss/svcauth_gss.c. rsi_request() uses the qword_addhex() routine (implemented in net/sunrpc/cache.c) to encode the upcall data as ASCII for the named pipe.
The issue is that the upcall data is limited to PAGE_SIZE bytes (this buffer is allocated in sunrpc_cache_pipe_upcall). On my kernel at least, this is 4096 bytes. The upcall data is encoded as ASCII characters. It takes two ASCII characters to encode each byte of upcall data, meaning that any token over 2047 bytes will fill the buffer and result in an error condition.
When that happens, sunrpc_cache_pipe_upcall returns -EAGAIN, which implies (according to Documentation/filesystems/nfs/rpc-cache.txt) that the upcall is pending, even though in fact sunrpc_cache_pipe_upcall has actually freed the buffer and never added the call to the cache request queue.
The result is that all nfsd kernel processes continue to try to process the request and check back on the request, continuously, for 30 seconds, trying to enqueue the upcall.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
More majordomo info at http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Jim Rees

2010-08-04 15:45:33 UTC

Permalink

Post by Jonathan Manton
When a user with a Kerberos token of 2048 bytes or larger attempts to
access a filesystem mounted using Kerberized NFS, the NFS server locks up
for 30 seconds, and ultimately the call fails.

Yes, this limitation has been known for a long time. We ran into this same
issue using X.509 certs and spkm3. I imagine PKINIT will also hit this
limitation.

But shouldn't it fail right away instead of locking up for 30 seconds?

Does the entire server lock up, or just that one rpc?

Can a malicious client use this as a DOS? Does it require a valid ticket,
or will any ticket >= 2048 do?
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html

Jonathan Manton

2010-08-04 16:22:55 UTC

Permalink

Post by Andy Adamson

It seems to me that it should error out with a log message, rather
than simply trying over and over again.

Post by Andy Adamson
Does the entire server lock up, or just that one rpc?

The concrete manifestation of this is that all of the NFS kernel
processes run continuously. So on a single-processor system, it takes
100% of the CPU for those 30 seconds. On a multiprocessor system (at
least my RHEL system), the NFS kernel processes keep affinity with a
CPU, so it just consumes one processor. I have not tested if other
NFS requests can be processed during that window on a multiprocessor
system. It does not really "lock up", but rather monopolizes the CPU
with high-priority kernel threads.

Related to this, it was a real pain for me to debug, since setting any
of the rpcdebug flags in rpc simply overloaded the logging subsystem.
I had to put an ssleep() in svcauth_gss_handle_init() in order to get
debugging output I could use from rpcdebug.

Post by Andy Adamson
Can a malicious client use this as a DOS?

Yes.

Post by Andy Adamson
Does it require a valid ticket,
or will any ticket >= 2048 do?

I believe that all of the validity-checking of the token is done in
the upcall rpc.svcgssd, not in the sunrpc kernel code. I am a kernel
newbie though, so I am not sure.
--
To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
the body of a message to majordomo-***@public.gmane.org
More majordomo info at http://vger.kernel.org/majordomo-info.html