crypto: testmgr - Validate output length in (de)compression tests
When self-testing (de)compression algorithms, make sure the actual size of
the (de)compressed output data matches the expected output size.
Otherwise, in case the actual output size would be smaller than the expected
output size, the subsequent buffer compare test would still succeed, and no
error would be reported.
Signed-off-by: Geert Uytterhoeven <Geert.Uytterhoeven@sonycom.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Ingo Molnar [Tue, 25 Nov 2008 15:19:24 +0000 (23:19 +0800)]
crypto: testmgr - Fix error flow of test_comp
This warning:
crypto/testmgr.c: In function ‘test_comp’:
crypto/testmgr.c:829: warning: ‘ret’ may be used uninitialized in this function
triggers because GCC correctly notices that in the ctcount == 0 &&
dtcount != 0 input condition case this function can return an undefined
value, if the second loop fails.
Remove the shadowed 'ret' variable from the second loop that was probably
unintended.
Signed-off-by: Ingo Molnar <mingo@elte.hu> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Evgeniy Polyakov [Mon, 24 Nov 2008 14:04:39 +0000 (22:04 +0800)]
crypto: hifn_795x - Fix queue management
Fix queue management. Change ring size and perform its check not
one after another descriptor, but using stored pointers to the last
checked descriptors.
Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
HIFN uses the transform context to store per-request data, which breaks
when more than one request is outstanding. Move per request members from
struct hifn_context to a new struct hifn_request_context and convert
the code to use this.
Signed-off-by: Patrick McHardy <kaber@trash.net> Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
The ANSI X9.31 PRNG docs aren't particularly clear on how to increment DT,
but empirical testing shows we're incrementing from the wrong end. A 10,000
iteration Monte Carlo RNG test currently winds up not getting the expected
result.
From http://csrc.nist.gov/groups/STM/cavp/documents/rng/RNGVS.pdf :
If I load up ansi_cprng w/dbg=1 though, it was fairly obvious what was
going wrong:
----8<----
getting 16 random bytes for context ffff810033fb2b10
Calling _get_more_prng_bytes for context ffff810033fb2b10
Input DT: 00000000: e6 b3 be 78 2a 23 fa 62 d7 1d 4a fb b0 e9 22 fc
Input I: 00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
Input V: 00000000: f0 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
tmp stage 0: 00000000: e6 b3 be 78 2a 23 fa 62 d7 1d 4a fb b0 e9 22 fc
tmp stage 1: 00000000: f4 8e cb 25 94 3e 8c 31 d6 14 cd 8a 23 f1 3f 84
tmp stage 2: 00000000: 8c 53 6f 73 a4 1a af d4 20 89 68 f4 58 64 f8 be
Returning new block for context ffff810033fb2b10
Output DT: 00000000: e7 b3 be 78 2a 23 fa 62 d7 1d 4a fb b0 e9 22 fc
Output I: 00000000: 04 8e cb 25 94 3e 8c 31 d6 14 cd 8a 23 f1 3f 84
Output V: 00000000: 48 89 3b 71 bc e4 00 b6 5e 21 ba 37 8a 0a d5 70
New Random Data: 00000000: 88 dd a4 56 30 24 23 e5 f6 9d a5 7e 7b 95 c7 3a
Calling _get_more_prng_bytes for context ffff810033fb2b10
Input DT: 00000000: e7 b3 be 78 2a 23 fa 62 d7 1d 4a fb b0 e9 22 fc
Input I: 00000000: 04 8e cb 25 94 3e 8c 31 d6 14 cd 8a 23 f1 3f 84
Input V: 00000000: 48 89 3b 71 bc e4 00 b6 5e 21 ba 37 8a 0a d5 70
tmp stage 0: 00000000: e7 b3 be 78 2a 23 fa 62 d7 1d 4a fb b0 e9 22 fc
tmp stage 1: 00000000: 80 6b 3a 8c 23 ae 8f 53 be 71 4c 16 fc 13 b2 ea
tmp stage 2: 00000000: 2a 4d e1 2a 0b 58 8e e6 36 b8 9c 0a 26 22 b8 30
Returning new block for context ffff810033fb2b10
Output DT: 00000000: e8 b3 be 78 2a 23 fa 62 d7 1d 4a fb b0 e9 22 fc
Output I: 00000000: c8 e2 01 fd 9f 4a 8f e5 e0 50 f6 21 76 19 67 9a
Output V: 00000000: ba 98 e3 75 c0 1b 81 8d 03 d6 f8 e2 0c c6 54 4b
New Random Data: 00000000: e2 af e0 d7 94 12 01 03 d6 e8 6a 2b 50 3b df aa
returning 16 from get_prng_bytes in context ffff810033fb2b10
----8<----
The expected result is there, in the first "New Random Data", but we're
incorrectly making a second call to _get_more_prng_bytes, due to some checks
that are slightly off, which resulted in our original bytes never being
returned anywhere.
One approach to fixing this would be to alter some byte_count checks in
get_prng_bytes, but it would mean the last DEFAULT_BLK_SZ bytes would be
copied a byte at a time, rather than in a single memcpy, so a slightly more
involved, equally functional, and ultimately more efficient way of fixing this
was suggested to me by Neil, which I'm submitting here. All of the RNGVS ANSI
X9.31 AES128 VST test vectors I've passed through ansi_cprng are now returning
the expected results with this change.
Signed-off-by: Jarod Wilson <jarod@redhat.com> Acked-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Fri, 7 Nov 2008 07:11:47 +0000 (15:11 +0800)]
libcrc32c: Move implementation to crypto crc32c
This patch swaps the role of libcrc32c and crc32c. Previously
the implementation was in libcrc32c and crc32c was a wrapper.
Now the code is in crc32c and libcrc32c just calls the crypto
layer.
The reason for the change is to tap into the algorithm selection
capability of the crypto API so that optimised implementations
such as the one utilising Intel's CRC32C instruction can be
used where available.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Fri, 7 Nov 2008 06:58:52 +0000 (14:58 +0800)]
crypto: crc32c - Test descriptor context format
This patch adds a test for the requirement that all crc32c algorithms
shall store the partial result in the first four bytes of the descriptor
context.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Sun, 31 Aug 2008 12:21:09 +0000 (22:21 +1000)]
crypto: hash - Export shash through hash
This patch allows shash algorithms to be used through the old hash
interface. This is a transitional measure so we can convert the
underlying algorithms to shash before converting the users across.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Thu, 6 Nov 2008 06:39:16 +0000 (14:39 +0800)]
crypto: api - Call type show function before legacy for proc
This patch makes /proc/crypto call the type-specific show function
if one is present before calling the legacy show functions for
cipher/digest/compress. This allows us to reuse the type values
for those legacy types. In particular, hash and digest will share
one type value while shash is phased in as the default hash type.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Sun, 2 Nov 2008 13:38:11 +0000 (21:38 +0800)]
crypto: hash - Add import/export interface
It is often useful to save the partial state of a hash function
so that it can be used as a base for two or more computations.
The most prominent example is HMAC where all hashes start from
a base determined by the key. Having an import/export interface
means that we only have to compute that base once rather than
for each message.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Sun, 31 Aug 2008 05:47:27 +0000 (15:47 +1000)]
crypto: hash - Add shash interface
The shash interface replaces the current synchronous hash interface.
It improves over hash in two ways. Firstly shash is reentrant,
meaning that the same tfm may be used by two threads simultaneously
as all hashing state is stored in a local descriptor.
The other enhancement is that shash no longer takes scatter list
entries. This is because shash is specifically designed for
synchronous algorithms and as such scatter lists are unnecessary.
All existing hash users will be converted to shash once the
algorithms have been completely converted.
There is also a new finup function that combines update with final.
This will be extended to ahash once the algorithm conversion is
done.
This is also the first time that an algorithm type has their own
registration function. Existing algorithm types will be converted
to this way in due course.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Sat, 20 Sep 2008 21:52:53 +0000 (06:52 +0900)]
crypto: api - Rebirth of crypto_alloc_tfm
This patch reintroduces a completely revamped crypto_alloc_tfm.
The biggest change is that we now take two crypto_type objects
when allocating a tfm, a frontend and a backend. In fact this
simply formalises what we've been doing behind the API's back.
For example, as it stands crypto_alloc_ahash may use an
actual ahash algorithm or a crypto_hash algorithm. Putting
this in the API allows us to do this much more cleanly.
The existing types will be converted across gradually.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Sun, 14 Sep 2008 01:19:03 +0000 (18:19 -0700)]
crypto: api - Move type exit function into crypto_tfm
The type exit function needs to undo any allocations done by the type
init function. However, the type init function may differ depending
on the upper-level type of the transform (e.g., a crypto_blkcipher
instantiated as a crypto_ablkcipher).
So we need to move the exit function out of the lower-level
structure and into crypto_tfm itself.
As it stands this is a no-op since nobody uses exit functions at
all. However, all cases where a lower-level type is instantiated
as a different upper-level type (such as blkcipher as ablkcipher)
will be converted such that they allocate the underlying transform
and use that instead of casting (e.g., crypto_ablkcipher casted
into crypto_blkcipher). That will need to use a different exit
function depending on the upper-level type.
This patch also allows the type init/exit functions to call (or not)
cra_init/cra_exit instead of always calling them from the top level.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Neil Horman [Wed, 5 Nov 2008 04:13:14 +0000 (12:13 +0800)]
crypto: ansi_cprng - Allow resetting of DT value
This is a patch that was sent to me by Jarod Wilson, marking off my
outstanding todo to allow the ansi cprng to set/reset the DT counter value in a
cprng instance. Currently crytpo_rng_reset accepts a seed byte array which is
interpreted by the ansi_cprng as a {V key} tuple. This patch extends that tuple
to now be {V key DT}, with DT an optional value during reset. This patch also
fixes a bug we noticed in which the offset of the key area of the seed is
started at DEFAULT_PRNG_KSZ rather than DEFAULT_BLK_SZ as it should be.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Jarod Wilson <jarod@redhat.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Herbert Xu [Sun, 31 Aug 2008 05:58:45 +0000 (15:58 +1000)]
crypto: padlock - Avoid resetting cword on successive operations
Resetting the control word is quite expensive. Fortunately this
isn't an issue for the common operations such as CBC and ECB as
the whole operation is done through a single call. However, modes
such as LRW and XTS have to call padlock over and over again for
one operation which really hurts if each call resets the control
word.
This patch uses an idea by Sebastian Siewior to store the last
control word used on a CPU and only reset the control word if
that changes.
Note that any task switch automatically resets the control word
so we only need to be accurate with regard to the stored control
word when no task switches occur.
Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In commit ec6644d6325b5a38525f1d5b20fd4bf7db05cf2a "crypto: talitos - Preempt
overflow interrupts", the test in atomic_inc_not_zero was interpreted by the
author to be applied after the increment operation (not before). This off-by-one
fix prevents overflow error interrupts from occurring when requests are frequent
and large enough to do so.
Signed-off-by: Vishnu Suresh <Vishnu@freescale.com> Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Harvey Harrison [Sun, 12 Oct 2008 12:40:12 +0000 (20:40 +0800)]
crypto: camellia - use kernel-provided bitops, unaligned access
Remove the private implementation of 32-bit rotation and unaligned
access with byteswapping.
As a bonus, fixes sparse warnings:
crypto/camellia.c:602:2: warning: cast to restricted __be32
crypto/camellia.c:603:2: warning: cast to restricted __be32
crypto/camellia.c:604:2: warning: cast to restricted __be32
crypto/camellia.c:605:2: warning: cast to restricted __be32
crypto/camellia.c:710:2: warning: cast to restricted __be32
crypto/camellia.c:711:2: warning: cast to restricted __be32
crypto/camellia.c:712:2: warning: cast to restricted __be32
crypto/camellia.c:713:2: warning: cast to restricted __be32
crypto/camellia.c:714:2: warning: cast to restricted __be32
crypto/camellia.c:715:2: warning: cast to restricted __be32
crypto/camellia.c:716:2: warning: cast to restricted __be32
crypto/camellia.c:717:2: warning: cast to restricted __be32
[Thanks to Tomoyuki Okazaki for spotting the typo] Tested-by: Carlo E. Prelz <fluido@fluido.as> Signed-off-by: Harvey Harrison <harvey.harrison@gmail.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Neil Horman [Sun, 12 Oct 2008 12:36:51 +0000 (20:36 +0800)]
crypto: testmgr - Trigger a panic when self test fails in FIPS mode
The FIPS specification requires that should self test for any supported
crypto algorithm fail during operation in fips mode, we need to prevent
the use of any crypto functionality until such time as the system can
be re-initialized. Seems like the best way to handle that would be
to panic the system if we were in fips mode and failed a self test.
This patch implements that functionality. I've built and run it
successfully.
Signed-off-by: Neil Horman <nhorman@tuxdriver.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Kim Phillips [Sun, 12 Oct 2008 12:33:14 +0000 (20:33 +0800)]
crypto: talitos - Perform auth check in h/w if on sec 2.1 and above
SEC version 2.1 and above adds the capability to do the IPSec ICV
memcmp in h/w. Results of the cmp are written back in the descriptor
header, along with the done status. A new callback is added that
checks these ICCR bits instead of performing the memcmp on the core,
and is enabled by h/w capability.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com>
After testing on different parts, another condition was added
before using h/w auth check because different
SEC revisions require different handling.
The SEC 3.0 allows a more flexible link table where
the auth data can span separate link table entries.
The SEC 2.4/2.1 does not support this case.
So a test was added in the decrypt routine
for a fragmented case; the h/w auth check is disallowed for
revisions not having the extent in the link table;
in this case the hw auth check is done by software.
A portion of a previous change for SEC 3.0 link table handling
was removed since it became dead code with the hw auth check supported.
This seems to be the best compromise for using hw auth check
on supporting SEC revisions; it keeps the link table logic
simpler for the fragmented cases.
Signed-off-by: Lee Nipper <lee.nipper@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
In talitos_interrupt, upon one done interrupt, mask further done interrupts,
and ack only any error interrupt.
In talitos_done, unmask done interrupts after completing processing.
In flush_channel, ack each done channel processed.
Keep done overflow interrupts masked because even though each pkt
is ack'ed, a few done overflows still occur.
Signed-off-by: Lee Nipper <lee.nipper@freescale.com> Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Kim Phillips [Sun, 12 Oct 2008 12:19:35 +0000 (20:19 +0800)]
crypto: talitos - Pass correct interrupt status to error handler
Since we ack early, the re-read interrupt status in talitos_error
may be already updated with a new value. Pass the error ISR value
directly in order to report and handle the error based on the correct
error status.
Also remove unused error tasklet.
Signed-off-by: Kim Phillips <kim.phillips@freescale.com> Signed-off-by: Lee Nipper <lee.nipper@freescale.com> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
On Tue, Sep 23, 2008 at 08:06:32PM +0200, Dimitri Puzin (max@psycast.de) wrote:
> With this patch applied it still doesn't work as expected. The overflow
> messages are gone however syslog shows
> [ 120.924266] hifn0: abort: c: 0, s: 1, d: 0, r: 0.
> when doing cryptsetup luksFormat as in original e-mail. At this point
> cryptsetup hangs and can't be killed with -SIGKILL. I've attached
> SysRq-t dump of this condition.
Yes, I was wrong with the patch: HIFN does not support 64-bit addresses
afaics.
Attached patch should not allow HIFN to be registered on 64-bit arch, so
crypto layer will fallback to the software algorithms.
Signed-off-by: Evgeniy Polyakov <johnpol@2ka.mipt.ru> Signed-off-by: Herbert Xu <herbert@gondor.apana.org.au>
Aleksey Senin [Wed, 24 Dec 2008 18:16:37 +0000 (10:16 -0800)]
RDMA/addr: Add support for translating IPv6 addresses
Add support for translating AF_INET6 addresses to the IB address
translation service. This requires using struct sockaddr_storage
instead of struct sockaddr wherever an IPv6 address might be stored,
and adding cases to handle IPv6 in addition to IPv4 to the various
translation functions.
Signed-off-by: Aleksey Senin <aleksey@alst60.(none)> Signed-off-by: Roland Dreier <rolandd@cisco.com>
and are queued up for v2.6.29. This shows that the facility is still not
tested well enough to release into a stable kernel - disable it for now and
reactivate in .29. In .29 the hardware-branch-tracer will use the DS/BTS
facilities too - hopefully resulting in better code.
H. Peter Anvin [Tue, 23 Dec 2008 18:10:40 +0000 (10:10 -0800)]
x86: PAT: fix address types in track_pfn_vma_new()
Impact: cleanup, fix warning
This warning:
arch/x86/mm/pat.c: In function track_pfn_vma_copy:
arch/x86/mm/pat.c:701: warning: passing argument 5 of follow_phys from incompatible pointer type
Triggers because physical addresses are resource_size_t, not u64.
This really matters when calling an interface like follow_phys() which
takes a pointer to a physical address -- although on x86, being
littleendian, it would generally work anyway as long as the memory region
wasn't completely uninitialized.
Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
Lachlan McIlroy [Wed, 24 Dec 2008 03:07:32 +0000 (14:07 +1100)]
[XFS] Fix race in xfs_write() between direct and buffered I/O with DMAPI
The iolock is dropped and re-acquired around the call to XFS_SEND_NAMESP().
While the iolock is released the file can become cached. We then
'goto retry' and - if we are doing direct I/O - mapping->nrpages may now be
non zero but need_i_mutex will be zero and we will hit the WARN_ON().
Since we have dropped the I/O lock then the file size may have also changed
so what we need to do here is 'goto start' like we do for the XFS_SEND_DATA()
DMAPI event.
We also need to update the filesize before releasing the iolock so that
needs to be done before the XFS_SEND_NAMESP event. If we drop the iolock
before setting the filesize we could race with a truncate.
Reviewed-by: Christoph Hellwig <hch@infradead.org> Signed-off-by: Lachlan McIlroy <lachlan@sgi.com>
Kyle McMartin [Tue, 23 Dec 2008 13:44:30 +0000 (08:44 -0500)]
parisc: disable UP-optimized flush_tlb_mm
flush_tlb_mm's "optimized" uniprocessor case of allocating a new
context for userspace is exposing a race where we can suddely return
to a syscall with the protection id and space id out of sync, trapping
on the next userspace access.
Harry Ciao [Tue, 23 Dec 2008 21:57:16 +0000 (13:57 -0800)]
edac: fix edac core deadlock when removing a device
When deleting an edac device, we have to wait for its edac_dev.work to be
completed before deleting the whole edac_dev structure. Since we have no
idea which work in current edac_poller's workqueue is the work we are
conerned about, we wait for all work in the edac_poller's workqueue to be
proceseed. This is done via flush_cpu_workqueue() which inserts a
wq_barrier into the tail of the workqueue and then sleeping on the
completion of this wq_barrier. The edac_poller will wake up sleepers when
it is found.
EDAC core creates only one kernel worker thread, edac_poller, to run the
works of all current edac devices. They share the same callback function
of edac_device_workq_function(), which would grab the mutex of
device_ctls_mutex first before it checks the device. This is exactly
where edac_poller and rmmod would have a great chance to deadlock.
In below call trace of rmmod > ... >
edac_device_del_device >
edac_device_workq_teardown > flush_workqueue > flush_cpu_workqueue,
device_ctls_mutex would have already been grabbed by
edac_device_del_device(). So, on one hand rmmod would sleep on the
completion of a wq_barrier, holding device_ctls_mutex; on the other hand
edac_poller would be blocked on the same mutex when it's running any one
of works of existing edac evices(Note, this edac_dev.work is likely to be
totally irrelevant to the one that is being removed right now)and never
would have a chance to run the work of above wq_barrier to wake rmmod up.
edac_device_workq_teardown() should not be called within the critical
region of device_ctls_mutex. Just like is done in edac_pci_del_device()
and edac_mc_del_mc(), where edac_pci_workq_teardown() and
edac_mc_workq_teardown() are called after related mutex are released.
Moreover, an edac_dev.work should check first if it is being removed. If
this is the case, then it should bail out immediately. Since not all of
existing edac devices are to be removed, this "shutting flag" should be
contained to edac device being removed. The current edac_dev.op_state can
be used to serve this purpose.
The original deadlock problem and the solution have been witnessed and
tested on actual hardware. Without the solution, rmmod an edac driver
would result in below deadlock:
Evgeniy Polyakov [Tue, 23 Dec 2008 21:57:12 +0000 (13:57 -0800)]
w1: fix slave selection on big-endian systems
During test of the w1-gpio driver i found that in "w1.c:679
w1_slave_found()" the device id is converted to little-endian with
"cpu_to_le64()", but its not converted back to cpu format in "w1_io.c:293
w1_reset_select_slave()".
Based on a patch created by Andreas Hummel.
[akpm@linux-foundation.org: remove unneeded cast] Reported-by: Andreas Hummel <andi_hummel@gmx.de> Signed-off-by: Evgeniy Polyakov <zbr@ioremap.net> Signed-off-by: Andrew Morton <akpm@linux-foundation.org> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
x86: update add-cpu_mask_to_apicid_and to use struct cpumask*
broke interrupt delivery on x2apic platforms. As x2apic cluster mode uses
logical delivery mode, we need to use logical apicid instead of physical apicid
in x2apic_cpu_mask_to_apicid_and()
Impact: fixes the broken interrupt delivery issue on generic x2apic platforms.
Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com> Acked-by: Mike Travis <travis@sgi.com> Signed-off-by: H. Peter Anvin <hpa@zytor.com> Signed-off-by: Ingo Molnar <mingo@elte.hu>
> kernel/sched.c: In function 'schedule':
> kernel/sched.c:3679: warning: 'active_balance' may be used uninitialized in this function
>
> This warning is correct - the code is buggy.
In sched.c load_balance_newidle, there's real potential use of
uninitialised variable - fix it.
Yinghai Lu [Fri, 19 Dec 2008 23:23:44 +0000 (15:23 -0800)]
x86: fix lguest used_vectors breakage, -v2
Impact: fix lguest, clean up
32-bit lguest used used_vectors to record vectors, but that model of
allocating vectors changed and got broken, after we changed vector
allocation to a per_cpu array.
Try enable that for 64bit, and the array is used for all vectors that
are not managed by vector_irq per_cpu array.
Also kill system_vectors[], that is now a duplication of the
used_vectors bitmap.
[ merged in cpus4096 due to io_apic.c cpumask changes. ]
[ -v2, fix build failure ]
This patch extends the new upcall with a "service" field that currently
can have 2 values: "*" or "nfs". These values specify matching rules for
principals in the keytab file. The "*" means that gssd is allowed to use
"root", "nfs", or "host" keytab entries while the other option requires
"nfs".
Restricting gssd to use the "nfs" principal is needed for when the
server performs a callback to the client. The server in this case has
to authenticate itself as an "nfs" principal.
We also need "service" field to distiguish between two client-side cases
both currently using a uid of 0: the case of regular file access by the
root user, and the case of state-management calls (such as setclientid)
which should use a keytab for authentication. (And the upcall should
fail if an appropriate principal can't be found.)
Signed-off: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch extends the new upcall by adding a "target" field
communicating who we want to authenticate to (equivalently, the service
principal that we want to acquire a ticket for).
Signed-off: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch adds server-side support for callbacks other than AUTH_SYS.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
This patch adds client-side support to allow for callbacks other than
AUTH_SYS.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
rpc: pass target name down to rpc level on callbacks
The rpc client needs to know the principal that the setclientid was done
as, so it can tell gssd who to authenticate to.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Two principals are involved in krb5 authentication: the target, who we
authenticate *to* (normally the name of the server, like
nfs/server.citi.umich.edu@CITI.UMICH.EDU), and the source, we we
authenticate *as* (normally a user, like bfields@UMICH.EDU)
In the case of NFSv4 callbacks, the target of the callback should be the
source of the client's setclientid call, and the source should be the
nfs server's own principal.
Therefore we allow svcgssd to pass down the name of the principal that
just authenticated, so that on setclientid we can store that principal
name with the new client, to be used later on callbacks.
Signed-off-by: Olga Kornievskaia <aglo@citi.umich.edu> Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Implement the new upcall. We decide which version of the upcall gssd
will use (new or old), by creating both pipes (the new one named "gssd",
the old one named after the mechanism (e.g., "krb5")), and then waiting
to see which version gssd actually opens.
We don't permit pipes of the two different types to be opened at once.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
rpc: store pointer to pipe inode in gss upcall message
Keep a pointer to the inode that the message is queued on in the struct
gss_upcall_msg. This will be convenient, especially after we have a
choice of two pipes that an upcall could be queued on.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
rpc: use count of pipe openers to wait for first open
Introduce a global variable pipe_version which will eventually be used
to keep track of which version of the upcall gssd is using.
For now, though, it only keeps track of whether any pipe is open or not;
it is negative if not, zero if one is opened. We use this to wait for
the first gssd to open a pipe.
(Minor digression: note this waits only for the very first open of any
pipe, not for the first open of a pipe for a given auth; thus we still
need the RPC_PIPE_WAIT_FOR_OPEN behavior to wait for gssd to open new
pipes that pop up on subsequent mounts.)
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
I can't see any reason we need to call this until either the kernel or
the last gssd closes the pipe.
Also, this allows to guarantee that open_pipe and release_pipe are
called strictly in pairs; open_pipe on gssd's first open, release_pipe
on gssd's last close (or on the close of the kernel side of the pipe, if
that comes first).
That will make it very easy for the gss code to keep track of which
pipes gssd is using.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
We want to transition to a new gssd upcall which is text-based and more
easily extensible.
To simplify upgrades, as well as testing and debugging, it will help if
we can upgrade gssd (to a version which understands the new upcall)
without having to choose at boot (or module-load) time whether we want
the new or the old upcall.
We will do this by providing two different pipes: one named, as
currently, after the mechanism (normally "krb5"), and supporting the
old upcall. One named "gssd" and supporting the new upcall version.
We allow gssd to indicate which version it supports by its choice of
which pipe to open.
As we have no interest in supporting *simultaneous* use of both
versions, we'll forbid opening both pipes at the same time.
So, add a new pipe_open callback to the rpc_pipefs api, which the gss
code can use to track which pipes have been open, and to refuse opens of
incompatible pipes.
We only need this to be called on the first open of a given pipe.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
I want to add a little more code here, so it'll be convenient to have
this flatter.
Also, I'll want to add another error condition, so it'll be more
convenient to return -ENOMEM than NULL in the error case. The only
caller is already converting NULL to -ENOMEM anyway.
Signed-off-by: J. Bruce Fields <bfields@citi.umich.edu> Signed-off-by: Trond Myklebust <Trond.Myklebust@netapp.com>
Benny Halevy [Tue, 23 Dec 2008 21:06:13 +0000 (16:06 -0500)]
nfs: return compound hdr.status when there are no op replies
When there are no op replies encoded in the compound reply
hdr.status still contains the overall status of the compound
rpc. This can happen, e.g., when the server returns a
NFS4ERR_MINOR_VERS_MISMATCH error.