Which Linux security capabilities are your applications using? I recently developed a new tool, capable, to print out capability checks live:
# capable TIME UID PID COMM CAP NAME AUDIT 22:11:23 114 2676 snmpd 12 CAP_NET_ADMIN 1 22:11:23 0 6990 run 24 CAP_SYS_RESOURCE 1 22:11:23 0 7003 chmod 3 CAP_FOWNER 1 22:11:23 0 7003 chmod 4 CAP_FSETID 1 22:11:23 0 7005 chmod 4 CAP_FSETID 1 22:11:23 0 7005 chmod 4 CAP_FSETID 1 22:11:23 0 7006 chown 4 CAP_FSETID 1 22:11:23 0 7006 chown 4 CAP_FSETID 1 22:11:23 0 6990 setuidgid 6 CAP_SETGID 1 22:11:23 0 6990 setuidgid 6 CAP_SETGID 1 22:11:23 0 6990 setuidgid 7 CAP_SETUID 1 22:11:24 0 7013 run 24 CAP_SYS_RESOURCE 1 22:11:24 0 7026 chmod 3 CAP_FOWNER 1 [...]
capable uses bcc, a front-end and a collection of tools that use new Linux enhanced BPF tracing capabilities. capable works by using BPF with kprobes to dynamically trace the kernel cap_capable() function, and then uses a table to map the capability index to the name seen in the output. Here's the source code: it's pretty straightforward.
I wrote it as a colleague, Michael Wardrop, asked what security capabilities our applications were actually using. Given a list, we could use setcap(8) (or other software) to improve the security of applications by only allowing the necessary capabilities.
Non-audit Checks
The cap_capable() function has an audit argument, which directs whether the capability check should write an audit message or not, if that's configured. By default, I only print capability checks where this is true, but capable can also trace all checks with the -v option:
# capable -h
usage: capable [-h] [-v] [-p PID]
Trace security capability checks
optional arguments:
-h, --help show this help message and exit
-v, --verbose include non-audit checks
-p PID, --pid PID trace this PID only
examples:
./capable # trace capability checks
./capable -v # verbose: include non-audit checks
./capable -p 181 # only trace PID 181
Here's some non-audit events:
# capable -v TIME UID PID COMM CAP NAME AUDIT 20:53:45 60004 22061 lsb_release 21 CAP_SYS_ADMIN 0 20:53:45 60004 22061 lsb_release 21 CAP_SYS_ADMIN 0 20:53:45 60004 22061 lsb_release 21 CAP_SYS_ADMIN 0 20:53:45 60004 22061 lsb_release 21 CAP_SYS_ADMIN 0 20:53:45 60004 22061 lsb_release 21 CAP_SYS_ADMIN 0 20:53:45 60004 22061 lsb_release 21 CAP_SYS_ADMIN 0 [...]
What are all those?
I'll start by showing the cap_capable() function prototype, from security/commoncap.c:
int cap_capable(const struct cred *cred, struct user_namespace *targ_ns,
int cap, int audit)
Now I can use bcc's trace program to inspect these calls (bear with me), given that cap will be arg3, and audit arg4 (from the above prototype):
# trace 'cap_capable "cap: %d, audit: %d", arg3, arg4' TIME PID COMM FUNC - 20:56:18 25535 lsb_release cap_capable cap: 21, audit: 0 20:56:18 25535 lsb_release cap_capable cap: 21, audit: 0 20:56:18 25535 lsb_release cap_capable cap: 21, audit: 0 20:56:18 25535 lsb_release cap_capable cap: 21, audit: 0 20:56:18 25535 lsb_release cap_capable cap: 21, audit: 0 [...]
That one-liner is pretty similar to my capable program, except it lacks the "NAME" column with human readable translations.
I'm really doing this so I can add the (newly added) -K and -U options, which print kernel and user-level stack traces. I'll just use -K:
# trace -K 'cap_capable "cap: %d, audit: %d", arg3, arg4'
TIME PID COMM FUNC -
[...]
20:59:58 30607 lsb_release cap_capable cap: 21, audit: 0
Kernel Stack Trace:
ffffffff813659d1 cap_capable
ffffffff813684bb security_vm_enough_memory_mm
ffffffff811deda6 expand_downwards
ffffffff811def64 expand_stack
ffffffff81234321 setup_arg_pages
ffffffff8128c10b load_elf_binary
ffffffff81234cee search_binary_handler
ffffffff8128b7ff load_script
ffffffff81234cee search_binary_handler
ffffffff8123635a do_execveat_common.isra.35
ffffffff812367da sys_execve
ffffffff81003bae do_syscall_64
ffffffff81861ca5 return_from_SYSCALL_64
20:59:58 30607 lsb_release cap_capable cap: 21, audit: 0
Kernel Stack Trace:
ffffffff813659d1 cap_capable
ffffffff813684bb security_vm_enough_memory_mm
ffffffff811df623 mmap_region
ffffffff811dff4b do_mmap
ffffffff811c122a vm_mmap_pgoff
ffffffff811c1295 vm_mmap
ffffffff8128bb93 elf_map
ffffffff8128c271 load_elf_binary
ffffffff81234cee search_binary_handler
ffffffff8128b7ff load_script
ffffffff81234cee search_binary_handler
ffffffff8123635a do_execveat_common.isra.35
ffffffff812367da sys_execve
ffffffff81003bae do_syscall_64
ffffffff81861ca5 return_from_SYSCALL_64
[...]
Awesome. So these are coming from security_vm_enough_memory_mm(). By reading the source, I see it's used to reserve some memory for root. It's not a hard failure if the capability is missing. And it's not really a security check, hence why it disabled audit.
I should add a -K option to the capable tool, so it can print stack traces too.
Older Kernels
To use capable, you'll need a 4.4 kernel. To use the -K option, 4.6.
Here's a version using my older perf-tools collection, which uses ftrace and should work on much older kernels including the 3.x series:
# ./perf-tools/bin/kprobe -s 'p:cap_capable cap=%dx audit=%cx' 'audit != 0'
Tracing kprobe cap_capable. Ctrl-C to end.
run-4440 [003] d... 6417394.367486: cap_capable: (cap_capable+0x0/0x70) cap=0x18 audit=0x1
run-4440 [003] d... 6417394.367492:
=> ns_capable_common
=> capable
=> do_prlimit
=> SyS_setrlimit
=> entry_SYSCALL_64_fastpath
chmod-4453 [006] d... 6417394.399020: cap_capable: (cap_capable+0x0/0x70) cap=0x3 audit=0x1
chmod-4453 [006] d... 6417394.399027:
=> ns_capable_common
=> ns_capable
=> inode_owner_or_capable
=> inode_change_ok
=> xfs_setattr_nonsize
=> xfs_vn_setattr
=> notify_change
=> chmod_common
=> SyS_fchmodat
=> entry_SYSCALL_64_fastpath
chmod-4453 [006] d... 6417394.399035: cap_capable: (cap_capable+0x0/0x70) cap=0x4 audit=0x1
chmod-4453 [006] d... 6417394.399037:
=> ns_capable_common
=> capable_wrt_inode_uidgid
=> inode_change_ok
=> xfs_setattr_nonsize
=> xfs_vn_setattr
=> notify_change
=> chmod_common
=> SyS_fchmodat
=> entry_SYSCALL_64_fastpath
chmod-4455 [007] d... 6417394.402524: cap_capable: (cap_capable+0x0/0x70) cap=0x4 audit=0x1
chmod-4455 [007] d... 6417394.402529:
=> ns_capable_common
=> capable_wrt_inode_uidgid
=> inode_change_ok
=> xfs_setattr_nonsize
=> xfs_vn_setattr
=> notify_change
=> chmod_common
=> SyS_fchmodat
=> entry_SYSCALL_64_fastpath
[...]
It's a one-liner using my kprobe tool. It's also (currently) a bit harder to use: I need to know which registers those arguments will be in: the example above is for x86_64 only.
That's all for now. Happy hacking.


Click here for Disqus comments (ad supported).