diff options
author | mperes <mperes@web> | 2013-09-23 14:04:42 -0700 |
---|---|---|
committer | xorg <iki-xorg@freedesktop.org> | 2013-09-23 14:04:42 -0700 |
commit | f409de2a0de2c164bca2de795f39dde2f958efc0 (patch) | |
tree | d90a73e53554ba848345768e0337839e98a71a1a /Events/XDC2013/XDC2013DavidHerrmannDRMSecurity | |
parent | 9d8a8a152951def04b547f4f0c846fb865be575d (diff) |
attachment upload
Diffstat (limited to 'Events/XDC2013/XDC2013DavidHerrmannDRMSecurity')
-rw-r--r-- | Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY | 437 |
1 files changed, 437 insertions, 0 deletions
diff --git a/Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY b/Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY new file mode 100644 index 00000000..a411f3c0 --- /dev/null +++ b/Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY @@ -0,0 +1,437 @@ + DRM Security + ============== + +During the last few years, users of the DRM API have increased significantly. +Aside from the X-Server different parts of the linux desktop stack use the DRM +API directly. This includes Plymouth, Weston, Mir, kmscon and more. +While the DRM and KMS APIs could mostly withstand the strain, the lack of a sole +user-space DRM user showed several shortcomings in the design. We cannot rely +on X-Server or DDX fixes to work around kernel API deficiencies, anymore. We +have to carefully take all the different DRM applications into account while +changing or improving the DRM API. +By opening /dev/dri/ to more applications than the X-Server, we also open it for +spoofing attacks. In this talk I want to built on the results of last year's +DRM2 talk (XDC-2012) and address the GEM-Flink, DRM-mmap() and DRM-Master +related spoofing attacks. I developed several examples that reveal how easy it +is to misuse these and will discuss the fixes that were introduced to DRM during +the last year. + +0) Prerequisites +================ + +Name: David Herrmann +Email: dh.herrmann@gmail.com +Date: 2013/07/02 + +The reader is expected to be familiar with the DRM API and its major concepts, +including the following: + DRM-Master, GEM + TTM, Flink, dma-buf, DRM mmap, DRI1 and DRI2 +These concepts are used throughout the article and will not be explained in +detail. + +Last year's talk on DRM2 is available at: + http://www.youtube.com/watch?v=4fRXNHAjMIY + +1) Current Situation +==================== + +Before we can discuss fixes for DRM deficiencies we must outline the current +situation and supported use-cases. The kernel API must be backwards-compatible, +so introducing new setups to fix old bugs is not acceptable. Instead, we must +understand the current situation in its entirety and always preserve backwards +compatiblity. + +Not all bugs can be fixed retroactively, but large user-space modifications +should be avoided so existing systems can benefit from these fixes. + +1.1) Setup +---------- + +In a typical DRM setup we have many different DRM users. A central role is +taken by the graphics-server which can have multiple authenticated +render-clients. This setup may exist many times in parallel, on a single seat or +on independent seats. Apart from a server-client layout, we might also have +independent offscreen DRM users. + + - Graphics Servers: X-Server, Weston or other compositors provide a central + place for clients to display their window contents and take care of any + modesetting or compositing. + Multiple servers can be run on different seats in parallel. On a single seat, + only one server is active at a time, the others run in background. + - Render-Clients: Graphics servers can allow clients to use the GPU to render + window contents. Clients have limited DRM access and cannot alter global GPU + state. They can share state with the server, but must retain control over + what is shared with whom. + - Offscreen-Clients: Offscreen clients are like Render-Clients but are not + associated with a graphics server. They require the GPU for offscreen use + like GPGPU or offscreen-rendering. + +1.2) Security +------------- + +With many different applications accessing the GPU in parallel, we must provide +definite DRM namespaces for each of them. While graphics servers are granted +global DRM access, all DRM users must retain control over private objects. A +graphics server should not be allowed to access a GPGPU client's buffers. And +different render clients should not be able to see what each other is doing. But +locking down object namespaces is not the ultimate solution as buffer sharing is +one of the fundamental concepts of DRM. + +The DRM-Master, GEM-Flink, DRM-mmap() and dma-buf APIs are currently used to +allow context separation and shared state. But they have several flaws that pose +a security risk to current linux desktop systems. + +The known problems (in no particular order) are: + - gem-flink doesn't provide any private namespaces to applications and servers. + Instead, only one global namespace is provided per DRM node. Malicious + authenticated applications can attack other clients via brute-force + "name-guessing" of gem buffers. + - DRM mmap() does not provide any private namespaces to applications. Once a + buffer has a fake-offset available for mmap()-use, it will be global. A + malicious application can guess the offset and alter it arbitrarily. + - drmModeGetFB() returns a gem-handle to the framebuffer's backing gem object. + This can be used by malicious applications to get access to the currently + active framebuffer and alter it arbitrarily. + - DRM-Master is limited to CAP_SYS_ADMIN. This requires applications to run as + root or use hackish workarounds. The complex design of compositors makes it + unlikely that they are bug-free so we should do our best to avoid running + them with root-privileges. + - DRM-Master management is left to the active graphics server. This allows + malicious applications to continously ask for DRM-Master and intercept it + during VT-switches. This doesn't even require root-privileges! + - DRM-Master context separation cannot be controlled entirely from user-space. + +2) Attacks +========== + +I looked for an attack scenario for each API deficiency and developed example +programs to exploit it. While I limited the examples to a specific +implementation (mostly Xorg), one must take into account that they are +applicable to others as well. + +2.1) GEM-Flink +-------------- + +The GEM-Flink attack is very simple. We need a running X-Server and two clients +that render on the GPU. Clients must be authenticated on the DRM node via the +DRI API, which mostly means being in the "video" group. + +Client A (the target) renders window contents via the GPU, creates an GEM-flink +name for the buffer and passes it to the X-Server. This allows the X-Server to +open the buffer and display it. +Client B (the attacker) can guess the Flink name (brute force) and use the +GEM_OPEN ioctl to open the same buffer, even though it wasn't supposed to get +access. The buffer may thus leak private information or allow the attacker to +alter the visual appearance of the target. + +The following pseudo-code shows how easy it is for Client B to get a GEM handle +to the buffer of Client A: + + Client A (target) | Client B (attacker) + -----------------------------------+------------------------------- + int fd; int fd, err; + uint32_t handle, name; uint32_t name, handle; + struct drm_gem_flink pl; struct drm_gem_open pl; + + fd = open("/dev/dri/card0"); fd = open("/dev/dri/card0"); + + .. handle = GEM_OPEN_* .. + .. card specific .. + + pl.handle = handle; + ioctl(fd, DRM_IOCTL_GEM_FLINK, + &pl); + name = pl.name; + + for (name = 0; name < INT_MAX; ++name) { + pl.name = name; + err = ioctl(fd, DRM_IOCTL_GEM_OPEN, + &pl); + if (!err) + break; + } + + handle = pl.handle; + +With the quite low number of global Flink names in freshly booted systems, the +bute-force attack has a very high success rate. The kernel uses the "IDR" system +for name allocation and thus the flink-names are highly predictable. + +The attacker cannot tell what buffer they opened, however, they can easily open +all buffers until they find what they need. + + 2.1.1) GEM-Flink Alternatives + ----------------------------- + + While limiting the lifetime of flink-names or requiring DRM-Master for + GEM_OPEN would reduce the attack surface, they break DRM API semantics. No + final fix for the GEM-Flink attack is known, but with dma-buf we have a + replacement which allows fine-grained access management via file-descriptors. + + The flink API was designed around global names and it is very unlikely that it + will ever change. Use dma-buf! + +2.2) DRM-mmap() +--------------- + +The mmap() attack on DRM devices is based on fake DRM offsets. If a client +wants to map a GPU buffer for CPU access, it requests an mmap() offset on the +DRM node and uses this offset as argument to mmap() to map the buffer. The same +scenario as for GEM-Flink (2.1) applies here. An attacker could easily guess the +offset and map a buffer that they have no access to. + + Client A (target) | Client B (attacker) + -----------------------------------+------------------------------- + int fd; int fd; + uint32_t dumb_handle, offset; uint32_t off; + struct drm_mode_map_dumb pl; void *mem; + + fd = open("/dev/dri/card0"); fd = open("/dev/dri/card0"); + + ... + dumb_handle = ioctl(fd, + DRM_IOCTL_MODE_CREATE_DUMB, + ..args..); + ... + + pl.handle = dumb_handle; + ioctl(fd, DRM_IOCTL_MODE_MAP_DUMB, + &pl); + offset = pl.offset; + + mmap(0, ..len.., ..prot.., + ..flags.., fd, offset); + + for (off = 0; off < INT_MAX; ++off) { + mem = mmap(0, ..len.., ..prot.., + ..flags.., fd, off); + if (mem != MAP_FAILED) + break; + } + +In this example, Client B will end up with some buffer (not guaranteed to be +the buffer of Client A) mapped at @mem. It can read from or write to it. An +attacker can easily map all available buffers, which guarantees that the buffer +of Client A is mapped. + +Internally, drm_mm is used for offset allocations. The algorithm is simple and +can be mirrored by the attacker. Similar to GEM-Flink (2.1) a brute-force attack +is likely to succeed. + + 2.2.1) DRM-mmap() Namespaces + ---------------------------- + + While the global mmap() offset namespace is part of the DRM API, no + application did make use of this. Hence, a simple fix is to bind mmap() access + to the GEM-name. A patch-series is pending on dri-devel which thus reduces the + DRM-mmap() attack to a GEM buffer attack (eg., see GEM-Flink 2.1 and "Unified + VMA Offset Manager" on dri-devel). + VMA Offset Manager: + http://lists.freedesktop.org/archives/dri-devel/2013-July/042141.html + mmap() Access Management: + http://lists.freedesktop.org/archives/dri-devel/2013-July/041222.html + + The idea is to restrict mmap() access to applications which own a handle to + the target buffer. In this case, an application is always allowed to create + mmap offsets theirselves. So they own the buffer and should be allowed mmap() + access. + If a client does not own a handle to the buffer, they must not get any access. + + The VMA offset manager and access-management patches are likely to be included + in linux-3.12 and thus fix this security problem. + +2.3) drmModeGetFB() +------------------- + +drmModeGetFB() is part of the DRM-KMS API and allows any DRM client to retrieve +a GEM-handle for the currently active framebuffer on any CRTC. A simple attack +requires a running X-Server with one or more allocated framebuffers. Any client +with access to /dev/dri/ can now open the DRM node, retrieve a GEM handle for +any CRTC via drmModeGetFB() and read/write it arbitrarily. + + Client A (attacker) + ------------------------------------ + int fd; + drmModeFB *fb; + uint32_t handle, id; + + fd = open("/dev/dri/card0"); + + for (id = 0; id < INT_MAX; ++id) { + fb = drmModeGetFB(fd, id); + if (fb) + break; + } + + handle = fb->handle; + +In this example, the attacker will own a handle for some existing framebuffer. +If this is done for all IDs, an attacker can get access to the currently +displayed framebuffer. +Note that owning a handle implies owning the buffer, so arbitrary mmap() access +is possible. No root rights or CAP_SYS_ADMIN is needed. No DRM authentication is +needed. This can be used by any client who has access to /dev/dri/card0. + +While modesetting commands are limited to DRM-Master, drmModeGetFB() is supposed +to be passive and thus globally accessible. Other ioctls like CREATE_DUMB also +allow similar denial-of-service attacks if clients consume all of GPU memory. + + 2.3.1) drmModeGetFB() Fix + ------------------------- + + Requiring DRM-Master for all these commands limits the attack surface to the + running graphics server (which would mostly mean the attack is useles). + However, this prevents background graphics servers from managing their + buffers. Especially during server shutdown, DRM-Master shouldn't be required + to free allocated buffers. So other fixes are preferred. + + The most serious bug, the drmModeGetFB buffer leak, can, however, be fixed by + returning an invalid gem handle for clients without DRM-Master access. Patches + are pending on dri-devel, but require more thorough investigation. + + The more appropriate fix is the introduction of DRM render nodes. This splits + DRM nodes between graphics servers and render clients and thus provides + fine-grained access management for KMS ioctls. + Render Nodes proposal: + http://lists.freedesktop.org/archives/dri-devel/2013-July/041222.html + An attacker would need access to /dev/dri/card0 (instead of /dev/dri/renderD0) + to perform this attack. However, this access will be restricted to + privileged compositors once render-nodes are established. + +2.4) DRM-Master and CAP_SYS_ADMIN +--------------------------------- + +Graphics servers are required to have DRM-Master privileges to perform +modesetting or modify global GPU state. However, DRM-Master can only be acquired +with CAP_SYS_ADMIN capabilities, which is roughly equivalent to root rights. + +No specific attack scenario is known, but any bug in a graphics server will +essentially be an attack surface to gain root rights on a desktop system. The +huge size of common graphics servers makes it likely that exploitable bugs +exist. We should thus reduce the required capabilities for graphics servers. + + 2.4.1) Outsourcing DRM-Master + ----------------------------- + + CAP_SYS_ADMIN is only required to acquire DRM-Master. So what Weston does is + running a small helper process which is connected to the compositor via a + pipe. During VT-Switches, the helper acquires and drops DRM-Master accordingly + and the compositor no longer needs CAP_SYS_ADMIN to manage DRM devices. The + attack surface is thus reduced to a small helper. + + If we extend this idea, we could easily move it into a central daemon which + takes care of DRM-Master management for all graphics servers. There is ongoing + work to reuse systemd-logind to manage DRM-Master and Input devices and drops + DRM-Master / mutes the devices while a graphic server is inactive and + re-enables them during wakeup. + + A description of this proposal can be found at: + http://dvdhrm.wordpress.com/2013/07/08/thoughts-on-linux-system-compositors/ + Development is still ongoing and a first prototype is expected for GUADEC 2013 + in Brno. + +2.5) DRM-Master Management +-------------------------- + +DRM-Master is a concept to separate multiple graphics-servers from each other. A +different DRM-Master context is assigned to each graphics server and for all +contexts, only a single DRM user can be "DRM-Master". So while multiple contexts +might exist, only a single context is considered active, the context the current +DRM-Master is assigned to. + +A graphics server can call drmDropMaster() to drop DRM-Master and drmSetMaster() +to gain DRM-Master. drmSetMaster() fails if the current user is not a +DRM-Master. A fundamental flaw is that both calls do not take any context as +argument. Moreover, user-space has no chance to find out which context a user is +assigned to. + +During open() on a DRM node, DRM core will assign the new user to the currently +active context (ie, the context of the current DRM-Master). If no context is +active, a new context is created. While this allows minor control over +context-creation and assignment, it does not allow assigning clients or servers +to a specific context. So if a new session X-Server is started while the current +X-Server is still active, both will get assigned to the same context. On the +other hand, if a client is started while the corresponding graphics-server is +inactive, the client will get assigned to a different context than the server +(which breaks DRI among other things). + +While the security implications might be subtle, this concept allows major +denial of service attacks if clients get assigned to wrong contexts. A main +problem is that a DRM user cannot detect this so it has no way to verify that it +is assigned to the correct context. It may allocate buffers on a context which +is actually the context of an attackers graphics server, not the context of the +target server. + +While this provides a huge surface for attackers, one might argue that it +requires CAP_SYS_ADMIN so we can mostly ignore it. However, that is not true! +During open(), if no context is active, a new context is created and +automatically is assigned DRM-Master. This allows any user with access to +/dev/dri/ to become DRM-Master! Of-course, one cannot drop and re-acquire it as +drmSetMaster() is protected. Nevertheless, this can be used for a +denial-of-service attack by hijacking DRM-Master during VT-switches by +unprivileged applications. Moreover, it can be used to display arbitrary content +on the screen and simulating login-screens or more. + + Client A (attacker) + ------------------------------------ + int fd; + + fd = -1; + do { + close(fd); + fd = open("/dev/dri/card0"); + } while (drmAuthMagic(fd, 0) == -EACCES); + +In this example an attacker opens a DRI device and uses a dummy drmAuthMagic() +call to test whether it is DRM-Master. drmAuthMagic() returns -EACCES if the +caller is no DRM-Master, otherwise -EINVAL is returned as 0 is an invalid +DRM-Magic number. +If this attacker runs during a VT switch, chances are high that it becomes +DRM-Master without having the CAP_SYS_ADMIN capability. Arbitrary modesetting +commands can be issued afterwards. + + 2.5.1) Static DRM-Master Contexts + --------------------------------- + + The current API design should make it pretty clear that multiple DRM-Master + contexts cannot be used properly. In fact, there is no application known to me + which profits or makes use of multiple contexts. Instead, DRM contexts were + reduced to a minimum and today manage no more than DRM-Master assignment. So + we can easily create a single static context during DRM device creation and + assign each user to it. + This prevents any situation where clients are assigned to wrong contexts. All + users will now share the same context. This indirectly fixes the DRM-Master + hijacking problem as new users will never be able to become DRM-Master by + opening /dev/dri/card0 as the static context will always be active. + + With render-nodes we allow offscreen clients, anyway. Hence, we don't have to + limit DRM authentication to the currently active master but can additionally + allow background clients to be authenticated and make use of the DRM device. + + 2.5.2) Centralized DRM-Master Management + ---------------------------------------- + + By moving drmSetMaster() and drmDropMaster() calls to a central daemon (like + systemd-logind), we can provide a central place for DRM-Master management. + Hijacking will be limited to CAP_SYS_ADMIN and can be detected via error codes + on drmSetMaster(). On the same time, clients can use Render-Nodes instead of + authenticating via drmAuth(). The concept of DRM-Master is thus reduced to + managing exclusive hardware access. + +3) Final Notes +============== + +Most of the spoofing attacks are based on the fact that all DRM users share the +same DRM node. Linux provides many advanced access-management facilities that we +could make use of. However, they can require huge changes to user-space. The big +DRM legacy makes it almost impossible to guarantee no old UMS driver might +break, so we cannot drop backwards-compatibility. But at the same time, this +shouldn't hold us back. New concepts that fix all known issues are already +available and wait for wider adoption. By keeping /dev/dri/card<num> as it is, +we can always guarantee backwards-compatibility but limit new users to a sane +and safe API. + +During last year's talk, most of these issues were still unfixed. However, a lot +has happened and nearly all fixes and new facilities are either already upstream +or pending on dri-devel and waiting for adoption. The emergence of so many +different DRM applications should motivate us to finally move forward. |