summaryrefslogtreecommitdiff
path: root/Events/XDC2013/XDC2013DavidHerrmannDRMSecurity
diff options
context:
space:
mode:
authormperes <mperes@web>2013-09-23 14:04:42 -0700
committerxorg <iki-xorg@freedesktop.org>2013-09-23 14:04:42 -0700
commitf409de2a0de2c164bca2de795f39dde2f958efc0 (patch)
treed90a73e53554ba848345768e0337839e98a71a1a /Events/XDC2013/XDC2013DavidHerrmannDRMSecurity
parent9d8a8a152951def04b547f4f0c846fb865be575d (diff)
attachment upload
Diffstat (limited to 'Events/XDC2013/XDC2013DavidHerrmannDRMSecurity')
-rw-r--r--Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY437
1 files changed, 437 insertions, 0 deletions
diff --git a/Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY b/Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY
new file mode 100644
index 00000000..a411f3c0
--- /dev/null
+++ b/Events/XDC2013/XDC2013DavidHerrmannDRMSecurity/DRM_SECURITY
@@ -0,0 +1,437 @@
+ DRM Security
+ ==============
+
+During the last few years, users of the DRM API have increased significantly.
+Aside from the X-Server different parts of the linux desktop stack use the DRM
+API directly. This includes Plymouth, Weston, Mir, kmscon and more.
+While the DRM and KMS APIs could mostly withstand the strain, the lack of a sole
+user-space DRM user showed several shortcomings in the design. We cannot rely
+on X-Server or DDX fixes to work around kernel API deficiencies, anymore. We
+have to carefully take all the different DRM applications into account while
+changing or improving the DRM API.
+By opening /dev/dri/ to more applications than the X-Server, we also open it for
+spoofing attacks. In this talk I want to built on the results of last year's
+DRM2 talk (XDC-2012) and address the GEM-Flink, DRM-mmap() and DRM-Master
+related spoofing attacks. I developed several examples that reveal how easy it
+is to misuse these and will discuss the fixes that were introduced to DRM during
+the last year.
+
+0) Prerequisites
+================
+
+Name: David Herrmann
+Email: dh.herrmann@gmail.com
+Date: 2013/07/02
+
+The reader is expected to be familiar with the DRM API and its major concepts,
+including the following:
+ DRM-Master, GEM + TTM, Flink, dma-buf, DRM mmap, DRI1 and DRI2
+These concepts are used throughout the article and will not be explained in
+detail.
+
+Last year's talk on DRM2 is available at:
+ http://www.youtube.com/watch?v=4fRXNHAjMIY
+
+1) Current Situation
+====================
+
+Before we can discuss fixes for DRM deficiencies we must outline the current
+situation and supported use-cases. The kernel API must be backwards-compatible,
+so introducing new setups to fix old bugs is not acceptable. Instead, we must
+understand the current situation in its entirety and always preserve backwards
+compatiblity.
+
+Not all bugs can be fixed retroactively, but large user-space modifications
+should be avoided so existing systems can benefit from these fixes.
+
+1.1) Setup
+----------
+
+In a typical DRM setup we have many different DRM users. A central role is
+taken by the graphics-server which can have multiple authenticated
+render-clients. This setup may exist many times in parallel, on a single seat or
+on independent seats. Apart from a server-client layout, we might also have
+independent offscreen DRM users.
+
+ - Graphics Servers: X-Server, Weston or other compositors provide a central
+ place for clients to display their window contents and take care of any
+ modesetting or compositing.
+ Multiple servers can be run on different seats in parallel. On a single seat,
+ only one server is active at a time, the others run in background.
+ - Render-Clients: Graphics servers can allow clients to use the GPU to render
+ window contents. Clients have limited DRM access and cannot alter global GPU
+ state. They can share state with the server, but must retain control over
+ what is shared with whom.
+ - Offscreen-Clients: Offscreen clients are like Render-Clients but are not
+ associated with a graphics server. They require the GPU for offscreen use
+ like GPGPU or offscreen-rendering.
+
+1.2) Security
+-------------
+
+With many different applications accessing the GPU in parallel, we must provide
+definite DRM namespaces for each of them. While graphics servers are granted
+global DRM access, all DRM users must retain control over private objects. A
+graphics server should not be allowed to access a GPGPU client's buffers. And
+different render clients should not be able to see what each other is doing. But
+locking down object namespaces is not the ultimate solution as buffer sharing is
+one of the fundamental concepts of DRM.
+
+The DRM-Master, GEM-Flink, DRM-mmap() and dma-buf APIs are currently used to
+allow context separation and shared state. But they have several flaws that pose
+a security risk to current linux desktop systems.
+
+The known problems (in no particular order) are:
+ - gem-flink doesn't provide any private namespaces to applications and servers.
+ Instead, only one global namespace is provided per DRM node. Malicious
+ authenticated applications can attack other clients via brute-force
+ "name-guessing" of gem buffers.
+ - DRM mmap() does not provide any private namespaces to applications. Once a
+ buffer has a fake-offset available for mmap()-use, it will be global. A
+ malicious application can guess the offset and alter it arbitrarily.
+ - drmModeGetFB() returns a gem-handle to the framebuffer's backing gem object.
+ This can be used by malicious applications to get access to the currently
+ active framebuffer and alter it arbitrarily.
+ - DRM-Master is limited to CAP_SYS_ADMIN. This requires applications to run as
+ root or use hackish workarounds. The complex design of compositors makes it
+ unlikely that they are bug-free so we should do our best to avoid running
+ them with root-privileges.
+ - DRM-Master management is left to the active graphics server. This allows
+ malicious applications to continously ask for DRM-Master and intercept it
+ during VT-switches. This doesn't even require root-privileges!
+ - DRM-Master context separation cannot be controlled entirely from user-space.
+
+2) Attacks
+==========
+
+I looked for an attack scenario for each API deficiency and developed example
+programs to exploit it. While I limited the examples to a specific
+implementation (mostly Xorg), one must take into account that they are
+applicable to others as well.
+
+2.1) GEM-Flink
+--------------
+
+The GEM-Flink attack is very simple. We need a running X-Server and two clients
+that render on the GPU. Clients must be authenticated on the DRM node via the
+DRI API, which mostly means being in the "video" group.
+
+Client A (the target) renders window contents via the GPU, creates an GEM-flink
+name for the buffer and passes it to the X-Server. This allows the X-Server to
+open the buffer and display it.
+Client B (the attacker) can guess the Flink name (brute force) and use the
+GEM_OPEN ioctl to open the same buffer, even though it wasn't supposed to get
+access. The buffer may thus leak private information or allow the attacker to
+alter the visual appearance of the target.
+
+The following pseudo-code shows how easy it is for Client B to get a GEM handle
+to the buffer of Client A:
+
+ Client A (target) | Client B (attacker)
+ -----------------------------------+-------------------------------
+ int fd; int fd, err;
+ uint32_t handle, name; uint32_t name, handle;
+ struct drm_gem_flink pl; struct drm_gem_open pl;
+
+ fd = open("/dev/dri/card0"); fd = open("/dev/dri/card0");
+
+ .. handle = GEM_OPEN_* ..
+ .. card specific ..
+
+ pl.handle = handle;
+ ioctl(fd, DRM_IOCTL_GEM_FLINK,
+ &pl);
+ name = pl.name;
+
+ for (name = 0; name < INT_MAX; ++name) {
+ pl.name = name;
+ err = ioctl(fd, DRM_IOCTL_GEM_OPEN,
+ &pl);
+ if (!err)
+ break;
+ }
+
+ handle = pl.handle;
+
+With the quite low number of global Flink names in freshly booted systems, the
+bute-force attack has a very high success rate. The kernel uses the "IDR" system
+for name allocation and thus the flink-names are highly predictable.
+
+The attacker cannot tell what buffer they opened, however, they can easily open
+all buffers until they find what they need.
+
+ 2.1.1) GEM-Flink Alternatives
+ -----------------------------
+
+ While limiting the lifetime of flink-names or requiring DRM-Master for
+ GEM_OPEN would reduce the attack surface, they break DRM API semantics. No
+ final fix for the GEM-Flink attack is known, but with dma-buf we have a
+ replacement which allows fine-grained access management via file-descriptors.
+
+ The flink API was designed around global names and it is very unlikely that it
+ will ever change. Use dma-buf!
+
+2.2) DRM-mmap()
+---------------
+
+The mmap() attack on DRM devices is based on fake DRM offsets. If a client
+wants to map a GPU buffer for CPU access, it requests an mmap() offset on the
+DRM node and uses this offset as argument to mmap() to map the buffer. The same
+scenario as for GEM-Flink (2.1) applies here. An attacker could easily guess the
+offset and map a buffer that they have no access to.
+
+ Client A (target) | Client B (attacker)
+ -----------------------------------+-------------------------------
+ int fd; int fd;
+ uint32_t dumb_handle, offset; uint32_t off;
+ struct drm_mode_map_dumb pl; void *mem;
+
+ fd = open("/dev/dri/card0"); fd = open("/dev/dri/card0");
+
+ ...
+ dumb_handle = ioctl(fd,
+ DRM_IOCTL_MODE_CREATE_DUMB,
+ ..args..);
+ ...
+
+ pl.handle = dumb_handle;
+ ioctl(fd, DRM_IOCTL_MODE_MAP_DUMB,
+ &pl);
+ offset = pl.offset;
+
+ mmap(0, ..len.., ..prot..,
+ ..flags.., fd, offset);
+
+ for (off = 0; off < INT_MAX; ++off) {
+ mem = mmap(0, ..len.., ..prot..,
+ ..flags.., fd, off);
+ if (mem != MAP_FAILED)
+ break;
+ }
+
+In this example, Client B will end up with some buffer (not guaranteed to be
+the buffer of Client A) mapped at @mem. It can read from or write to it. An
+attacker can easily map all available buffers, which guarantees that the buffer
+of Client A is mapped.
+
+Internally, drm_mm is used for offset allocations. The algorithm is simple and
+can be mirrored by the attacker. Similar to GEM-Flink (2.1) a brute-force attack
+is likely to succeed.
+
+ 2.2.1) DRM-mmap() Namespaces
+ ----------------------------
+
+ While the global mmap() offset namespace is part of the DRM API, no
+ application did make use of this. Hence, a simple fix is to bind mmap() access
+ to the GEM-name. A patch-series is pending on dri-devel which thus reduces the
+ DRM-mmap() attack to a GEM buffer attack (eg., see GEM-Flink 2.1 and "Unified
+ VMA Offset Manager" on dri-devel).
+ VMA Offset Manager:
+ http://lists.freedesktop.org/archives/dri-devel/2013-July/042141.html
+ mmap() Access Management:
+ http://lists.freedesktop.org/archives/dri-devel/2013-July/041222.html
+
+ The idea is to restrict mmap() access to applications which own a handle to
+ the target buffer. In this case, an application is always allowed to create
+ mmap offsets theirselves. So they own the buffer and should be allowed mmap()
+ access.
+ If a client does not own a handle to the buffer, they must not get any access.
+
+ The VMA offset manager and access-management patches are likely to be included
+ in linux-3.12 and thus fix this security problem.
+
+2.3) drmModeGetFB()
+-------------------
+
+drmModeGetFB() is part of the DRM-KMS API and allows any DRM client to retrieve
+a GEM-handle for the currently active framebuffer on any CRTC. A simple attack
+requires a running X-Server with one or more allocated framebuffers. Any client
+with access to /dev/dri/ can now open the DRM node, retrieve a GEM handle for
+any CRTC via drmModeGetFB() and read/write it arbitrarily.
+
+ Client A (attacker)
+ ------------------------------------
+ int fd;
+ drmModeFB *fb;
+ uint32_t handle, id;
+
+ fd = open("/dev/dri/card0");
+
+ for (id = 0; id < INT_MAX; ++id) {
+ fb = drmModeGetFB(fd, id);
+ if (fb)
+ break;
+ }
+
+ handle = fb->handle;
+
+In this example, the attacker will own a handle for some existing framebuffer.
+If this is done for all IDs, an attacker can get access to the currently
+displayed framebuffer.
+Note that owning a handle implies owning the buffer, so arbitrary mmap() access
+is possible. No root rights or CAP_SYS_ADMIN is needed. No DRM authentication is
+needed. This can be used by any client who has access to /dev/dri/card0.
+
+While modesetting commands are limited to DRM-Master, drmModeGetFB() is supposed
+to be passive and thus globally accessible. Other ioctls like CREATE_DUMB also
+allow similar denial-of-service attacks if clients consume all of GPU memory.
+
+ 2.3.1) drmModeGetFB() Fix
+ -------------------------
+
+ Requiring DRM-Master for all these commands limits the attack surface to the
+ running graphics server (which would mostly mean the attack is useles).
+ However, this prevents background graphics servers from managing their
+ buffers. Especially during server shutdown, DRM-Master shouldn't be required
+ to free allocated buffers. So other fixes are preferred.
+
+ The most serious bug, the drmModeGetFB buffer leak, can, however, be fixed by
+ returning an invalid gem handle for clients without DRM-Master access. Patches
+ are pending on dri-devel, but require more thorough investigation.
+
+ The more appropriate fix is the introduction of DRM render nodes. This splits
+ DRM nodes between graphics servers and render clients and thus provides
+ fine-grained access management for KMS ioctls.
+ Render Nodes proposal:
+ http://lists.freedesktop.org/archives/dri-devel/2013-July/041222.html
+ An attacker would need access to /dev/dri/card0 (instead of /dev/dri/renderD0)
+ to perform this attack. However, this access will be restricted to
+ privileged compositors once render-nodes are established.
+
+2.4) DRM-Master and CAP_SYS_ADMIN
+---------------------------------
+
+Graphics servers are required to have DRM-Master privileges to perform
+modesetting or modify global GPU state. However, DRM-Master can only be acquired
+with CAP_SYS_ADMIN capabilities, which is roughly equivalent to root rights.
+
+No specific attack scenario is known, but any bug in a graphics server will
+essentially be an attack surface to gain root rights on a desktop system. The
+huge size of common graphics servers makes it likely that exploitable bugs
+exist. We should thus reduce the required capabilities for graphics servers.
+
+ 2.4.1) Outsourcing DRM-Master
+ -----------------------------
+
+ CAP_SYS_ADMIN is only required to acquire DRM-Master. So what Weston does is
+ running a small helper process which is connected to the compositor via a
+ pipe. During VT-Switches, the helper acquires and drops DRM-Master accordingly
+ and the compositor no longer needs CAP_SYS_ADMIN to manage DRM devices. The
+ attack surface is thus reduced to a small helper.
+
+ If we extend this idea, we could easily move it into a central daemon which
+ takes care of DRM-Master management for all graphics servers. There is ongoing
+ work to reuse systemd-logind to manage DRM-Master and Input devices and drops
+ DRM-Master / mutes the devices while a graphic server is inactive and
+ re-enables them during wakeup.
+
+ A description of this proposal can be found at:
+ http://dvdhrm.wordpress.com/2013/07/08/thoughts-on-linux-system-compositors/
+ Development is still ongoing and a first prototype is expected for GUADEC 2013
+ in Brno.
+
+2.5) DRM-Master Management
+--------------------------
+
+DRM-Master is a concept to separate multiple graphics-servers from each other. A
+different DRM-Master context is assigned to each graphics server and for all
+contexts, only a single DRM user can be "DRM-Master". So while multiple contexts
+might exist, only a single context is considered active, the context the current
+DRM-Master is assigned to.
+
+A graphics server can call drmDropMaster() to drop DRM-Master and drmSetMaster()
+to gain DRM-Master. drmSetMaster() fails if the current user is not a
+DRM-Master. A fundamental flaw is that both calls do not take any context as
+argument. Moreover, user-space has no chance to find out which context a user is
+assigned to.
+
+During open() on a DRM node, DRM core will assign the new user to the currently
+active context (ie, the context of the current DRM-Master). If no context is
+active, a new context is created. While this allows minor control over
+context-creation and assignment, it does not allow assigning clients or servers
+to a specific context. So if a new session X-Server is started while the current
+X-Server is still active, both will get assigned to the same context. On the
+other hand, if a client is started while the corresponding graphics-server is
+inactive, the client will get assigned to a different context than the server
+(which breaks DRI among other things).
+
+While the security implications might be subtle, this concept allows major
+denial of service attacks if clients get assigned to wrong contexts. A main
+problem is that a DRM user cannot detect this so it has no way to verify that it
+is assigned to the correct context. It may allocate buffers on a context which
+is actually the context of an attackers graphics server, not the context of the
+target server.
+
+While this provides a huge surface for attackers, one might argue that it
+requires CAP_SYS_ADMIN so we can mostly ignore it. However, that is not true!
+During open(), if no context is active, a new context is created and
+automatically is assigned DRM-Master. This allows any user with access to
+/dev/dri/ to become DRM-Master! Of-course, one cannot drop and re-acquire it as
+drmSetMaster() is protected. Nevertheless, this can be used for a
+denial-of-service attack by hijacking DRM-Master during VT-switches by
+unprivileged applications. Moreover, it can be used to display arbitrary content
+on the screen and simulating login-screens or more.
+
+ Client A (attacker)
+ ------------------------------------
+ int fd;
+
+ fd = -1;
+ do {
+ close(fd);
+ fd = open("/dev/dri/card0");
+ } while (drmAuthMagic(fd, 0) == -EACCES);
+
+In this example an attacker opens a DRI device and uses a dummy drmAuthMagic()
+call to test whether it is DRM-Master. drmAuthMagic() returns -EACCES if the
+caller is no DRM-Master, otherwise -EINVAL is returned as 0 is an invalid
+DRM-Magic number.
+If this attacker runs during a VT switch, chances are high that it becomes
+DRM-Master without having the CAP_SYS_ADMIN capability. Arbitrary modesetting
+commands can be issued afterwards.
+
+ 2.5.1) Static DRM-Master Contexts
+ ---------------------------------
+
+ The current API design should make it pretty clear that multiple DRM-Master
+ contexts cannot be used properly. In fact, there is no application known to me
+ which profits or makes use of multiple contexts. Instead, DRM contexts were
+ reduced to a minimum and today manage no more than DRM-Master assignment. So
+ we can easily create a single static context during DRM device creation and
+ assign each user to it.
+ This prevents any situation where clients are assigned to wrong contexts. All
+ users will now share the same context. This indirectly fixes the DRM-Master
+ hijacking problem as new users will never be able to become DRM-Master by
+ opening /dev/dri/card0 as the static context will always be active.
+
+ With render-nodes we allow offscreen clients, anyway. Hence, we don't have to
+ limit DRM authentication to the currently active master but can additionally
+ allow background clients to be authenticated and make use of the DRM device.
+
+ 2.5.2) Centralized DRM-Master Management
+ ----------------------------------------
+
+ By moving drmSetMaster() and drmDropMaster() calls to a central daemon (like
+ systemd-logind), we can provide a central place for DRM-Master management.
+ Hijacking will be limited to CAP_SYS_ADMIN and can be detected via error codes
+ on drmSetMaster(). On the same time, clients can use Render-Nodes instead of
+ authenticating via drmAuth(). The concept of DRM-Master is thus reduced to
+ managing exclusive hardware access.
+
+3) Final Notes
+==============
+
+Most of the spoofing attacks are based on the fact that all DRM users share the
+same DRM node. Linux provides many advanced access-management facilities that we
+could make use of. However, they can require huge changes to user-space. The big
+DRM legacy makes it almost impossible to guarantee no old UMS driver might
+break, so we cannot drop backwards-compatibility. But at the same time, this
+shouldn't hold us back. New concepts that fix all known issues are already
+available and wait for wider adoption. By keeping /dev/dri/card<num> as it is,
+we can always guarantee backwards-compatibility but limit new users to a sane
+and safe API.
+
+During last year's talk, most of these issues were still unfixed. However, a lot
+has happened and nearly all fixes and new facilities are either already upstream
+or pending on dri-devel and waiting for adoption. The emergence of so many
+different DRM applications should motivate us to finally move forward.