Research By: Slava Makkaveev
Since 2007, Amazon has sold tens of millions of Kindles, which is impressive. But this also means that tens of millions of people could have potentially been hacked through a software bug in those same Kindles. Their devices could be turned into bots or their private local networks could be compromised, and perhaps even information in their billing accounts can be stolen.
The easiest way to remotely reach a user’s Kindle is through an e-book. A malicious book can be published and made available for free access in any virtual library, including the Kindle Store, via the “self-publishing” service, or sent directly to the end-user device via the Amazon “send to kindle” service.
While you might not be happy with the writing in a particular book, nobody expects to download one that is malicious. No such scenarios have been publicized. Antiviruses do not have signatures for e-books.
But… we succeeded in making a malicious book. If you were to open this book on a Kindle device, it could have caused a hidden piece of code to be executed with root rights. From this moment on, you can assume that you have lost control of your e-reader.
The issues we found were reported to Amazon in February 2021 and fixed in the 5.13.5 version of Kindle’s firmware in April 2021. The patched firmware will be installed automatically on devices connected to the Internet.
Kindle Touch architecture
Basically, the Kindle OS is a Linux kernel with a set of native programs mainly provided by busybox, the LIPC subsystem for inter-process communication, and the Java and Webkit subsystems for user interface (UI) and services.
Figure 1: Kindle Touch architecture.
Most of the UI is written in Java. The Java subsystem (the framework) provides LIPC handlers for both services and the UI (so-called Booklets). For example, the Kindle home UI window is the
com.lab126.booklet.home booklet managed by the framework.
Who parses e-books?
The latest version (5.13.4) of the Kindle e-reader firmware is publicly available for download on the official Amazon website. The source code is also partially available there. But the source code did not help in our research because it mainly consists of third-party open-source projects, including the Linux kernel, with small Amazon tweaks. There is no source code for the components responsible for parsing and rendering e-books.
Our first goal was to discover a vulnerability in the e-book parsing framework. For this we have enough files from the firmware and there is no need for a real Kindle device.
Let’s look at the components responsible for handling e-books.
/mnt/us/documents is the regular e-books’ directory, when you download a new book on your Kindle device. Who is going to handle the file first?
/usr/bin/scanner service periodically scans the document directory for new files and, depending on the file extension, uses one of the “extractor” libraries to extract metadata from the e-book. All extractors are listed in the
/var/local/appreg.db sqlite database. There is a handler for each of the supported Kindle e-book formats:
|azw, mbp, mobi, prc
If the scanner does not match the file extension or a parsing error occurs, the e-book is not shown to the user.
We did not go deep into the scanning process because extracting metadata is too simple an operation to suggest parsing errors.
After the scanner does its job, a thumbnail of the new book is displayed on the home screen. From this moment on, the Java framework is responsible for opening the book when you click on it. Java archive (JAR) files that implement the logic for opening and rendering e-books can be found in the
/opt/amazon/ebook/lib firmware directory. Primarily, these are
For further research, we decided to focus our attention on the PDF file format, as it’s one of the most common, and yet at the same time, complex formats.
Let’s take a look at the implementation of the PDF book opening function in the
As you can see, this function is only a wrapper over the
nativeOpenPDFDocument native function with the body in the
nativeOpenPDFDocument function starts the PDF server
/usr/bin/pdfreader, forking the process, and synchronously sends it an “openBook” message via the open source HTTP client/server library
/usr/lib/libsoup-2.4.so. In fact, it sends a GET request to
pdfreader server is the main target of our research. Eventually, we will run our payload in the context of this process.
At startup, the
pdfreader server lowers itself to the permissions of the “framework” user (uid 9000) with a
setuid call. Then it launches a soup server listening on port 7667, defining dozens of handlers for high-level PDF operations, including the “openBook” and “startRendering” ones that we are interested in.
/usr/lib/libFoxitWrapper.so library, written by Amazon, provides an API for working with PDF files. The
pdfreader uses this library in its soup handlers. For example, the “openBook” handler looks like this:
Note the following significant functions of the
openPDFDocumentFromLibrary(char *file, char* password, uint32_t* handle) – Opens the PDF document.
getCurrentPage(uint32_t handle, uint32_t page, uint32_t flag) – Parses the PDF page to internal structures.
renderPageFromLibrary(uint32_t handle, uint32_t page, uint32_t width, uint32_t height, float scale, uint8_t landscape, uint8_t* out) – Renders the PDF page converting it to an image. When called, the stream filters begin to be parsed.
These functions are good entry points for fuzzing a PDF tree structure.
As the name implies,
libFoxitWrapper.so is a wrapper for a popular Foxit PDF SDK presented on Kindle devices by the
/usr/lib/libfpdfemb.so library. The
libfpdfemb.so is a closed-source library proprietary to Foxit Software Inc. The Foxit Embedded PDF SDK manual can be found on the Internet.
Fuzzing PDF filters
We tried to fuzz the mentioned functions from the
libFoxitWrapper.so library, but this approach did not bring any result, except for a set of null pointer exceptions. A more promising approach to the PDF format is to choose one specific object or stream filter as the target for the test. So, we decided to fuzz the
But first, let’s take a look at the classic fuzzing model.
The easiest way to fuzz any closed-source library is to write an executable file that loads the library into memory and calls the target functions. This loader takes a file with permuted data as a command line parameter, reads it in, and passes the data to the function under test. Next, the loader is instrumented or run on an emulator to collect the code coverage matrix for each test case. One of the third-party fuzzers/permutors is used to generate new test cases based on the coverage matrix.
To fuzz the
libfpdfemb.so library, we chose a combination of American Fuzzy Lop (AFL) and Quick emulator (Qemu). The host machine is Ubuntu.
Figure 2: The fuzzing scheme.
We need to note one more thing. A Kindle device is based on an ARM processor. Therefore, our loader was compiled using
arm-linux-gnueabi-g++. The Qemu easily emulates ARM on x86.
A simple search for the words “CPDF” and “Codec” in the
libfpdfemb.so library allowed us to find all the possible stream filters/codecs:
Jpx. Let’s take a look at one of them with an example.
Figure 3: Fragment of PDF page with jbig2 filter.
￼As you can see, an image
Im1 with jbig2 filter is declared. Jbig2 is an image compression standard for bi-level images. The jbig2 encoder segments the input page into regions: text, halftone images, refinement, and others. These regions are held in the
JBIG2Globals stream. When rendering a PDF page,
libfpdfemb.so parses the
JBIG2Globals stream and reconstructs the image.
Jbig2Module object, defined in the
libfpdfemb.so library, is responsible for decoding jbig2 compressed images.
StartDecode method is declared as follows:
Among other filters, we fuzzed the jbig2 decoding algorithm using the
StartDecode function as the entry point and permuted the image size (
height arguments), the image stream (
src_size) and the
JBIG2Globals stream (
global_size). Below you can see the harness we used to invoke the
StartDecode. The base variable is the address of the
libfpdfemb.so library in memory.
As a result, we discovered a valuable heap overflow vulnerability in the
JBIG2Globals decoding algorithm.
CVE-2021-30354. Heap overflow
Let’s take a look at the following
Figure 5: Malformed
Two page regions are defined here:
- The image information region (first 0x23 bytes). The image width is 0x80, the height is 1 and the stride is 0x10. The stride is calculated as
((width + 31) >> 5) << 2.
- The “refinement” region (from 0x23 to 0x4D bytes). This region contains jbig2 encoded information to refine the image. As only a part of the image can be refined, it also contains the coordinates of the refining rectangle. In our case, the provided rectangle parameters are: width – 0, height – 0x10, x – 0, y – 0x40000000.
This is a malformed stream. An oversized rectangle is defined in the refinement region.
What happens in this case? The algorithm tries to expand the base image to the new dimensions. The height of the new image is recalculated as
height + y, and
(height + y) * stride heap memory is allocated for the resized image. But there is a mistake in the expanding function that leads to a heap overflow: a missed check for
INT_MAX when calculating the size in memory of the new image. The 32-bit register overflows, and 0x100 bytes is allocated for the image instead of 0x400000100.
Figure 6: The
This means that by using refinement regions, we can “refine” the data outside of the image, and get the arbitrary write primitive. In the following example, the second refinement region overwrites 0x10 (stride) bytes at an offset 0x1234 * 0x10 bytes from the beginning of the image in the heap. The data blob (0x71 to 0x79 bytes) is decompressed by the jbig2 algorithm and then XORed with the heap content.
Figure 7: Controlled heap overflow.
We can create any number of refinement regions and overwrite parts of memory that are at a distance from each another. In addition, the fact that the writing is done through a XOR operation allows us to fix only specific bits of memory, but not whole words, and bypass ASLR protection if required.
As mentioned previously, the
libfpdfemb.so library is part of the
pdfreader process. The data and heap segments of this process are read/write/execute. ASLR is built into the Linux kernel and is controlled by the parameter
/proc/sys/kernel/randomize_va_space. Its default value on Kindle devices is 1, which means the base address of the data segment is located immediately after the end of the executable code segment. In other words, there is no randomization for the data segment and the heap. These two facts make exploiting the discovered jbig2 vulnerability trivial.
CVE-2021-30355. Improper Privilege Management
We now have RCE vulnerability in the context of the
pdfreader process. A user downloads the PDF book to his Kindle device. When the book is opened, a malicious payload is launched.
pdfreader process has the framework user rights:
uid=9000(framework) gid=150(javausers) groups=150(javausers). It can send LIPC messages, access special internal files, but it is still limited. We want to be a root to reset all restrictions.
So, the second stage of the research is to find an LPE vulnerability that allows the framework user to run a code under the root user.
First, we jailbroke one of our Kindles because it is not enough just to have files from the firmware to search the logical LPE. We need to see running processes and opened ports, and to be able to debug Kindle services.
A software jailbreak for some versions of Kindle firmware can be found on the Internet. But the most general way is to jailbreak through the serial port. Although this requires disassembling the device, this is what we did.
Figure 8: Jailbreak the Kindle via the serial port.
We got a jailbroken device, and then analyzed the services that have root rights, as well as the resources they access. Eventually, we found a logical error, or more accurately, an improper privilege management, in one of the Kindle services. Great, there is no need to fuzz the device drivers.
The framework user has full access to
/var/tmp/framework directory, where he can create any executable file. Actually, this is the user’s working directory. For example, we can create a bash script file
payload.sh that logs user privileges:
The framework user has read/write access to the
/var/local/appreg.db sqlite database that is essentially an application registry. This means that we can fix a database entry using the
/usr/lib/libsqlite3.so library or by simply editing the file. We want to patch one of the “command” entries in the
properties table in
For example, we can patch the entry
com.lab126.browser: set the
value field to
/var/tmp/framework/payload.sh instead of
/usr/bin/mesquite. The following SQL request does the work:
The framework can request the application manager, represented by the
appmgrd service, to start an arbitrary application. We can send an LIPC message to open the browser app using the
/usr/lib/liblipc.so library. This shell command does the same:
The application manager is responsible for launching built-in apps. To do this, it listens for the appropriate LIPC events. To start the browser app, it reads the entry
com.lab126.browser from the
appreg.db, and executes the command specified in the
value field. As we patched this database entry, our
payload.sh script is launched.
appmgrd service has root rights. The “root: uid=0(root) gid=0(root)” string is logged by the
The described LPE vulnerability can be easily exploited from the
pdfreader process that we owned. The
liblipc.so libraries are already loaded into the process memory. By combining the two discovered vulnerabilities, any malicious payload can be run as root.
We demonstrated how an e-book can function as malware. As the malware code is executed with root user rights, just opening such a book could have led to irreparable damage. The attacker could have deleted your e-books, potentially gain full access to your Amazon account, could have converted your Kindle to a bot, attacked other devices in your local network, and more.
The described vulnerabilities were reported to Amazon in February 2021 and fixed in the 5.13.5 version of Kindle’s firmware in April 2021.