Faxploit: Sending Fax Back to the Dark Ages
Research By: Eyal Itkin and Yaniv Balmas
Fax, the brilliant technology that lifted mankind out the dark ages of mail delivery when only the postal service and carrier pigeons were used to deliver a physical message from a sender to a receiver.
Technology wise, however, that was a long time ago. Today we are light years away from those dark days. In its place we have email, chat messengers, mobile communication channels, web-services, satellites using quantum messaging and more. So fax today is surely nothing but a relic that has been cast aside to the museum of old technologies, right?
Wrong. Fax is surprisingly still widely used even today. With over 300 million fax numbers in use, according to a simple Google search, it seems like we are still far from seeing fax be a thing of the past.
With this in mind, Check Point Research decided to take a deeper look into this old fashioned form of communication and see if fax, other than being a loud noisy beeper and a bureaucratic burden, is also a major network security risk.
Taking Over a Network Using Just a Fax Number
To provide some background, fax today is widely used in all-in-one printer devices by many industries worldwide. These all-in-one printers are then connected both to the internal home or corporate networks through their Ethernet, WiFi, Bluetooth, etc interfaces. However, in addition they are also connected to a PSTN phone line in order to support the fax functionality that they include.
Our research set out to ask what would happen if an attacker, with merely a phone line at his disposal and equipped with nothing more than his target`s fax number, was able to attack an all-in-one printer by sending a malicious fax to it. If the answer was ‘yes’, then he could potentially gain complete control over the printer and possibly infiltrate the rest of the network connected to this printer.
So, after a long and tedious research, we finally succeeded in this mission.
In fact, we found several critical vulnerabilities in all-in-one printers which allowed us to ‘faxploit’ the all-in-one printer and take complete control over it by sending a maliciously crafted fax.
From that point on, anything was possible. We decided the best way to showcase this control will be to use Eternal Blue in order to exploit any PC connected to the same network, and use that PC in order to exfiltrate data back to the attacker by sending…a fax.
Still have questions? Skip the Technical Analysis and head straight to the Q&A.
To watch our talk on this research at DefCon 26, please click here.
Want a deeper look into this attack? Read on for our full technical research paper.
Reversing the Firmware
The first step in reverse engineering the firmware, once we loaded it to IDA, was to figure what is being executed, and in what environment. After a quick recon phase, we found out these details:
The firmware is loaded to and executed by an ARM 32bit CPU, running in Big Endian mode. The main CPU uses a shared memory region to communicate with an MCU that controls the LCD screen.
Figure 1: Printer Architecture
The Operating System is a ThreadX-based [ref. 3] real-time Operating System by Green Hills [ref. 4]. It uses a flat memory model in which there are many tasks that run in Kernel-Mode, all sharing the same virtual address space. Since this is a flat memory model, we would expect the tasks to communicate with each other over a message queue (a FIFO). In addition, the virtual address space is fixed, and no ASLR-based mechanisms are deployed.
When we started to analyse the T.30 state machine task (“tT30”), though, we stumbled upon many traces that used seemingly unique IDs. A deeper investigation found that these IDs are also used in several lists of strings that start with the “DSID_” prefix. And indeed, the strings seem to match the logic near these traces, giving us important reversing hints. We built an Enum from all of the different DSIDs lists, giving us textual descriptions for many traces throughout the task.
Figure 3: DSIS list used for the T.30 state machine.
Figure 4: using the DSID Enum together with the trace function.
Gaps Between Tasks
When reversing the T.30 state machine, and later reversing the task that handles the HDLC modem (“tFaxModem”), it seemed that there were several function pointer tables that we were missing. We found two common code patterns that looked like some allocation / deallocation routines. These functions are used in each module in order to receive information from previous module, and maybe used also to dispatch the buffers to the next module. An example is shown in figure 5.
If we were not able to locate these functions we would not have been able to see how the data flowed inside the firmware, therefore limiting our understanding of the firmware. Since we could not trace most of the function pointers to their initialization, we needed to start a more dynamic approach. We therefore needed a debugger.
Building a Debugger
- The Serial Debugging Interface
At first we analysed the board, searching for a serial debugging port. And soon enough, our (now broken) printer was connected to the serial debugger.
Figure 6: Connecting JTAGULATOR to the printer’s serial debugger.
After a few attempts to use the serial debugger we found that the debugging interface was limited by default:
Figure 7: The serial debugger refuses to obey our commands.
It seemed that we would need to elevate our privileges; and so we needed a vulnerability.
Searching for 1-Days
When trying to exploit a given firmware, it is always useful to check what open sources are being used and comparing their versions to known CVEs. In many cases 1-Days are enough, and they are surly good enough for debugging purposes. There are two main ways for identifying the used open sources:
- Use a string search in the firmware, and identify key strings from popular open sources
- Search the vendor’s website for open source licenses of the products
In addition, identifying useful vulnerabilities in these open sources can be done in many ways:
- You can search for CVE details that match the relevant library version
- If you are already familiar with several vulnerabilities, simply check if they are relevant
- Stay tuned – US CERT distributes a weekly e-mail containing the newly published CVEs
gSOAP debugging vulnerability – Devil’s Ivy
During our research we already identified the gSOAP library, and when we saw reports regarding “Devil’s Ivy” in twitter, we immediately checked it out. The code of CVE 2017-9765 [ref.5], a.k.a “Devil’s Ivy”, can be seen in figure 8.
Figure 8: Decompiled code of the “Devil’s Ivy” vulnerability.
We could reach this vulnerability by sending a huge XML (> 2GB) to the printer over TCP port 53048 thus triggering a stack-based buffer overflow. Exploiting this vulnerability would give us full control over the printer, meaning that we could use this as a debugging vulnerability.
There were, however, two main drawbacks with this plan:
- Sending the exploit over the network would take a considerable amount of time, but with some optimizations we would be able to reduce the transmission time to around seven minutes.
- We would need to develop this exploit using only IDA and the basic serial dumps that would be generated on each failed attempt.
Exploiting Devil’s Ivy
The vulnerability gave us a controllable stack-based buffer overflow, with some limitations over our chars. The forbidden chars were:
- Unprintable: 0x00 – 0x19
- ‘?’: 0x3F
A major advantage in this vulnerability was that our overflow was practically unlimited, enabling us to send the entire exploit chain to be stored on the target’s stack.
An important limitation we had to bare in mind when exploiting in an embedded environment (not on an intel CPU) is the fact that the CPU has several caches. Our received packet would be stored in the Data Cache (D-Cache), while instructions were executed from the Instruction Cache (I-Cache). This means that even though there is no NX bit support, we could not simply return back to execute our payload directly from the stack buffer as the CPU would execute the code as it sees it through the I-Cache.
To bypass all of the different limitations, we had to use a bootstrapping exploit that consists of the following parts:
- Basic ROP that flushes the D-Cache and I-Cache.
- Decoded shellcode that loads our debugger’s network loader.
- The full debugger will be sent to the loader over the network.
We leave the task of constructing the full exploit chain as an exercise to the reader.
Our debugger is an instruction-based network debugger. It supports basic memory read / write requests and can be extended to support firmware-specific instructions as well. We used this debugger to extract memory dumps from the printer, and later on we extended it to test some of the features we used in our demonstration.
Once the debugger is configured with the addresses of the firmware’s API functions (such as memcpy, sleep, and send) it can be loaded to any address as it is fully position independent (PIC). We uploaded our “Scout Debugger” to our Github, and it can be found here [ref.6].
ITU T.30 – Fax Protocol
When an all-in-one printer supports fax capabilities it means that it supports Group 3 (G3) fax protocols, which conform to the ITU T.30 standard [ref.2]. This standard defines the basic capabilities required from the sender and the receiver, while also outlining the different phases of the protocol, as can be seen in figure 9.
Figure 9: Diagram as taken from the ITU T.30 standard.
We will focus on Phase B and Phase C of the protocol. Phase B is responsible for the capability negotiation (handshake) between the sender and the receiver, while Phase C includes the transmission of the data frames according to the negotiated specifications.
The frames themselves are sent over the phone line using HDLC frames, as can be seen in figure 10.
Figure 10: Diagram as taken from the ITU T.30 standard.
Searching for attack vectors
It is a common misconception that faxes simply send TIFF files. In actual fact though, the T.30 protocol sends pages, while phase B negotiates parameters such as page height and page width, and phase C is used to transport the page’s data lines. This means that the final output will be a .tiff file that contains IFD tags that were built using the meta-data from the handshake. The .tiff file will later contain the page lines just as they were received over the phone line.
Although there are many vulnerabilities in .tiff parsers, these vulnerabilities are mostly found in code that parses IFD tags, and in our case these tags are built by the printer itself. The only processing that will be done to our page content is opening its compression during the printing process.
Unfortunately for us, there are multiple names for the compression schemes used by the .tiff file format, and we had to work them out. Here is the basic mapping, as we understood it using [ref.8 and ref.9]:
- TIFF Compression Type 2 = G3 without End-Of-Line (EOL) markers
- TIFF Compression Type 3 = G3 = ITU T.30 Compression T.4 = CCITT 1-D
- TIFF Compression Type 4 = G4 = ITU T.30 Compression T.6 = CCITT 2-D
The compression scheme is basically a Run-Length-Encoding (RLE) scheme using fixed Huffman tables for white codes, and black codes, as faxes are black and white.
We checked the decompression code for T.4 and T.6 and couldn’t find any interesting vulnerabilities there.
During phase B the modems exchange their capabilities, so they could decide what is the best supported transmission method. We wrote a simple script to parse these messages using the ITU T.30 standard, and we found out this interesting result as shown in figure 11:
Figure 11: Parsing the DIS capabilities of the target.
It seemed that our printer supported the ITU T.81 (JPEG) format [ref.10], declared in Annex E of the ITU T.4 standard [ref.11], and in short, it meant we could send colourful faxes. When we examined the code that handles the colourful faxes we found out another good finding: the received data is stored to a .jpg file as is. In contrast to the .tiff case in which the headers are built by the receiver, in the .jpg case we control the entire file.
We checked this behaviour with the standard and found out that since the JPEG format is complex, the headers (called markers [ref.12]) are indeed sent over the phone line, and the receiver should process them and decide what to keep. Some of the markers might not be supported by the receiver, and should be dropped, and other markers (such as the COM marker) should always be skipped. In our firmware, and in open sources that we checked, the received content is always dumped to a file without any filtering, giving an attacker a great starting point.
Printing the coloured fax
So, to recap: when the target printer receives a colour fax it simply dumps its content into a .jpg file (“%s/jfxp_temp%d_%d.jpg”, to be precise), without any sanitation checks. However, receiving the fax is only the first step, as it now should be printed. The printer module needs first to verify the width and height of the received document, so it sends it for a basic parsing round.
The JPEG Parser
For some unknown reason, firmware developers tend to re-implement modules that are already implemented in major popular open sources. This means that instead of using libjpeg [ref.13], the developers implemented their own JPEG parser. From an attacker’s point of view this is a jackpot, as finding a vulnerability in a complex file format parser looks very promising.
The parser itself is quite simple, and works like this:
- Check that the file starts with a Start Of Image (SOI) marker: 0xFFD8
- Run in a loop and parse each of the supported markers
- When finished, return the relevant data to the caller
CVE-2018-5925 – Buffer-Overflow While Parsing COM Markers
According to the standard, a COM marker (0xFFFE) is a variable-sized text field representing a text comment. This was our first candidate for finding a parsing vulnerability, and ironically this marker was supposed to be dropped by the fax receiver according to the standard.
And indeed, we found the following vulnerability as can be shown in Figure 12:
Figure 12: decompiled code for the COM marker vulnerability.
The parsing module parses a 2-byte (Little Endian) length field and runs in a loop that copies data from our file into some global array. It looks like each entry in the array is of size 2100 bytes, while our length field can be as high as 64KB, granting us a massive controllable buffer-overflow.
CVE-2018-5924 – Stack-Based Buffer-Overflow while Parsing DHT Markers
Since the first vulnerability was located in a marker that shouldn’t be supported by standard-compliant implementations, we chose to keep on looking for vulnerabilities in additional markers. The DHT marker (0xFFC4) Define a special Huffman Table that should be used when decoding the data frames of the file.
This function was even simpler than the previous one, show in Figure 13:
Figure 13: Decompiled code for the DHT marker vulnerability.
- We can see that there is an initial parsing loop that reads 16 bytes, and because each byte represents a length field, all of the bytes are accumulated into an overall length variable.
- A local stack buffer of size 256 bytes is prepared for use – filled with zeros.
- A second for loop uses the previous length field, and copies data from our file into the local stack buffer
A simple calculation could point out the vulnerability in this code: 16 * 255 = 4080 > 256. We have a controllable stack-based buffer overflow without any limitations on our used chars, we couldn’t hope for a better vulnerability.
Building an Exploit
We chose to exploit the DHT vulnerability as it was the easiest to exploit. If we recall, our debugging exploit also used a stack-based buffer overflow vulnerability, meaning we only needed to preform minor modifications to our debugging exploit.
Autonomous Payload – Implementing a Turing Machine
We could have used the same network-based loader that we used for our debugging exploit; however our current attack vector had a major advantage: our full payload can be stored inside the sent “JPEG” file. Relying on the fact that no one preforms any sanitation checks on our fax’s content, we could store our entire payload inside the sent document, without worrying about it not being a legal JPEG document.
Using this fact, together with the fact that the file’s file descriptor (fd) is stored in an accessible global variable, we wrote a file-based loader. The loader reads the payload from the file and loads it to memory. Later on, every time the payload wants to preform a task using some input, it reads the input from the same file and acts upon the instructions in it. Effectively, we implemented a basic Turing Machine that reads input from the tape (the sent fax) and acts accordingly.
Spreading Throughout the Network
Simply taking over a printer would be nice, but we wanted to do more. Indeed, if we could take over the entire computer network that the printer is part of, we could achieve a much bigger impact. So, knowing that one of the members in our Vulnerability Research team knows Eternal Blue quite well [ref.14] and that our Malware Research team did a similar research on Double Pulsar [ref.15], we decided to implement both NSA tools by using our file-based Turing Machine.
And so, our payload implemented the following features:
- Taking over the printer’s LCD screen – demonstrating full control over the printer itself.
- Checking if the printer’s network cable is connected.
- Using Eternal Blue and Double Pulsar to attack a victim computer in the network, taking full control over it.
To our knowledge, we now had the first (publicly documented) printer capable of using Eternal Blue and Double Pulsar to autonomously spread an attacker’s payload over a computer network.
Wrapping it all together
When we started our research, our goal was to show that the fax machine, which is now mostly embedded in all-in-one printers, poses a security risk that was yet to be considered by the research community. In our research we presented the ITU T.30 fax protocol, including some of its extensions, such as Annex E that defines how to send colourful faxes. These protocols, defined in the 90s, use complex state machines, complicated compressions and several hard to implement extensions.
Using the HP Officejet Pro 6830 all-in-one printer as a test case, we were able to demonstrate the security risk that lies in a modern implementation of the fax protocol. Using nothing but a phone line, we were able to send a fax that could take full control over the printer, and later spread our payload inside the computer network accessible to the printer.
We believe that this security risk should be given special attention by the community, changing the way that modern network architectures treat network printers and fax machines. From now on, a fax machine should be treated as a possible infiltration vector into the corporate network.
The responsible disclosure process was coordinated with HP Inc, which were very helpful and responsive during the process.
- 1 May 2018 – Vulnerabilities were disclosed to HP Inc.
- 1 May 2018 – HP Inc acknowledged our submission and started working on a patch.
- May – June 2018 – Coordinated effort to recreate the PoC and patch the vulnerabilities.
- 2-3 July 2018 – Face to Face meeting with HP Inc:
- The vulnerabilities were demonstrated and discussed.
- The patches by HP Inc were tested and approved by both parties.
- 23 July 2018 – The vulnerabilities were flagged as Critical.
- 1 August 2018 – HP Inc published the patched firmware on their site [ref.1].
- 12 August 2018 – Official public disclosure during DEFCON 26.
- HP Security Bulletin – https://support.hp.com/us-en/document/c06097712
- ITU T.30 (Fax) – https://www.itu.int/rec/T-REC-T.30-200509-I/en
- ThreadX – https://en.wikipedia.org/wiki/ThreadX
- Green Hills – https://www.ghs.com/
- CVE 2017-9765 (Devil’s Ivy) – http://cve.mitre.org/cgi-bin/cvename.cgi?name=2017-9765
- Scout Debugger on Github – https://github.com/CheckPointSW/Scout
- HP Digital Fax – https://support.hp.com/us-en/document/c03448663
- CCITT (Huffman) Encoding – https://www.fileformat.info/mirror/egff/ch09_05.htm
- Huffman/CCITT Compression in TIFF – https://www.mikekohn.net/file_formats/tiff.php
- ITU T.81 (JPEG) – https://www.w3.org/Graphics/JPEG/itu-t81.pdf
- ITU T.4 – https://www.itu.int/rec/T-REC-T.4-200307-I/en
- JPEG Markers – https://en.wikipedia.org/wiki/JPEG#Syntax_and_structure
- libjpeg – http://libjpeg.sourceforge.net/
- Eternal Blue check point research – https://research.checkpoint.com/eternalblue-everything-know/
Questions and Answers:
Q: What the fax?
A: Check Point Research has uncovered critical vulnerabilities in popular implementation of the fax protocol. These vulnerabilities allows an attacker with mere access to a phone line, and a fax number to attack it`s victim’s all-in-one printer – allowing him full control over the all-in-one printer and possibly the entire network its connected to.
Q: Is this for real?
A: Yes. Take a look at this video to see it in action.
Q: Does this only apply to all-in-one printers?
A: No. We conducted our research on all-in-one printers; however similar vulnerabilities are likely to be found in other fax implementation, such as fax-to-mail services, standalone fax machines, etc.
Q: Who cares about fax anyway?
A: Surprisingly fax is still used by many industries, governments and individuals around the world. These include the healthcare industry, legal, banking and commercial – some of which are governed by regulations, and other simply for legacy reasons.
Q: What does this mean?
A: Once an all-in-one printer has been compromised, anything is possible. It could be used to infiltrate the internal network, steal printed documents, mine Bitcoin, or practically anything.
Q: Does the fax need to be plugged-in?
A: Erm, yes. And so does the power supply.
Q: Does this apply to all fax machines?
A: Our research was done on HP Officejet all-in-one printers though this was merely a test-case. We strongly believe that similar vulnerabilities apply to other fax vendors too as this research concerns the fax communication protocols in general.
Q: Is it widespread?
A: By our estimates, there are currently hundreds of millions of fax machines still in use around the world. Financial reports from Wall Street indicate that tens of millions of all-in-one printers are sold worldwide each year.
Q: How bad is it?
A: It`s bad.
Q: Has it been fixed?
A: We worked closely with HP to fix the vulnerability and, following the process of responsible disclosure, they managed to release a patch before this publication.
Q: What can I do?
A: If you own an HP Officejet all-in-one printer then follow the instructions from HP here. For advice on how to better protect your network, please visit Check Point’s corporate blog.
Q: Am I affected?
A: If you are using an HP OfficeJet with fax capabilities and have not applied the patch then you are vulnerable.
Q: Has it been seen it the wild?
A: Not yet. Our research was intended to highlight a potential security risk.
Q: Can you share more technical details.
A: Yes, scroll up.
Q: Why is Check Point doing this research?
A: As a cutting edge research team, we believe it is our professional responsibility to look into known and unknown risks and vulnerabilities in the cyber threat landscape. By carrying out this kind of research, along with the rest of the cyber security community, we hope to make the online world safer.
Q: How can I exploit this vulnerability?
A: Check Point does not share exploitation tools or exploit code as a policy, nor will we give you the detailed instructions for creating one. One can assume, however, that other researchers will independently develop such code eventually. We can only encourage you to use it professionally and responsibly.
Q: I need someone to blame!
A: That’s not even a question.