Remote Cloud Execution – Critical Vulnerabilities in Azure Cloud Infrastructure (Part II)

January 30, 2020

Research by Ronen Shustin

Cloud Attack Part II

In the previous part  we talked about the Azure Stack architecture and mentioned that it can be extended with features that are not part of its core. Using the ability to research cloud components offline, we took this opportunity to research Azure App Service. In this part, we take a deep dive into Azure App Service internals. We examine its architecture, attack vectors, and demonstrate how a critical vulnerability we found in one of its components affected Azure Cloud.

What is Azure App Service?

According to Microsoft, Azure App Service enables you to build and host web apps, mobile back ends, and RESTful APIs in the programming language of your choice, without managing infrastructure. It offers auto-scaling and high availability, supports both Windows and Linux, and enables automated deployments from GitHub, Azure DevOps, or any Git repo.

Azure App Service Architecture

Note – Some of the information presented below is taken from: https://channel9.msdn.com/Events/Ignite/2016/BRK3206

Azure App Service architecture can be broken down into six types of roles:

Controller

Manages App Service – The controller manages and monitors the App Service. It performs WinRM functions across the other roles to deploy software and to make sure everything works as intended.

File Server

Application Content – Stores the tenant’s application and data.

Management

API, UI Extensions and Data Service – The management role is where the API, the UI extensions, tenant portal, and admin portal are hosted.

Front End

App Service Routing – The Front End is a load balancer. Behind all the software load balancers in Azure/Azure Stack, this is the load balancer for the App Service, which dynamically decides to which web workers the requests are sent.

Publisher

FTP, Web Deploy – The publisher exposes an FTP in the Web Deploy endpoint which helps deploy all of the tenant application file content onto the file server.

Web Workers

Application host – Servers that host the applications, the more of them the better. This is where all the interesting stuff happens and our research mostly focuses on this part.

Now that we understand what roles the App Service has, let’s examine a basic diagram of how they interact with each other (Default App Service configuration):

Web Workers

As mentioned previously, the tenant applications run inside the web worker(s) machine, meaning it can run any code using the supported languages. What happens if the tenant decides to run a malicious code which tampers with the web worker machine or even other running apps? This is an interesting question but first, let’s dig into some internals.

Microsoft Web Hosting Framework

Code-named Antares, this provides the platform for running tenant apps. It has several components:

DWASSVC

This service is responsible for managing and running tenant applications. Usually, two or more IIS worker processes (w3wp.exe) are created for each tenant app. The first process runs Kudu, which allows the tenant to manage its own application in a simple and accessible way. The second one actually runs the tenant application. These processes, which are the “complete app”, run under the same user with relatively low permissions.

 

There is a lot more to this service which we discuss later.

Kernel Mode Sandbox

One of the most interesting components is RsFilter.sys, a filter driver, another Azure proprietary component that does not contain public symbols. We can write a whole article about it, based on the reverse engineering we performed. However, this time we just briefly describe its major capabilities.

Process Creation/Deletion Tracking

It tracks every process creation or deletion using the PsSetCreateProcessNotifyRoutine callback.

There are two main structures that are created in the kernel for “Sandboxed Processes.” The first one we called is ProcessInfo, which contains information about the process, like handle, process ID, image file name and more. Here is a snippet of how it looks in IDA Pro:

The second structure that is created (or gathered) is a property in the ProcessInfo structure and we called it SandboxSettings. This structure is enormous and contains information about the sandbox environment of the process, for example, the process/thread count limitation, network port limitation, environment paths and more. Here is a short snippet:

Once the ProcessInfo structure is created, it is inserted in a global table, so the kernel driver could track it. An interesting concept to note here is that the SandboxSettings can be shared across other processes which are under the same sandbox. For example: the tenant application, its child processes, and Kudu have different ProcessInfo structures but the same SandboxSettings.

IOCTLs & FltPort Communication

 These are the main communication interfaces to the driver from userspace. Some IOCTLs can be called by anyone, and some require permissions. Connecting to the filter port (FLT) requires high permissions. Changing sandbox settings can be done through this port.

File System Filter

The RsFilter registers a lot of file system-related callbacks to perform its duty.

If you ever used Azure App Service, you probably know that your app can access D:\home and can get the home directory. Have you ever wondered how this works? How every tenant app gets its own home directory by accessing this path? The answer lies in the PreFilterOnCreateCallback function. We previously mentioned the SandboxSettings structure; one of its properties is called sandboxRemotePath which contains a UNC file share path to the storage location of the app. DWASSVC sets this path at the start of the IIS worker process by communicating with the driver, using the exposed filter port (FltPort). Therefore, when the app tries to access D:\home or other special paths, the filter driver matches and replaces them with the correct ones on the fly.

The other callbacks implement file system limitations like disk quota, directory/file count and more.

Network Filter

The network filter implements some security mechanisms to prevent data leakage or increased attack surface. It has port whitelisting and local port filtering. For example, your app can’t connect to local ports it didn’t listen on. It also has maximum connection limitation, network range IP whitelist, disabled raw sockets and more.

In the SMB 445 and 139 port blacklist (WannaCry anyone?), can someone spot the possible bug? 🙂

 

User Mode Sandbox

Each IIS worker process loads a DLL called RsHelper.dll. It’s compiled with Microsoft’s Detours library and hooks a lot of functions from DLLs like kernel32.dll, advapi32.dll, ws2_32.dll, httpapi.dll, ntdll.dll and more. The interesting part is it also hooks the CreateProcessA/W functions. When these functions are called, it uses a well known DLL injection technique (CreateRemoteThread) and injects itself into the created process. This is the first time we saw a legitimate usage of a DLL injection. Looking at the hooked functions, we can see that Microsoft pretty much implemented this to prevent tenant apps from getting inside information they don’t necessarily need to know on the web worker machines.

For more information about the App Service Sandbox, see this reference article that explains its capabilities. We can confirm we’ve seen most of those features implemented. However, there were some we didn’t see, for example, the Win32k.sys (User32/GDI32) restrictions. With that said, they might be actually implemented on the public cloud.

After understanding how App Service works, we started looking for vulnerabilities from a local attack scenario.

Vulnerability Details

In this section, we share the details of a vulnerability we found in the DWASSVC that when exploited, allowed us to execute code as NT AUTHORITY/SYSTEM.

During our research, we used Process Explorer (from SysInternals Suite) to examine the running processes and see how they are being executed, with what command line, modules, etc.  We encountered some interesting parameters in the command line:

You can see the ‘-a’ parameter is supplied with a named pipe path. This raised some questions on what data is sent over this pipe, and can we influence it? To answer this, we first need to understand who created this pipe. It is DWASSVC who starts the workers. As it’s written in C#, we used a decompiler to look at its code. Before a new worker is created, the service first creates a named pipe to be able to communicate with it. We can see this in the decompiled sources of Microsoft.Web.Hosting.ProcessModel.dll at WorkerProcess.cs:Start:

The CreateIpmPipe is a native function which is implemented in the DWASInterop.dll:

To be able to “dive” deeper into DWASInterop.dll, we had to reverse engineer it completely. It took us some time because there were no public symbols, and it’s written in C++. However, there were many debug strings that disclosed function names, and we also noticed that this DLL shares code with iisutil.dll from IIS (which has public symbols). Diffing them helped us in the reverse engineering process.

Let’s look at the internal implementation of CreateIpmPipe:

We can see the call to the internal CreateIpmMessagePipe function:

The CreateIpmMessagePipe calls to CreateNamedPipeW which creates a named pipe. If we look at the parameters, it looks like it uses PIPE_ACCESS_DUPLEX, which means the pipe is bi-directional and both the server (DWASSVC) and the client (w3wp.exe) can read and write to this pipe. Later the flow finishes and returns to the C# program. After it returns, DWASSVC starts the worker process and passes the pipe name as a parameter (-a flag). After the worker is started, it connects to the pipe, and communication starts.

So now we know how the named pipe is created, but for what purpose? Inter process communication. For example, there is “Worker Shutdown Request” message, instead of brutally killing the worker process, DWASSVC can send it a shutdown request. There are many other examples as well. With that said, the protocol implementation interested us, and we wanted to see if we could find any vulnerabilities.

This is the structure of the messages the worker can send to the DWASSVC service and which we called WorkerItem:

DWORD opcode (4 Bytes) The operation code
DWORD dataLength (4 Bytes) The length of the data
data The actual data

When DWASSVC receives messages (DWASInterop.dll), the IPM_MESSAGE_PIPE::MessagePipeCompletion callback is called. The first and only parameter that is passed to it is this which is an IPM_MESSAGE_IMP instance.

The IPM_MESSAGE_IMP is a special class that DWASInterop uses to describe messages. It doesn’t have a lot of fields. Here is a snippet of the reverse-engineered class:

It contains a pointer to the workerItem and also has a property called workerItemSize which is the size of the workerItem. The workerItemSize holds the complete size of the workerItem (which is the opcode + length + data) compares to the workerItem.dataLength which only holds the data length.

When IPM_MESSAGE_PIPE::MessagePipeCompletion receives a message there is an interesting edge case:

There is a call to the ReallocateWorkerItem function when the data is read.

There is a simple allocation of a new worker item, and then a copy of the previous data to the new structure.

While calling this function, the workerItem->dataLength is passed and the allocation of a new worker item is performed with that size. However, the memcpy is performed with workerItemSize. Those two sizes are calculated automatically if the message is sent through DWASInterop.dll or iisutil.dll exported API (WriteMessage API function). However, If an attacker can send a message directly to the named pipe he can send a similar message like the following:

DWORD opcode (4 Bytes) 0x16 (In that stage it doesn’t matter)
DWORD dataLength (4 Bytes) 0
data A * 100 (a long string)

The workerItemSize is calculated to 108 and the workerItem->dataLength is 0. In this case, the allocation with the size 0 succeeds and then a memcpy is performed on the allocated area with the size of 108, resulting in a heap based overflow with controlled content and size!

So how can an attacker send a message to DWASSVC (DWASInterop.dll)? By design, when running C# Azure function, it runs in the context of the worker (w3wp.exe). This lets an attacker a way to potentially enumerate the currently opened handles. This way, the attacker can find the already opened named pipe handle and send a specially crafted message. Here is how we triggered the vulnerability:

We created a C# Azure function which loads a native DLL and calls the load function.

The load function brute forces the handles until it finds an open one whose name starts with “iisipm”. Then it constructs the malicious message and sends it immediately. As a result, DWASSVC crashes.

Although we only demonstrated a crash, this vulnerability could be exploited to a privilege escalation.

Impact

Microsoft has various App Service plans:

  • Shared compute: Free and Shared, the two base tiers, runs an app on the same Azure VM as other App Service apps, including apps of other customers. These tiers allocate CPU quotas to each app that runs on the shared resources, and the resources cannot scale out.
  • Dedicated compute: The Basic, Standard, Premium, and PremiumV2 tiers run apps on dedicated Azure VMs. Only apps in the same App Service plan share the same compute resources. The higher the tier, the more VM instances are available to you for scale-out.
  • Isolated: This tier runs dedicated Azure VMs on dedicated Azure Virtual Networks. It provides network isolation on top of compute isolation to your apps and provides the maximum scale-out capabilities.

For more information, see: https://docs.microsoft.com/en-us/azure/app-service/overview-hosting-plans

 

Exploiting this vulnerability in all of the plans could allow us to compromise Microsoft’s App Service infrastructure. However, exploiting it specifically on a Free/Shared plan could also allow us to compromise other tenant apps, data, and accounts! Thus breaking the security model of App Service.

Conclusion

The cloud is not a magical place. Although it is considered safe, it is ultimately an infrastructure that consists of code that can have vulnerabilities – just as we demonstrated in this article.

This vulnerability was disclosed and fixed by Microsoft and assigned as CVE-2019-1372.
Microsoft acknowledged that this vulnerability worked on Azure Cloud and Azure Stack