Exploiting SMBGhost (CVE-2020-0796) for a Local Privilege Escalation: Writeup + POC

SHARE THIS ARTICLE

Follow zecops

Introduction

CVE-2020-0796 is a bug in the compression mechanism of SMBv3.1.1, also known as “SMBGhost”. The bug affects Windows 10 versions 1903 and 1909, and it was announced and patched by Microsoft about three weeks ago. Once we heard about it, we skimmed over the details and created a quick POC (proof of concept) that demonstrates how the bug can be triggered remotely, without authentication, by causing a BSOD (Blue Screen of Death). A couple of days ago we returned to this bug for more than just a remote DoS. The Microsoft Security Advisory describes the bug as a remote code execution (RCE) vulnerability, but there is no public POC that demonstrates RCE through this bug.

Hear the news first

  • Only essential content
  • New vulnerabilities & announcements
  • News from ZecOps Research Team
We won’t spam, pinky swear 🤞

Initial Analysis

The bug is an integer overflow bug that happens in the Srv2DecompressData function in the srv2.sys SMB server driver. Here’s a simplified version of the function, with the irrelevant details omitted:

typedef struct _COMPRESSION_TRANSFORM_HEADER
{
    ULONG ProtocolId;
    ULONG OriginalCompressedSegmentSize;
    USHORT CompressionAlgorithm;
    USHORT Flags;
    ULONG Offset;
} COMPRESSION_TRANSFORM_HEADER, *PCOMPRESSION_TRANSFORM_HEADER;

typedef struct _ALLOCATION_HEADER
{
    // ...
    PVOID UserBuffer;
    // ...
} ALLOCATION_HEADER, *PALLOCATION_HEADER;

NTSTATUS Srv2DecompressData(PCOMPRESSION_TRANSFORM_HEADER Header, SIZE_T TotalSize)
{
    PALLOCATION_HEADER Alloc = SrvNetAllocateBuffer(
        (ULONG)(Header->OriginalCompressedSegmentSize + Header->Offset),
        NULL);
    If (!Alloc) {
        return STATUS_INSUFFICIENT_RESOURCES;
    }

    ULONG FinalCompressedSize = 0;

    NTSTATUS Status = SmbCompressionDecompress(
        Header->CompressionAlgorithm,
        (PUCHAR)Header + sizeof(COMPRESSION_TRANSFORM_HEADER) + Header->Offset,
        (ULONG)(TotalSize - sizeof(COMPRESSION_TRANSFORM_HEADER) - Header->Offset),
        (PUCHAR)Alloc->UserBuffer + Header->Offset,
        Header->OriginalCompressedSegmentSize,
        &FinalCompressedSize);
    if (Status < 0 || FinalCompressedSize != Header->OriginalCompressedSegmentSize) {
        SrvNetFreeBuffer(Alloc);
        return STATUS_BAD_DATA;
    }

    if (Header->Offset > 0) {
        memcpy(
            Alloc->UserBuffer,
            (PUCHAR)Header + sizeof(COMPRESSION_TRANSFORM_HEADER),
            Header->Offset);
    }

    Srv2ReplaceReceiveBuffer(some_session_handle, Alloc);
    return STATUS_SUCCESS;
}

The Srv2DecompressData function receives the compressed message which is sent by the client, allocates the required amount of memory, and decompresses the data. Then, if the Offset field is not zero it copies the data that is placed before the compressed data as is to the beginning of the allocated buffer.

If we look carefully, we can notice that lines 20 and 31 can lead to an integer overflow for certain inputs. For example, most POCs that appeared shortly after the bug publication and crashed the system just used the 0xFFFFFFFF value for the Offset field. Using the value 0xFFFFFFFF triggers an integer overflow on line 20, and as a result less bytes are allocated.

Later, it triggers an additional integer overflow on line 31. The crash happens due to a memory access at the address calculated in line 30, far away from the received message. If the code verified the calculation at line 31, it would bail out early since the buffer length happens to be negative and cannot be represented, and that makes the address itself on line 30 invalid as well.

Choosing what to overflow

There are only two relevant fields that we can control to cause an integer overflow: OriginalCompressedSegmentSize and Offset, so there aren’t that many options. After trying several combinations, the following combination caught our eye: what if we send a legit Offset value and a huge OriginalCompressedSegmentSize value? Let’s go over the three steps the code is going to execute:

  1. Allocate: The amount of allocated bytes will be smaller than the sum of both fields due to the integer overflow.
  2. Decompress: The decompression will receive a huge OriginalCompressedSegmentSize value, treating the target buffer as practically having limitless size. All other parameters are unaffected thus it will work as expected.
  3. Copy: If it’s ever going to be executed (will it?), the copy will work as expected.

Whether or not the Copy step is going to be executed, it already looks interesting – we can trigger an out of bounds write on the Decompress stage since we managed to allocate less bytes then necessary on the Allocate stage.

As you can see, using this technique we can trigger an overflow of any size and content, which is a great start. But what is located beyond our buffer? Let’s find out!

Diving into SrvNetAllocateBuffer

To answer this question, we need to look at the allocation function, in our case SrvNetAllocateBuffer. Here is the interesting part of the function:

PALLOCATION_HEADER SrvNetAllocateBuffer(SIZE_T AllocSize, PALLOCATION_HEADER SourceBuffer)
{
    // ...

    if (SrvDisableNetBufferLookAsideList || AllocSize > 0x100100) {
        if (AllocSize > 0x1000100) {
            return NULL;
        }
        Result = SrvNetAllocateBufferFromPool(AllocSize, AllocSize);
    } else {
        int LookasideListIndex = 0;
        if (AllocSize > 0x1100) {
            LookasideListIndex = /* some calculation based on AllocSize */;
        }

        SOME_STRUCT list = SrvNetBufferLookasides[LookasideListIndex];
        Result = /* fetch result from list */;
    }

    // Initialize some Result fields...

    return Result;
}

We can see that the allocation function does different things depending on the required amount of bytes. Large allocations (larger than about 16 MB) just fail. Medium allocations (larger than about 1 MB) use the SrvNetAllocateBufferFromPool function for the allocation. Small allocations (the rest) use lookaside lists for optimization.

Note: There’s also the SrvDisableNetBufferLookAsideList flag which can affect the functionality of the function, but it’s set by an undocumented registry setting and is disabled by default, so it’s not very interesting.

Lookaside lists are used for effectively reserving a set of reusable, fixed-size buffers for the driver. One of the capabilities of lookaside lists is to define a custom allocation/free functions which will be used for managing the buffers. Looking at references for the SrvNetBufferLookasides array, we found that it’s initialized in the SrvNetCreateBufferLookasides function, and by looking at it we learned the following:

  • The custom allocation function is defined as SrvNetBufferLookasideAllocate, which just calls SrvNetAllocateBufferFromPool.
  • 9 lookaside lists are created with the following sizes, as we quickly calculated with Python:
    >>> [hex((1 << (i + 12)) + 256) for i in range(9)]
    [‘0x1100’, ‘0x2100’, ‘0x4100’, ‘0x8100’, ‘0x10100’, ‘0x20100’, ‘0x40100’, ‘0x80100’, ‘0x100100’]

    It matches our finding that allocations larger than 0x100100 bytes are allocated without using lookaside lists.

The conclusion is that every allocation request ends up in the SrvNetAllocateBufferFromPool function, so let’s take a look at it.

SrvNetAllocateBufferFromPool and the allocated buffer layout

The SrvNetAllocateBufferFromPool function allocates a buffer in the NonPagedPoolNx pool using the ExAllocatePoolWithTag function, and then fills some of the structures with data. The layout of the allocated buffer is the following:

The only relevant parts of this layout for the scope of our research are the user buffer and the ALLOCATION_HEADER struct. We can see right away that by overflowing the user buffer, we end up overriding the ALLOCATION_HEADER struct. Looks very convenient.

Overriding the ALLOCATION_HEADER struct

Our first thought at this point was that due to the check that follows the SmbCompressionDecompress call:

if (Status < 0 || FinalCompressedSize != Header->OriginalCompressedSegmentSize) {
    SrvNetFreeBuffer(Alloc);
    return STATUS_BAD_DATA;
}

SrvNetFreeBuffer will be called and the function will fail, since we crafted OriginalCompressedSegmentSize to be a huge number, and FinalCompressedSize is going to be a smaller number which represents the actual amount of decompressed bytes. So we analyzed the SrvNetFreeBuffer function, managed to replace the allocation pointer to a magic number, and waited for the free function to try and free it, hoping to leverage it later for use-after-free or similar. But to our surprise, we got a crash in the memcpy function. That has made us happy, since we didn’t hope to get there at all, but we had to check why it happened. The explanation can be found in the implementation of the SmbCompressionDecompress function:

NTSTATUS SmbCompressionDecompress(
    USHORT CompressionAlgorithm,
    PUCHAR UncompressedBuffer,
    ULONG  UncompressedBufferSize,
    PUCHAR CompressedBuffer,
    ULONG  CompressedBufferSize,
    PULONG FinalCompressedSize)
{
    // ...

    NTSTATUS Status = RtlDecompressBufferEx2(
        ...,
        FinalUncompressedSize,
        ...);
    if (Status >= 0) {
        *FinalCompressedSize = CompressedBufferSize;
    }

    // ...

    return Status;
}

Basically, if the decompression succeeds, FinalCompressedSize is updated to hold the value of CompressedBufferSize, which is the size of the buffer. This deliberate update of the FinalCompressedSize return value seemed quite suspicious for us, since this little detail, together with the allocated buffer layout, allows for a very convenient exploitation of this bug.

Since the execution continues to the stage of copying the raw data, let’s review the call once again:

memcpy(
    Alloc->UserBuffer,
    (PUCHAR)Header + sizeof(COMPRESSION_TRANSFORM_HEADER),
    Header->Offset);

The target address is read from the ALLOCATION_HEADER struct, the one that we can override. The content and the size of the buffer are controlled by us as well. Jackpot! Write-what-where in the kernel, remotely!

Remote write-what-where implementation

We did a quick implementation of a Write-What-Where CVE-2020-0796 Exploit in Python, which is based on the CVE-2020-0796 DoS POC of maxpl0it. The code is fairly short and straightforward.

Local Privilege Escalation

Now that we have the write-what-where exploit, what can we do with it? Obviously we can crash the system. We might be able to trigger remote code execution, but we didn’t find a way to do that yet. If we use the exploit on localhost and leak additional information, we can use it for local privilege escalation, as it was already demonstrated to be possible via several techniques.

The first technique we tried was proposed by Morten Schenk in his Black Hat USA 2017 talk. The technique involves overriding a function pointer in the .data section of the win32kbase.sys driver, and then calling the appropriate function from user mode to gain code execution. j00ru wrote a great writeup about using this technique in WCTF 2018, and provided his exploit source code. We adjusted it for our write-what-where exploit, but found out that it doesn’t work since the thread that handles the SMB messages is not a GUI thread. Due to this, win32kbase.sys is not mapped, and the technique is not relevant (unless there’s a way to make it a GUI thread, something we didn’t research).

We ended up using the well known technique covered by cesarcer in 2012 in his Black Hat presentation Easy Local Windows Kernel Exploitation. The technique is about leaking the current process token address by using the NtQuerySystemInformation(SystemHandleInformation) API, and then overriding it, granting the current process token privileges that can then be used for privilege escalation. The Abusing Token Privileges For EoP research by Bryan Alexander (dronesec) and Stephen Breen (breenmachine) (2017) demonstrates several ways of using various token privileges for privilege escalation.

We based our exploit on the code that Alexandre Beaulieu kindly shared in his Exploiting an Arbitrary Write to Escalate Privileges writeup. We completed the privilege escalation after modifying our process’ token privileges by injecting a DLL into winlogon.exe. The DLL’s whole purpose is to launch a privileged instance of cmd.exe. Our complete Local Privilege Escalation Proof of Concept can be found here and is available for research / defensive purposes only.

Summary

We managed to demonstrate that the CVE-2020-0796 vulnerability can be exploited for local privilege escalation. Note that our exploit is limited for medium integrity level, since it relies on API calls that are unavailable in a lower integrity level. Can we do more than that? Maybe, but it will require more research. There are many other fields that we can override in the allocated buffer, perhaps one of them can help us achieve other interesting things such as remote code execution.

POC Source Code

Remediation

  1. We recommend updating servers and endpoints to the latest Windows version to remediate this vulnerability. If possible, block port 445 until updates are deployed. Regardless of CVE-2020-0796, we recommend enabling host-isolation where possible.
  2. It is possible to disable SMBv3.1.1 compression in order to avoid triggers to this bug, however we recommend to do full update instead if possible.

ZecOps Customers & Partners

ZecOps Digital Forensics and Incident Response (DFIR) customers can detect such exploitation attempts as “CVE-2020-0796” using ZecOps agentless solution: Neutrino for Servers and Endpoints. To try ZecOps technology and see a demo, you can contact us here

Researchers wanted

At ZecOps we’re working on offensive cyber security research for defensive purposes. We are hiring additional researchers & exploit developers in various platforms including iOS and Windows. If you are interested, contact us here.

ZecOps Mobile XDR is here, and its a game changer

Perform automated investigations in minutes to uncover cyber-espionage on smartphones and tablets.

LEARN MORE >

Partners, Resellers, Distributors and Innovative Security Teams

ZecOps provides the industry-first automated crash forensics platform across devices, operating systems and applications.

LEARN MORE >

SHARE THIS ARTICLE