CVE-2018-4038
An exploitable arbitrary write vulnerability exists in the open document format parser of the Atlantis Word Processor, version 3.2.7.2, while trying to null-terminate a string. A specially crafted document can allow an attacker to pass an untrusted value as a length to a constructor. This constructor will miscalculate a length and then use it to calculate the position to write a null byte. This can allow an attacker to corrupt memory, which can result in code execution under the context of the application. An attacker must convince a victim to open a specially crafted document in order to trigger this vulnerability.
Atlantis Word Processor 3.2.7.1, 3.2.7.2
https://www.atlantiswordprocessor.com/en/
8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CWE-131: Incorrect Calculation of Buffer Size
Atlantis Word Processor is a traditional word processor that provides a number of useful features for a variety of users. The software is fully compatible with other word processors, such as Microsoft Office Word 2007, and even has a similar interface to Microsoft Word. Atlantis also has the ability to encrypt document files and fully customize the interface. This application is written in Delphi and contains the majority of its capabilities within a single relocatable binary.
When opening up a document that follows the open document format specification, the application will first fingerprint it in order to determine the correct file format parser, which is performed by the following code. This code will first fetch the current TDoc
object and then check one of its fields that represents the current file format enumeration. When the open document format is selected, this field will have the value “3,” which results in the execution of case 3. At [1], the application will then execute a function that fingerprints and continues parsing the document.
awp+0x1b3139:
005b3139 8b45e8 mov eax,dword ptr [ebp-18h] // TDoc
005b313c 8b80dc000000 mov eax,dword ptr [eax+0DCh] // TDoc.fileFormatEnumeration
005b3142 83f805 cmp eax,5
005b3145 776a ja awp+0x1b31b1 (005b31b1)
005b3147 ff24854e315b00 jmp dword ptr awp+0x1b314e (005b314e)[eax*4]
...
awp+0x1b3193:
005b3193 55 push ebp // Case 3
005b3194 e8fbc0ffff call awp+0x1af294 (005af294) // [1]
005b3199 59 pop ecx
005b319a 8885d7f8ffff mov byte ptr [ebp-729h],al
005b31a0 eb1c jmp awp+0x1b31be (005b31be)
When processing the document, the application will execute the function at 0x5af294. After initializing a couple of data structures and creating an instance of the TUnpackedZip
object, the function will parse a couple of XML files in the document. At [2], the application will extract the “content.xml” file from the document, and then parse the file with the function call at [3]. Once this is performed, the function will then later navigate through the different elements in the “content.xml” file.
awp+0x1af294:
005af294 55 push ebp
005af295 8bec mov ebp,esp
005af297 81c4d8feffff add esp,0FFFFFED8h
005af29d 53 push ebx
005af29e 56 push esi
005af29f 57 push edi
005af2a0 33c0 xor eax,eax
...
005af3d5 8b17 mov edx,dword ptr [edi]
005af3d7 8d85e0feffff lea eax,[ebp-120h] // Filename
005af3dd b960f95a00 mov ecx,offset awp+0x1af960 (005af960)
005af3e2 e8c141e5ff call awp+0x35a8 (004035a8) // LStrCat3
005af3e7 8b95e0feffff mov edx,dword ptr [ebp-120h] // Filename
005af3ed 8b4508 mov eax,dword ptr [ebp+8] // Frame
005af3f0 8b805cf9ffff mov eax,dword ptr [eax-6A4h] // TUnpackedZip
005af3f6 33c9 xor ecx,ecx
005af3f8 e89301f2ff call awp+0xcf590 (004cf590) // [2] Extract file from ZIP
005af3fd 8b17 mov edx,dword ptr [edi]
005af3ff 8d85e0feffff lea eax,[ebp-120h] // Filename
005af405 b960f95a00 mov ecx,offset awp+0x1af960 (005af960)
005af40a e89941e5ff call awp+0x35a8 (004035a8) // LStrCat3
005af40f 8b85e0feffff mov eax,dword ptr [ebp-120h] // Filename
005af415 8b95e8feffff mov edx,dword ptr [ebp-118h]
005af41b e830f1f1ff call awp+0xce550 (004ce550) // [3] Parse XML
005af420 8945f8 mov dword ptr [ebp-8],eax // Stored here
Later, inside the same function, the application uses the function calls at [4] to descend through the different nodes within the parsed “content.xml” document. The first element that the application searches for is labelled as “office:body.” Immediately after locating this element, it searches for “office:text.” This element is then immediately passed through the %eax
register to the function call at [5] in order for the application to process its children.
awp+0x1af679:
005af679 55 push ebp
005af67a ba40fb5a00 mov edx,offset awp+0x1afb40 (005afb40) // "office:body"
005af67f 8b45f8 mov eax,dword ptr [ebp-8] // content.xml
005af682 e825dcf1ff call awp+0xcd2ac (004cd2ac) // [4] Search through XML document for element
005af687 ba54fb5a00 mov edx,offset awp+0x1afb54 (005afb54) // "office:text"
005af68c e81bdcf1ff call awp+0xcd2ac (004cd2ac) // [4] Search through XML document for element
005af691 e8facdffff call awp+0x1ac490 (005ac490) // [5] Proceses element's children
005af696 59 pop ecx
The function at 0x5ac490 is used by the application in order to process the child elements of an XML element. After storing some information to retain the current state of parsing, the application will then enter a loop at 0x5ac529. This loop is responsible for actually iterating through each of the child elements belonging to the XML element that is currently being parsed. For each child element, the application will increment an index and then pass it to the function call at [6]. This will then return the child element. The XML element’s tag name is at offset +0x20. This tag name is then passed to the function call at [7] in order to convert into a token identifier. This token identifier is then used in a case statement in order to determine how to parse it specifically.
awp+0x1ac490:
005ac490 55 push ebp
005ac491 8bec mov ebp,esp
005ac493 81c4e8feffff add esp,0FFFFFEE8h
005ac499 53 push ebx
005ac49a 56 push esi
005ac49b 57 push edi
005ac49c 33d2 xor edx,edx
...
awp+0x1ac529:
005ac529 8bd3 mov edx,ebx // Index
005ac52b 8b45fc mov eax,dword ptr [ebp-4]
005ac52e e8d9b7e5ff call awp+0x7d0c (00407d0c) // [6] Fetch child element
005ac533 8bf0 mov esi,eax
005ac535 8b4620 mov eax,dword ptr [esi+20h] // XML Tag
005ac538 e81ba7ffff call awp+0x1a6c58 (005a6c58) // [7] Convert XML Tag to Token Id
005ac53d 25ff000000 and eax,0FFh
005ac542 83c0fb add eax,0FFFFFFFBh
005ac545 3db4000000 cmp eax,0B4h
005ac54a 0f8710150000 ja awp+0x1ada60 (005ada60)
005ac550 8a805dc55a00 mov al,byte ptr awp+0x1ac55d (005ac55d)[eax]
005ac556 ff248512c65a00 jmp dword ptr awp+0x1ac612 (005ac612)[eax*4]
...
005ada87 43 inc ebx // Next Index
005ada88 4f dec edi
005ada89 0f859aeaffff jne awp+0x1ac529 (005ac529)
When the application handles case 152 for “text:s,” the following code is executed. The “text:s” element represents the number of spaces at a given point within a document. When processing this tag name, the application will first extract the “text:c” attribute via the function call at [8]. This attribute will then be converted into a WString
and then converted to an integer at [9]. In order to prevent integer overflow, the application will then do a signed-ness check at [10] before passing it to a call to LStrSetLength
at [11]. The provided proof-of-concept sets the value of “text:c” to 0x7fffffff thus bypassing the signed-ness check.
awp+0x1ad474:
005ad474 8b4508 mov eax,dword ptr [ebp+8] // Frame
005ad477 8d48fc lea ecx,[eax-4] // XML element
005ad47a ba3cdd5a00 mov edx,offset awp+0x1add3c (005add3c) // "text:c"
005ad47f 8bc6 mov eax,esi
005ad481 e8eafbf1ff call awp+0xcd070 (004cd070) // [8] Check property
005ad486 84c0 test al,al
005ad488 747a je awp+0x1ad504 (005ad504)
005ad48a 8d85f8feffff lea eax,[ebp-108h] // Number
005ad490 8b5508 mov edx,dword ptr [ebp+8] // Frame
005ad493 8b52fc mov edx,dword ptr [edx-4] // XML element
005ad496 e88960e5ff call awp+0x3524 (00403524) // LStrFromWStr
005ad49b 8b85f8feffff mov eax,dword ptr [ebp-108h] // Number
005ad4a1 33d2 xor edx,edx
005ad4a3 e82cebe5ff call awp+0xbfd4 (0040bfd4) // [9] StrToInt
005ad4a8 8bd0 mov edx,eax
005ad4aa 33c0 xor eax,eax
005ad4ac e88f13e6ff call awp+0xe840 (0040e840) // [10] Signedness check
005ad4b1 8bd0 mov edx,eax
005ad4b3 8d45e4 lea eax,[ebp-1Ch]
005ad4b6 e8d163e5ff call awp+0x388c (0040388c) // [11] LStrSetLength
005ad4bb 837de400 cmp dword ptr [ebp-1Ch],0
005ad4bf 0f849b050000 je awp+0x1ada60 (005ada60)
The implementation of LStrSetLength
, unfortunately, acts in an insecure fashion when a long string length is set by the application. LStrSetLength
is implemented by the following code. After performing a number of checks that tell Delphi whether a string needs to be resized or re-allocated, the implementation will make a call to NewAnsiString
at [12]. This function will take the length as a parameter, add 9 to it and then pass it to System.GetMemory
. Due to the way System.GetMemory
calculates and then aligns the size of a chunk, a length of 0x7fffffff + 9 can result in an undersized allocation. Later at [14], this length of 0x7fffffff will then be written to relative to the beginning of the chunk returned by System.GetMemory
. This will write out of bounds of the returned allocation which can cause heap corruption and allow for code execution under the context of the application.
awp+0x388c:
0040388c 53 push ebx
0040388d 56 push esi
0040388e 57 push edi
0040388f 89c3 mov ebx,eax
00403891 89d6 mov esi,edx
00403893 31ff xor edi,edi
...
004038c2 89d0 mov eax,edx // Length
004038c4 e8c7faffff call awp+0x3390 (00403390) // [12] NewAnsiString
\
awp+0x3390:
00403390 85c0 test eax,eax
00403392 7e1c jle awp+0x33b0 (004033b0)
00403394 50 push eax
00403395 83c009 add eax,9 // Add 9 to length
00403398 e8a3efffff call awp+0x2340 (00402340) // [13] System.GetMemory
0040339d 83c008 add eax,8
004033a0 5a pop edx
004033a1 8950fc mov dword ptr [eax-4],edx
004033a4 c740f801000000 mov dword ptr [eax-8],1
004033ab c6041000 mov byte ptr [eax+edx],0 // [14] Write out of bounds
004033af c3 ret
eax=0892366c ebx=0018ec0c ecx=08923664 edx=7fffffff esi=7fffffff edi=00000000
eip=004033ab esp=0018eae4 ebp=0018ec28 iopl=0 nv up ei pl nz na pe nc
cs=0023 ss=002b ds=002b es=002b fs=0053 gs=002b efl=00010206
awp+0x33ab:
004033ab c6041000 mov byte ptr [eax+edx],0 ds:002b:8892366b=??
0:000> ub .
awp+0x3392:
00403392 7e1c jle awp+0x33b0 (004033b0)
00403394 50 push eax
00403395 83c009 add eax,9
00403398 e8a3efffff call awp+0x2340 (00402340)
0040339d 83c008 add eax,8
004033a0 5a pop edx
004033a1 8950fc mov dword ptr [eax-4],edx
004033a4 c740f801000000 mov dword ptr [eax-8],1
0:000> r @edx
edx=7fffffff
0:000> da poi(@ebp-108)
0babda90 "2147483647"
0:000> .formats 0n2147483647
Evaluate expression:
Hex: 7fffffff
Decimal: 2147483647
Octal: 17777777777
Binary: 01111111 11111111 11111111 11111111
Chars: ...
Time: ***** Invalid
Float: low 1.#QNAN high 0
Double: 1.061e-314
To use the proof of concept, simply open up or preview the document in the target application. The application should crash at the address specified due to heap corruption.
2018-11-16 - Vendor Disclosure
2018-11-20 - Vendor patched; Public Release
Discovered by a member of Cisco Talos.