CVE-2019-5030
A buffer overflow vulnerability exists in the PowerPoint document conversion function of Rainbow PDF Office Server Document Converter V7.0 Pro MR1 (7,0,2019,0220). While parsing a document text info container, the TxMasterStyleAtom::parse function is incorrectly checking the bounds corresponding to the number of style levels, causing a vtable pointer to be overwritten, which leads to code execution.
Antenna House Rainbow PDF Office Server Document Converter v7.0 Pro MR1 for Linux64 (7,0,2019,0220)
https://www.rainbowpdf.com/trial-server-solutions/
8.8 - CVSS:3.0/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H
CWE-122: Heap-based Buffer Overflow
Rainbow PDF is a software solution, developed by Antenna House, that converts Microsoft office 97-2016 documents into a PDF.
Office document structures are sometimes complex and they contain strict restraints. Not enforcing such constraints may lead to several sides effects while parsing.
The Microsoft documentation MS-PPT Powerpoint binary file format is explaining some of the structures that are needed to understand the issue described in this advisory. In particular, see below the format for DocumentTextInfoContainer
, RecordHeader
and TextMasterStyleAtom
structures. The RecordHeader
is a generic structure, present at the beginning of each container and atom record. A container is a record that defines the structure and hierarchy of atom records and other container records. An atom record contains presentation data. Analogous to a file system, atom records are similar to files that contain data and container records are similar to directories that provide structure and hierarchy for atom records. The DocumentTextInfoContainer
record specifies the default text styles for the document and the TextMasterStyleAtom
specifies the character-level and paragraph-level formatting of a main master slide.
The RecordHeader
is described as follow, a fixed length of eight bytes:
+ recVer (4bits): An unsigned integer specifies the version of the record data that follow the record header. A value of 0xF specifies the record is a container record.
+ recInstance (12bits): An unsigned integer that specifies the record instance data. Interpretation of the value is dependent on the particular record type.
+ recType (2 bytes): A `RecordType` enumeration that specifies the type of the record data that follows the record header.
+ recLen (4 bytes): An unsigned integer that specifies the length, in bytes, of the record data that follows the record header.
The DocumentTextInfoContainer
is described as:
+ rh (8 bytes): A `RecordHeader` structure with rh.recType set to the value RT_Environnement (0x3F2).
+ [...] several optional variable length atoms
+ testSIDefaultsAtom (variable): A `TextSIExceptionAtom` record.
+ textMasterStyleAtom (variable): A `TextMasterStyleAtom` record.
The TextMasterStyleAtom
is described as follow:
+ rh (8 bytes): A `RecordHeader` structure where recType value must be a RT_TextMasterStyleAtom (0xFA3) and recInstance value specifies the type of text to which the formatting applies.
+ cLevels (2 bytes): An unsigned integer that specifies the number of styles levels. It MUST be less than or equal to 0x0005
+ LstLvlx (variable): Five optional TextMasterStyleLevel structure that specifies the master formatting for text. Each structure must exist accordingly to the cLevels value.
The cLevels
field specifies it MUST be less than or equal to 0x0005. This is important because the vulnerability depends on the value of this field.
The function DfvPptReaderNS::TxMasterStyleAtom::parse
is called to parse Microsoft Office PowerPoint character-level and paragraph-level formatting of the main master slide.
bool __fastcall DfvPptReaderNS::TxMasterStyleAtom::parse(DfvPptReaderNS::TxMasterStyleAtom *this, DfvCommon::MSORecParseContext *context)
{
DfvPptReaderNS::TxMasterStyleAtom *TxMasterStyleAtomTable;
unsigned __int16 data_recinstance;
unsigned __int16 TextTypeEnum;
unsigned __int16 current_word_value;
unsigned __int16 index;
bool status;
DfvPptReaderNS::PFStyle *PFStyleAtom;
unsigned __int16 cLevels;
int offset;
TxMasterStyleAtomTable = this;
recInstance = this->recVer_recInstance;
offset = 0;
TextTypeEnum = recInstance >> 4;
if ( !DfvCommon::MSORecParseContext::readRecordData(context, this->recLen)
|| !DfvCommon::MSORecParseContext::getWord(context, &cLevels, offset) ) [1]
goto error_TxMasterStyleAtom;
offset += 2;
if ( cLevels ) [2]
{
index = 0;
status = true;
data_to_read = 1;
while ( 1 ) [3]
{
if ( TextTypeEnum <= 8u )
{
if ( (TextTypeEnum <= 4 )
{
current_index = index;
if ( index ) [7]
{
DfvPptReaderNS::PFStyle::operator=( TxMasterStyleAtomTable + 96 * index + 0x18, TxMasterStyleAtomTable + 96 * index - 0x48); [8]
DfvPptReaderNS::CFStyle::operator=( TxMasterStyleAtomTable + 32 * index + 0x1F8, TxMasterStyleAtomTable + 0x20 * index + 0x1D8); [9]
}
goto read_data; [10]
}
if ( (TextTypeEnum >= 5 )
{
if ( (unsigned int)DfvCommon::MSORecParseContext::getWord(a2, ¤t_index, offset) )
{
offset += 2;
read_data:
if ( !data_read
|| (PFStyleObject = TxMasterStyleAtomTable + 96 * index + 24),
*((_WORD *)PFStyleObject + 4) = index,
DfvPptReaderNS::PFStyle::parse(PFStyleObject, context, &offset) [5]
&& DfvPptReaderNS::CFStyle::parse(TxMasterStyleAtomTable + 32 * index + 0x1F8,context,&offset))
{
goto next_entry; [6]
}
}
goto reset_read_data;
}
}
reset_read_data:
data_read = 0;
next_entry:
if ( ++index >= cLevels ) [4]
{
if ( data_read )
return 1LL;
error_TxMasterStyleAtom:
icu_52::UnicodeString::UnicodeString((icu_52::UnicodeString *)¤t_index, 1, L"TxMasterStyleAtom", 17);
DfvPptReaderNS::PPTError::throwError((DfvPptReaderNS::PPTError *)0xD883, (unsigned __int64)¤t_index, v6);
}
}
}
The function DfvPptReaderNS::TxMasterStyleAtom::parse
uses the function DfvCommon::MSORecParseContext::getWord
to get the cLevels
from file at [1], which is returning a word value from the buffer previously read. The algorithm of the function DfvPptReaderNS::TxMasterStyleAtom::parse
is quite trivial, checking cLevels
for a positive value at [2], then applying an infinite loop which starts at [3]. The bounds is check at [4] compares the incremented value index
to cLevels
, ending with success or failure if it’s superior or equal to it. We can notice here that cLevels
is not compared against 0x0005, as documented in Microsoft’s Documentation.
Inside our loop we can see at [5] two main parsing functions named DfvPptReaderNS::PFStyle::parse
and DfvPptReaderNS::CFStyle::parse
which are reading binary data to fill in data accordingly. The interesting point is the presence of the constant value 96*index
and 32*index
respectively, typically demonstrating the usage of indexed tables. Once data is read, checks is performed again at [6] branching directly into [4]. Remember the index incremented at [4], data was previously read at [5] and is recopied at [8] and [9] into the next element of the relevant table. Then the execution continues with a direct branch at [10], until index
surpasses cLevels
[4].
fvPptReaderNS::PPTDocument *__fastcall DfvPptReaderNS::PPTDocument::PPTDocument(DfvPptReaderNS::PPTDocument *this)
{
[...]
*(_QWORD *)v6 = &`vtable for'DfvPptReaderNS::TxMasterStyleAtom + 2;
*(_WORD *)&v25[v7 - 16] = 0;
*(_WORD *)&v25[v7 - 8] = 0;
*(_WORD *)&v25[v7 - 2] = 0;
*(_WORD *)&v25[v7 - 6] = 8226;
*(_WORD *)&v25[v7 - 4] = 0;
*(_WORD *)&v25[v7 + 4] = 0;
*(_WORD *)&v25[v7 + 6] = 100;
*(_WORD *)&v25[v7 + 12] = 0;
*(_QWORD *)&v25[v7 - 24] = vtable_PFStyle; [11]
*(_WORD *)&v25[v7 + 8] = 0;
*(_WORD *)&v25[v7 + 10] = 0;
*(_WORD *)&v25[v7 + 14] = 0;
*(_WORD *)&v25[v7 + 16] = 576;
*(_DWORD *)&v25[v7 - 12] = 0;
*(_DWORD *)&v25[v7] = 0x1000000;
*((_QWORD *)v6 + 9) = vtable_TabStops;
v8 = v6 - v5;
*((_QWORD *)v6 + 10) = 0LL;
*(_WORD *)&v5[v8 + 104] = 0;
*(_WORD *)&v5[v8 + 106] = 0;
*(_QWORD *)&v5[v8 + 88] = 0LL;
*(_QWORD *)&v5[v8 + 96] = 0LL;
*(_WORD *)&v25[v7 + 64] = 0;
*(_WORD *)&v25[v7 + 66] = 7;
*(_WORD *)&v25[v7 + 68] = 0;
v9 = v21 - v5;
*(_QWORD *)&v1[v9 - 120] = vtable_PFStyle; [12]
*(_WORD *)&v1[v9 - 112] = 0;
*(_WORD *)&v1[v9 - 104] = 0;
*(_WORD *)&v1[v9 - 102] = 8226;
*(_WORD *)&v1[v9 - 100] = 0;
*(_WORD *)&v1[v9 - 98] = 0;
*(_WORD *)&v1[v9 - 92] = 0;
*(_WORD *)&v1[v9 - 90] = 100;
*(_WORD *)&v1[v9 - 88] = 0;
*(_WORD *)&v1[v9 - 86] = 0;
*(_WORD *)&v1[v9 - 84] = 0;
*(_WORD *)&v1[v9 - 82] = 0;
*(_WORD *)&v1[v9 - 80] = 576;
*(_DWORD *)&v1[v9 - 108] = 0;
*(_DWORD *)&v1[v9 - 96] = 0x1000000;
*((_QWORD *)v6 + 21) = vtable_TabStops;
*((_QWORD *)v6 + 22) = 0LL;
*(_WORD *)&v5[v8 + 200] = 0;
[...]
}
We can easily understand that there is an overflow, which is happening due to the missing check against 0x0005 at [4], but we need to get into the constructor named DfvPptReaderNS::PPTDocument::PPTDocument
to understand why. Without describing the whole function DfvPptReaderNS::PPTDocument::PPTDocument
, we can see at [11] and [12] the construction of objects related to PFStyle
. This function is preparing all objects related to the complete PowerPoint document and is reserving fixed space for objects and their corresponding vtables entries. The overflow is overwriting the vtable objects in the record, which can be used by an attacker to arbitrarily alter the execution flow of the program and thus execute arbitrary code.
2019-03-20 - Vendor Disclosure
2019-05-14 - Vendor Patched
2019-05-14 - Public Release
Discovered by Emmanuel Tacheau of Cisco Talos.