CVE-2016-4330
HDF5 is a fileformat that is maintained by a non-profit organization, The HDF Group. HDF5 is designed to be used for storage and organization of large amounts of scientific data and is used to exchange data structures between applications in industries such as the GIS industry via libraries such as GDAL, OGR, or as part of software like ArcGIS. The vulnerability exists due to the library’s failure to check if the number of dimensions for an array read from the file is within the bounds of the space allocated for it. When reading elements from the file into this array, a heap-based buffer overflow will occur, potentially leading to arbitrary code execution.
hdf5-1.8.16.tar.bz2
tools/h5ls: Version 1.8.16
tools/h5stat: Version 1.8.16
tools/h5dump: Version 1.8.16
http://www.hdfgroup.org/HDF5/
http://www.hdfgroup.org/HDF5/release/obtainsrc.html</br>
http://www.hdfgroup.org/ftp/HDF5/current/src/hdf5-1.8.16.tar.bz2
8.6 – CVSS:3.0/AV:L/AC:L/PR:N/UI:R/S:C/C:H/I:H/A:H
The HDF file format is intended to be a general file format that is self-describing for various types of data structures used in the scientific community [1]. These datastructures are intended to be stored in two types of objects, Datasets and Groups. Paralleling the file-format to a filesystem, a Dataset can be interpreted as a file, and a Group can be interpreted as a directory that’s able to contain other Datasets or Groups. Associated with each entry, is metadata containing user-defined named attributes that can be used to describe the dataset.
Within the HDF file format, paths can be specified as the ‘/’-separated posix format. When reading a dataset, the library will open the object using H5D__open_oid. Inside this function, the library will read the type and it’s location. Once the type and it’s location are read, then the library will pass the H5O_DTYPE_ID value onto H5O_msg_read.
src/H5Dint.c:1221
static herr_t
H5D__open_oid(H5D_t *dataset, hid_t dapl_id, hid_t dxpl_id)
{
/* Open the dataset object */
if(H5O_open(&(dataset->oloc)) < 0)
HGOTO_ERROR(H5E_DATASET, H5E_CANTOPENOBJ, FAIL, "unable to open")
/* Get the type and space */
if(NULL == (dataset->shared->type = (H5T_t *)H5O_msg_read(&(dataset->oloc), H5O_DTYPE_ID, NULL, dxpl_id))) // XXX: \
HGOTO_ERROR(H5E_DATASET, H5E_CANTINIT, FAIL, "unable to load type info from dataset header")
\
src/H5Omessage.c:463
void *
H5O_msg_read(const H5O_loc_t *loc, unsigned type_id, void *mesg,
hid_t dxpl_id)
{
H5O_t *oh = NULL; /* Object header to use */
void *ret_value; /* Return value */
/* Get the object header */
if(NULL == (oh = H5O_protect(loc, dxpl_id, H5AC_READ)))
HGOTO_ERROR(H5E_OHDR, H5E_CANTPROTECT, NULL, "unable to protect object header")
/* Call the "real" read routine */
if(NULL == (ret_value = H5O_msg_read_oh(loc->file, dxpl_id, oh, type_id, mesg))) // XXX: read the message from the object header
HGOTO_ERROR(H5E_OHDR, H5E_READERROR, NULL, "unable to read object header message")
Inside H5O_msg_read_oh, the application will use the type_id argument to determine which message type is being used for a message. This message type is used to determine which callback to use in order to handle the message. This process occurs within the macro H5O_LOAD_NATIVE at H5Omessage.c:545
src/H5Omessage.c:517
void *
H5O_msg_read_oh(H5F_t *f, hid_t dxpl_id, H5O_t *oh, unsigned type_id,
void *mesg)
{
const H5O_msg_class_t *type; /* Actual H5O class type for the ID */
unsigned idx; /* Message's index in object header */
void *ret_value = NULL;
for(idx = 0; idx < oh->nmesgs; idx++)
if(type == oh->mesg[idx].type)
break;
H5O_LOAD_NATIVE(f, dxpl_id, 0, oh, &(oh->mesg[idx]), NULL)
Inside the H5O_LOAD_NATIVE macro, the application will select a structure containing function pointers out of the msg->type field. This structure contains various functions that are used to decode the message. When decoding a msg of type H5O_DTYPE_ID, the library will dispatch into the H5O_dtype_shared_decode function. This function will eventually call H5O_dtype_decode. Inside H5O_dtype_decode, the library will first allocate space using the call H5T__alloc. Afterwards, execution will continue onto H5O_dtype_decode_helper which is responsible for decoding the datatypes.
src/H5Oshared.h:50
static H5_INLINE void *
H5O_SHARED_DECODE(H5F_t *f, hid_t dxpl_id, H5O_t *open_oh, unsigned mesg_flags,
unsigned *ioflags, const uint8_t *p)
{
/* Decode native message directly */
if(NULL == (ret_value = H5O_SHARED_DECODE_REAL(f, dxpl_id, open_oh, mesg_flags, ioflags, p))) // XXX: \
HGOTO_ERROR(H5E_OHDR, H5E_CANTDECODE, NULL, "unable to decode native message")
} /* end else */
\
src/H5Odtype.c:1091
static void *
H5O_dtype_decode(H5F_t *f, hid_t H5_ATTR_UNUSED dxpl_id, H5O_t H5_ATTR_UNUSED *open_oh, unsigned H5_ATTR_UNUSED mesg_flags,
unsigned *ioflags/*in,out*/, const uint8_t *p)
{
/* Allocate datatype message */
if(NULL == (dt = H5T__alloc()))
HGOTO_ERROR(H5E_RESOURCE, H5E_NOSPACE, NULL, "memory allocation failed")
/* Perform actual decode of message */
if(H5O_dtype_decode_helper(f, ioflags, &p, dt) < 0)
HGOTO_ERROR(H5E_DATATYPE, H5E_CANTDECODE, NULL, "can't decode type")
Inside H5T__alloc, the library will allocate space for an H5T_shared_t object. This structure is defined within H5Tpkg.h at line 288. The vulnerability is due to the definition of the H5T_array_t field within the union u. The H5T_array_t structure defines an H5S_MAX_RANK element array of size_t fields. Defined in src/H5public.h:31, this length is 32.
src/H5T.c:3446
H5T_t *
H5T__alloc(void)
{
/* Allocate & initialize shared datatype structure */
if(NULL == (dt->shared = H5FL_CALLOC(H5T_shared_t))) // XXX: sizeof(H5T_shared_t)
HGOTO_ERROR(H5E_RESOURCE, H5E_NOSPACE, NULL, "memory allocation failed")
src/H5Spublic.h:31
#define H5S_MAX_RANK 32
src/H5Tpkg.h:288
typedef struct H5T_shared_t {
union {
H5T_array_t array; /* an array datatype */
} u;
} H5T_shared_t;
src/H5Tpkg.h:273
typedef struct H5T_array_t {
size_t nelem; /* total number of elements in array */
unsigned ndims; /* member dimensionality */
size_t dim[H5S_MAX_RANK]; /* size in each dimension */ // XXX: maximum of 32
} H5T_array_t;
After allocating space for the H5T_array_t, the library will return back to H5O_dtype_decode which will then execute the function H5O_dtype_decode_helper. When entering the case H5T_ARRAY, the library will read the number of dimensions from the file and then check that it’s valid via an assertion. Due to an assertion being only enabled when the application is compiled in debug-mode, this check will get optimized out by the preprocessor. Immediately following, the library will enter a loop that reads DWORDs from the file into the H5T_array_t.dim field. If the value of u.array.ndims is larger than 32, then this loop will read data outside the bounds of the H5T_array_t that was allocated earlier. This will lead to heap corruption and can lead to code execution under the context of the application using the library.
src/H5Odtype.c:133
static htri_t
H5O_dtype_decode_helper(H5F_t *f, unsigned *ioflags/*in,out*/, const uint8_t **pp, H5T_t *dt)
{
case H5T_ARRAY: /* Array datatypes */
/* Decode the number of dimensions */
dt->shared->u.array.ndims = *(*pp)++;
/* Double-check the number of dimensions */
HDassert(dt->shared->u.array.ndims <= H5S_MAX_RANK);
/* Decode array dimension sizes & compute number of elements */
for(i = 0, dt->shared->u.array.nelem = 1; i < (unsigned)dt->shared->u.array.ndims; i++) {
UINT32DECODE(*pp, dt->shared->u.array.dim[i]);
dt->shared->u.array.nelem *= dt->shared->u.array.dim[i];
} /* end for */
$ gdb -q --args bin/h5stat poc.hdf
No symbol table is loaded. Use the "file" command.
Reading symbols from $HOME/hdf5-1.8.16/release/bin/h5stat...done.
(gdb) bp H5Odtype.c:518
Breakpoint 3 at 0x8147cb0: file ../../src/H5Odtype.c, line 518.
(gdb) bp H5Odtype.c:528 i < 0x1f
Breakpoint 4 at 0x8147cc9: file ../../src/H5Odtype.c, line 528.
(gdb) r
Starting program: $HOME/hdf5-1.8.16/release/bin/h5stat poc.hdf
Filename: poc.hdf
Breakpoint 3, H5O_dtype_decode_helper (f=f@entry=0x83f0e48, ioflags=ioflags@entry=0xbfffed6c, pp=pp@entry=0xbfffed1c, dt=dt@entry=0x83df358) at ../../src/H5Odtype.c:518
518 dt->shared->u.array.ndims = *(*pp)++;
(gdb) n
524 if(version < H5O_DTYPE_VERSION_3)
(gdb) n
518 dt->shared->u.array.ndims = *(*pp)++;
(gdb) n
524 if(version < H5O_DTYPE_VERSION_3)
(gdb) p dt->shared->u.array.ndims
$1 = 0x69
(gdb) c
Continuing.
*** Error in `$HOME/hdf5-1.8.16/release/bin/h5stat': free(): invalid pointer: 0x083f21e1 ***
Catchpoint 2 (signal SIGABRT), 0xb7ffecb0 in ?? ()
(gdb)
### Crash Analysis (Address Sanitizer)
=================================================================
==2398==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xb2b20938 at pc 0xb626a9fe bp 0xbfb6d5d8 sp 0xbfb6d5d0
WRITE of size 4 at 0xb2b20938 thread T0
#0 0xb626a9fd in H5O_dtype_decode_helper $HOME/hdf5-1.8.16/asan/src/../../src/H5Odtype.c:529
#1 0xb6252881 in H5O_dtype_decode $HOME/hdf5-1.8.16/asan/src/../../src/H5Odtype.c:1108
#2 0xb621efd8 in H5O_dtype_shared_decode $HOME/hdf5-1.8.16/asan/src/../../src/H5Oshared.h:84
#3 0xb62faa5c in H5O_msg_read_oh $HOME/hdf5-1.8.16/asan/src/../../src/H5Omessage.c:554
#4 0xb62f88a6 in H5O_msg_read $HOME/hdf5-1.8.16/asan/src/../../src/H5Omessage.c:483
#5 0xb5798b96 in H5D__open_oid $HOME/hdf5-1.8.16/asan/src/../../src/H5Dint.c:1245
#6 0xb5795df7 in H5D_open $HOME/hdf5-1.8.16/asan/src/../../src/H5Dint.c:1153
#7 0xb563b3f9 in H5Dopen2 $HOME/hdf5-1.8.16/asan/src/../../src/H5D.c:368
#8 0x825351d in find_objs_cb $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5tools_utils.c:580
#9 0x8270c4d in traverse_cb $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5trav.c:237
#10 0xb5c2f66a in H5G_visit_cb $HOME/hdf5-1.8.16/asan/src/../../src/H5Gint.c:939
#11 0xb5c83a72 in H5G__node_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5Gnode.c:1026
#12 0xb5477c85 in H5B_iterate_helper $HOME/hdf5-1.8.16/asan/src/../../src/H5B.c:1175
#13 0xb54756db in H5B_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5B.c:1220
#14 0xb5cdc773 in H5G__stab_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5Gstab.c:565
#15 0xb5ca7af2 in H5G__obj_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5Gobj.c:707
#16 0xb5c2cbe2 in H5G_visit $HOME/hdf5-1.8.16/asan/src/../../src/H5Gint.c:1174
#17 0xb5fe7f7d in H5Lvisit_by_name $HOME/hdf5-1.8.16/asan/src/../../src/H5L.c:1378
#18 0x825c8fe in traverse $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5trav.c:310
#19 0x82679c5 in h5trav_visit $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5trav.c:1164
#20 0x82522c4 in init_objs $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5tools_utils.c:655
#21 0x80cf3b7 in table_list_add $HOME/hdf5-1.8.16/asan/tools/h5dump/../../../tools/h5dump/h5dump.c:408
#22 0x80d12c1 in main $HOME/hdf5-1.8.16/asan/tools/h5dump/../../../tools/h5dump/h5dump.c:1470
#23 0xb5033a82 (/lib/i386-linux-gnu/libc.so.6+0x19a82)
#24 0x80cec04 in _start ($HOME/hdf5-1.8.16/asan/bin/h5dump+0x80cec04)
0xb2b20938 is located 0 bytes to the right of 168-byte region [0xb2b20890,0xb2b20938)
allocated by thread T0 here:
#0 0x80b791e in calloc ($HOME/hdf5-1.8.16/asan/bin/h5dump+0x80b791e)
#1 0xb6058d5b in H5MM_calloc $HOME/hdf5-1.8.16/asan/src/../../src/H5MM.c:107
#2 0xb6947712 in H5T__alloc $HOME/hdf5-1.8.16/asan/src/../../src/H5T.c:3462
#3 0xb62523b8 in H5O_dtype_decode $HOME/hdf5-1.8.16/asan/src/../../src/H5Odtype.c:1104
#4 0xb621efd8 in H5O_dtype_shared_decode $HOME/hdf5-1.8.16/asan/src/../../src/H5Oshared.h:84
#5 0xb62faa5c in H5O_msg_read_oh $HOME/hdf5-1.8.16/asan/src/../../src/H5Omessage.c:554
#6 0xb62f88a6 in H5O_msg_read $HOME/hdf5-1.8.16/asan/src/../../src/H5Omessage.c:483
#7 0xb5798b96 in H5D__open_oid $HOME/hdf5-1.8.16/asan/src/../../src/H5Dint.c:1245
#8 0xb5795df7 in H5D_open $HOME/hdf5-1.8.16/asan/src/../../src/H5Dint.c:1153
#9 0xb563b3f9 in H5Dopen2 $HOME/hdf5-1.8.16/asan/src/../../src/H5D.c:368
#10 0x825351d in find_objs_cb $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5tools_utils.c:580
#11 0x8270c4d in traverse_cb $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5trav.c:237
#12 0xb5c2f66a in H5G_visit_cb $HOME/hdf5-1.8.16/asan/src/../../src/H5Gint.c:939
#13 0xb5c83a72 in H5G__node_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5Gnode.c:1026
#14 0xb5477c85 in H5B_iterate_helper $HOME/hdf5-1.8.16/asan/src/../../src/H5B.c:1175
#15 0xb54756db in H5B_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5B.c:1220
#16 0xb5cdc773 in H5G__stab_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5Gstab.c:565
#17 0xb5ca7af2 in H5G__obj_iterate $HOME/hdf5-1.8.16/asan/src/../../src/H5Gobj.c:707
#18 0xb5c2cbe2 in H5G_visit $HOME/hdf5-1.8.16/asan/src/../../src/H5Gint.c:1174
#19 0xb5fe7f7d in H5Lvisit_by_name $HOME/hdf5-1.8.16/asan/src/../../src/H5L.c:1378
#20 0x825c8fe in traverse $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5trav.c:310
#21 0x82679c5 in h5trav_visit $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5trav.c:1164
#22 0x82522c4 in init_objs $HOME/hdf5-1.8.16/asan/tools/lib/../../../tools/lib/h5tools_utils.c:655
#23 0x80cf3b7 in table_list_add $HOME/hdf5-1.8.16/asan/tools/h5dump/../../../tools/h5dump/h5dump.c:408
#24 0x80d12c1 in main $HOME/hdf5-1.8.16/asan/tools/h5dump/../../../tools/h5dump/h5dump.c:1470
#25 0xb5033a82 (/lib/i386-linux-gnu/libc.so.6+0x19a82)
SUMMARY: AddressSanitizer: heap-buffer-overflow $HOME/hdf5-1.8.16/asan/src/../../src/H5Odtype.c:529 H5O_dtype_decode_helper
2016-05-08 - Discovery
2016-05-17 - Vendor Notification
2016-11-15 - Public Disclosure
[1] https://en.wikipedia.org/wiki/Hierarchical_Data_Format
[2] http://www.hdfgroup.org/HDF5/
Discovered by Cisco Talos.