Talos Vulnerability Report

TALOS-2024-1926

libigl readMSH improper array index validation vulnerability

May 28, 2024
CVE Number

CVE-2024-23948,CVE-2024-23951,CVE-2024-23947,CVE-2024-23950,CVE-2024-23949

SUMMARY

Multiple improper array index validation vulnerabilities exist in the readMSH functionality of libigl v2.5.0. A specially crafted .msh file can lead to an out-of-bounds write. An attacker can provide a malicious file to trigger this vulnerability.

CONFIRMED VULNERABLE VERSIONS

The versions below were either tested or verified to be vulnerable by Talos or confirmed to be vulnerable by the vendor.

libigl v2.5.0

PRODUCT URLS

libigl - https://libigl.github.io/

CVSSv3 SCORE

8.8 - CVSS:3.1/AV:N/AC:L/PR:N/UI:R/S:U/C:H/I:H/A:H

CWE

CWE-787 - Out-of-bounds Write

DETAILS

libigl is a C++ geometry processing library that is designed to be simple to integrate into projects using a header-only construction for the code base. This library is widely utilized in industries ranging from Triple-A game development to 3D printing, and it can be found in many applications that require the geometry processing of various file formats.

When loading a .msh file via the readMSH function, the code will invoke various functions from MshLoader.cpp which contains the vulnerabilities mentioned in this report.

In multiple locations throughout MshLoader.cpp, the index of a node, element, etc. is used to reference memory in the associated vector. However these vectors are allocated based upon the declared number of elements, nodes, etc. while no check is performed to ensure the indexes of these objects are within the expected bounds. Furthermore, the MSH specification says Note that the elm-numbers do not necessarily have to form a dense nor an ordered sequence. and Note that the node-numbers do not necessarily have to form a dense nor an ordered sequence. so even a legitimate MSH file could trigger these bugs.

CVE-2024-23947 - igl::MshLoader::parse_nodes (binary file)

When pasing the $Node section of the .msh file, the parse_nodes function is called and do the following:

IGL_INLINE void igl::MshLoader::parse_nodes(std::ifstream& fin) {
    size_t num_nodes;
[0] fin >> num_nodes;
[1] m_nodes.resize(num_nodes*3);

    if (m_binary) {
        size_t stride = (4+3*m_data_size);
        size_t num_bytes = stride * num_nodes;
        char* data = new char[num_bytes];
        igl::_msh_eat_white_space(fin);
        fin.read(data, num_bytes);
        
   for (size_t i=0; i < num_nodes; i++) {
            int node_idx;
[2]         memcpy(&node_idx, data+i*stride, sizeof(int));
                         node_idx-=1;
                         // directly move into vector storage
                         // this works only when m_data_size==sizeof(Float)==sizeof(double)
[3]         memcpy(&m_nodes[node_idx*3], data+i*stride + 4, m_data_size*3);
        }
        delete [] data;
    } else {
        ...
        }
    }
}

Using user input [0] the m_nodes vector is resized [1] and then reading the node_idx from user input [2] the following data is stored within the vector at node_idx*3 without verifying the index is within bound [3]. This potentially lead to an out-of-bound write relative to the start of the m_nodes vector.

CVE-2024-23948 - igl::MshLoader::parse_nodes (ascii file)

When pasing the $Node section of the .msh file, the parse_node function is called and do the following:

IGL_INLINE void igl::MshLoader::parse_nodes(std::ifstream& fin) {
    size_t num_nodes;
[0] fin >> num_nodes;
[1] m_nodes.resize(num_nodes*3);

    if (m_binary) {
           ...
    } else {
        int node_idx;
        for (size_t i=0; i<num_nodes; i++) {
[2]         fin >> node_idx;
            node_idx -= 1;
            // here it's 3D node explicitly
[3]         fin >> m_nodes[node_idx*3]
                >> m_nodes[node_idx*3+1]
                >> m_nodes[node_idx*3+2];
        }
    }
}

Using user input [0] the m_nodes vector is resized [1] and then reading the node_idx from user input [2] the following data is stored within the vector at node_idx*3 without verifying the index is within bound [3]. This potentially lead to an out-of-bound write relative to the start of the m_nodes vector.

CVE-2024-23949 - igl::MshLoader::parse_node_field (ascii file)

When pasing the $NodeData section of the .msh file, the parse_node_field function is called and do the following:

IGL_INLINE void igl::MshLoader::parse_node_field( std::ifstream& fin ) {

    ...

    for (size_t i=0; i<num_int_tags; i++)
[0]     fin >> int_tags[i];

    if (num_string_tags <= 0 || num_int_tags <= 2) {
        throw std::runtime_error("Unexpected number of field tags");
    }
    std::string fieldname = str_tags[0];
    int num_components    = int_tags[1];
[1] int num_entries       = int_tags[2];

    std::vector<Float> field( num_entries*num_components );

    if (m_binary) {
        ...
        }
        delete [] data;
    } else {
        int node_idx;
        for (size_t i=0; i<num_entries; i++) {
[2]         fin >> node_idx;
            node_idx -= 1;
            for (size_t j=0; j<num_components; j++) {
[3]             fin >> field[node_idx*num_components+j];
            }
        }
    }
...
}

Using user input [0] the field vector is resized [1] and then reading the node_idx from user input [2] the following data is stored within the vector at node_idx*3 without verifying the index is within bound [3]. This potentially lead to an out-of-bound write relative to the start of the field vector.

CVE-2024-23950 - igl::MshLoader::parse_element_field (binary file)

When pasing the $ElementData section of the .msh file, the parse_element_field function is called and do the following:

IGL_INLINE void igl::MshLoader::parse_element_field(std::ifstream& fin) {
    ...
    for (size_t i=0; i<num_int_tags; i++)
[0]     fin >> int_tags[i];

    if (num_string_tags <= 0 || num_int_tags <= 2) {
        throw std::runtime_error("Invalid file format");
    }
    std::string fieldname = str_tags[0];
    int num_components = int_tags[1];
[1] int num_entries = int_tags[2];
    std::vector<Float> field(num_entries*num_components);

    if (m_binary) {
        size_t num_bytes = (num_components * m_data_size + 4) * num_entries;
        char* data = new char[num_bytes];
        igl::_msh_eat_white_space(fin);
        fin.read(data, num_bytes);
        for (int i=0; i < num_entries; i++) {
                int elem_idx;
                // works with sizeof(int)==4
[2]             memcpy(&elem_idx, &data[i*(4+num_components*m_data_size)],4);
                elem_idx -= 1;

                // directly copy data into vector storage space
[3]             memcpy(&field[elem_idx*num_components], &data[i*(4+num_components*m_data_size) + 4], m_data_size*num_components);
        }
        delete [] data;
    } else {
        ...
    }
    ...
}

Using user input [0] the field vector is resized [1] and then reading the elem_idx from user input [2] the following data is stored within the vector at elem_idx*num_components without verifying the index is within bound [3]. This potentially lead to an out-of-bound write relative to the start of the field vector.

CVE-2024-23951 - igl::MshLoader::parse_element_field (ascii file)

When pasing the $ElementData section of the .msh file, the parse_element_field function is called and do the following:

IGL_INLINE void igl::MshLoader::parse_element_field(std::ifstream& fin) {
    ...

    std::vector<int> int_tags(num_int_tags);
    for (size_t i=0; i<num_int_tags; i++)
[0]     fin >> int_tags[i];

    if (num_string_tags <= 0 || num_int_tags <= 2) {
        throw std::runtime_error("Invalid file format");
    }
    std::string fieldname = str_tags[0];
    int num_components = int_tags[1];
[1] int num_entries = int_tags[2];
    std::vector<Float> field(num_entries*num_components);

    if (m_binary) {
        ...
    } else {
        int elem_idx;
        for (size_t i=0; i<num_entries; i++) {
[2]         fin >> elem_idx;
            elem_idx -= 1;
            for (size_t j=0; j<num_components; j++) {
[3]             fin >> field[elem_idx*num_components+j];
            }
        }
    }
    ...
}

Using user input [0] the field vector is resized [1] and then reading the elem_idx from user input [2] the following data is stored within the vector at elem_idx*num_components without verifying the index is within bound [3]. This potentially lead to an out-of-bound write relative to the start of the field vector.

TIMELINE

2023-11-22 - Initial Vendor Contact
2023-11-28 - Initial Vendor Contact
2023-11-30 - Request for confirmation
2023-12-11 - Advisories sent
2024-02-07 - Four more advisories sent, after the initial two
2024-02-27 - Request for status update
2024-04-10 - Request for status update
2ß24-05-15 - Request for status update via Github issue, no reply
2024-05-28 - Public Release

Credit

Discovered by Philippe Laulheret of Cisco Talos.