CVE-2021-40400
An out-of-bounds read vulnerability exists in the RS-274X aperture macro outline primitive functionality of Gerbv 2.7.0 and dev (commit b5f1eacd) and the forked version of Gerbv (commit d7f42a9a). A specially-crafted Gerber file can lead to information disclosure. An attacker can provide a malicious file to trigger this vulnerability.
Gerbv 2.7.0
Gerbv dev (commit b5f1eacd)
Gerbv forked dev (commit d7f42a9a)
Gerbv - https://sourceforge.net/projects/gerbv/ Gerbv forked - https://github.com/gerbv/gerbv
9.3 - CVSS:3.0/AV:N/AC:L/PR:N/UI:N/S:C/C:L/I:N/A:H
CWE-119 - Improper Restriction of Operations within the Bounds of a Memory Buffer
Gerbv is an open-source software that allows to view RS-274X Gerber files, Excellon drill files and pick-n-place files. These file formats are used in industry to describe the layers of a printed circuit board and are a core part of the manufacturing process.
Some PCB (printed circuit board) manufacturers use software like Gerbv in their web interfaces as a tool to convert Gerber (or other supported) files into images. Users can upload Gerber files to the manufacturer website, which are converted to an image to be displayed in the browser, so that users can verify that what has been uploaded matches their expectations. Gerbv can do such conversions using the -x
switch (export). Moreover, gerbv can be compiled and used as a shared library. For these reasons, we consider this software as reachable via network without user interaction or privilege requirements.
Gerbv uses the function gerbv_open_image
to open files. In this advisory we’re interested in the RS-274X file-type.
int
gerbv_open_image(gerbv_project_t *gerbvProject, char *filename, int idx, int reload,
gerbv_HID_Attribute *fattr, int n_fattr, gboolean forceLoadFile)
{
...
dprintf("In open_image, about to try opening filename = %s\n", filename);
fd = gerb_fopen(filename);
if (fd == NULL) {
GERB_COMPILE_ERROR(_("Trying to open \"%s\": %s"),
filename, strerror(errno));
return -1;
}
...
if (gerber_is_rs274x_p(fd, &foundBinary)) { // [1]
dprintf("Found RS-274X file\n");
if (!foundBinary || forceLoadFile) {
/* figure out the directory path in case parse_gerb needs to
* load any include files */
gchar *currentLoadDirectory = g_path_get_dirname (filename);
parsed_image = parse_gerb(fd, currentLoadDirectory); // [2]
g_free (currentLoadDirectory);
}
}
...
A file is considered of type “RS-274X” if the function gerber_is_rs274x_p
[1] returns true. When true, the parse_gerb
is called [2] to parse the input file. Let’s first look at the requirements that we need to satisfy to have an input file be recognized as an RS-274X file:
gboolean
gerber_is_rs274x_p(gerb_file_t *fd, gboolean *returnFoundBinary)
{
...
while (fgets(buf, MAXL, fd->fd) != NULL) {
dprintf ("buf = \"%s\"\n", buf);
len = strlen(buf);
/* First look through the file for indications of its type by
* checking that file is not binary (non-printing chars and white
* spaces)
*/
for (i = 0; i < len; i++) { // [3]
if (!isprint((int) buf[i]) && (buf[i] != '\r') &&
(buf[i] != '\n') && (buf[i] != '\t')) {
found_binary = TRUE;
dprintf ("found_binary (%d)\n", buf[i]);
}
}
if (g_strstr_len(buf, len, "%ADD")) {
found_ADD = TRUE;
dprintf ("found_ADD\n");
}
if (g_strstr_len(buf, len, "D00") || g_strstr_len(buf, len, "D0")) {
found_D0 = TRUE;
dprintf ("found_D0\n");
}
if (g_strstr_len(buf, len, "D02") || g_strstr_len(buf, len, "D2")) {
found_D2 = TRUE;
dprintf ("found_D2\n");
}
if (g_strstr_len(buf, len, "M00") || g_strstr_len(buf, len, "M0")) {
found_M0 = TRUE;
dprintf ("found_M0\n");
}
if (g_strstr_len(buf, len, "M02") || g_strstr_len(buf, len, "M2")) {
found_M2 = TRUE;
dprintf ("found_M2\n");
}
if (g_strstr_len(buf, len, "*")) {
found_star = TRUE;
dprintf ("found_star\n");
}
/* look for X<number> or Y<number> */
if ((letter = g_strstr_len(buf, len, "X")) != NULL) {
if (isdigit((int) letter[1])) { /* grab char after X */
found_X = TRUE;
dprintf ("found_X\n");
}
}
if ((letter = g_strstr_len(buf, len, "Y")) != NULL) {
if (isdigit((int) letter[1])) { /* grab char after Y */
found_Y = TRUE;
dprintf ("found_Y\n");
}
}
}
...
/* Now form logical expression determining if the file is RS-274X */
if ((found_D0 || found_D2 || found_M0 || found_M2) && // [4]
found_ADD && found_star && (found_X || found_Y))
return TRUE;
return FALSE;
} /* gerber_is_rs274x */
For an input to be considered an RS-274X file, the file must first of all contain only printing characters [3]. The other requirements can be gathered by the conditional expression at [4]. An example of a minimal RS-274X file is the following:
%FSLAX26Y26*%
%MOMM*%
%ADD100C,1.5*%
D100*
X0Y0D03*
M02*
Even though not important for the purposes of the vulnerability itself, note that the checks use g_strstr_len
, so all those fields can be found anywhere in the file. For example, this file is also recognized as an RS-274X file, even though it will fail later checks in the execution flow:
%ADD0X0*
After an RS-274X file has been recognized, parse_gerb
is called, which in turn calls gerber_parse_file_segment
:
gboolean
gerber_parse_file_segment (gint levelOfRecursion, gerbv_image_t *image,
gerb_state_t *state, gerbv_net_t *curr_net,
gerbv_stats_t *stats, gerb_file_t *fd,
gchar *directoryPath)
{
...
while ((read = gerb_fgetc(fd)) != EOF) {
...
case '%':
dprintf("... Found %% code at line %ld\n", line_num);
while (1) {
parse_rs274x(levelOfRecursion, fd, image, state, curr_net,
stats, directoryPath, &line_num);
If our file starts with “%”, we end up calling parse_rs274x
:
static void
parse_rs274x(gint levelOfRecursion, gerb_file_t *fd, gerbv_image_t *image,
gerb_state_t *state, gerbv_net_t *curr_net, gerbv_stats_t *stats,
gchar *directoryPath, long int *line_num_p)
{
...
switch (A2I(op[0], op[1])){
...
case A2I('A','D'): /* Aperture Description */
a = (gerbv_aperture_t *) g_new0 (gerbv_aperture_t,1);
ano = parse_aperture_definition(fd, a, image, scale, line_num_p); // [6]
...
break;
case A2I('A','M'): /* Aperture Macro */
tmp_amacro = image->amacro;
image->amacro = parse_aperture_macro(fd); // [5]
if (image->amacro) {
image->amacro->next = tmp_amacro;
...
For this advisory, we’re interested in the AM
and AD
commands. For details on the Gerber format see the specification from Ucamco.
In summary, AM
defines a “macro aperture template”, which is, in other terms, a parametrized shape. It is a flexible way to define arbitrary shapes by building on top of simpler shapes (primitives). It allows to perform arithmetic operations and define variables. After a template has been defined, the AD
command is used to instantiate the template and optionally pass some parameters to customize the shape.
From the specification, this is the syntax of the AM
command:
<AM command> = AM<Aperture macro name>*<Macro content>
<Macro content> = {{<Variable definition>*}{<Primitive>*}}
<Variable definition> = $K=<Arithmetic expression>
<Primitive> = <Primitive code>,<Modifier>{,<Modifier>}|<Comment>
<Modifier> = $M|< Arithmetic expression>
<Comment> = 0 <Text>
While this is the syntax for the AD
command:
<AD command> = ADD<D-code number><Template>[,<Modifiers set>]*
<Modifiers set> = <Modifier>{X<Modifier>}
For this advisory, we’re interested in the “Outline” primitive (code 4
). From the specification:
An outline primitive is an area defined by its outline or contour. The outline is a polygon, consisting of linear segments only, defined by its start vertex and n subsequent vertices.
The outline primitive should contain the following fields:
+-----------------+----------------------------------------------------------------------------------------+
| Modifier number | Description |
+-----------------+----------------------------------------------------------------------------------------+
| 1 | Exposure off/on (0/1) |
+-----------------+----------------------------------------------------------------------------------------+
| 2 | The number of vertices of the outline = the number of coordinate pairs minus one. |
| | An integer ≥3. |
+-----------------+----------------------------------------------------------------------------------------+
| 3, 4 | Start point X and Y coordinates. Decimals. |
+-----------------+----------------------------------------------------------------------------------------+
| 5, 6 | First subsequent X and Y coordinates. Decimals. |
+-----------------+----------------------------------------------------------------------------------------+
| ... | Further subsequent X and Y coordinates. Decimals. |
| | The X and Y coordinates are not modal: both X and Y must be specified for all points. |
+-----------------+----------------------------------------------------------------------------------------+
| 3+2n, 4+2n | Last subsequent X and Y coordinates. Decimals. Must be equal to the start coordinates. |
+-----------------+----------------------------------------------------------------------------------------+
| 5+2n | Rotation angle, in degrees counterclockwise, a decimal. |
| | The primitive is rotated around the origin of the macro definition, |
| | i.e. the (0, 0) point of macro |
+----------------------------------------------------------------------------------------------------------+
Also the specification states that “The maximum number of vertices is 5000”, which is controlled by the modified number 2. So, depending on the number of vertices, the length of this primitive will change accordingly.
In the parse_rs274x
function, when an AM
command is found, the function parse_aperture_macro
is called [5]. Let’s see how this outline primitive is handled there:
gerbv_amacro_t *
parse_aperture_macro(gerb_file_t *fd)
{
gerbv_amacro_t *amacro;
gerbv_instruction_t *ip = NULL;
int primitive = 0, c, found_primitive = 0;
...
int equate = 0;
amacro = new_amacro();
...
/*
* Since I'm lazy I have a dummy head. Therefore the first
* instruction in all programs will be NOP.
*/
amacro->program = new_instruction();
ip = amacro->program;
while(continueLoop) {
c = gerb_fgetc(fd);
switch (c) {
...
case '*':
...
/*
* Check is due to some gerber files has spurious empty lines.
* (EagleCad of course).
*/
if (found_primitive) {
ip->next = new_instruction(); /* XXX Check return value */
ip = ip->next;
if (equate) {
ip->opcode = GERBV_OPCODE_PPOP;
ip->data.ival = equate;
} else {
ip->opcode = GERBV_OPCODE_PRIM; // [10]
ip->data.ival = primitive;
}
equate = 0;
primitive = 0;
found_primitive = 0;
}
break;
...
case ',':
if (!found_primitive) { // [8]
found_primitive = 1;
break;
}
...
break;
...
case '1':
case '2':
case '3':
case '4':
case '5':
case '6':
case '7':
case '8':
case '9':
case '.':
/*
* First number in an aperture macro describes the primitive
* as a numerical value
*/
if (!found_primitive) { // [7]
primitive = (primitive * 10) + (c - '0');
break;
}
(void)gerb_ungetc(fd);
ip->next = new_instruction(); /* XXX Check return value */ // [9]
ip = ip->next;
ip->opcode = GERBV_OPCODE_PUSH;
amacro->nuf_push++;
ip->data.fval = gerb_fgetdouble(fd);
if (neg)
ip->data.fval = -ip->data.fval;
neg = 0;
comma = 0;
break;
case '%':
gerb_ungetc(fd); /* Must return with % first in string
since the main parser needs it */
return amacro; // [11]
default :
/* Whitespace */
break;
}
if (c == EOF) {
continueLoop = 0;
}
}
free (amacro);
return NULL;
}
As we can see this function implements a set of opcodes for a virtual machine that is used to perform arithmetic operations, handle variable definitions and references via a virtual stack, and primitives.
Let’s take an outline primitive definition as example:
%AMX0*4,0,3,1,1,1*%
As discussed before, %AM
will land us in the parse_aperture_macro
function, and X0
is the name for the macro. The macro parsing starts with 4
[7]: this is the primitive number, which is read as a decimal number until a ,
is found [8]. After that, each field separated by ,
is read as a double
and added to the stack via PUSH
[9]. These form the arguments to the primitive. When *
is found [10], the primitive instruction is added, and with %
the macro is returned.
For reference, these are the prototypes for the macro and the program instructions:
struct amacro {
gchar *name;
gerbv_instruction_t *program;
unsigned int nuf_push;
struct amacro *next;
}
struct instruction {
gerbv_opcodes_t opcode;
union {
int ival;
float fval;
} data;
struct instruction *next;
}
Back to parse_rs274x
: When an AD
command is found, the function parse_aperture_definition
is called [6], which in turn calls simplify_aperture_macro
when the AD
command is using a template.
static int
simplify_aperture_macro(gerbv_aperture_t *aperture, gdouble scale)
{
...
gerbv_instruction_t *ip;
int handled = 1, nuf_parameters = 0, i, j, clearOperatorUsed = FALSE; // [18]
double *lp; /* Local copy of parameters */
double tmp[2] = {0.0, 0.0};
...
/* Allocate stack for VM */
s = new_stack(aperture->amacro->nuf_push + extra_stack_size); // [12]
if (s == NULL)
GERB_FATAL_ERROR("malloc stack failed in %s()", __FUNCTION__);
...
for(ip = aperture->amacro->program; ip != NULL; ip = ip->next) {
switch(ip->opcode) {
case GERBV_OPCODE_NOP:
break;
case GERBV_OPCODE_PUSH :
push(s, ip->data.fval); // [13]
break;
...
case GERBV_OPCODE_PRIM :
/*
* This handles the exposure thing in the aperture macro
* The exposure is always the first element on stack independent
* of aperture macro.
*/
switch(ip->data.ival) {
...
case 4 : // [14]
dprintf(" Aperture macro outline [4] (");
type = GERBV_APTYPE_MACRO_OUTLINE;
/*
* Number of parameters are:
* - number of points defined in entry 1 of the stack +
* start point. Times two since it is both X and Y.
* - Then three more; exposure, nuf points and rotation.
*/
nuf_parameters = ((int)s->stack[1] + 1) * 2 + 3; // [15]
break;
...
}
if (type != GERBV_APTYPE_NONE) {
if (nuf_parameters > APERTURE_PARAMETERS_MAX) { // [16]
GERB_COMPILE_ERROR(_("Number of parameters to aperture macro (%d) "
"are more than gerbv is able to store (%d)"),
nuf_parameters, APERTURE_PARAMETERS_MAX);
nuf_parameters = APERTURE_PARAMETERS_MAX; // [17]
}
/*
* Create struct for simplified aperture macro and
* start filling in the blanks.
*/
sam = g_new (gerbv_simplified_amacro_t, 1);
sam->type = type;
sam->next = NULL;
memset(sam->parameter, 0,
sizeof(double) * APERTURE_PARAMETERS_MAX);
memcpy(sam->parameter, s->stack, // [18]
sizeof(double) * nuf_parameters);
For this advisory, all the AD
command has to do is utilize the macro that we just created, without special parameters. Let’s consider the following aperture definition:
%ADD09X0*
For AD
to use the template, it has to execute the template in the virtual machine. To this end, a virtual stack is allocated at [12] to handle parameters. The size of this stack depends on nuf_push
, which is incremented at [9] every time a GERBV_OPCODE_PUSH
instruction is added to the program.
In the case of the sample macro previously discussed, our program will contain a series of GERBV_OPCODE_PUSH
instructions (pushing the numbers 0,3,1,1,1
to the stack, at [13]) and a GERBV_OPCODE_PRIM
instruction for primitive 4 (outline), executed at [14].
At [15] the number of vertices is taken from the second field in the stack (as per specification) and the number of parameters for the primitive is calculated. At [16] the code makes sure that nuf_parameters
is not bigger than APERTURE_PARAMETERS_MAX
(102), otherwise nuf_parameters
gets limited to APERTURE_PARAMETERS_MAX
[17]. Finally at [18] the parameters are copied from the stack into the newly allocated sam
structure.
The problem in this whole logic is that the stack buffer (s->stack
) created at [12] has a size that depends on nuf_push
, while the memcpy
happening at [18] has a size that depends on nuf_parameters
. In the sample macro, the value of nuf_parameters
is 3
, however an attacker could use any arbitrary number, which is taken verbatim at [15] by reading s->stack[1]
and used to calculate nuf_parameters
. At [17] the value of nuf_parameters
is restricted to a maximum of 102, meaning that an attacker can set an arbitrary nuf_parameters
value from 0 to 102, causing the memcpy
at [18] to range from 0 to 816 (i.e. 102 * sizeof(double)
).
If nuf_push
is smaller than nuf_parameters
, the memcpy
will cause an out-of-bounds read on the s->stack
buffer, which will lead to storing data of the nearby heap chunks into sam->parameter
. Since sam->parameter
is used to draw the shape for the macro being evaluated, this will result in the final drawing having a different shape, coordinate points and rotation, depending on the values stored in the nearby heap chunks. Since an attacker might be able to read the rendered image at the end of the parsing (e.g. if the service using Gerbv is converting a .gbr
file into a .png
and returning it to the user), it would be possible to extract heap metadata or contents by reading the resulting rendered image. The quality of the information depends on the dpi
chosen for the operation, however with careful heap manipulation, in the worst case this could result in an information leak of the process’ memory.
# ./gerbv -x png -o out aperture_macro_parameters_oobr.poc
** (process:267): CRITICAL **: 15:11:07.120: Number of parameters to aperture macro (2005) are more than gerbv is able to store (102)=================================================================
==267==ERROR: AddressSanitizer: heap-buffer-overflow on address 0xf3e027b8 at pc 0xf799d8be bp 0xfff31768 sp 0xfff31338
READ of size 816 at 0xf3e027b8 thread T0
#0 0xf799d8bd (/usr/lib/i386-linux-gnu/libasan.so.4+0x778bd)
#1 0x5664448d in simplify_aperture_macro ./src/gerber.c:2051
#2 0x56646257 in parse_aperture_definition ./src/gerber.c:2272
#3 0x56640cef in parse_rs274x ./src/gerber.c:1637
#4 0x56634211 in gerber_parse_file_segment ./src/gerber.c:243
#5 0x56639d97 in parse_gerb ./src/gerber.c:768
#6 0x5664fdb3 in gerbv_open_image ./src/gerbv.c:526
#7 0x5664d760 in gerbv_open_layer_from_filename_with_color ./src/gerbv.c:249
#8 0x565b9528 in main ./src/main.c:932
#9 0xf6b91f20 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x18f20)
#10 0x56577220 (./gerbv+0x16220)
0xf3e027b8 is located 0 bytes to the right of 120-byte region [0xf3e02740,0xf3e027b8)
allocated by thread T0 here:
#0 0xf7a0c124 in calloc (/usr/lib/i386-linux-gnu/libasan.so.4+0xe6124)
#1 0xf70165ca in g_malloc0 (/usr/lib/i386-linux-gnu/libglib-2.0.so.0+0x4e5ca)
#2 0x566439f1 in simplify_aperture_macro ./src/gerber.c:1922
#3 0x56646257 in parse_aperture_definition ./src/gerber.c:2272
#4 0x56640cef in parse_rs274x ./src/gerber.c:1637
#5 0x56634211 in gerber_parse_file_segment ./src/gerber.c:243
#6 0x56639d97 in parse_gerb ./src/gerber.c:768
#7 0x5664fdb3 in gerbv_open_image ./src/gerbv.c:526
#8 0x5664d760 in gerbv_open_layer_from_filename_with_color ./src/gerbv.c:249
#9 0x565b9528 in main ./src/main.c:932
#10 0xf6b91f20 in __libc_start_main (/lib/i386-linux-gnu/libc.so.6+0x18f20)
SUMMARY: AddressSanitizer: heap-buffer-overflow (/usr/lib/i386-linux-gnu/libasan.so.4+0x778bd)
Shadow bytes around the buggy address:
0x3e7c04a0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x3e7c04b0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
0x3e7c04c0: 00 00 00 00 00 00 01 fa fa fa fa fa fa fa fa fa
0x3e7c04d0: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa
0x3e7c04e0: fa fa fa fa fa fa fa fa 00 00 00 00 00 00 00 00
=>0x3e7c04f0: 00 00 00 00 00 00 00[fa]fa fa fa fa fa fa fa fa
0x3e7c0500: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 04
0x3e7c0510: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
0x3e7c0520: fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa fa
0x3e7c0530: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd fd
0x3e7c0540: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
==267==ABORTING
2021-11-22 - Initial contact
2022-01-28 - 60 day follow up
2022-02-24 - 90 day public disclosure notice; vendor agreed
2022-02-28 - Public Release
Discovered by Claudio Bozzato of Cisco Talos.