It would be good if RCS_parsercsfile_i() could report an error on corrupt repository files. But RCS_parsercsfile_i() is supposed to only do a quick scan of the repository file's keys for HEAD, BRANCH, and EXPAND keys. Knowledge of the complete structure of a repository file is in RCS_reparsercsfile(). To keep this maintainable, RCS_parsercsfile_i() should not duplicate this knowledge. Maybe the best compromise is to add a simple and quick heuristic that reports an error for a set of corruptions, e.g. if the key is too long. I intended to make a patch for this, but then I found that rcsbuf_getkey() may call realloc(3) (via rcsbuf_fill()) between assigning the buffer pointer to *keyp and finding the end of the key. This is a bug in itself (because realloc(3) may move the buffer), which requires to change the handling of the buffers (i.e. the offsets must be remembered instead of the pointers). But rather than going on a bug-fixing trip that may result in a (minor?) redesign, I would like to have a workable fix for the initial bug. Therefore it may be better to catch only the "overwritten with zeroes" corruption (and similar). rcsfile(5) effectively says that keys are limited to "any visible graphic character except special ($ | , | . | : | ; | @)". So it may be a good heuristic to outlaw all control characters in keys (except for the whitespace characters, which end a key). ================================================================================ --- src/rcs.c +++ src/rcs.c @@ -166,6 +166,25 @@ #define whitespace(c) (spacetab[(unsigned char)c] != 0) +static const char invalidtab[] = { + 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 0, 0, 0, 1, 1, /* 0x00 - 0x0f */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x10 - 0x1f */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x20 - 0x2f */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x30 - 0x3f */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x40 - 0x4f */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x50 - 0x5f */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0x60 - 0x8f */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, /* 0x70 - 0x7f */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x80 - 0x8f */ + 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, /* 0x90 - 0x9f */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0xa0 - 0xaf */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0xb0 - 0xbf */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0xc0 - 0xcf */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0xd0 - 0xdf */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, /* 0xe0 - 0xef */ + 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0 /* 0xf0 - 0xff */ +}; + static char *rcs_lockfile = NULL; static int rcs_lockfd = -1; @@ -1107,9 +1126,11 @@ { register const char * const my_spacetab = spacetab; register char *ptr, *ptrend; + register const char * const my_invalidtab = invalidtab; char c; #define my_whitespace(c) (my_spacetab[(unsigned char)c] != 0) +#define my_invalid(c) (my_invalidtab[(unsigned char)c] != 0) rcsbuf->vlen = 0; rcsbuf->at_string = 0; @@ -1186,6 +1207,9 @@ c = *ptr; if (c == ';' || my_whitespace (c)) break; + /* Sanity check: is this character invalid for a key (and not whitespace)? */ + if (my_invalid (c)) + return 0; } }