[Python-Dev] cpython: replace custom validation logic in the parse module with a simple DFA validator (original) (raw)

Christian Heimes christian at python.org
Sat Jun 4 20:26:01 EDT 2016


On 2016-06-02 11:32, benjamin.peterson wrote:

https://hg.python.org/cpython/rev/4a9159ea2536 changeset: 101601:4a9159ea2536 user: Benjamin Peterson <benjamin at python.org> date: Thu Jun 02 11:30:18 2016 -0700 summary: replace custom validation logic in the parse module with a simple DFA validator (closes #26526)

Patch from A. Skrobov. files: Misc/NEWS | 3 + Modules/parsermodule.c | 2545 +-------------------------- 2 files changed, 96 insertions(+), 2452 deletions(-)

diff --git a/Misc/NEWS b/Misc/NEWS --- a/Misc/NEWS +++ b/Misc/NEWS @@ -22,6 +22,9 @@ Library ------- +- Issue #26526: Replace custom parse tree validation in the parser + module with a simple DFA validator. + - Issue #27114: Fix SSLContext.loadwindowsstorecerts fails with PermissionError diff --git a/Modules/parsermodule.c b/Modules/parsermodule.c --- a/Modules/parsermodule.c +++ b/Modules/parsermodule.c @@ -670,9 +670,75 @@ static node* buildnodetree(PyObject *tuple); -static int validateexprtree(node *tree); -static int validatefileinput(node *tree); -static int validateencodingdecl(node *tree); + +static int +validatenode(node *tree) +{ + int type = TYPE(tree); + int nch = NCH(tree); + dfa *ntdfa; + state *dfastate; + int pos, arc; + + assert(ISNONTERMINAL(type)); + type -= NTOFFSET; + if (type >= PyParserGrammar.gndfas) { + PyErrFormat(parsererror, "Unrecognized node type %d.", TYPE(tree)); + return 0; + } + ntdfa = &PyParserGrammar.gdfa[type]; + REQ(tree, ntdfa->dtype); + + /* Run the DFA for this nonterminal. */ + dfastate = &ntdfa->dstate[ntdfa->dinitial]; + for (pos = 0; pos < nch; ++pos) {_ _+ node *ch = CHILD(tree, pos);_ _+ int chtype = TYPE(ch);_ _+ for (arc = 0; arc < dfastate->snarcs; ++arc) { + short alabel = dfastate->sarc[arc].albl; + assert(alabel < PyParserGrammar.gll.llnlabels);_ _+ if (PyParserGrammar.gll.lllabel[alabel].lbtype == chtype) {_ _+ /* The child is acceptable; if non-terminal, validate it recursively. */_ _+ if (ISNONTERMINAL(chtype) && !validatenode(ch))_ _+ return 0;_ _+_ _+ /* Update the state, and move on to the next child. */_ _+ dfastate = &ntdfa->dstate[dfastate->sarc[arc].aarrow]; + goto arcfound; + } + } + /* What would this state have accepted? */ + { + short alabel = dfastate->sarc->albl; + int nexttype; + if (!alabel) /* Wouldn't accept any more children */ + goto illegalnumchildren; + + nexttype = PyParserGrammar.gll.lllabel[alabel].lbtype; + if (ISNONTERMINAL(nexttype)) + PyErrFormat(parsererror, "Expected node type %d, got %d.", + nexttype, chtype); + else + PyErrFormat(parsererror, "Illegal terminal: expected %s.", + PyParserTokenNames[nexttype]);

Coverity doesn't that line:

CID 1362505 (#1 of 1): Out-of-bounds read (OVERRUN) 20. overrun-local: Overrunning array _PyParser_TokenNames of 58 8-byte elements at element index 255 (byte offset 2040) using index next_type (which evaluates to 255).

Can you add a check to verify, that next_type is not out-of-bounds, e.g.

next_type);

+ return 0; + } + +arcfound: + continue; + } + /* Are we in a final state? If so, return 1 for successful validation. */ + for (arc = 0; arc < dfastate->snarcs; ++arc) { + if (!dfastate->sarc[arc].albl) { + return 1; + } + } + +illegalnumchildren: + PyErrFormat(parsererror, + "Illegal number of children for %s node.", ntdfa->dname); + return 0; +}



More information about the Python-Dev mailing list