(original) (raw)

Hi,

I tested the latest beta from 3.4 (b3) and noticed there is a new marshal protocol version 3.
The documentation is a little silent about the new features, not going into detail.

I've run a performance test with the new protocol version and noticed the new version is two times slower in serialization than version 2\. I tested it with a simple value tuple in a list (500000 elements).
Nothing special. (happens only if the tuple contains also a tuple)

Copy of the test code:


from time import time
from marshal import dumps

def genData(amount=500000):
for i in range(amount):
yield (i, i+2, i\*2, (i+1,i+4,i,4), "my string template %s" % i, 1.01\*i, True)

data = list(genData())
print(len(data))
t0 = time()
result = dumps(data, 2)
t1 = time()
print("duration p2: %f" % (t1-t0))
t0 = time()
result = dumps(data, 3)
t1 = time()
print("duration p3: %f" % (t1-t0))



Is the overhead for the recursion detection so high ?

Note this happens only if there is a tuple in the tuple of the datalist.


Regards,

Wolfgang