Msgpack - ValueError: buffer source array is read-only · Issue #11880 · pandas-dev/pandas (original) (raw)
I get the Value error when processing data using pandas. I followed the following steps:
- convert to msgpack format with compress flag
- subsequently read file into a dataframe
- push to sql table with to_sql
On the third step i get ValueError: buffer source array is read-only.
This problem does not arise if I wrap the read_msgpack call inside a pandas.concat
Example
import pandas as pd import numpy as np
from sqlalchemy import create_engine
eng = create_engine("sqlite:///:memory:")
df1 = pd.DataFrame({ 'A' : 1., 'B' : pd.Timestamp('20130102'), 'C' : pd.Series(1,index=list(range(4)),dtype='float32'), 'D' : np.array([3] * 4,dtype='int32'), 'E' : 'foo' })
df1.to_msgpack('test.msgpack', compress='zlib') df2 = pd.read_msgpack('test.msgpack')
df2.to_sql('test', eng, if_exists='append', chunksize=1000) # throws value error
df2 = pd.cooncat([pd.read_msgpack('test.msgpack')])
df2.to_sql('test', eng, if_exists='append', chunksize=1000) # works
This happens with both blosc and zlib compression. While I have found a solution, this behaviour seems very odd and for very large files there is a small performance hit.
edit: @TomAugspurger changed the sql engine to sqlite