python - Writing and reading .mat using scipy.io changes dictionary contents -


this question has answer here:

i trying write dictionary .mat file using scipy.io.savemat(), when do, contents change!

here array wish assign dictionary key "genes":

vectorizeddf.index.values.astype(np.str_) 

which prints

array(['44m2.3', 'a0a087wsv2', 'a0a087wt57', ..., 'tert-rmrp_human',        'tert-terc_human', 'wisp3 varinat'],        dtype='<u44') 

then

genedict = {"genes": vectorizeddf.index.values.astype(np.str_),           "x": vectorizeddf.values,           "id": vectorizeddf.columns.values.astype(np.str_)} import scipy.io sio sio.savemat("goa_human.mat", genedict) 

but when load dictionary using

goadict = sio.loadmat("goa_human.mat") 

my strings padded spaces!

>>> goadict['genes'] array(['44m2.3                                      ',    'a0a087wsv2                                  ',    'a0a087wt57                                  ', ...,    'tert-rmrp_human                             ',    'tert-terc_human                             ',    'wisp3 varinat                               '],    dtype='<u44') 

which far ideal. on other hand, when access

genedict['id'] 

i

array(['go:0000002', 'go:0000003', 'go:0000009', ..., 'go:2001303',        'go:2001306', 'go:2001311'],       dtype='<u10') 

which original format of array before saving. seems me issue in dtype, did best cast both of them strings. not sure why 1 <u44 , other <u10. how might resolve this?

thank you!

let's try save variety of objects:

in [597]: d={'alist':['one','two','three','four'],    .....: 'adict':{'one':np.arange(5)},    .....: 'strs': np.array(['one','two','three','four']),    .....: 'objs': np.array(['one','two','three','four'],dtype=object)}  in [598]: d out[598]:  {'alist': ['one', 'two', 'three', 'four'],  'adict': {'one': array([0, 1, 2, 3, 4])},  'objs': array(['one', 'two', 'three', 'four'], dtype=object),  'strs': array(['one', 'two', 'three', 'four'],         dtype='<u5')} in [599]: io.savemat('test.mat',d)  in [600]: dd=io.loadmat('test.mat') in [601]: dd out[601]:  {'adict': array([[([[0, 1, 2, 3, 4]],)]],         dtype=[('one', 'o')]),  'strs': array(['one  ', 'two  ', 'three', 'four '],         dtype='<u5'),  'alist': array(['one  ', 'two  ', 'three', 'four '],         dtype='<u5'),  '__header__': b'matlab 5.0....',  '__version__': '1.0',  'objs': array([[array(['one'],         dtype='<u3'),          array(['two'],         dtype='<u3'),          array(['three'],         dtype='<u5'),          array(['four'],         dtype='<u4')]], dtype=object),  '__globals__': []} 

this scipy version, '0.14.1'; not particularly new one, haven't read of recent changes in io code.

and in octave get:

octave:14> data = load('test.mat') data =    scalar structure containing fields:      alist =  1   2   3 4       adict =        scalar structure containing fields:          1 =            0  1  2  3  4       objs =      {       [1,1] = 1       [1,2] = 2       [1,3] = 3       [1,4] = 4     }     strs =  1   2   3 4  

the list , str array both produce (4,5) character arrays in octave, while dtype=object array produces cell array of strings.

in both d , dd, strs array u5 , takes 80 bytes (4 words*5 char/word *4 bytes/char), in dd, strings have been padded blanks.

in [617]: d['strs'][0] out[617]: 'one' in [618]: dd['strs'][0] out[618]: 'one  ' in [619]: d['strs'][0].tostring() out[619]: b'o\x00\x00\x00n\x00\x00\x00e\x00\x00\x00' in [620]: dd['strs'][0].tostring() out[620]: b'o\x00\x00\x00n\x00\x00\x00e\x00\x00\x00 \x00\x00\x00 \x00\x00\x00' 

i haven't paid attention why arrays d['strs'] don't display strings padding. how it's distinguishing between blanks , 'empty' bytes. note py3, default string unicode. don't know if py2 byte strings different (except take 1 byte/char).

so yes, io.savemat change string array (and lists) adding blanks full dtype width. purpose create matlab style character matrix.

@zeemonkeez's link covers this, including way of converting character matrix cell:

octave:25> cellstr(data.strs) ans =  {   [1,1] = 1   [2,1] = 2   [3,1] = 3   [4,1] = 4 

python matlab: exporting list of strings using scipy.io


Comments

Popular posts from this blog

java - Suppress Jboss version details from HTTP error response -

gridview - Yii2 DataPorivider $totalSum for a column -

Sass watch command compiles .scss files before full sftp upload -