python - How does one store a Pandas DataFrame as an HDFS PyTables table (or CArray, EArray, etc.)? -


i have following pandas dataframe:

import pandas pd df = pd.read_csv(filename.csv) 

now, can use hdfstore write df object file (like adding key-value pairs python dictionary):

store = hdfstore('store.h5') store['df'] = df 

http://pandas.pydata.org/pandas-docs/stable/io.html

when @ contents, object frame.

store  

outputs

<class 'pandas.io.pytables.hdfstore'> file path: store.h5 /df            frame        (shape->[552,23252]) 

however, in order use indexing, 1 should store table object.

my approach try hdfstore.put() i.e.

hdfstore.put(key="store.h", value=df, format=table) 

however, fails error:

typeerror: put() missing 1 required positional argument: 'self' 

how 1 save pandas dataframes pytables tables?

common part - create or open existing hdfstore file:

store = pd.hdfstore('store.h5') 

try if want have indexed all columns:

store.append('key_name', df, data_columns=true) 

or if want have indexed subset of columns:

store.append('key_name', df, data_columns=['cola','colc','coln']) 

ps hdfstore.append() saves dfs per default in table format


Comments

Popular posts from this blog

java - Suppress Jboss version details from HTTP error response -

gridview - Yii2 DataPorivider $totalSum for a column -

Sass watch command compiles .scss files before full sftp upload -