python - How does one store a Pandas DataFrame as an HDFS PyTables table (or CArray, EArray, etc.)? -
i have following pandas dataframe:
import pandas pd df = pd.read_csv(filename.csv)
now, can use hdfstore
write df
object file (like adding key-value pairs python dictionary):
store = hdfstore('store.h5') store['df'] = df
http://pandas.pydata.org/pandas-docs/stable/io.html
when @ contents, object frame
.
store
outputs
<class 'pandas.io.pytables.hdfstore'> file path: store.h5 /df frame (shape->[552,23252])
however, in order use indexing, 1 should store table
object.
my approach try hdfstore.put()
i.e.
hdfstore.put(key="store.h", value=df, format=table)
however, fails error:
typeerror: put() missing 1 required positional argument: 'self'
how 1 save pandas dataframes pytables tables?
common part - create or open existing hdfstore file:
store = pd.hdfstore('store.h5')
try if want have indexed all columns:
store.append('key_name', df, data_columns=true)
or if want have indexed subset of columns:
store.append('key_name', df, data_columns=['cola','colc','coln'])
ps hdfstore.append()
saves dfs per default in table
format
Comments
Post a Comment