python - How does one store a Pandas DataFrame as an HDFS PyTables table (or CArray, EArray, etc.)? -
i have following pandas dataframe:
import pandas pd df = pd.read_csv(filename.csv) now, can use hdfstore write df object file (like adding key-value pairs python dictionary):
store = hdfstore('store.h5') store['df'] = df http://pandas.pydata.org/pandas-docs/stable/io.html
when @ contents, object frame.
store outputs
<class 'pandas.io.pytables.hdfstore'> file path: store.h5 /df frame (shape->[552,23252]) however, in order use indexing, 1 should store table object.
my approach try hdfstore.put() i.e.
hdfstore.put(key="store.h", value=df, format=table) however, fails error:
typeerror: put() missing 1 required positional argument: 'self' how 1 save pandas dataframes pytables tables?
common part - create or open existing hdfstore file:
store = pd.hdfstore('store.h5') try if want have indexed all columns:
store.append('key_name', df, data_columns=true) or if want have indexed subset of columns:
store.append('key_name', df, data_columns=['cola','colc','coln']) ps hdfstore.append() saves dfs per default in table format
Comments
Post a Comment