python - Pandas Dataframe: Accessing via composite index created by groupby operation -

- May 15, 2014

i want calculate group specific ratio gathered 2 datasets. 2 dataframes read database with

leases = pd.read_sql_query(sql, connection) sales = pd.read_sql_query(sql, connection)

one real estate offered sale, other rented objects. group both of them city , category i'm interested in:

leasegroups = leases.groupby(['idconjugate', "city"]) salegroups = sales.groupby(['idconjugate', "city"])

now want know ratio between cheapest rental object per category , city , expensively sold object obtain lower bound possible return:

minlease = leasegroups['price'].min() maxsale = salegroups['price'].max() ratios =  minlease*12/maxsale

i output like: category - city: ratio cannot access ratio object city nor category. tried creating new dataframe with:

newframe = pd.dataframe({"minleases" : minlease,"maxsales" : maxsale,"ratios" : ratios}) newframe = newframe.loc[newframe['ratios'].notnull()]

which gives me correct rows, , newframe.index returns groups.

index.name gives ['idconjugate', 'city'] indexing results in keyerror. how can make index out of different groups: id0+city1, id0+city2 etc... ?

edit: output looks this:

                               maxsales  minleases    ratios idconjugate city 1           argeles gazost        59500        337  0.067966             chelles              129000        519  0.048279             enghien-les-bains    143000        696  0.058406             esbly                117990        495  0.050343             foix                  58000        350  0.072414

the goal select top ratios , plot them bokeh, takes dataframe object , plots column versus index understand it:

topselect = ratio.loc[ratio["ratios"] > ratio["ratios"].quantile(quant)]   dots = dot(topselect, values='ratios', label=topselect.index, tools=[hover,],                title="{}% best minimal lease/sale ratios per city , group".format(topperc*100), width=600)

i needed index list in original order, following worked:

ids = [] cities = [] l in topselect.index:     ids.append(str(int(l[0])))     cities.append(l[1])  newind = [i+"_"+j i,j in zip(ids, cities)] topselect.index = newind

now plot shows 1_city1 ... 1_city2 ... n_cityx on x-axis. figure there must obvious way inside pandas framework i'm missing.

Search This Blog

Look

python - Pandas Dataframe: Accessing via composite index created by groupby operation -

Comments

Post a Comment

Popular posts from this blog

filehandler - java open files not cleaned, even when the process is killed -

Sass watch command compiles .scss files before full sftp upload -

java - Suppress Jboss version details from HTTP error response -