python - Numpy vectorized summation with variable number of factors -


i computing function contains summation on index. index between 0 , integer part of t; ideally able compute summation several values of t. in real-life case, of values of t small, small percentage can 1 or 2 orders of magnitude larger average.

what doing is:
1) define vector t, e.g. (my real-life data have larger number of entries, give idea):

import numpy np  t = np.random.exponential(5, 10) 

2) create matrix containing factors between 0 , int(t), , zeroes:

n = int(t.max()) j = ((np.arange(n) < t[:,np.newaxis])*np.arange(1,n+1)).astype(int).transpose() print(j) 

[[ 1 1 1 1 1 1 1 1 1 1]
[ 2 0 2 2 2 0 2 0 2 2]
[ 0 0 3 0 3 0 3 0 3 3]
[ 0 0 4 0 4 0 0 0 4 4]
[ 0 0 5 0 5 0 0 0 5 5]
[ 0 0 6 0 6 0 0 0 6 6]
[ 0 0 7 0 7 0 0 0 0 7]
[ 0 0 8 0 8 0 0 0 0 8]
[ 0 0 9 0 9 0 0 0 0 9]
[ 0 0 0 0 10 0 0 0 0 10]
[ 0 0 0 0 11 0 0 0 0 0]
[ 0 0 0 0 12 0 0 0 0 0]]

3) generate single elements of summation, using mask avoid applying function elements zero:

a =  np.log(1 + (1 + j) * 5)* (j>0)  

4) sum along columns:

a.sum(axis=0) 

obtaining: array([ 5.170484 , 2.39789527, 29.96464821, 5.170484 , 42.29052851, 2.39789527, 8.21500643, 2.39789527, 18.49060911, 33.9899999 ])

is there fastest/better way vectorize that? have feeling slow due large amount of zeroes not contribute sum, since beginner numpy couldn't figure out better way of writing it.

edit: in actual problem, function applied j depends on second parameter tau (in vector of same size of t). items contained in every column not same.

looking @ j, each column has numbers going 1 n, n being decided based on each t element. then, summing along each column, same summing until n because rest of elements zeros anyway. summed values calculated np.cumsum , n values limits of each column in j directly calculated t. these n values used indices index cumsum-ed values give final output.

this should pretty fast , memory efficient, given cumsum computation done , on 1d array, compared summation done in original approach on 2d array along each column. thus, have vectorized approach -

n = int(t.max()) vals = (np.log(1 + (1 + np.arange(1,n+1)) * 5)).cumsum() out = vals[(t.astype(int)).clip(max=n-1)] 

in terms of memory usage, generating 3 variables -

n    : scalar vals : 1d array of n elements  out  : 1d array of t.size elements (this output anyway) 

runtime test , verify output -

in [5]: def original_app(t):    ...:     n = int(t.max())    ...:     j = ((np.arange(n) < t[:,none])*np.arange(1,n+1)).astype(int).transpose()    ...:     =  np.log(1 + (1 + j) * 5)* (j>0)     ...:     return a.sum(axis=0)    ...:     ...: def vectorized_app(t):    ...:     n = int(t.max())    ...:     vals = (np.log(1 + (1 + np.arange(1,n+1)) * 5)).cumsum()    ...:     return vals[(t.astype(int)).clip(max=n-1)]    ...:   in [6]: # input array    ...: t = np.random.exponential(5, 10000)  in [7]: %timeit original_app(t) 100 loops, best of 3: 9.62 ms per loop  in [8]: %timeit vectorized_app(t) 10000 loops, best of 3: 50.1 µs per loop  in [9]: np.allclose(original_app(t),vectorized_app(t)) # verify outputs out[9]: true 

Comments

Popular posts from this blog

java - Suppress Jboss version details from HTTP error response -

gridview - Yii2 DataPorivider $totalSum for a column -

Sass watch command compiles .scss files before full sftp upload -