Have this piece of code that I am trying to optimize. It uses list comprehensions and works.
series1 = np.asarray(range(10)).astype(float)
series2 = series1[::-1]
ntup = zip(series1,series2)
[['', 't:'+str(series2)][series1 > series2] for series1,series2 in ntup ]
#['', '', '', '', '', 't:4.0', 't:3.0', 't:2.0', 't:1.0', 't:0.0']
Trying to use np.where()
here. Is there a solution with numpy
. (Without series being consumed)
series1 = np.asarray(range(10)).astype(float)
series2 = series1[::-1]
np.where(series1 > series2 ,'t:'+ str(series2),'' )
The results is this:
array(['', '', '', '', '', 't:[ 9. 8. 7. 6. 5. 4. 3. 2. 1. 0.]',
't:[ 9. 8. 7. 6. 5. 4. 3. 2. 1. 0.]',
't:[ 9. 8. 7. 6. 5. 4. 3. 2. 1. 0.]',
't:[ 9. 8. 7. 6. 5. 4. 3. 2. 1. 0.]',
't:[ 9. 8. 7. 6. 5. 4. 3. 2. 1. 0.]'],
dtype='|S43')
答案 0 :(得分:2)
We can use a vectorized approach based on
np.core.defchararray.add
for the string appending of 't:'
with the valid strings, and
np.where
to choose based on the conditional statement and perform the appending or just use the default value of an empty string.
So, we would have an implementation like so -
np.where(series1>series2,np.core.defchararray.add('t:',series2.astype(str)),'')
Boost it-up!
We can use the appending with np.core.defchararray.add
on the valid elements based on the mask of series1>series2
to boost up the performance further after initializing an array with the default empty strings and then assigning only the valid values into it.
So, the modified version would look something like this -
mask = series1>series2
out = np.full(series1.size,'',dtype='U34')
out[mask] = np.core.defchararray.add('t:',series2[mask].astype(str))
Runtime test
Vectorized versions as functions :
def vectorized_app1(series1,series2):
mask = series1>series2
return np.where(mask,np.core.defchararray.add('t:',series2.astype(str)),'')
def vectorized_app2(series1,series2):
mask = series1>series2
out = np.full(series1.size,'',dtype='U34')
out[mask] = np.core.defchararray.add('t:',series2[mask].astype(str))
return out
Timings on a bigger dataset -
In [283]: # Setup input arrays
...: series1 = np.asarray(range(10000)).astype(float)
...: series2 = series1[::-1]
...:
In [284]: %timeit [['', 't:'+str(s2)][s1 > s2] for s1,s2 in zip(series1, series2)]
10 loops, best of 3: 32.1 ms per loop # OP/@hpaulj's soln
In [285]: %timeit vectorized_app1(series1,series2)
10 loops, best of 3: 20.5 ms per loop
In [286]: %timeit vectorized_app2(series1,series2)
100 loops, best of 3: 10.4 ms per loop
As noted by OP in comments
, that we can probably play around with the dtype for series2
before appending. So, I used U32
there to keep the output dtype same as with str
dtype, i.e. series2.astype('U32')
inside the np.core.defchararray.add
call. The new timings for the vectorized approaches were -
In [290]: %timeit vectorized_app1(series1,series2)
10 loops, best of 3: 20.1 ms per loop
In [291]: %timeit vectorized_app2(series1,series2)
100 loops, best of 3: 10.1 ms per loop
So, there's some further marginal improvement there!
答案 1 :(得分:1)
Your list comprehensions work just fine for lists, not really need to use arrays. And for operations like this arrays probably won't give any speed advantage.
In [521]: series1=[float(i) for i in range(10)]
In [522]: series2=series1[::-1]
In [523]: [['', 't:'+str(s2)][s1 > s2] for s1,s2 in zip(series1, series2)]
Out[523]: ['', '', '', '', '', 't:4.0', 't:3.0', 't:2.0', 't:1.0', 't:0.0']
As @Divaker noted there is a np.char.add
function that will perform string operations. My experience is that they are marginally faster than list operations. And when you take into account the overhead of creating arrays, they may be slower.
=========
The array
version as shown by @Divakar
In [539]: aseries1=np.array(series1)
In [540]: aseries2=np.array(series2)
In [541]: np.where(aseries1>aseries2, np.char.add('t:',aseries2.astype('U3')), '
...: ')
Out[541]:
array(['', '', '', '', '', 't:4.0', 't:3.0', 't:2.0', 't:1.0', 't:0.0'],
dtype='<U5')
A couple of time tests:
In [542]: timeit [['', 't:'+str(s2)][s1 > s2] for s1,s2 in zip(series1, series2)
...: ]
100000 loops, best of 3: 15.5 µs per loop
In [543]: timeit np.where(aseries1>aseries2, np.char.add('t:',aseries2.astype('U3')), '')
10000 loops, best of 3: 63 µs per loop
答案 2 :(得分:1)
这对我有用。完全矢量化。
import numpy as np
series1 = np.arange(10)
series2 = series1[::-1]
empties = np.repeat('', series1.shape[0])
ts = np.repeat('t:', series1.shape[0])
s2str = series2.astype(np.str)
m = np.vstack([empties, np.core.defchararray.add(ts, s2str)])
cmp = np.int64(series1 > series2)
idx = np.arange(m.shape[1])
res = m[cmp, idx]
print res