我正在尝试逐个元素地将两个数组相乘以形成单个字符串。
有人可以建议吗?
import numpy as np
def array_translate(array):
intlist = [x for x in array if isinstance(x, int)]
strlist = [x for x in array if isinstance(x, str)]
joinedlist = np.multiply(intlist, strlist)
return "".join(joinedlist)
print(array_translate(["Cat", 2, "Dog", 3, "Mouse", 1])) # => "CatCatDogDogDogMouse"
我收到此错误:
File "/Users/peteryoon/PycharmProjects/Test3/Test3.py", line 8, in array_translate
joinedlist = np.multiply(intlist, strlist)
numpy.core._exceptions.UFuncTypeError: ufunc 'multiply' did not contain a loop with signature matching types (dtype('<U21'), dtype('<U21')) -> dtype('<U21')
我能够使用下面的列表理解来解决。但是很好奇看到numpy是如何工作的。
def array_translate(array):
intlist = [x for x in array if isinstance(x, int)]
strlist = [x for x in array if isinstance(x, str)]
return "".join(intlist*strlist for intlist, strlist in zip(intlist, strlist))
print(array_translate(["Cat", 2, "Dog", 3, "Mouse", 1])) # => "CatCatDogDogDogMouse"
答案 0 :(得分:4)
In [79]: arr = np.array(['Cat','Dog','Mouse'])
In [80]: cnt = np.array([2,3,1])
各种替代品的时间。相对位置可能随数组的大小(以及是以列表还是数组开头)而变化。因此,您需要自己进行测试:
In [93]: timeit ''.join(np.repeat(arr,cnt))
7.98 µs ± 57.7 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [94]: timeit ''.join([str(wd)*i for wd,i in zip(arr,cnt)])
5.96 µs ± 167 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [95]: timeit ''.join(arr.astype(object)*cnt)
13.3 µs ± 50.9 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [96]: timeit ''.join(np.char.multiply(arr,cnt))
27.4 µs ± 307 ns per loop (mean ± std. dev. of 7 runs, 10000 loops each)
In [100]: timeit ''.join(np.frompyfunc(lambda w,i: w*i,2,1)(arr,cnt))
10.4 µs ± 164 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [101]: %%timeit f = np.frompyfunc(lambda w,i: w*i,2,1)
...: ''.join(f(arr,cnt))
7.95 µs ± 93.2 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
In [102]: %%timeit x=arr.tolist(); y=cnt.tolist()
...: ''.join([str(wd)*i for wd,i in zip(x,y)])
1.36 µs ± 39.7 ns per loop (mean ± std. dev. of 7 runs, 1000000 loops each)
np.repeat
适用于各种数组。
列表推导使用字符串乘号,并且不能一概而论。通常这是最快的,尤其是从列表开始时。
对象dtype将字符串dtype转换为Python字符串,然后将操作委托给字符串乘法。
np.char
将字符串方法应用于数组的元素。虽然方便,但很少很快。
In [104]: timeit ''.join(np.repeat(arr,cnt).tolist())
4.04 µs ± 197 ns per loop (mean ± std. dev. of 7 runs, 100000 loops each)
答案 1 :(得分:3)
也许使用重复
z = array(['Cat', 'Dog', 'Mouse'], dtype='<U5')
"".join(np.repeat(z, (2, 3, 1)))
'CatCatDogDogDogMouse'