我的问题如下。我在实际物理空间(x,y,z)中有一个(大)3D点数据集。它是由嵌套的for循环生成的,如下所示:
# Generate given dat with its ordering
x_samples = 2
y_samples = 3
z_samples = 4
given_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
for y in range(y_samples):
for x in range(x_samples):
row = [x+.1,y+.2,z+.3]
given_dat[row_ind,:] = row
row_ind += 1
for row in given_dat:
print(row)`
为了将它与另一组数据进行比较,我想将给定数据重新排序为我想要的顺序,如下所示(非正统,我知道):
# Generate data with desired ordering
x_samples = 2
y_samples = 3
z_samples = 4
desired_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
for x in range(x_samples):
for y in range(y_samples):
row = [x+.1,y+.2,z+.3]
desired_dat[row_ind,:] = row
row_ind += 1
for row in desired_dat:
print(row)
我已经编写了一个能够实现我想要的功能,但它非常缓慢且效率低下:
def bad_method(x_samp,y_samp,z_samp,data):
zs = np.unique(data[:,2])
xs = np.unique(data[:,0])
rowlist = []
for z in zs:
for x in xs:
for row in data:
if row[0] == x and row[2] == z:
rowlist.append(row)
new_data = np.vstack(rowlist)
return new_data
# Shows that my function does with I want
fix = bad_method(x_samples,y_samples,z_samples,given_dat)
print('Unreversed data')
print(given_dat)
print('Reversed Data')
print(fix)
# If it didn't work this will throw an exception
assert(np.array_equal(desired_dat,fix))
我怎样才能提高功能,让它更快?我的数据集通常有大约200万行。必须有可能通过一些聪明的切片/索引来做到这一点,我确信会更快,但我很难搞清楚如何。谢谢你的帮助!
答案 0 :(得分:2)
你可以重新塑造你的阵列,根据需要交换轴并再次重新塑造:
# (No need to copy if you don't want to keep the given_dat ordering)
data = np.copy(given_dat).reshape(( z_samples, y_samples, x_samples, 3))
# swap the "y" and "x" axes
data = np.swapaxes(data, 1,2)
# back to 2-D array
data = data.reshape((x_samples*y_samples*z_samples,3))
assert(np.array_equal(desired_dat,data))