Python - 反转使用特定循环顺序生成的数据

时间:2016-10-15 19:36:43

标签: python numpy

问题:如何更有效地执行以下任务?

我的问题如下。我在实际物理空间(x,y,z)中有一个(大)3D点数据集。它是由嵌套的for循环生成的,如下所示:

# Generate given dat with its ordering
x_samples = 2
y_samples = 3
z_samples = 4
given_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
    for y in range(y_samples):
        for x in range(x_samples):
            row = [x+.1,y+.2,z+.3]
            given_dat[row_ind,:] = row
            row_ind += 1
for row in given_dat:
    print(row)`

为了将它与另一组数据进行比较,我想将给定数据重新排序为我想要的顺序,如下所示(非正统,我知道):

# Generate data with desired ordering
x_samples = 2
y_samples = 3
z_samples = 4
desired_dat = np.zeros(((x_samples*y_samples*z_samples),3))
row_ind = 0
for z in range(z_samples):
    for x in range(x_samples):
        for y in range(y_samples):
            row = [x+.1,y+.2,z+.3]
            desired_dat[row_ind,:] = row
            row_ind += 1
for row in desired_dat:
    print(row)

我已经编写了一个能够实现我想要的功能,但它非常缓慢且效率低下:

def bad_method(x_samp,y_samp,z_samp,data):
    zs = np.unique(data[:,2])
    xs = np.unique(data[:,0])
    rowlist = []
    for z in zs:
        for x in xs:
            for row in data:
                if row[0] == x and row[2] == z:
                rowlist.append(row)
    new_data = np.vstack(rowlist)
    return new_data
# Shows that my function does with I want
fix = bad_method(x_samples,y_samples,z_samples,given_dat)    
print('Unreversed data')
print(given_dat)
print('Reversed Data')
print(fix)
# If it didn't work this will throw an exception
assert(np.array_equal(desired_dat,fix))

我怎样才能提高功能,让它更快?我的数据集通常有大约200万行。必须有可能通过一些聪明的切片/索引来做到这一点,我确信会更快,但我很难搞清楚如何。谢谢你的帮助!

1 个答案:

答案 0 :(得分:2)

你可以重新塑造你的阵列,根据需要交换轴并再次重新塑造:

# (No need to copy if you don't want to keep the given_dat ordering)
data = np.copy(given_dat).reshape(( z_samples, y_samples, x_samples, 3))
# swap the "y" and "x" axes
data = np.swapaxes(data, 1,2)
# back to 2-D array
data = data.reshape((x_samples*y_samples*z_samples,3))

assert(np.array_equal(desired_dat,data))