如何从Python数组中删除值,对它们执行操作,然后在原始数组中替换它们

时间:2017-07-13 22:04:51

标签: python arrays indexing

我正在使用庞大的数据集。我想要做的是取所有值>从数组中取0并将它们放在一个新数组中,对这些提取的值运行统计信息,然后将新值放回原始数组中。

假设我有一个数组[0,0,0,0,0, . . . .32, .44,0,0,0](即下面脚本中的对象arr):我想删除诸如.32,.44等的值,并将它们放入新阵列arr2

然后我想对第二个数组进行统计分析(PCA),获取与原始数组中原始位置对应的新值,并用这些新值替换原始值。我已经开始在下面对此进行编码,但不知道如何提取值> 0,同时保持阵列中的位置。

import os
import nibabel as nb
import numpy as np
import numpy.linalg as npl
import nibabel as nib
import matplotlib.pyplot as plt
from matplotlib.mlab import PCA
#from dipy.io.image import load_nifti, save_nifti

np.set_printoptions(precision=4, suppress=True)
FA = './all_FA_skeletonised.nii'

from dipy.io.image import load_nifti
img = nib.load(FA)
data = img.get_data()
data.shape        #get x,y,z and subject # parameters from image

#place subject number into a variable
vol_shape = data.shape[:-1] # x,y,z coordinates
n_vols = data.shape[-1]   # 28 subjects volumes

# N is the num of voxels (dimensions) in a volume
N = np.prod(vol_shape)

#- Reshape first dimension of whole image data array to N, and take
#- transpose
arr2 = []
arr = data.reshape(N, n_vols).T  # 28 X 7,200,000 array
for a in array:
    if a > 0:
        arr2.append(a)

row_means = np.outer(np.mean(arr2, axis=1), np.ones(N))
X = arr2 - row_means # mean center data array

#- Calculate unscaled covariance matrix of X
unscaled_covariance = X.dot(X.T)
unscaled_covariance.shape

# Calculate U, S, VT with SVD on unscaled covariance matrix
U, S, VT = npl.svd(unscaled_covariance)
#- Use subplots to make axes to plot first 10 principal component
#- vectors
#- Plot one component vector per sub-plot.
fig, axes = plt.subplots(10, 1)
for i, ax in enumerate(axes):
    ax.plot(U[:, i])

#- Calculate scalar projections for projecting X onto U
#- Put results into array C.
C = U.T.dot(X)

***#- Put values in C back into original data matrix***

2 个答案:

答案 0 :(得分:1)

我会用他们的位置(在原始数组中)提取所需的值,并将它们作为index_in_the_original_array: value_in_the_original_array存储在字典中。然后我会对字典中的values进行计算。最后,我们保留了索引(作为字典中的键),用于替换原始数组中的值。在代码中:

from pprint import pprint

original_array = [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

# Collecting all values & indices of the elements that are greater than 5:
my_dictionary = {index: value for index, value in enumerate(original_array) if value > 5}
pprint(original_array)      # [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
pprint(my_dictionary)       # {5: 6, 6: 7, 7: 8, 8: 9, 9: 10}

# doing the processing (Here just incrementing the values by 2):
my_dictionary = {key: my_dictionary[key] + 2 for key in my_dictionary.keys()}
pprint(my_dictionary)       # {5: 8, 6: 9, 7: 10, 8: 11, 9: 12}

# Replacing the new values into the original array:
for key in my_dictionary.keys():
    original_array[key] = my_dictionary[key]

pprint(original_array)      # [1, 2, 3, 4, 5, 8, 9, 10, 11, 12]

<强>更新

如果我们想避免使用字典,我们可以执行以下操作,基本上与上面相同。

import numpy as np

def process_data(data):
    return data * 5

original_array = np.array([1, 2, 3, 4, 5, 6, 7, 8, 9, 10])
new_array = np.array([[index, value] for index, value in enumerate(original_array) if value > 5])
print(new_array)    # [[ 5  6]
                    #  [ 6  7]
                    #  [ 7  8]
                    #  [ 8  9]
                    #  [ 9 10]]

# doing the processing (Here, just using the above function that multiplies the values by 5):
new_array[:, 1] = process_data(new_array[:, 1])
print(new_array)    # [[ 5 30]
                    #  [ 6 35]
                    #  [ 7 40]
                    #  [ 8 45]
                    #  [ 9 50]]

# Replacing the new values into the original array:
for indx, val in new_array:
    original_array[indx] = val

print(original_array)  # [ 1  2  3  4  5 30 35 40 45 50]

答案 1 :(得分:0)

编辑:错误地提出了问题(请参阅评论),所以这里有更新。

假设我们有a=[0,0,1,2,0,3]b=[.1, .1, .1],并希望将它们组合起来以获得[0, 0,.1, .1, 0, 0.1],即0保留在相同的索引处,所有其他值将被替换:

import numpy as np
b = np.array([.1, .1, .1])
a = np.array([0,0,1,2,0,3], dtype='float64')  # expects same dtype
np.place(a, a>0, b)  # modify in place

如果您需要原始值,请在a行之前备份np.place

以前的版本:

不确定我是否让你正确,假设通过'保持数组中的位置',你的意思是例如[0,0,1,2,0,3,0]应该eval [1,2,3] (而不是[1,3,2]或其他)。您可以a[a!=]执行此操作,其中a是您的数组。如果您只想取消前导/尾随零,请尝试使用numpy.trim_zeros

如果输入是2D数组或矩阵,那么事情应该是不同的,因为你需要保持它们的形状。