Question

我有一个像这样的数据框df1：

     Sample_names    esv0     esv1   esv2   ...    esv918  esv919  esv920  esv921
0    pr1gluc8NH1     2.1      3.5   6222   ...         0       0       0       0
1    pr1gluc8NH2  3189.0     75.0   9045   ...         0       0       0       0
2  pr1gluc8NHCR1     0.0   2152.0  12217   ...         0       0       0       0
3  pr1gluc8NHCR2     0.0  17411.0   1315   ...         0       1       0       0
4     pr1sdm8NH1   365.0      7.0   4117   ...         0       0       0       0
5     pr1sdm8NH2  4657.0     18.0  13520   ...         0       0       0       0
6   pr1sdm8NHCR1     0.0    139.0   3451   ...         0       0       0       0
7   pr1sdm8NHCR2  1130.0   1439.0   4163   ...         0       0       0       0

我想对行执行一些操作，并通过for循环替换它们。

for i in range(len(df1)):
     x=df1.iloc[i].values  ### gets all the values corresponding to each row
     x=np.vstack(x[1:]).astype(np.float) ####converts object type to a regular 2D array for all row elements except the first, which is a string.
     x=x/np.sum(x) ###normalize to 1
     df1.iloc[i,1:]=x   ###this is the step that should replace part of the old row with the new array.

但是与此同时，我收到一个错误“ ValueError：使用ndarray设置时必须具有相同的len键和值”。 x的长度与df1-1的每一行相同（我不想替换第一列Sample_names）

我也尝试过df1=df1.replace(df1.iloc[i,1:],x)。这提供了TypeError：value参数必须是标量，dict或Series。

对于实现此目的的任何想法，我将不胜感激。

谢谢。

Answer 1

您需要调整x数组的形状，因为其形状为(n, 1)，其中n是所有类似esv的列的长度。

将行：df1.iloc[i, 1:] = x更改为

df1.iloc[i, 1:] = x.squeeze()

用数组替换熊猫数据框的一部分行

1 个答案: