我有三个Pandas
列,其中元素为list
。为了组合这些列表,我可以通过显式写入列的名称并将它们+
放在一起
df = pd.DataFrame({'allmz':([[1,2,3],[2,4,5],[2,5,5],[2,3,5],[1,4,5]]),'allint':([[11,31,31],[21,41,51],[41,51,51],[11,31,51],[1,51,11]]), 'allx':([[6,7,3],[2,4,5],[2,5,5],[2,9,5],[3,4,5]])})
df['new'] = df['allmz'] + df['allint'] + df['allint']
print df
allint allmz allx new
0 [11, 31, 31] [1, 2, 3] [6, 7, 3] [1, 2, 3, 11, 31, 31, 11, 31, 31]
1 [21, 41, 51] [2, 4, 5] [2, 4, 5] [2, 4, 5, 21, 41, 51, 21, 41, 51]
2 [41, 51, 51] [2, 5, 5] [2, 5, 5] [2, 5, 5, 41, 51, 51, 41, 51, 51]
3 [11, 31, 51] [2, 3, 5] [2, 9, 5] [2, 3, 5, 11, 31, 51, 11, 31, 51]
4 [1, 51, 11] [1, 4, 5] [3, 4, 5] [1, 4, 5, 1, 51, 11, 1, 51, 11]
但是,如果我有太多的列名来编写它们,有没有办法通过循环(或不循环)列名列表来实现:
而是columns = ['allmz','allint','allx']
?
答案 0 :(得分:3)
选项1
在列上切片并沿第一个轴调用int stringcompare(const char *string, const char *substr) {
int i, j, firstOcc;
i = 0, j = 0;
while(string[i] != '\0') {
while(string[i] != substr[0] && string[i] != '\0') {
i++;
}
if(string[i] == '\0') {
return -1;
}
firstOcc = i;
while(string[i] == substr[j] && string[i] != '\0' && substr[j] != '\0') {
i++;
j++;
}
if(substr[j] == '\0') {
return firstOcc;
}
if(string[i] == '\0') {
return -1;
}
i = firstOcc + 1;
j = 0;
}
}
。
sum
df['new'] = df[['allmz','allint','allx']].sum(axis=1)
选项2
df
allint allmz allx new
0 [11, 31, 31] [1, 2, 3] [6, 7, 3] [1, 2, 3, 11, 31, 31, 6, 7, 3]
1 [21, 41, 51] [2, 4, 5] [2, 4, 5] [2, 4, 5, 21, 41, 51, 2, 4, 5]
2 [41, 51, 51] [2, 5, 5] [2, 5, 5] [2, 5, 5, 41, 51, 51, 2, 5, 5]
3 [11, 31, 51] [2, 3, 5] [2, 9, 5] [2, 3, 5, 11, 31, 51, 2, 9, 5]
4 [1, 51, 11] [1, 4, 5] [3, 4, 5] [1, 4, 5, 1, 51, 11, 3, 4, 5]
的另一个选项:
np.concatenate
v = df[['allmz','allint','allx']].values.tolist()
df['new'] = np.concatenate(v, axis=0).reshape(len(df), -1).tolist()
答案 1 :(得分:2)
您可以使用Python的内置sum
功能。
df['new'] = sum([df[col] for col in df], [])
答案 2 :(得分:1)
如果你有一大堆专栏名称,那么解决这个问题的简单方法如下所示:
col = df.loc[: , "allint":"allx"]
其中" allint"是起始列名称" allx"是结束列名称
df['new'] = col.sum(axis=1)
df
这将为您提供与写完每列名称后相同的结果。