大熊猫中具有字符串和数字的列总和

时间:2019-04-26 05:26:54

标签: pandas

我需要对第一列中包含字符串的列b求和

>>> df
   a  b
0  c  d
1  1  2
2  3  4
>>> df['sum'] = df.sum(1)
>>> df
   a  b sum
0  c  d  cd
1  1  2   3
2  3  4   7

我只需要添加数值并获得类似的输出

>>> df
   a  b sum
0  c  d  "dummyString/NaN"
1  1  2   3
2  3  4   7

我只需要添加一些列

df['sum']=df['a']+df['b']

1 个答案:

答案 0 :(得分:3)

混合数据时的解决方案-带字符串的数字:

我认为最简单的做法是将to_numeric#You can do this without using json library as well- mySourceList=[{"bookmark": "bla","entity_name": "aag","id": 56},{"bookmark": "ag","entity_name": "dsg","id": 34},{"bookmark": "agds","entity_name": "dsaga","id": 12}] myTargetList=[] for dict in mySourceList: myTempList=[] myTempDict={} for key,value in dict.items(): if(key in ["entity_name","id"]): myTempList.append((key,value)) myTempDict.update(myTempList) myTargetList.append(myTempDict) print(myTargetList) #O/P- [{'entity_name': 'aag', 'id': 56}, {'entity_name': 'dsg', 'id': 34}, {'entity_name': 'dsaga', 'id': 12}] 之后的非数值转换为sum s:

NaN

或者:

df['sum'] = pd.to_numeric(df[['a','b']].sum(1), errors='coerce')

编辑:

解决方案ID号是字符串表示形式-首先转换为数字,然后转换为df['sum'] = pd.to_numeric(df['a']+df['b'], errors='coerce') print (df) a b sum 0 c d NaN 1 1 2 3.0 2 3 4 7.0

sum

或者:

df['sum'] = pd.to_numeric(df['a'], errors='coerce') + pd.to_numeric(df['b'], errors='coerce')
print (df)
   a  b  sum
0  c  d  NaN
1  1  2  3.0
2  3  4  7.0