我需要对第一列中包含字符串的列b求和
>>> df
a b
0 c d
1 1 2
2 3 4
>>> df['sum'] = df.sum(1)
>>> df
a b sum
0 c d cd
1 1 2 3
2 3 4 7
我只需要添加数值并获得类似的输出
>>> df
a b sum
0 c d "dummyString/NaN"
1 1 2 3
2 3 4 7
我只需要添加一些列
df['sum']=df['a']+df['b']
答案 0 :(得分:3)
混合数据时的解决方案-带字符串的数字:
我认为最简单的做法是将to_numeric
的#You can do this without using json library as well-
mySourceList=[{"bookmark": "bla","entity_name": "aag","id": 56},{"bookmark": "ag","entity_name": "dsg","id": 34},{"bookmark": "agds","entity_name": "dsaga","id": 12}]
myTargetList=[]
for dict in mySourceList:
myTempList=[]
myTempDict={}
for key,value in dict.items():
if(key in ["entity_name","id"]):
myTempList.append((key,value))
myTempDict.update(myTempList)
myTargetList.append(myTempDict)
print(myTargetList)
#O/P- [{'entity_name': 'aag', 'id': 56}, {'entity_name': 'dsg', 'id': 34}, {'entity_name': 'dsaga', 'id': 12}]
之后的非数值转换为sum
s:
NaN
或者:
df['sum'] = pd.to_numeric(df[['a','b']].sum(1), errors='coerce')
编辑:
解决方案ID号是字符串表示形式-首先转换为数字,然后转换为df['sum'] = pd.to_numeric(df['a']+df['b'], errors='coerce')
print (df)
a b sum
0 c d NaN
1 1 2 3.0
2 3 4 7.0
:
sum
或者:
df['sum'] = pd.to_numeric(df['a'], errors='coerce') + pd.to_numeric(df['b'], errors='coerce')
print (df)
a b sum
0 c d NaN
1 1 2 3.0
2 3 4 7.0