我想在一栏中对excel文件实施操作 该列具有字符串和整数数据,但该列是对象类型
我的数据在Excel中看起来像:(字符串和数字的组合)
Time Spent
3600
0
None
1800
0
我尝试了以下代码
if (df['Time Spent']=='None').all():
df['Time Spent'] = 0
else:
df['Time Spent'] = df['Time Spent'].astype('int')/3600
我得到的错误
Index([u'Issue Key', u'Issue Id', u'Summary', u'Assignee', u'Priority',
u'Issue Type', u'Status', u'Tag', u'Original Estimate', u'Time Spent',
u'Resolution Date', u'Created Date'],
dtype='object')
Traceback (most recent call last):
File "dashboard_migration_graph_Resolved.py", line 60, in <module>
df['Time Spent'] = df['Time Spent'].astype('int')/3600
File "/usr/lib64/python2.7/site-packages/pandas/util/_decorators.py", line 118, in wrapper
return func(*args, **kwargs)
File "pandas/_libs/lib.pyx", line 854, in pandas._libs.lib.astype_intsafe
File "pandas/_libs/src/util.pxd", line 91, in util.set_value_at_unsafe
ValueError: invalid literal for long() with base 10: 'None'
答案 0 :(得分:3)
将to_numeric
与errors='coerce'
一起使用将所有非数字转换为缺失值,因此在除法之前添加Series.fillna
:
df['Time Spent'] = pd.to_numeric(df['Time Spent'], errors='coerce').fillna(0)/3600
print (df)
Time Spent
0 1.0
1 0.0
2 0.0
3 0.5
4 0.0
如果需要None
像丢失值一样返回,只需删除fillna
-取而代之的是None
得到丢失值NaN
,因此可能需要多列:
df['Time Spent'] = pd.to_numeric(df['Time Spent'], errors='coerce')/3600
print (df)
Time Spent
0 1.0
1 0.0
2 NaN
3 0.5
4 0.0
答案 1 :(得分:2)
我无法发表评论(由于声誉低下),但您尝试了吗:
df['Time Spent'] = df['Time Spent'].replace('None', 0). astype(int)/3600
希望这对您有用。