例如,我有一个包含两列的datadframe
A B
00:01:05 2018-10-10 23:58:10
我想得到第三列C,它是A + B的总和
A B C
00:01:05 2018-10-10 23:58:10 2018-10-10 23:59:15
如果我这样做:
df['C']= df['A'] + df['B']
我明白了
cannot add DatetimeArray and DatetimeArray
答案 0 :(得分:2)
这是您的示例数据框,
import numpy as np
import pandas as pd
## Generate Random Data
raw_data=np.random.choice([None,1], (50,8))
raw_data= np.r_[raw_data, np.random.choice([None, 1,2,3], (50,8))]
## Create dataframe from random data
df = pd.DataFrame(raw_data, columns="A, B, D, E, F, G, I, L".split(","))
notnull_counts = (~df.isnull()).sum(axis=1)
## filter rows with your condition
legit_rows = df[((notnull_counts==1) | (notnull_counts==2) | (notnull_counts==8))]
non_legit_rows = df[~((notnull_counts==1) | (notnull_counts==2) | (notnull_counts==8))]
display(legit_rows)
将列sample = pd.DataFrame()
sample['A'] = ['00:01:05']
sample['B'] = ['2018-10-10 23:58:10']
转换为pd.Timstamp,将B
转换为pd.Timedelta,
A
然后正常添加列
sample['B'] = pd.to_datetime(sample['B'])
sample['A'] = pd.to_timedelta(sample['A'], unit='m')
答案 1 :(得分:2)
将列A
转换为时间增量to_timedelta
,必要时将列B
转换为to_datetime
:
df = pd.DataFrame({'A':['00:01:05'],
'B':['2018-10-10 23:58:10']})
df['C'] = pd.to_timedelta(df['A']) + pd.to_datetime(df['B'])
print (df)
A B C
0 00:01:05 2018-10-10 23:58:10 2018-10-10 23:59:15
如果列A
中包含python时间:
df['C'] = pd.to_timedelta(df['A'].astype(str)) + pd.to_datetime(df['B'])