我减去两个熊猫数据框列时得到NaN

时间:2019-03-28 11:29:50

标签: pandas datetime dataframe

我有一个包含几列的数据框,我想获取包含时间的两列之间的时间差。首先,我已经使用pd.to_datetime将两列转换为DateTime对象,但是当我减去这两列并将结果分配给新列时,结果以NaN值结束。

ops_data_clean_1.loc['Package committed-time'] = 
pd.to_datetime(ops_data_clean_1['Package committed-time'])
ops_data_clean_1.loc['Flight launched-time'] = 
pd.to_datetime(ops_data_clean_1['Flight launched-time'])
ops_data_clean_1['time_to_launch'] = ops_data_clean_1.loc['Flight 
launched-time'] - ops_data_clean_1.loc['Package committed-time']
ops_data_clean_1.head()

2 个答案:

答案 0 :(得分:1)

我认为您的麻烦在于您使用的.loc函数。

.loc ['Package commit-time']基本上说,选择具有值'Package commit-time'的ROW,没有。

但是您要选择具有该名称的列。使用简单的ops_data_clean_1 ['包装承诺时间']访问列或ops_data_clean_1.loc [:,'包装承诺时间']

有关.loc的更多信息,请访问:enter link description here

答案 1 :(得分:1)

我认为您的问题是仅从数据框中访问一列时使用loc。您只需从代码中删除loc即可解决此问题。

请参见以下玩具示例,

ops_data_clean_1 = pd.DataFrame()

ops_data_clean_1['Package committed-time'] = ['2018-01-01 00:00:30', '2018-01-01 00:49:00', '2018-03-01 00:00:45']
ops_data_clean_1['Flight launched-time'] = ['2018-01-01 01:00:30', '2018-01-01 02:49:00', '2018-03-01 00:54:45']

ops_data_clean_1['Package committed-time'] = pd.to_datetime(ops_data_clean_1['Package committed-time'])
ops_data_clean_1['Flight launched-time'] = pd.to_datetime(ops_data_clean_1['Flight launched-time'])

ops_data_clean_1['time_to_launch'] = ops_data_clean_1['Flight launched-time'] - ops_data_clean_1['Package committed-time']

ops_data_clean_1.head()

# Output

Package committed-time  Flight launched-time    time_to_launch
0   2018-01-01 00:00:30 2018-01-01 01:00:30 01:00:00
1   2018-01-01 00:49:00 2018-01-01 02:49:00 02:00:00
2   2018-03-01 00:00:45 2018-03-01 00:54:45 00:54:00

如果要使用loc,则必须使用:选择数据框的所有行,例如ops_data_clean_1.loc[:, 'Flight launched-time']

然后代码变成

ops_data_clean_1 = pd.DataFrame()

ops_data_clean_1['Package committed-time'] = ['2018-01-01 00:00:30', '2018-01-01 00:49:00', '2018-03-01 00:00:45']
ops_data_clean_1['Flight launched-time'] = ['2018-01-01 01:00:30', '2018-01-01 02:49:00', '2018-03-01 00:54:45']

ops_data_clean_1.loc[:, 'Package committed-time'] = pd.to_datetime(ops_data_clean_1['Package committed-time'])
ops_data_clean_1.loc[:, 'Flight launched-time'] = pd.to_datetime(ops_data_clean_1['Flight launched-time'])

ops_data_clean_1['time_to_launch'] = ops_data_clean_1.loc[:, 'Flight launched-time'] - ops_data_clean_1.loc[:, 'Package committed-time']

ops_data_clean_1.head()

# Output

    Package committed-time  Flight launched-time    time_to_launch
0   2018-01-01 00:00:30 2018-01-01 01:00:30 01:00:00
1   2018-01-01 00:49:00 2018-01-01 02:49:00 02:00:00
2   2018-03-01 00:00:45 2018-03-01 00:54:45 00:54:00
相关问题