我试图动态地向Pandas数据框添加新行。索引是一个时间戳,我无法弄清楚如何在不搞乱索引的情况下插入新行。代码的第一部分创建数据框:
data = {'time_stamp': ['2014-05-01 18:47:05.069', '2014-05-01 18:47:05.119', '2014-05-02 18:47:05.230',],
'col_a': [34, 25, 26],
'col_b' : [21,32,43]}
df = pd.DataFrame(data, columns = ['time_stamp', 'col_a', 'col_b'])
df['time_stamp'] = pd.to_datetime(df['time_stamp'], format="%Y-%m-%d %H:%M:%S.%f")
df.index = df['time_stamp'] # Make time_stamp the index
del df['time_stamp'] # Drop the initial time_stamp column
print df
结果:
col_a col_b
time_stamp
2014-05-01 18:47:05.069 34 21
2014-05-01 18:47:05.119 25 32
2014-05-02 18:47:05.230 26 43
尝试使用concat添加一行(与append相同的问题):
#Insert new row (corresponding to an incoming update message with a time stamp an a new value on col_a
ts = pd.to_datetime("2014-05-04 18:47:05.487", format="%Y-%m-%d %H:%M:%S.%f")
new_row = pd.DataFrame([[11]], columns = ["col_a"])
df = pd.concat([df, pd.DataFrame(new_row)], ignore_index=False)
print df
结果:
col_a col_b
2014-05-01 18:47:05.069000 34 21.0
2014-05-01 18:47:05.119000 25 32.0
2014-05-02 18:47:05.230000 26 43.0
0 11 NaN
如果我延长" new_row"使用名为" time_frame"的列和相应的时间戳,它将创建一个名为" time_stamp"的新列,而不是在索引列中插入新值。
col_a col_b time_stamp
2014-05-01 18:47:05.069000 34 21.0 NaT
2014-05-01 18:47:05.119000 25 32.0 NaT
2014-05-02 18:47:05.230000 26 43.0 NaT
0 11 NaN 2014-05-04 18:47:05.487
非常感谢任何想法。
答案 0 :(得分:3)
让我们尝试在pd.DataFrame构造中使用index
参数。
ts = pd.to_datetime("2014-05-04 18:47:05.487", format="%Y-%m-%d %H:%M:%S.%f")
new_row = pd.DataFrame([[11]], columns = ["col_a"], index=[ts])
df1 = pd.concat([df, pd.DataFrame(new_row)], ignore_index=False)
print(df1)
输出:
col_a col_b
2014-05-01 18:47:05.069 34 21.0
2014-05-01 18:47:05.119 25 32.0
2014-05-02 18:47:05.230 26 43.0
2014-05-04 18:47:05.487 11 NaN