我有两个时间序列数据帧(约45,000行对5行)。一个将时间戳记降低为毫秒,另一个将时间戳记为秒。我想在较大的数据框中创建一个新列,例如: a)将一个值附加到较大数据框中的行,该行的时间戳最接近(以秒为单位)(以秒为单位),该行与较小数据框中的时间戳最接近 b)其他时间戳记均为NaN。
larger df =
timestamp price
0 2018-04-24 06:01:02.600 1
1 2018-04-24 06:01:02.600 1
2 2018-04-24 06:01:02.600 2
3 2018-04-24 06:01:02.600 4
4 2018-04-24 06:01:02.775 2
5 2018-04-24 06:01:02.825 3
6 2018-04-24 06:01:03.050 5
7 2018-04-24 06:01:03.125 6
8 2018-04-24 06:01:03.275 7
9 2018-04-24 06:01:03.300 4
10 2018-04-24 06:01:03.300 3
11 2018-04-24 06:01:03.950 5
12 2018-04-24 06:01:04.050 5
smaller df =
timestamp price
0 24/04/2018 06:01:02 2
1 24/04/2018 12:33:37 4
2 24/04/2018 14:29:34 5
3 24/04/2018 15:02:50 6
4 24/04/2018 15:20:04 7
desired df =
timestamp price newCol
0 2018-04-24 06:01:02.600 1 aValue
1 2018-04-24 06:01:02.600 1 NaN
2 2018-04-24 06:01:02.600 2 NaN
3 2018-04-24 06:01:02.600 4 NaN
4 2018-04-24 06:01:02.775 2 NaN
5 2018-04-24 06:01:02.825 3 NaN
6 2018-04-24 06:01:03.050 5 NaN
7 2018-04-24 06:01:03.125 6 NaN
8 2018-04-24 06:01:03.275 7 NaN
9 2018-04-24 06:01:03.300 4 NaN
10 2018-04-24 06:01:03.300 3 NaN
11 2018-04-24 06:01:03.950 5 NaN
12 2018-04-24 06:01:04.050 5 NaN
非常感谢您的帮助。对于一般编程人员来说,我还是太陌生,无法轻松解决此问题。
非常感谢
答案 0 :(得分:1)
private rowClicked : any;
onButtonClicked() {
console.log(this.rowClicked) // undefined
}
onRowClicked(event: any)() {
this.rowClicked = event.data;
}
为了只使用一次值,我不得不从较小的数据框中跟踪时间戳。因此,当我将reindex
与reindex
一起使用时,我会包含这些值。然后,我在遮罩内使用'nearest'
。
duplicated
pandas.merge_asof
df_small_new = df_small.set_index('timestamp', drop=False)
df_small_new = df_small_new.reindex(df_large.timestamp, method='nearest')
df_large.assign(
newcol=df_small_new.price.mask(df_small_new.timestamp.duplicated()).values)
timestamp price newcol
0 2018-04-24 06:01:02.600 1 2.0
1 2018-04-24 06:01:02.600 1 NaN
2 2018-04-24 06:01:02.600 2 NaN
3 2018-04-24 06:01:02.600 4 NaN
4 2018-04-24 06:01:02.775 2 NaN
5 2018-04-24 06:01:02.825 3 NaN
6 2018-04-24 06:01:03.050 5 NaN
7 2018-04-24 06:01:03.125 6 NaN
8 2018-04-24 06:01:03.275 7 NaN
9 2018-04-24 06:01:03.300 4 NaN
10 2018-04-24 06:01:03.300 3 NaN
11 2018-04-24 06:01:03.950 5 NaN
12 2018-04-24 06:01:04.050 5 NaN
列'price'
设置为direction
'nearest'