我在R中有一个数据帧,如下所示:
library(tidyverse)
df_input_r <- data.frame(
id = c(7181, 7183, 7183, 7183, 7185, 7185, 7185, 7191, 7191),
date = c("2020-01-01", "2020-01-01", "2020-01-02", "2020-01-03",
"2020-01-01", "2020-01-02", "2020-01-03",
"2020-01-01", "2020-01-02"),
rank = c(1, 1, 2, 3, 1, 2, 3, 1, 2)
)
我想在熊猫中执行一个简单的操作:我想按id
分组,并仅过滤每个id的倒数第二个条目;我想过滤并保留所有列,输出如下:
df_output <- data.frame(stringsAsFactors=FALSE,
id = c(7181, 7183, 7185, 7191),
date_mod = c("-", "2020-01-02", "2020-01-02", "2020-01-01"),
rank_mod = c(0, 2, 2, 1)
)
id date_mod rank_mod
1 7181 - 0
2 7183 2020-01-02 2
3 7185 2020-01-02 2
4 7191 2020-01-01 1
我想通过reticulate
在熊猫中打开此R数据帧,在pandas
中执行操作,并在r中取回输出数据帧。
我写的熊猫代码是:
grouped = df.groupby('user_id')
index_tokeep = dict(grouped['rank'].nlargest(2).index.tolist())
df.loc[index_tokeep.values(),:]
感谢您的帮助