下午好, 我有以下问题希望有人能帮我找到正确的解决方案。 情况如下: 假设一个人有一个不平衡的面板数据集
| ID | Value | Time |
| 1 | 12 | 2011 |
| 1 | 8 | 2012 |
| 1 | 10 | 2013 |
| 2 | 24 | 2011 |
| 2 | 10 | 2012 |
| 3 | 1 | 2011 |
| 3 | 8 | 2012 |
| 3 | 2 | 2013 |
我尝试做的是计算每个ID的平均值,并为该个体的每一年插入一个值。结果应如下所示:
| ID | Value | Time |
| 1 | 10 | 2011 |
| 1 | 10 | 2012 |
| 1 | 10 | 2013 |
| 2 | 17 | 2011 |
| 2 | 17 | 2012 |
| 3 | 4 | 2011 |
| 3 | 4 | 2012 |
| 3 | 4 | 2013 |
我见过很多相同类型的问题,但没有解决方案可以保持面板数据的形式。有没有人知道如何在R中解决这个问题?
答案 0 :(得分:2)
library(dplyr)
df <- data.frame(ID = c(1,1,1,2,2,3,3,3),
Value = c(12,8,10,24,10,1,8,2),
Time = c(2011,2012,2013,2011,2012,2011,2012,2013))
df %>%
group_by(ID) %>%
summarise(Value = round(mean(Value))) %>%
right_join(df %>% select(-Value), by ="ID")
# A tibble: 8 x 3
ID Value Time
<dbl> <dbl> <dbl>
1 1 10 2011
2 1 10 2012
3 1 10 2013
4 2 17 2011
5 2 17 2012
6 3 4 2011
7 3 4 2012
8 3 4 2013
EDIT
As Sotos points out below, this is a better solution:
df %>% group_by(ID) %>% mutate(Value = round(mean(Value)))
答案 1 :(得分:1)
With data.table
this becomes a "one-liner":
library(data.table)
setDT(df)[, Value := round(mean(Value)), by = ID][]
ID Value Time 1: 1 10 2011 2: 1 10 2012 3: 1 10 2013 4: 2 17 2011 5: 2 17 2012 6: 3 4 2011 7: 3 4 2012 8: 3 4 2013
df <- fread(
"| ID | Value | Time |
| 1 | 12 | 2011 |
| 1 | 8 | 2012 |
| 1 | 10 | 2013 |
| 2 | 24 | 2011 |
| 2 | 10 | 2012 |
| 3 | 1 | 2011 |
| 3 | 8 | 2012 |
| 3 | 2 | 2013 |",
sep = "|", drop = c(1L, 5L))
答案 2 :(得分:1)
The base R solution via ave
,
round(ave(df$Value, df$ID))
#[1] 10 10 10 17 17 4 4 4