在R数据帧中添加空行

时间:2015-07-17 17:29:38

标签: r dataframe dplyr

我有一个数据框

data<-data.frame(Type=c("A","B","D","D","E","E"),
                 Ratio=c(5,6,3,3,4,4),
                 Number=c(65,74,43,34,23,12),
                 Letter=c("P","K","M","M","N","B"),
                 Season=c("Fall","Spring","Winter",
                          "Summer","Spring","Winter"))
> data
Type Ratio Number Letter  Season
A     5      65     P       Fall
B     6      74     K       Spring
D     3      43     M       Winter
D     3      34     M       Summer
E     4      23     N       Spring
E     4      12     B       Winter

我想在&#39;类型&#39;中添加新行。仅使用一次(A和B)。 我想在每个行的下方添加一个弓形,其中包含与其上方相同的类型,比率和数字,但是字母和季节的NA。 我用过

group_by(Type) 

开始

我希望我的最终数据框看起来像这样

Type Ratio Number Letter  Season
A     5      65     P       Fall
A     5      65     NA       NA
B     6      74     K       Spring
B     6      74     NA       NA
D     3      43     M       Winter
D     3      34     M       Summer
E     4      23     N       Spring
E     4      12     B       Winter

谢谢!

4 个答案:

答案 0 :(得分:2)

另一个data.table解决方案:

setDT(data)[, if (.N == 1L) 
                c(Number = list(Number), .SD[1:2, .(Letter, Season)]) 
              else .SD, 
by=.(Type, Ratio)]
#    Type Ratio Number Letter Season
# 1:    A     5     65      P   Fall
# 2:    A     5     65     NA     NA
# 3:    B     6     74      K Spring
# 4:    B     6     74     NA     NA
# 5:    D     3     43      M Winter
# 6:    D     3     34      M Summer
# 7:    E     4     23      N Spring
# 8:    E     4     12      B Winter

答案 1 :(得分:1)

使用data.table

library(data.table) #1.9.5+
setDT(data)
data<-setkey(rbindlist(list(data,data[,if(.N==1).SD[,!c("Letter","Season"),with=F],by=Type]),fill=T),Type)
> data
   Type Ratio Number Letter Season
1:    A     5     65      P   Fall
2:    A     5     65     NA     NA
3:    B     6     74      K Spring
4:    B     6     74     NA     NA
5:    D     3     43      M Winter
6:    D     3     34      M Summer
7:    E     4     23      N Spring
8:    E     4     12      B Winter

答案 2 :(得分:1)

基础套餐:

d1 <- as.data.frame(table(data$Type))
d2 <- data[data$Type %in% d1[d1$Freq<2,1], 1:3]
d2[, c("Letter", "Season")] <- NA
d3 <- rbind(data, d2)
d3[order(d3$Type), ]

使用dplyr和基础包。我根据Nick Kennedy使用bind_rows来改进我的解决方案。所以我不需要创建我的NA列。

library(dplyr)
d1 <- data %>% group_by(Type) %>% summarize(count = n()) %>% filter (count<2) 
d2 <- data[data$Type %in% d1$Type, 1:3]
d3 <- bind_rows(data, d2)
d3[order(d3$Type), ]

输出:

  Type Ratio Number Letter Season
1    A     5     65      P   Fall
7    A     5     65   <NA>   <NA>
2    B     6     74      K Spring
8    B     6     74   <NA>   <NA>
3    D     3     43      M Winter
4    D     3     34      M Summer
5    E     4     23      N Spring
6    E     4     12      B Winter

答案 3 :(得分:1)

这是一个单行dplyr解决方案(虽然为了清晰起见打印在多行上):

data %>%
  group_by(Type) %>%
  do(if(nrow(.) > 1) . else bind_rows(., select(., Type, Ratio, Number)))

如果您更喜欢嵌套管道do

data %>%
  group_by(Type) %>%
  bind_rows(.,
    filter(., n() < 2) %>%
    select(Type, Ratio, Number)
  ) %>%
  arrange(Type)