使用R,如何使用具有不同结果列的多个结果表并将它们按行进行组合,以便捕获所有结果,如果一组结果没有此列,则使用NA或空白。基本上我需要获取我拥有的数据
并将其转换为我想要的数据
请注意,我并不关心品牌型号和年份,它们可以叠加在一起。
对于格式不佳的帖子道歉,我仍然在这里找到我的脚。
答案 0 :(得分:5)
我们可以在rbindlist
data.table
使用list
library(data.table)
rbindlist(list(df1, df2, df3), use.names = TRUE, fill=TRUE)
或使用bind_rows
dplyr
library(dplyr)
bind_rows(df1, df2, df3)
如果数据位于图片中显示的单个文件中,请使用readLines
读取,然后split
将其读取到list
并使用rbindlist
< / p>
lines1 <- trimws(readLines("temp1.csv"))
i1 <- cumsum(grepl("^Brand", lines1))
i2 <- lines1!=''
lst <- lapply(split(lines1[i2], i1[i2]),
function(x) read.csv(text=x, sep=""))
rbindlist(lst, use.names=TRUE, fill = TRUE)
或者
bind_rows(lst)
df1 <- data.frame(Brand = 1, Model ="A", Year = 2010:2014,
Dogs = c(0.71, 0.76, 0.40, 0.39, 0.67),
Cats = c(0.64,0.06,0.18, 0.20, 0.23),
Rabbits = c(0.56, 0.96, 0.90, 0.38, 0.73),
stringsAsFactors=FALSE)
df2 <- data.frame(Brand = 1, Model ="B", Year = c(2010, 2011, 2013, 2014),
Dogs = c(0.12, 0.43, 0.79, 0.29),
Ducks = c(0.67, 0.48, 0.80, 0.70),
stringsAsFactors=FALSE)
df3 <- data.frame(Brand = 1, Model ="C", Year = 2013:2014,
Dogs = c(0.76, 0.98),
Cats = c(0.90, 0.84),
Lions = c(0.12, 0.22),
Wolves = c(0.75, 0.61),
stringsAsFactors=FALSE)
答案 1 :(得分:0)
使用merge()
:
> Reduce(function(x, y) merge(x, y, all=TRUE), list(df1, df2, df3))
Brand Model Year Dogs Cats Rabbits Ducks Lions Wolves
1 1 A 2010 0.71 0.64 0.56 NA NA NA
2 1 A 2011 0.76 0.06 0.96 NA NA NA
3 1 A 2012 0.40 0.18 0.90 NA NA NA
4 1 A 2013 0.39 0.20 0.38 NA NA NA
5 1 A 2014 0.67 0.23 0.73 NA NA NA
6 1 B 2010 0.12 NA NA 0.67 NA NA
7 1 B 2011 0.43 NA NA 0.48 NA NA
8 1 B 2013 0.79 NA NA 0.80 NA NA
9 1 B 2014 0.29 NA NA 0.70 NA NA
10 1 C 2013 0.76 0.90 NA NA 0.12 0.75
11 1 C 2014 0.98 0.84 NA NA 0.22 0.61