我有一个从ArcGIS 10.1导出的.dbf文件,我需要重新组织它。数据的一个例子是:
V1 V2
40.000000000000000 41.000000000000000
40.000000000000000 42.000000000000000
41.000000000000000 40.000000000000000
41.000000000000000 42.000000000000000
41.000000000000000 43.000000000000000
42.000000000000000 40.000000000000000
42.000000000000000 41.000000000000000
42.000000000000000 43.000000000000000
43.000000000000000 41.000000000000000
43.000000000000000 42.000000000000000
我需要的格式是第一列中每个唯一值只有一行,第二列中的所有相应值现在出现在该行中,例如:
V1 V2 V3 V4
40.000000000000000 41.000000000000000 42.000000000000000
41.000000000000000 40.000000000000000 42.000000000000000 43.000000000000000
42.000000000000000 40.000000000000000 41.000000000000000 43.000000000000000
43.000000000000000 41.000000000000000 42.000000000000000
如果有人可以帮我解决这个问题,我将不胜感激。谢谢!
答案 0 :(得分:2)
您可以使用第一列上的split
功能拆分数据框,然后使用lapply
提取矢量:
dat = data.frame(X1=c(40, 40, 41, 41, 41, 42, 42, 42, 43, 43),
X2=c(41, 42, 40, 42, 43, 40, 41, 43, 41, 42))
res <- lapply(split(dat, dat[,1]), function(d) c(d[1,1], sort(unique(d[,2]))))
res
# $`40`
# [1] 40 41 42
#
# $`41`
# [1] 41 40 42 43
#
# $`42`
# [1] 42 40 41 43
#
# $`43`
# [1] 43 41 42
大多数人可能更喜欢以这种格式保存数据,但您也可以将列表合并到一个矩阵中,用NA
值右键填充矢量:
max.len <- max(unlist(lapply(res, length)))
do.call(rbind, lapply(res, function(x) { length(x) <- max.len ; x }))
# [,1] [,2] [,3] [,4]
# 40 40 41 42 NA
# 41 41 40 42 43
# 42 42 40 41 43
# 43 43 41 42 NA
答案 1 :(得分:2)
您也可以在dplyr
library(dplyr)
library(tidyr)
dat%>%
group_by(X1) %>%
mutate(Time=seq_along(X1))
%>%spread(Time,X2)
#Source: local data frame [4 x 4]
#X1 1 2 3
#1 40 41 42 NA
#2 41 40 42 43
#3 42 40 41 43
#4 43 41 42 NA
答案 2 :(得分:1)
这实际上是一个reshape
问题,但您没有“时间”变量。
您可以轻松创建“时间”变量,如下所示:
dat$time <- with(dat, ave(X1, X1, FUN = seq_along))
从那里,使用基地R的reshape
...
reshape(dat, direction = "wide", idvar="X1", timevar="time")
# X1 X2.1 X2.2 X2.3
# 1 40 41 42 NA
# 3 41 40 42 43
# 6 42 40 41 43
# 9 43 41 42 NA
...或来自“reshape2”的dcast
......
library(reshape2)
dcast(dat, X1 ~ time, value.var="X2")
# X1 1 2 3
# 1 40 41 42 NA
# 2 41 40 42 43
# 3 42 40 41 43
# 4 43 41 42 NA