我目前正在努力处理我的数据。 请考虑以下数据集:
a <- c("city1", "city2", "city3", "city4")
b <- c(1,3,5,9)
c <- c(4,1,3,10)
data <- data.frame(cbind(a,b,c))
> data
a b c
1 city1 1 4
2 city2 3 1
3 city3 5 3
4 city4 9 10
我希望收到的数据是每个城市的b和c值匹配并附在数据集的末尾(复制每个城市的频率与城市在数据集中的频率相同:
a b c a1 b1 c1
1 city1 1 4 city2 3 1
1 city1 1 4 city3 5 3
1 city1 1 4 city4 9 10
2 city2 3 1 city1 1 4
2 city2 3 1 city3 5 3
2 city2 3 1 city4 9 10
等......
我该怎么做?
答案 0 :(得分:2)
您想要交叉加入。您可以将merge
与by = NULL
:
merge(data, data, by = NULL)
a.x b.x c.x a.y b.y c.y
1 city1 1 4 city1 1 4
2 city2 3 1 city1 1 4
3 city3 5 3 city1 1 4
4 city4 9 10 city1 1 4
5 city1 1 4 city2 3 1
6 city2 3 1 city2 3 1
7 city3 5 3 city2 3 1
8 city4 9 10 city2 3 1
9 city1 1 4 city3 5 3
10 city2 3 1 city3 5 3
11 city3 5 3 city3 5 3
12 city4 9 10 city3 5 3
13 city1 1 4 city4 9 10
14 city2 3 1 city4 9 10
15 city3 5 3 city4 9 10
16 city4 9 10 city4 9 10
或者您可以使用sqldf
包:
library(sqldf)
sqldf("select * from data as a cross join data as b")
a b c a b c
1 city1 1 4 city1 1 4
2 city1 1 4 city2 3 1
3 city1 1 4 city3 5 3
4 city1 1 4 city4 9 10
5 city2 3 1 city1 1 4
6 city2 3 1 city2 3 1
7 city2 3 1 city3 5 3
8 city2 3 1 city4 9 10
9 city3 5 3 city1 1 4
10 city3 5 3 city2 3 1
11 city3 5 3 city3 5 3
12 city3 5 3 city4 9 10
13 city4 9 10 city1 1 4
14 city4 9 10 city2 3 1
15 city4 9 10 city3 5 3
16 city4 9 10 city4 9 10
在这两种情况下,您都会获得完整的交叉联接。之后,您可以对data.frame进行子集化,以获得您想要的内容。例如:
data2 <- merge(data, data, by = NULL)
subset(data2, a.x!= a.y)
a.x b.x c.x a.y b.y c.y
2 city2 3 1 city1 1 4
3 city3 5 3 city1 1 4
4 city4 9 10 city1 1 4
5 city1 1 4 city2 3 1
7 city3 5 3 city2 3 1
8 city4 9 10 city2 3 1
9 city1 1 4 city3 5 3
10 city2 3 1 city3 5 3
12 city4 9 10 city3 5 3
13 city1 1 4 city4 9 10
14 city2 3 1 city4 9 10
15 city3 5 3 city4 9 10
答案 1 :(得分:1)
您只需通过使用expand.grid
:
idx <- expand.grid(1:4,1:4)[-seq(1,16,by=5),]
cbind(data[idx[,2],],data[idx[,1],])
a b c a b c
1 city1 1 4 city2 3 1
1.1 city1 1 4 city3 5 3
1.2 city1 1 4 city4 9 10
2 city2 3 1 city1 1 4
2.1 city2 3 1 city3 5 3
2.2 city2 3 1 city4 9 10
3 city3 5 3 city1 1 4
3.1 city3 5 3 city2 3 1
3.2 city3 5 3 city4 9 10
4 city4 9 10 city1 1 4
4.1 city4 9 10 city2 3 1
4.2 city4 9 10 city3 5 3