R将值添加到其他行的行

时间:2015-02-27 13:49:16

标签: r matrix

我目前正在努力处理我的数据。 请考虑以下数据集:

        a <- c("city1", "city2", "city3", "city4")
        b <- c(1,3,5,9)
        c <- c(4,1,3,10)
        data <- data.frame(cbind(a,b,c))

> data
      a b  c
1 city1 1  4
2 city2 3  1
3 city3 5  3
4 city4 9 10

我希望收到的数据是每个城市的b和c值匹配并附在数据集的末尾(复制每个城市的频率与城市在数据集中的频率相同:

      a b  c  a1     b1  c1  
1 city1 1  4  city2  3   1
1 city1 1  4  city3  5   3
1 city1 1  4  city4  9   10
2 city2 3  1  city1  1   4
2 city2 3  1  city3  5   3
2 city2 3  1  city4  9   10

等......

我该怎么做?

2 个答案:

答案 0 :(得分:2)

您想要交叉加入。您可以将mergeby = NULL

一起使用
merge(data, data, by = NULL)
     a.x b.x c.x   a.y b.y c.y
1  city1   1   4 city1   1   4
2  city2   3   1 city1   1   4
3  city3   5   3 city1   1   4
4  city4   9  10 city1   1   4
5  city1   1   4 city2   3   1
6  city2   3   1 city2   3   1
7  city3   5   3 city2   3   1
8  city4   9  10 city2   3   1
9  city1   1   4 city3   5   3
10 city2   3   1 city3   5   3
11 city3   5   3 city3   5   3
12 city4   9  10 city3   5   3
13 city1   1   4 city4   9  10
14 city2   3   1 city4   9  10
15 city3   5   3 city4   9  10
16 city4   9  10 city4   9  10

或者您可以使用sqldf包:

library(sqldf)
sqldf("select * from data  as a cross join data as b")
      a b  c     a b  c
1  city1 1  4 city1 1  4
2  city1 1  4 city2 3  1
3  city1 1  4 city3 5  3
4  city1 1  4 city4 9 10
5  city2 3  1 city1 1  4
6  city2 3  1 city2 3  1
7  city2 3  1 city3 5  3
8  city2 3  1 city4 9 10
9  city3 5  3 city1 1  4
10 city3 5  3 city2 3  1
11 city3 5  3 city3 5  3
12 city3 5  3 city4 9 10
13 city4 9 10 city1 1  4
14 city4 9 10 city2 3  1
15 city4 9 10 city3 5  3
16 city4 9 10 city4 9 10

在这两种情况下,您都会获得完整的交叉联接。之后,您可以对data.frame进行子集化,以获得您想要的内容。例如:

data2 <- merge(data, data, by = NULL)
subset(data2, a.x!= a.y)
     a.x b.x c.x   a.y b.y c.y
2  city2   3   1 city1   1   4
3  city3   5   3 city1   1   4
4  city4   9  10 city1   1   4
5  city1   1   4 city2   3   1
7  city3   5   3 city2   3   1
8  city4   9  10 city2   3   1
9  city1   1   4 city3   5   3
10 city2   3   1 city3   5   3
12 city4   9  10 city3   5   3
13 city1   1   4 city4   9  10
14 city2   3   1 city4   9  10
15 city3   5   3 city4   9  10

答案 1 :(得分:1)

您只需通过使用expand.grid

创建的索引进行子集化即可
idx <- expand.grid(1:4,1:4)[-seq(1,16,by=5),]
cbind(data[idx[,2],],data[idx[,1],])
        a b  c     a b  c
1   city1 1  4 city2 3  1
1.1 city1 1  4 city3 5  3
1.2 city1 1  4 city4 9 10
2   city2 3  1 city1 1  4
2.1 city2 3  1 city3 5  3
2.2 city2 3  1 city4 9 10
3   city3 5  3 city1 1  4
3.1 city3 5  3 city2 3  1
3.2 city3 5  3 city4 9 10
4   city4 9 10 city1 1  4
4.1 city4 9 10 city2 3  1
4.2 city4 9 10 city3 5  3