合并两个列表组件

时间:2011-10-14 01:46:10

标签: list r merge

我有一个很重要的列表,但微观示例如下:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 
mylist <- list(A=A, B=B, C= C)

预期输出是将A与B合并,以便每个组件看起来像AB

AA, aA, Aa, aa, Aa

最好应该排序,大写总是第一个

AA, Aa, Aa, aa, Aa

因此,新列表或矩阵应该有两列或多行:

AA, Aa, Aa, aa, Aa
1,   2, 3,   1, 4

现在我想根据类来计算C的平均值 - “AA”,“Aa”和“aa”

看起来很简单,但我无法轻易弄明白。

3 个答案:

答案 0 :(得分:2)

> (ab <- paste(A, B, sep="") )
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- paste(A, B, sep="") )  # the joining step
[1] "AA" "aA" "Aa" "aa" "Aa"
> (ab <- sub("([a-z])([A-Z])", "\\2\\1", ab) ) # swap lowercase uppercase
[1] "AA" "Aa" "Aa" "aa" "Aa"

> rbind(ab, C)                  # matrix
   [,1] [,2] [,3] [,4] [,5]
ab "AA" "Aa" "Aa" "aa" "Aa"
C  "1"  "2"  "3"  "1"  "4" 
> data.frame(alleles=ab, count=C)  # dataframes are lists
  alleles count
1      AA     1
2      Aa     2
3      Aa     3
4      aa     1
5      Aa     4

答案 1 :(得分:2)

如果您使用包data.frame

将数据排列在plyr中,我就可以这样做
> A <- c("A", "a", "A", "a", "A")
> B <- c("A", "A", "a", "a", "a")
> C <- c(1, 2, 3, 1, 4) 
> groups <- sort(paste(A, B, sep=""))
[1] "AA" "aA" "Aa" "aa" "Aa"
> my.df <- data.frame(A=A, B=B, C=C, group=groups)

> require(plyr)
> result <- ddply(my.df, "group", transform, group.means=mean(C))
> result[order(result$group, decreasing=TRUE),]
  A B C group group.means
5 A A 1    AA         1.0
3 A a 3    Aa         3.5
4 A a 4    Aa         3.5
2 a A 2    aA         2.0
1 a a 1    aa         1.0

答案 2 :(得分:1)

使用您的数据:

A <- c("A", "a", "A", "a", "A")
B <- c("A", "A", "a", "a", "a")
C <- c(1, 2, 3, 1, 4) 

我使用A和B的组合作为关键列定义data.frame

AB <- paste(A, B, sep='')
df <- data.frame(id=AB, C=C)

> df
  id C
1 AA 1
2 aA 2
3 Aa 3
4 aa 1
5 Aa 4

如果您需要在汇总前订购此data.frame,请:

df <- df[order(AB, decreasing=TRUE),]

> df
  id C
 1 AA 1
 3 Aa 3
 5 Aa 4
 2 aA 2
 4 aa 1

使用aggregate计算每个id的平均值:

meanDF <- aggregate(C~id, data=df, mean)

> meanDF

  id   C
1 aa 1.0
2 aA 2.0
3 Aa 3.5
4 AA 1.0

但是如果你想在聚合后订购,那么:

df <- data.frame(id=AB, C=C)
meanDF <- aggregate(C~id, data=df, mean)
meanDF <- meanDF[order(meanDF$id, decreasing=TRUE),]