r将2个文件与for循环比较并划分某些列

时间:2013-08-08 13:40:45

标签: r loops

请帮忙!

我是R的新手并试图进入它。我一直得到同样的警告:

    Warning messages:
 1: In if ((AttrSum[i, 1] == AllAttr[i:length(AllAttr), 1])) { :  the condition has length > 1 and only the first element will be used        

我有3个csv文件。 1- AttrSum [90,2]有90行,2列.... 2-TC_RC [80,12] ... 3-AllAttr [70,20]有20列70行。我想检查AttrSum [,1]的第一列是否等于TC_RC [,12]的第12列,然后将AllAttr [,19]的第19列分成AttrSum的第2列[,2]和相应的第1列(存在于TC_RC [,12] = RC_ind中。我这样做:

 AttrSum <-read.csv()
AllAttr <- read.csv()

RC_sum <-AttrSum [AttrSum [,1]%in%TC_RC[,12], col_ind]# values
RC_ind <- AttrSum [(AttrSum [,1] %in%TC_RC[,12]), 1]#names


len_attrSum <- length (AttrSum)
CV <- c()

for (i in 1:len_attrSum){


if (all(RC_ind [i] == AllAttr[i:length(AllAttr),1])){

CV[i] <- (RC_sum[i]/AllAttr[,19])

}
}

对不起这个基本问题,但我被困在这里。我知道我的循环有问题,但看不出它是什么。我看了R的介绍,但仍然无法得到它。

提前致谢

PS:1-AttrSum [90,2]文件。

Case    x
2   1.784309
3   2.836969
4   0.791783
5   1.812687
8   0.385067
.......
90  0.771613


2-TC_RC[80,12] file. 12 columns,80 rows.
10  41.166667   1   0.364352    47  0.944911    49       26.833333  26.833333   1   0.324537    49

100 40.625          0   0.112847    55  0.953485    107 33.625  42.25   0   0.117361    109

101 29.75          0    0.082639    111 0.917909    107 12.625  29.75   0   0.082639    111
  1. AllAttr [90,19]档案

    Case    V16 V15 V14 V8  V9  V9.1    V10 V11 V12 V13 V1  V2  V3  V4  V5  V6  Vl7 VB
      2 0.577967    0.023869    0.021571    0.481754    0.61584 0   0   0   0   0   0.024057    0.039251    0   0   0   0   0   4
      3 0.327011    0.095338    0.025591    0.785795    0.511902    0.516165    0   0   0   0   0.033882    0.028056    0.513229    0   0   0   0   4
    

2 个答案:

答案 0 :(得分:5)

R中的if语句未向量化。因此,如果您运行代码if(1:2 == 1:2),您将得到相同的错误。相反,请将您的比较包装在allanyif(all(1:2 == 1:2))

答案 1 :(得分:1)

这是我的解决方案(未进行测试,因为您没有提供可重复的样本数据):

AttrSum$new[AttrSum[,1] %in% TC_RC [,12]]<-
AllAttr[,19][AttrSum[,1] %in% TC_RC [,12]]/AttrSum[,2][AttrSum[,1] %in% TC_RC [,12]]

注意:我假设您要在TC-RC的第12列的所有行上检查AttrSum的第一列的每一行,然后输出结果商(右侧) AttrSum的新专栏。

已更新,并对TC_RC(2,5和11)的数据进行了一些更改,以显示上述代码正常工作

AttrSum<-structure(list(col1 = 1:5, col2 = c(1.784309, 2.836969, 0.791783, 
1.812687, 0.385067)), .Names = c("col1", "col2"), row.names = c(NA, 
-5L), class = "data.frame")
> AttrSum
  col1     col2
1    1 1.784309
2    2 2.836969
3    3 0.791783
4    4 1.812687
5    5 0.385067
AllAttr<-structure(list(col111 = 2:3, col112 = c(0.577967, 0.327011), 
    col113 = c(0.023869, 0.095338), col114 = c(0.021571, 0.025591
    ), col115 = c(0.481754, 0.785795), col116 = c(0.61584, 0.511902
    ), col117 = c(0, 0.516165), col118 = c(0L, 0L), col119 = c(0L, 
    0L), col120 = c(0L, 0L), col121 = c(0L, 0L), col122 = c(0.024057, 
    0.033882), col123 = c(0.039251, 0.028056), col124 = c(0, 
    0.513229), col125 = c(0L, 0L), col126 = c(0L, 0L), col127 = c(0L, 
    0L), col128 = c(0L, 0L), col129 = c(4L, 4L)), .Names = c("col111", 
"col112", "col113", "col114", "col115", "col116", "col117", "col118", 
"col119", "col120", "col121", "col122", "col123", "col124", "col125", 
"col126", "col127", "col128", "col129"), class = "data.frame", row.names = c(NA, 
-2L))
>AllAttr
  col111   col112   col113   col114   col115   col116   col117 col118 col119 col120 col121   col122   col123   col124 col125
1      2 0.577967 0.023869 0.021571 0.481754 0.615840 0.000000      0      0      0      0 0.024057 0.039251 0.000000      0
2      3 0.327011 0.095338 0.025591 0.785795 0.511902 0.516165      0      0      0      0 0.033882 0.028056 0.513229      0
  col126 col127 col128 col129
1      0      0      0      4
2      0      0      0      4
TC_RC<-structure(list(col11 = c(10L, 100L, 101L), col12 = c(41.166667, 
40.625, 29.75), col13 = c(1L, 0L, 0L), col14 = c(0.364352, 0.112847, 
0.082639), col15 = c(47L, 55L, 111L), col16 = c(0.944911, 0.953485, 
0.917909), col17 = c(49L, 107L, 107L), col18 = c(26.833333, 33.625, 
12.625), col19 = c(26.833333, 42.25, 29.75), col20 = c(1L, 0L, 
0L), col21 = c(0.324537, 0.117361, 0.082639), col22 = c(2L, 5L, 
111L)), .Names = c("col11", "col12", "col13", "col14", "col15", 
"col16", "col17", "col18", "col19", "col20", "col21", "col22"
), class = "data.frame", row.names = c(NA, -3L))
> TC_RC
  col11    col12 col13    col14 col15    col16 col17    col18    col19 col20    col21 col22
1    10 41.16667     1 0.364352    47 0.944911    49 26.83333 26.83333     1 0.324537     2
2   100 40.62500     0 0.112847    55 0.953485   107 33.62500 42.25000     0 0.117361     5
3   101 29.75000     0 0.082639   111 0.917909   107 12.62500 29.75000     0 0.082639   111

> AttrSum[,1] %in% TC_RC [,12]
[1] FALSE  TRUE FALSE FALSE  TRUE

AttrSum$new[AttrSum[,1] %in% TC_RC [,12]]<-
    AllAttr[,19][AttrSum[,1] %in% TC_RC [,12]]/AttrSum[,2][AttrSum[,1] %in% TC_RC [,12]]

> AttrSum
  col1     col2      new
1    1 1.784309       NA
2    2 2.836969 1.409955
3    3 0.791783       NA
4    4 1.812687       NA
5    5 0.385067       NA

如果您不想将其附加到AttrSum,那么您可以执行以下操作:

hello<-c()
hello[AttrSum[,1] %in% TC_RC [,12]]<-
        AllAttr[,19][AttrSum[,1] %in% TC_RC [,12]]/AttrSum[,2][AttrSum[,1] %in% TC_RC [,12]]
    > hello
[1]       NA 1.409955       NA       NA       NA