在R中改进循环的方法

时间:2015-04-26 23:47:47

标签: r for-loop apply

我有一个大文件,需要将管理员ID与用户匹配:

      TABLE1              TABLE 2
INDEX  V1   IDS            AdmID
  1     A   30               30
  2     U   3                123
  3     U   25               60
  4     U   4                 .
  5     U   5                 .
  6     A   123               .
  7     U   7        
  8     U   8        
  9     U   9        
  10    A   60      
  11    U   26
  12    U    2  
  .     .    .       
  .     .    .       
  .     .    .       

我想要这样的事情:

     COMPLETE TABLE                  
INDEX   V1  IDS   ADMIN_ID         
  1     A   30      30               
  2     U   3       30               
  3     U   25      30              
  4     U   4       30               
  5     U   5       30               
  6     A   123    123               
  7     U   7      123
  8     U   8      123
  9     U   9      123
  10    A   60      60
  11    U   26      60
  12    U    2      60
  .     .    .       .
  .     .    .       .
  .     .    .       .

所以我写了这个循环,但是要永远完成。任何关于如何在这种情况下使用apply()的想法:

ln=10,000;#number of records in the Adm table
TABLE2= index of the adm ids

for (k in 1:ln){
  w<-TABLE2$A_ID[k] #Ids of the adms
  for(i in seq(from=AdmID[k], to=AdmID[k+1], by=1)){
    TABLE1$ADMIN_ID[i]<-w
  }
}

1 个答案:

答案 0 :(得分:0)

如果记录 - admin$ind如何应用映射,它会更容易。获得累积总和,并反映映射表 - admin。然后可以按顺序替换ID - 在您的情况下,12,9,5。

df <- data.frame(index = c(1:12),
                 v1 = c("A","U","U","U","U","A","U","U","U","A","U","U"),
                 ids = 13:24,
                 admin = 0)
# need rule to assign ids - ind
admin <- data.frame(ind = c(5,4,3), id = c(30,123,60))
# get cumulative sum and reverse admin table
admin$cum <- cumsum(admin$ind)
admin <- admin[nrow(admin):1,]
admin
ind  id cum
3   3  60  12
2   4 123   9
1   5  30   5

# ids will be subsequently updated - 12, 9, 5
for(i in 1:length(admin$cum)) {
  df[as.numeric(row.names(df)) <= admin$cum[i], 4] <- admin$id[i]
}
df
index v1 ids admin
1      1  A  13    30
2      2  U  14    30
3      3  U  15    30
4      4  U  16    30
5      5  U  17    30
6      6  A  18   123
7      7  U  19   123
8      8  U  20   123
9      9  U  21   123
10    10  A  22    60
11    11  U  23    60
12    12  U  24    60

以下是使用个别匹配规则的另一个版本,但是累积规则。

df <- data.frame(index = c(1:12),
                 v1 = c("A","U","U","U","U","A","U","U","U","A","U","U"),
                 ids = 13:24)
# need rule to assign ids - ind
admin <- data.frame(ind = c(5,4,3), id = c(30,123,60))

df$admin <- do.call(c, lapply(1:length(admin$ind), function(x) {
  rep(admin$id[x], sum(as.numeric(row.names(df)) <= admin$ind[x]))
}))