我不能用if语句来实现这个for循环

时间:2015-05-20 19:24:34

标签: r

这是我的数据,包括infoemployer,inforst和interrst。这叫做tyearb。

                                                     infoemployer                 inforst
1                                                         Comcast               Jeff Dunn
6                                                   Cummins, Inc.       Rebekah Smith
38                                                         DaVita        Andy Nielsen
42                                                       Deloitte           Chase Russell
66                                              Duff & Phelps LLC     Tanner Anderson
76                                                 Frito-Lay Inc.     Tanner  Anderson
88                                              Intel Corporation          Jake Graff
96      J.P. Morgan- (J.P. Morgan is part of JPMorgan Chase & Co)        Andy Nielsen
97                                                         Lenovo      Nelson Anievas
98                                                        PepsiCo     Tanner Anderson
100                                              Procter & Gamble      Andee Flinders
102 Sears Holdings Corporation, formerly Sears, Roebuck & Company     Tanner Anderson
103                                       The Walt Disney Company Kylie Rothlisberger
106                                        Union Pacific Railroad          Jake Graff
116                                                          USAA       Rebekah Smith
117                                                       Walmart       Chase Russell
237                                                                              <NA>
238                                                         Apple                <NA>
239                              Brandes Investment Partners L.P.                <NA>
240                      EY (formerly known as Ernst & Young) LLP                <NA>
242                                            Grant Thornton LLP                <NA>
243                                                      KPMG LLP                <NA>
245                                                    Moss Adams                <NA>
246                                            Pariveda Solutions                <NA>
248                             PwC (PricewaterhouseCoopers, LLC)                <NA>
250                                                         RCLCO                <NA>
251                                     Strata Fund Services, LLC                <NA>
               interrst
1                  <NA>
6         Rebekah Smith
38         Andy Nielsen
42        Chase Russell
66      Tanner Anderson
76      Tanner Anderson
88           Jake Graff
96         Andy Nielsen
97       Nelson Anievas
98      Tanner Anderson
100      Andee Flinders
102     Tanner Anderson
103 Kylie Rothlisberger
106          Jake Graff
116       Rebekah Smith
117       Chase Russell
237      Austin Pollard
238      Brady Tengberg
239           Jeff Dunn
240       Rebekah Smith
242           Jeff Dunn
243      Andee Flinders
245          Jake Graff
246      Nelson Anievas
248      Nelson Anievas
250          Jake Graff
251        Andy Nielsen

我的代码如下:

levels(tyearb[,2]) <- c(levels(tyearb[,2]), levels(tyearb[,3]))

for (i in 1:length(tyearb))
  {
 if (is.na(tyearb[i,2]))
  {
    tyearb[i,2] = tyearb[i,3]
  }
  }

我只想保留inforst中的所有当前值,除非它是<NA>,然后我想插入interrst的值。我认识到我可以将除了第一个中间值以外的所有值复制到inforst,但是我显然无法使用更大的数据集来执行此操作,其中将丢失更多信息。

我看了很多,如果循环在一起,我就是不能让它为我工作。有人可以解释一下吗?

1 个答案:

答案 0 :(得分:2)

data.table解决方案(即使非常大的数据集也会非常快):

library(data.table)
DT[is.na(z), z := y]

其中z是您要为NA测试的列,而y是您要插入的列(尽管您可以使用任何表达式替换y)。