应用以及哪个功能需要操作员错误

时间:2019-06-06 12:24:52

标签: r apply

我有一个参考表:

ref <- data.frame("Strong"=c("A","A","B","B","C","C","D"),
              "Medium"=c("A","B","B","C","C","D","D"),
              "Moderate"=c("B","C","C","C","D","D","D"),
              "Weak"=c("C","C","D","D","D","D","D"))
rownames(ref) <- c("WS1","WS2","WS3","WS4","WS5","WS6","WS7")

还有一个大的dataframe(如下所示):

df <- data.frame("Rad"=c("Weak","Weak","Weak","Moderate","Moderate"), "Wind"=c("WS4","WS3","WS3","WS2","WS4"))

我需要在参考表Wind中从Rad中查找dfref值。为此,我使用以下代码检索索引,然后使用这些索引值从ref复制值:

df$x <- apply(df,1,function(x){which(colnames(ref) == df[x,"Rad"])})
df$x <- apply(df,1,function(x){which(colnames(ref) == x$Rad)})

df$y <- apply(df,1,function(x){which(rownames(ref) == df[x,"Wind"])})
df$y <- apply(df,1,function(x){which(rownames(ref) == x$Wind)})

预期输出如下:

   Rad     Wind  PG
 1 Weak     WS4  D
 2 Weak     WS3  D
 3 Weak     WS3  D
 4 Moderate WS2  C
 5 Moderate WS4  C

上面的代码有效,但是存在问题:

  • 我不必写“两次”行,但是如果我只运行第二个,则代码不会运行。
  • 第一行没有达到预期的效果(并且由于语法错误,因此不应该这样做),但是,如果我不首先使用“失败的”,第二行将不会运行
  • 最后,尽管这行得通,但我敢肯定,还有其他更简单的方法可以完成我正在做的事情。任何提示将不胜感激!

3 个答案:

答案 0 :(得分:2)

使用data.table的另一种方法。即使在大型数据集上,Shuld都能快速运行。 使用与@IceCreamToucan解决方案相同的逻辑,但是停留在data.table之内。

解释:使用熔化的ref表,对df执行更新联接。

library( data.table )

setDT(df)[ melt( setDT( ref, keep.rownames = TRUE ), id.vars = "rn" ), 
           PG := i.value, 
           on = .( Wind == rn, Rad == variable )][]

#         Rad Wind PG
# 1:     Weak  WS4  D
# 2:     Weak  WS3  D
# 3:     Weak  WS3  D
# 4: Moderate  WS2  C
# 5: Moderate  WS4  C

答案 1 :(得分:1)

我们可以分别用matchrownames列以及colnames的子集中{{1}的ref RADWIND

ref

答案 2 :(得分:1)

library(tidyverse)
library(data.table) # for melt

ref_long <- 
  ref %>% 
    rownames_to_column('row') %>% 
    melt('row')

df %>% 
  left_join(ref_long, by = c('Rad' = 'variable', 'Wind' = 'row'))

#        Rad Wind value
# 1     Weak  WS4     D
# 2     Weak  WS3     D
# 3     Weak  WS3     D
# 4 Moderate  WS2     C
# 5 Moderate  WS4     C