Question

因此，我试图从现有字符串填充“名称”列。大多数字符串如下所示：

;Parties;John, Smith;Defendant;

因此我已经能够使用此代码提取名称

data$name <- str_to_upper(str_extract(data$column, "(?<=;Parties;)(\\D){4,60}(?=;Defendant)"))

但是，我的姓名栏中有一些NA值

name
JOHN, SMITH
JANE, DOE
BOB, ROSS
NA

sum(is.na(data$name)) 
[1] 888

当我用data $ column [4]查看这些NA行时，看起来像这样：

Agency;Michael, Scott;Defendant

我正在尝试在名称列中填写NA。这是我的代码：

data$name <- if(is.na(data$name)) {
str_to_upper(str_extract(data$column, "(?<=Agency;)(\\D){4,60}(?=;Defendant)"))
}

但我收到此错误：

In if (is.na(data$name)) { :
the condition has length > 1 and only the first element will be used

有什么想法吗？谢谢！

Answer 1

答案完全基于@ r2evans的评论，但我认为将问题标记为已解决以使其封闭是很好的做法。

isna <- is.na(data$column)
data$column[isna] <- str_to_upper(
                       str_extract(data$column[isna], 
                                   "(?<=Agency;)(\\D){4,60}(?=;Defendant)"))

使用if语句替换列中的NA值

1 个答案: