Question

我有一个R数据框，看起来像

data.1       data.character
a            **str1**,str2,str2,str3,str4,str5,str6
b            str3,str4,str5
c            **str1**,str6

我目前正在使用grepl来确定列data.character是否包含我的搜索字符串"<str>"，如果是，我希望将data.1中的所有行值连接成一个字符串使用分隔符

例如。如果我使用grepl(str1,data.character)，它将返回两行df$data.1，我想要一个像

这样的输出

a,c （data.character中包含str1的行）

我目前正在使用两个for循环，但我知道这不是一种有效的方法。我想知道是否有人可以提出更优雅，更省时的方法。

Answer 1

你几乎就在那里 - （现在我的啰嗦答案）

# Data
df <- read.table(text="data.1       data.character
       a            **str1**,str2,str2,str3,str4,str5,str6
       b            str3,str4,str5
       c            **str1**,str6",header=T,stringsAsFactors=F)

匹配字符串

# In your question you used grepl which produces a logical vector (TRUE if
#string is present)

grepl("str1" , df$data.character)
#[1]  TRUE FALSE  TRUE

# In my comment I used grep which produces an positional index of the vector if
# string is present (this was due to me not reading your grepl properly rather 
# than because of any property)

grep("str1" , df$data.character)
# [1] 1 3

然后在grep（或grepl）

生成的这些位置上对您想要的向量进行子集化

(s <- df$data.1[grepl("str1" , df$data.character)])
# [1] "a" "c"  first and third elements are selected

将这些格式粘贴到所需格式（折叠参数用于定义元素之间的分隔符）

paste(s,collapse=",")
# [1] "a,c"

如此简洁

paste(df$data.1[grep("str1" , df$data.character)],collapse=",")

连接R中不同行的字符串

1 个答案: