Question

这是一个数据帧/ tibble和一个字符元素（此元素是一个tibble的列）

df1 <- structure(list(Twitter_name = c("CHESHIREKlD", "JellyComons", 
"kirmiziburunlu", "erkekdeyimleri", "herosFrance", "IkishanShah"
), Declared_followers = c(60500L, 43100L, 31617L, 27852L, 26312L, 
16021L), Real_followers = c(60241, 43054, 31073, 27853, 25736, 
15856), Twitter_Id = c("783866366", "1424086592", "2367932244", 
"3352977681", "2580703352", "521094407")), .Names = c("Twitter_name", 
"Declared_followers", "Real_followers", "Twitter_Id"), row.names = c(NA, 
-6L), class = c("tbl_df", "tbl", "data.frame"))

myId <- c("867211097882804224", "868806957133688832",   "549124465","822580282452754432",  
"109344546", "482666188", "61716107", "3642392237", "595318933", 
"833365943044628480", "1045015087", "859830740669800448", "860562940059045888", 
"2854457294", "871784135983067136", "866922354554814464", "4839343547", 
"849451474572759040", "872084673526214656", "794841530053853184")

N：B ：df1已被缩短，确实有128个观察结果。
我希望测试df1$Twitter_Id的所有行元素，看看它们是否在myId中。我可以运行这个：

 > match(myId[1], df1$Twitter_Id)

但：

在第一次出现时停止
我需要将match()函数应用于myId的所有元素。

使用lapply()或dplyr，tydiverse个软件包中的其他功能，我找不到干净简单的方法。

感谢您的帮助。

编辑我需要对整个案例更加明确。

myTw <- structure(list(id_str = c("893445199661330433", "893116842558050304", 
"892739336466305024", "892401780105019393", "892401594272296963", 
"892365572486430720", "891964139756818432")), .Names = "id_str", row.names = c(NA, 
-7L), class = c("tbl_df", "tbl", "data.frame"))

这些是推文ID。我正在寻找的是获取哪些推特用户转发了这些推特。为此，我使用包retweeters()中的twitteR函数。

library(twitteR)
MyRtw <- retweeters(myTw[1])

MyRtw <- c("889135428028084224", "867211097882804224", "868806957133688832", 
"549124465", "822580282452754432", "109344546", "482666188", 
"61716107", "3642392237", "595318933", "833365943044628480", 
"1045015087", "859830740669800448", "860562940059045888", "2854457294", 
"871784135983067136", "866922354554814464", "4839343547", "849451474572759040", 
"872084673526214656")

这是Twitter用户ID的列表。最后，我想看看df1$Twitte_Id的哪些用户转发了MyTw[1]。

Answer 1

您可以使用＆＃39;％in％＆＃39;操作

编辑：可能这就是你想要的。在这里，我使用了原始帖子中发布的数据（在编辑之前）。

matchVector = NULL
for (id in df1$Twitter_Id) {
  matchCounter <- sum(myId %in% id)  
  matchVector <- c(matchVector, matchCounter)
}

df1$numberOfMatches <- matchVector

从列表

1 个答案: