我有一个评论数据框,看起来像这样(df1)
Comments
Apple laptops are really good for work,we should buy them
Apple Iphones are too costly,we can resort to some other brands
Google search is the best search engine
Android phones are great these days
I lost my visa card today
我有另一个merchent名称的数据框,看起来像这样(df2):
Merchant_Name
Google
Android
Geoni
Visa
Apple
MC
WallMart
如果df2中的商家名称出现在df 1的注释中,请将该商家名称附加到R中df1中的第二列。匹配不一定是完全匹配。近似值是必需的。此外,df1包含大约500K行! 我的最终输出df可能看起来像这样
Comments Merchant
Apple laptops are really good for work,we should buy them Apple
Apple Iphones are too costly,we can resort to some other brands Apple
Google search is the best search engine Google
Android phones are great these days Android
I lost my visa card today Visa
我怎样才能在R中有效地做到这一点。 感谢
答案 0 :(得分:5)
这是regex
的工作。查看grepl
内的lapply
命令。
comments = c(
'Apple laptops are really good for work,we should buy them',
'Apple Iphones are too costly,we can resort to some other brands',
'Google search is the best search engine ',
'Android phones are great these days',
'I lost my visa card today'
)
brands = c(
'Google',
'Android',
'Geoni',
'Visa',
'Apple',
'MC',
'WallMart'
)
brandinpattern = lapply(
brands,
function(brand) {
commentswithbrand = grepl(x = tolower(comments), pattern = tolower(brand))
if ( sum(commentswithbrand) > 0) {
data.frame(
comment = comments[commentswithbrand],
brand = brand
)
} else {
data.frame()
}
}
)
brandinpattern = do.call(rbind, brandinpattern)
> do.call(rbind, brandinpattern)
comment brand
1 Google search is the best search engine Google
2 Android phones are great these days Android
3 I lost my visa card today Visa
4 Apple laptops are really good for work,we should buy them Apple
5 Apple Iphones are too costly,we can resort to some other brands Apple
答案 1 :(得分:0)
试试这个
final_df <- data.frame(Comments = character(), Merchant_Name = character(), stringsAsFactors = F)
for(i in df1$Comments){
for(j in df2$Merchant_Name){
if(grepl(tolower(j),tolower(i))){
final_df[nrow(final_df) + 1,] <- c(i, j)
break
}
}
}
final_df
## comments brands
##1 Apple laptops are really good for work,we should buy them Apple
##2 Apple Iphones are too costly,we can resort to some other brands Apple
##3 Google search is the best search engine Google
##4 Android phones are great these days Android
##5 I lost my visa card today Visa