我想知道如何在搜索字词列下找到确切的小写字符串。
我目前做了什么:
使用R studio,我使用以下命令读取CSV文件:
books = read.csv("~/Desktop/R Class /BBbooklist.csv", sep=",",header= TRUE,na.strings = "?",stringsAsFactors = FALSE)
csv文件有一个标题:
[1] "Department" "Search.Term" "Search.Frequency" "ASIN"
[5] "X.1.Clicked.Title" "Click.Share" "Conversion.Share" "ASIN.1"
[9] "X.2.Clicked.Title" "Click.Share.1" "Conversion.Share.1" "ASIN.2"
[13] "X.3.Clicked.Title" "Click.Share.2" "Conversion.Share.2"
这是名为books的文件的图像:
我想要的是检索搜索词的所有行:'成人着色书'
并将行保存到名为keywords.csv
我尝试了几个选项:
dta.subset<-subset(books,Search.Term = 'adult coloring books')
以及像:
这样的grep命令grep(pattern = "adult coloring books",x = string, value = T)
答案 0 :(得分:0)
从OP的问题来看,目前尚不清楚该术语是否可以出现在任何专栏或一个特定专栏中。所以我假设它可以出现在任何列中,因为这更通用。但如果问题是关于一个特定的列,那么我的解决方案可能会产生想要的结果,具体取决于数据结构。
# dummy data frame
set.seed(1)
df = data.frame(matrix(sample(letters, size = 50, replace = T), nrow=10, ncol=5))
df
X1 X2 X3 X4 X5
1 g f y m v
2 j e f p q
3 o r q m u
4 x j d e o
5 f u g v n
6 x m k r u
7 y s a u a
8 r z j c m
9 q j w s t
10 b u i k s
# say we want all the rows for which there's "a" in at least one column
# vector of logic indexes, telling you which row has a letter "a"
x = apply(df, 1, function(x) {any(grepl(x, pattern = "a")) })
# what you want
df[x, ]
X1 X2 X3 X4 X5
7 y s a u a
现在,如果您不需要匹配正则表达式,那么这可以更简单:
ind = apply(df=="a", 1, function(x) any(x))
df[ind, ]
现在问题被编辑了,我意识到OP确实应该在R中研究子集,正如有人在评论中所建议的那样。
df[df$X3=="a", ]
X1 X2 X3 X4 X5
7 y s a u a