当我将文件文件上传到R时,文本被截断,我无法获得准确的计数。是否有我应该使用的另一个命令,所以我读取整个文本文件。
library(stringr)
> readr::read_file("Apple_Wikipedia.txt")
[1] "Apple Inc. is an American multinational technology company headquartered in Cupertino, California that designs, develops, and sells consumer electronics, computer software, and online services. The company's hardware products include the iPhone smartphone, the iPad tablet computer, the Mac personal computer, the iPod portable media player, the Apple Watch smartwatch, the Apple TV digital media player, and the HomePod smart speaker. Apple's consumer software includes the macOS and iOS operating systems, the iTunes media player, the Safari web browser, and the iLife and iWork creativity and productivity suites. Its online services include the iTunes Store, the iOS App Store and Mac App Store, Apple Music, and iCloud.\r\nApple was founded by Steve Jobs, Steve Wozniak, and Ronald Wayne in April 1976 to develop and sell personal computers. It was incorporated as Apple Computer, Inc. in January 1977, and sales of its computers saw significant momentum and revenue growth for the company.... <truncated>
> x <- c("Apple","ios", "iphone")
> str_count(x)
[1] 5 3 6
答案 0 :(得分:1)
首先需要将文本分配给R中的实际对象。目前,您只是在文本中阅读而不将其保存在任何地方,然后以不恰当的方式调用str_count
,所以它只是返回Apple&#39;(5),&#39; ios&#39;(3)和&#39; iphone&#39;(6)中的字符数。 R控制台中文本的显示仍会在某些时候被截断,但数据将被完全保存。以下应该有效。
library(stringr)
apple_wiki <- readr::read_file("Apple_Wikipedia.txt")
x <- c("Apple","iOS", "iPhone")
str_count(apple_wiki, x)
另请注意,str_count区分大小写,因此请小心将您的条款与Wiki条目匹配,或使用正则表达式或文本转换来绕过它。