(R)“文本挖掘”如何查看<< PlainTextDocument >>中的详细信息?

时间:2019-12-25 08:34:09

标签: r tm

刚开始学习文本挖掘,紧接着这本书,我使用tm :: inspect()来查看数据“粗略”中的第一个信息,但是与那本书的示例不同,R向我展示了以下内容,而不是详细信息如书中所述。

我想知道为什么会这样吗?我该如何纠正呢?谢谢! (对不起,我的英语水平很低。)

我的代码:

library(tm)
data(crude) 
inspect(crude[1])                                        
summary(crude)

和输出:

> inspect(crude[1])
<<VCorpus>>
Metadata:  corpus specific: 0, document level (indexed): 0
Content:  documents: 1

$`reut-00001.xml`
<<PlainTextDocument>>
Metadata:  15
Content:  chars: 527

> summary(crude)
    Length Class             Mode
127 2      PlainTextDocument list
144 2      PlainTextDocument list
191 2      PlainTextDocument list
194 2      PlainTextDocument list
211 2      PlainTextDocument list
236 2      PlainTextDocument list
237 2      PlainTextDocument list

1 个答案:

答案 0 :(得分:0)

可能是您忘记了一对方括号吗?

library(tm)
data("crude")

inspect(crude[[1]])

对我来说,它打印出以下内容:

<<PlainTextDocument>>
Metadata:  15
Content:  chars: 527

Diamond Shamrock Corp said that
effective today it had cut its contract prices for crude oil by
1.50 dlrs a barrel.
    The reduction brings its posted price for West Texas
Intermediate to 16.00 dlrs a barrel, the copany said.
    "The price reduction today was made in the light of falling
oil product prices and a weak crude oil market," a company
spokeswoman said.
    Diamond is the latest in a line of U.S. oil companies that
have cut its contract, or posted, prices over the last two days
citing weak oil markets.
 Reuter