如何使用R中的streamR包跟踪带有特殊字符的单词?

时间:2014-04-18 03:51:53

标签: r twitter stream

我正在使用streamR包来传输一些推文,但它不能使用一些葡萄牙语单词,如“polícia”,“médico”,“audiência”和“política”等。如果我改用“policia”,它只显示包含“policia”的推文,西班牙文,它没有用葡萄牙语显示“polícia”。

我在R 3.1,Windows 7,streamR 0.2.1下。这是一段代码:

> filterStream(file="acento.json", track="polícia", timeout=60, oauth=twitCred)
Capturing tweets...
Connection to Twitter stream was closed after 61 seconds with up to 4 tweets downloaded.
> df <- parseTweets("acento.json")
Error in readLines(tweets, encoding = "UTF-8") : 
  5 arguments passed to .Internal(readLines) which requires 6

此消息显示发现了4条推文看起来像是默认消息,因为生成的json文件永远不会超过1kb。

> filterStream(file="acento1.json", track="política", timeout=60, oauth=twitCred)
Capturing tweets...
Connection to Twitter stream was closed after 62 seconds with up to 4 tweets downloaded.
> df <- parseTweets("acento1.json")
Error in readLines(tweets, encoding = "UTF-8") : 
  5 arguments passed to .Internal(readLines) which requires 6

有人可以给我一些如何解决这个问题的提示吗?

1 个答案:

答案 0 :(得分:2)

尝试polícia

filterStream(file="acento.json", track="pol\u00edcia", timeout=30, oauth= twitCred)