尝试从YouTube抓取用户评论时出现“ utf8towcs”错误

时间:2019-03-31 01:09:30

标签: r

我正在尝试检索用户对智能手机品牌的评论,以下是我在运行以下R命令时遇到的错误,请有人帮助我获得针对此问题的永久修复。

此代码不一致,并且有时在安装tm和paramhelper软件包后可能会运行。

错误:

  

sort.list(y)中的错误:输入无效,请删除此狗屎视频   所以人们不想在“ utf8towcs”中窃取联想电话ðŸ〜¢'

命令:

videoIdData <- read.csv('D:\\DWBI Final Data\\VideoId.csv', stringsAsFactors = F)
str(videoIdData)

for (i in 1:length(videoIdData$videoID)){
  print(paste("The id is", i))

  commentSearchUrl <- "https://www.googleapis.com/youtube/v3/commentThreads?part=snippet%2C+replies&maxResults=100&textFormat=plainText&videoId=iTgmR4pcR9Q%20&fields=items%2CnextPageToken&key=AIzaSyAS-uSQhftToHWhbVYh1u5mqjNOvTUrGJ8"
  commentSearchUrl <- param_set(commentSearchUrl, key = "videoId", value = videoIdData$videoID[i] )
  print(commentSearchUrl)

  init_results <- httr::content(httr::GET(commentSearchUrl))
  data <- init_results$items

  if(length(data)!= 0){

    organize_data = function(){

      sub_data <- lapply(data, function(x) {
        data.frame(

          Comment = x$snippet$topLevelComment$snippet$textDisplay,
          Date = x$snippet$topLevelComment$snippet$publishedAt,
          stringsAsFactors=FALSE)
      })
    }

    sample <- organize_data()
    L <- length(sample)
    sample <- data.frame(matrix(unlist(sample), nrow=L, byrow=T))
    colnames(sample) <- c("Comment","Date")
    sample$Brand <- videoIdData$Brand[i]
    sample$Comment <- gsub("[^[:alnum:] ]", "", sample$Comment)

    sample$Comment <- sub("^\\s*<U\\+\\w+>\\s*", "", sample$Comment)
    sampleBrand <-sample[!grep("<U+1798>", sample$Comment),]
    sample$Date <- substring(sample$Date ,1,10)
    sample$Comment <- gsub("[[:punct:]]", "", sample$Comment)
    sample <- sample[sample$Date < "2018-09-01",]
    sample$Comment <- str_replace_all(sample$Comment, "[^[:alnum:]]", " ")

    write.table(sample, "Comments.csv", sep = ",", col.names = T, append = T,row.names = F)
  }
}

0 个答案:

没有答案