虽然这是真的,但字符串等于false

时间:2017-05-07 16:09:13

标签: r

我现在正在使用Rstudio作为网络抓取工具。但我有一个问题。

#define NUMSTU 50

#include <stdio.h>

//function prototype
void printdata();

//Global variables

int stuID[NUMSTU];
int stuCount;
int totStu;

int main ()
{
   int stuCount = 0;
   int totStu = 0;
   int studentID;
    //Prompt user for number of student's in class

    printf("Please enter number of student's in class:");
    scanf ("%d", &totStu);

   for (stuCount = 0; stuCount <totStu; stuCount++)
   {    
   //Prompt user for student ID number

   printf("\n Please enter student's ID number:");
  scanf("%d", &studentID);
  stuID[NUMSTU] = studentID;

  }

 //Call Function to print data
 printdata();

 return 0;
 }//end main


 void printdata(){

 //This function will display collected data
 //Input: Globals stuID[NUMSTU]
//Output: none



//Display column headers
printf("\n\n stuID\n");

//loop and display student ID numbers
for (stuCount = 0; stuCount <totStu; stuCount++){
printf("%d", stuID);
}
}

A [+]总是返回false,我不知道为什么。我将其与其他使用完全相同的方法返回true的其他人进行了比较。有谁知道如何解决这个问题?

1 个答案:

答案 0 :(得分:2)

网页使用的是UTF-8编码,这似乎导致了这个问题。

library(rvest)
page_html <- read_html("http://competitie.vttl.be/index.php?menu=6&sel=36665&result=1&category=1")
grade <- page_html %>% html_nodes("td:nth-child(1) :nth-child(2) :nth-child(3) .DBTable_first") %>% html_text()
grade
[1] "A [+]"
Encoding(grade)
[1] "UTF-8"
Encoding(grade) <- "unknown"
grade
[1] "AÂ [+]"

注意额外的角色!

一个解决方案是

 grade <- page_html %>% html_nodes("td:nth-child(1) :nth-child(2) :nth-child(3) .DBTable_first") %>% html_text()
 grade <- iconv(grade, "UTF-8", "ASCII", "")
 identical(grade,"A[+]")
[1] TRUE

NB从UTF-8转换为ASCII会删除空格,因此现在比较为“A [+]”

BTW我必须调整html_nodes中的css选择器字符串才能使其正常工作。