使用readHTMLTable()删除特定列中0.0000的行

时间:2016-01-25 23:47:36

标签: r xml httr

我尝试使用XLM包中的0.0000 em函数删除带有readHTMLTable()特定列的HTML表格中的行,但没有成功。在我的代码中:

#Packages

    require(httr)
    require(XML)

#Function for read HTML table
#For remove 0.0000 in columns 9 and 10

    readFE<- function (x, URL = ""){
    FILE <- GET(url=URL)
         tables <- getNodeSet(htmlParse(FILE), "//table") 
         FE_tab <- readHTMLTable(tables[[1]], 
                            header = c("empresa","desc_projeto","desc_regiao", 
                                       "cadastrador_por","cod_talhao","descricao", 
                                       "formiga_area","qtd_destruido","latitude", 
                                       "longitude","data_cadastro"), 
                            colClasses = c("character","character","character", 
                                           "character","character","character", 
                                           "character","character","character", 
                                           "character","character"), 
                            trim = TRUE, stringsAsFactors = FALSE 
    )     
         x<-NULL
         results <- x
         x<-FE_tab[-(1),]
         results <- x
         results<-results[!apply(results,1,function(x){any(x[,9:10]==0.00000000)}),]
         results
    }

示例:

tableFE<-readFE(URL="https://www.dropbox.com/s/mb316ghr4irxipr/TALHOES_AGENTES.htm?dl=1")
tableFE## Doesn't work!!

1 个答案:

答案 0 :(得分:0)

以下是在$_POST / xml2中使用它的方法:

rvest