我想读取一个以UTF8
编码的形状文件。当我使用rgdal::readOGR
读取它时,它工作正常,但是sf::st_read
无法获得正确的编码。有关如何解决此问题的任何建议?
对于一个可复制的示例,我要读取的形状文件可以下载here。
rgdal::readOGR
shp <- rgdal::readOGR(shp_file, encoding = "UTF-8")
head(shp@data)
> ID CD_GEOCODM NM_MUNICIP
> 0 53 1200013 ACRELÂNDIA
> 1 54 1200054 ASSIS BRASIL
> 2 55 1200104 BRASILÉIA
> 3 56 1200138 BUJARI
> 4 57 1200179 CAPIXABA
> 5 58 1200203 CRUZEIRO DO SUL
sf::st_read
sf <- sf::st_read(shp_file, stringsAsFactors=F, options = "ENCODING=UTF8")
head(sf)
> ID CD_GEOCODM NM_MUNICIP geometry
> 1 53 1200013 ACREL<c2>NDIA POLYGON ((-67.14117 -9.6833...
> 2 54 1200054 ASSIS BRASIL POLYGON ((-69.79978 -10.506...
> 3 55 1200104 BRASIL<c9>IA POLYGON ((-69.58835 -10.643...
> 4 56 1200138 BUJARI POLYGON ((-68.31643 -9.2954...
> 5 57 1200179 CAPIXABA POLYGON ((-67.84667 -10.287...
> 6 58 1200203 CRUZEIRO DO SUL POLYGON ((-72.89221 -7.4995...
# I have also tried manually adding the encoding but it still doesn't work.
sf <- sf::st_read(shp_file, stringsAsFactors=F, options = "ENCODING=UTF8")
Encoding(sf$NM_MUNICIP) <- "UTF-8"
sf$NM_MUNICIP
> [1] "ACREL\xc2NDIA" "ASSIS BRASIL" "BRASIL\xc9IA" "BUJARI"
> [5] "CAPIXABA" "CRUZEIRO DO SUL" "EPITACIOL\xc2NDIA" "FEIJ\xd3"
> [9] "JORD\xc3O" "M\xc2NCIO LIMA" "MANOEL URBANO" "MARECHAL THAUMATURGO"
> [13] "PL\xc1CIDO DE CASTRO" "PORTO WALTER" "RIO BRANCO" "RODRIGUES ALVES"
> [17] "SANTA ROSA DO PURUS" "SENADOR GUIOMARD" "SENA MADUREIRA" "TARAUAC\xc1"
> [21] "XAPURI" "PORTO ACRE"
其他形状文件(例如this other one)也会出现相同的问题,所以我认为这不是文件本身的问题。
答案 0 :(得分:2)
当我尝试使用sf::st_read
而不是WINDOWS-1252
编码并以UTF8
读取它时,它对我有用。希望对您有所帮助!
sf <- sf::st_read("ac_municipios/12MUE250GC_SIR.shp", options = "ENCODING=WINDOWS-1252")