我在google earth中创建了一条路径,然后使用以下说明复制并粘贴了kml文件(https://developers.google.com/kml/faq#validation - 如何创建KML文件?)
使用R的xml包我使用xmlInternalTreeParse没有问题:
doc2<-xmlInternalTreeParse("ROUTE_3.kml")
但这是我在尝试使用xpathApply时得到的:
xpathApply(doc2,"/kml//coordinates",xmlValue)
list()
在我删除了kml标签的属性之后,我得到了以下内容:
xpathApply(doc2,"/kml//coordinates",xmlValue)
[[1]]
[1] "4.538678046760991,43.96218242485241,0 4.536099605055323,43.96220903572051,0
4.53771014982657,43.96415063050954,0 4.536106012183452,43.96535632643623,0
4.538664824256699,43.9660402294286,0 4.539486616025195,43.96777930035288,0
4.54165951159373,43.96623221715382,0 4.543909553814832,43.96588360581748,0
4.541906820403621,43.96447824521096,0 4.543519784610379,43.96288529313735,0
4.540449258644572,43.9633940089841,0 4.544185719673153,43.9516337999984,0
4.536212701406948,43.94157791460842,0 4.539125112498221,43.96125976359349,0"
我使用http://www.kmlvalidator.com/home.htm检查了原始kml文件,并说该文件“有效并符合最佳做法”。我是xpath的新手(一般来说是xml所以任何关于如何使用kml标签属性处理这个问题的建议都会受到赞赏。
既然我已将坐标作为列表的元素,那么有一种聪明的方法可以使用lon lat elv作为列标题来创建三列数据框吗? 我尝试了以下但我确信有更好的方法(感谢:Split column at delimiter in data frame):如果您有更直接的解决方案,请告诉我。谢谢。
ll<-xpathApply(doc2,"/kml//coordinates",xmlValue)
s<-ll[[1]]
ss<-strsplit(s,split=" ")
df <- data.frame(do.call('rbind', strsplit(as.character(ss[[1]]),',',fixed=TRUE)))
colnames(df)<-c("lon", "lat", "elv")
df
lon lat elv
1 4.538678046760991 43.96218242485241 0
2 4.536099605055323 43.96220903572051 0
3 4.53771014982657 43.96415063050954 0
4 4.536106012183452 43.96535632643623 0
5 4.538664824256699 43.9660402294286 0
6 4.539486616025195 43.96777930035288 0
7 4.54165951159373 43.96623221715382 0
8 4.543909553814832 43.96588360581748 0
9 4.541906820403621 43.96447824521096 0
10 4.543519784610379 43.96288529313735 0
11 4.540449258644572 43.9633940089841 0
12 4.544185719673153 43.9516337999984 0
13 4.536212701406948 43.94157791460842 0
14 4.539125112498221 43.96125976359349 0
这是原始的kml文件:
<?xml version="1.0" encoding="UTF-8"?>
<kml xmlns="http://www.opengis.net/kml/2.2" xmlns:gx="http://www.google.com/kml/ext/2.2" xmlns:kml="http://www.opengis.net/kml/2.2" xmlns:atom="http://www.w3.org/2005/Atom">
<Document>
<name>KmlFile</name>
<StyleMap id="m_ylw-pushpin">
<Pair>
<key>normal</key>
<styleUrl>#s_ylw-pushpin</styleUrl>
</Pair>
<Pair>
<key>highlight</key>
<styleUrl>#s_ylw-pushpin_hl</styleUrl>
</Pair>
</StyleMap>
<Style id="s_ylw-pushpin">
<IconStyle>
<scale>1.1</scale>
<Icon>
<href>http://maps.google.com/mapfiles/kml/pushpin/ylw-pushpin.png</href>
</Icon>
<hotSpot x="20" y="2" xunits="pixels" yunits="pixels"/>
</IconStyle>
</Style>
<Style id="s_ylw-pushpin_hl">
<IconStyle>
<scale>1.3</scale>
<Icon>
<href>http://maps.google.com/mapfiles/kml/pushpin/ylw-pushpin.png</href>
</Icon>
<hotSpot x="20" y="2" xunits="pixels" yunits="pixels"/>
</IconStyle>
</Style>
<Placemark>
<name>ROUTE_3</name>
<styleUrl>#m_ylw-pushpin</styleUrl>
<LineString>
<tessellate>1</tessellate>
<coordinates>
4.538678046760991,43.96218242485241,0
4.536099605055323,43.96220903572051,0
4.53771014982657,43.96415063050954,0
4.536106012183452,43.96535632643623,0
4.538664824256699,43.9660402294286,0
4.539486616025195,43.96777930035288,0
4.54165951159373,43.96623221715382,0
4.543909553814832,43.96588360581748,0
4.541906820403621,43.96447824521096,0
4.543519784610379,43.96288529313735,0
4.540449258644572,43.9633940089841,0
4.544185719673153,43.9516337999984,0
4.536212701406948,43.94157791460842,0
4.539125112498221,43.96125976359349,0
</coordinates>
</LineString>
</Placemark>
</Document>
</kml>
更新:做了一点阅读之后。特别是标题为 - 在内部XML树/ DOM中查找匹配节点的XML包文档部分 - 详细信息。我现在知道kml标签属性处理命名空间,所以我将xpathApply更正为:
xpathApply(doc2,"/kml:kml//kml:coordinates",xmlValue)
请注意,该路径现在包含kml:namespace。
现在我可以使用kml文件而无需修改。这是一个包含在函数中的示例:
library(XML)
KML_geo_path_coordinates_to_dataframe<-function(kml_file){
#this requires the xml library
doc2<-xmlInternalTreeParse(kml_file)
#the namespace issue (kml:) is explained in the getNodeSet(XML) R documentation under Details
ll<-xpathApply(doc2,"/kml:kml//kml:coordinates",xmlValue)
# ll delivers a list, I take the element I need out...a long string of coordinates separated by " "
s<-ll[[1]]
#however it may need some clean up
s<-gsub(pattern="\t",replacement="",x=s)
s<-gsub(pattern="\n",replacement="",x=s)
#split out the coordinate sets lon, lat, elv
ss<-strsplit(s,split=" ")
df <- data.frame(do.call('rbind', strsplit(as.character(ss[[1]]),',',fixed=TRUE)))
colnames(df)<-c("lon", "lat", "elv")
return(df)
}
答案 0 :(得分:0)
实施@Gavin的优秀建议:(假设文件名为map.kml
)。
library(rgdal)
setwd("<directory containing kml file>")
system(paste("ogrinfo", "map.kml")) # diagnostic to identify the layers
# Had to open data source read-only.
# INFO: Open of `map.kml'
# using driver `KML' successful.
# 1: KmlFile (Line String) <- This is the layer name
map <- readOGR(dsn="map.kml",layer="KmlFile")
df <- data.frame(map@lines[[1]]@Lines[[1]]@coords)
colnames(df) <- c("lon","lat")
df
# lon lat
# 1 4.538678 43.96218
# 2 4.536100 43.96221
# 3 4.537710 43.96415
# 4 4.536106 43.96536
# 5 4.538665 43.96604
# ...
一些注意事项:
readOGR(...)的KML驱动程序需要文件名(可选择带路径)作为dsn,并将kml名称标签的文本作为图层。开头的系统调用识别图层。
readOGR(...)抛出了z维度。因此,如果您需要,这种方法对您不起作用。
坐标的位置取决于几何形状和元素数量。在您的情况下,您只有一条路径。
您的文件实际上存在错误,在第2行(xmlns:gx
命名空间声明中缺少结束引号)。你需要修复它或文件不会导入..