我有一个非常大的XML文件,我想从中提取一些属性。
我已经到达了我可以单独查看每一行但我如何将其放入数据框的部分?
我的XML
文件的片段是 -
`<xml>
<Placemark>
<name>Abercorn SS</name>
<ExtendedData>
<SchemaData schemaUrl="#S_Negotiated_Primary_catchments_from_January_to_Dec_2015_SSSS">
<SimpleData name="Centre_name">Abercorn SS</SimpleData>
<SimpleData name="Centre_code">0591</SimpleData>
<SimpleData name="Year_level">PY_06</SimpleData>
<SimpleData name="Region_Name">Central Queensland</SimpleData>
</SchemaData>
</ExtendedData>
<coordinates>
150.904245,-24.97653499999998,0 150.904806,-24.97412299999999,0 150.904388,-24.974128,0 150.874011,-24.97011800000001,0 150.876696,-24.95033100000001,0 150.815075,-24.942558,0 150.814086,-24.948353,0 150.811966,-24.95216899999999,0 150.81239,-24.958529,0 150.814615,-24.96262800000001,0 150.805857,-24.961506,0 150.805127,-24.960027,0 150.803347,-24.95676499999999,0 150.803497,-24.95399,0 150.804612,-24.95119399999999,0 150.807595,-24.949109,0 150.807979,-24.948274,0 150.808042,-24.947767,0 150.806137,-24.94549799999999,0 150.80589,-24.94416599999999,0 150.805892,-24.94126700000001,0 150.804494, 150.975314,-25.03418900000001,0 150.973299,-25.033167,0 150.972944,-25.03298699999999,0 150.950341,-25.02149500000002,0 150.947855,-25.023683,0 150.946617,-25.02333300000002,0 150.942635,-25.022637,0 150.938578,-25.01867799999999,0 150.937075,-25.01831200000001,0 150.931733,-25.012552,0 150.932783,-25.007307,0 150.923093,-24.99487199999998,0 150.922398,-24.97895700000001,0 150.90426,-24.97665499999999,0 150.904287,-24.97654000000001,0 150.904245,-24.97653499999998,0
</coordinates>
</Placemark>
<Placemark>
<name>Abergowrie SS</name>
<ExtendedData>
<SchemaData schemaUrl="#S_Negotiated_Primary_catchments_from_January_to_Dec_2015_SSSS">
<SimpleData name="Centre_name">Abergowrie SS</SimpleData>
<SimpleData name="Centre_code">1275</SimpleData>
<SimpleData name="Year_level">PY_06</SimpleData>
<SimpleData name="Region_Name">North Queensland</SimpleData>
</SchemaData>
</ExtendedData>
<coordinates>
145.897021,-18.41322500000002,0 145.8956070000001,-18.413049,0 145.894936,-18.412542,0 145.894353,-18.41183899999999,0 145.893489,-18.41142200000002,0 145.892407,-18.41094100000001,0 145.891309,-18.41027500000002,0 145.890642,-18.41,0 145.88994,-18.409505,0 145.888815,-18.41014499999999,0 145.888433,-18.41028599999999,0 145.888159,-18.41060899999999,0 145.88775,-18.410905,0 145.88711,-18.41106999999999,0 145.88677,-18.41115799999999,0 145.886496,-18.411468,0 145.885986,-18.41255499999999,0 145.884587,-18.411924,-18.41395900000002,0 145.948824,-18.41326900000002,0 145.9491859999999,-18.412522,0 145.949185,-18.412293,0 145.9493659999999,-18.41194799999999,0 145.9493659999999,-18.411431,0 145.94898,-18.411168,0 145.948762,-18.41111,0 145.9483390000001,-18.410766,0 145.948279,-18.410169,0 145.948374,-18.40835500000001,0 145.948311,-18.40482999999999,0 145.948274,-18.40427900000001,0 145.948213,-18.404107,0 145.9480680000001,-18.403969,0 145.947996,-18.40354500000001,0 145.948152,-18.40293600000002,0 145.948755,-18.402017,0 145.949056,-18.401305,0 145.949176,-18.400891,0 145.949634,-18.400179,0 145.950056,-18.39960400000001,0 145.950237,-18.399202,0 145.950381,-18.39868599999999,0 145.950381,-18.39784700000002,0 145.950476,-18.39675600000001,0 145.950561,-18.39646899999999,0 145.95062,-18.39535599999999,0 145.950718,-18.39488700000001,0 145.946246,-18.396872,0 145.93556,-18.40287699999999,0 145.895762,-18.42523000000001,0 145.897021,-18.41322500000002,0
</coordinates>
</Placemark>
<Placemark>
<name>Acacia Ridge SS</name>
<ExtendedData>
<SchemaData schemaUrl="#S_Negotiated_Primary_catchments_from_January_to_Dec_2015_SSSS">
<SimpleData name="Centre_name">Acacia Ridge SS</SimpleData>
<SimpleData name="Centre_code">0025</SimpleData>
<SimpleData name="Year_level">0P_06</SimpleData>
<SimpleData name="Region_Name">Metropolitan</SimpleData>
</SchemaData>
</ExtendedData>
<coordinates>
153.020092,-27.56231199999998,0 153.018846,-27.562119,0 153.018031,-27.562006,0 153.016631,-27.56179,0 153.016649,-27.56171300000001,0 153.016258,-27.56165100000001,0 153.015945,-27.56212899999999,0 153.014001,-27.561825,0 153.01191,-27.56153000000001,0 153.011866,-27.561704,0 153.00854,-27.56121200000002,0 153.0085390000001,-27.56121099999999,-27.584082,0 153.030365,-27.584339,0 153.031037,-27.58056199999998,0 153.031997,-27.575302,0 153.029273,-27.57489899999999,0 153.029174,-27.57393100000001,0 153.026711,-27.573589,0 153.026697,-27.57344300000001,0 153.026285,-27.57339400000001,0 153.026092,-27.573321,0 153.026056,-27.57316099999999,0 153.024937,-27.572928,0 153.024536,-27.57261299999999,0 153.024385,-27.57127099999999,0 153.024084,-27.56948399999999,0 153.022069,-27.569139,0 153.02197,-27.569593,0 153.021903,-27.56958199999999,0 153.021779,-27.56966499999998,0 153.021555,-27.569635,0 153.021294,-27.569836,0 153.020912,-27.56977599999999,0 153.020894,-27.56980899999999,0 153.020811,-27.56980899999999,0 153.0193749999999,-27.56958600000002,0 153.019659,-27.56821400000001,0 153.019728,-27.567812,0 153.019814,-27.56760700000001,0 153.020334,-27.56768899999999,0 153.020667,-27.56599800000001,0 153.021182,-27.56606499999998,0 153.021525,-27.56434500000001,0 153.021457,-27.56385299999999,0 153.020902,-27.563771,0 153.020997,-27.563251,0 153.020104,-27.56311700000001,0 153.020174,-27.56274200000001,0 153.020188,-27.56267099999999,0 153.020245,-27.56236499999999,0 153.020258,-27.562367,0 153.020263,-27.562339,0 153.020092,-27.56231199999998,0
</coordinates>
</Placemark>
</xml>`
我就在名字后面,R egion_name,
坐标。
我想将这些值放在数据框中并执行更多字符串操作。 例如 - 我在 Abercorn SS,中央昆士兰之后,坐标
编辑:我已编辑了XML
文件。请参阅上面的代码。我写过R代码,但写入csv时有点乱。
我的代码是 -
`doc<-xmlTreeParse("PrimaryCatchments.xml")
top<-xmlRoot(doc)
extract <- xmlSApply(top, function(x) xmlSApply(x, xmlValue))
extract_df <- data.frame(t(extract),row.names=NULL)
my.df <- data.frame(lapply(extract_df, as.character), stringsAsFactors=FALSE)
write.csv(my.df,"Extract.csv", row.names=FALSE)`
这会以非常混乱的方式写入文件。我哪里错了?