我是 R 新手,对 XML/HTML 了解不多。我有这个 txt 文件,我正在尝试解析它并将其转换为数据框。下面是文本文件外观的示例。它包含无效字符,例如:
componentDidUpdate(currentProps) {
const {
incidentDetails: { ValidStatusChanges },
assignedIncidents,
} = currentProps.state.data;
if (!ValidStatusChanges) {
this.props.valueChange({
storeName: "data",
prop: "incidentDetails",
nestedProp: ValidStatusChanges,
value: []
})
}
if (ValidStatusChanges.length === 0 && assignedIncidents.length !== 0) {
this.props.valueChange({
storeName: "data",
prop: "assignedIncidents",
value: [],
});
}
}
这是我使用 gsub 删除这些字符后的示例代码,但我仍然遇到错误
"x"
"1" "<?xml version=\"1.0\" ?><response status='ok'><serviceRequestList><serviceRequest><accountManagerId>11111</accountManagerId><billable>0</billable><billableTotal>0.0000000000</billableTotal><billingStatus>Not Billed</billingStatus><costTotal>0.0000</costTotal><customerContactEmail>example@imperial.nhs.uk</customerContactEmail><customerContactId>2222222</customerContactId><customerContactName>Example Example</customerContactName><customerContactPhone>0044 (0)000 111 2222</customerContactPhone><customerContactPhoneMobile></customerContactPhoneMobile><customerId>444444</customerId><customerLocationCity>London</customerLocationCity><customerLocationCountry>United Kingdom</customerLocationCountry><customerLocationId>9999999</customerLocationId><customerLocationName>Example's Hospital</customerLocationName><customerLocationNotes></customerLocationNotes><customerLocationPostalCode>W2 1NY</customerLocationPostalCode><customerLocationState>Greater London</customerLocationState><customerLocationStreetAddress>Example Street</customerLocationStreetAddress><customerLocationZone></customerLocationZone><customerName>Example Healthcare (EXAMPLE)</customerName><dateTimeCreated>2010-04-06T09:47:25</dateTimeCreated><dateTimeClosed>2011-05-24T07:32:05.240</dateTimeClosed><description>Example - Ex/CANCELED</description><detailedDescription></detailedDescription><priority>3</priority><priorityLabel>Medium</priorityLabel><serviceManagerId>0</serviceManagerId><serviceRequestId>1007</serviceRequestId><status>Closed</status><timeOpen_hours>9909.7500000</timeOpen_hours><type></type></serviceRequest><serviceRequest><accountManagerId>11111</accountManagerId><billable>0</billable><billableTotal>0.0000000000</billableTotal><billingStatus>Not Billed</billingStatus><costTotal>0.0000</costTotal><customerContactEmail>example.example@gstt.nhs.uk, example2.example2@gstt.nhs.uk</customerContactEmail><customerContactId>5555555</customerContactId><customerContactName>Ex Example</customerContactName><customerContactPhone>88888 444444</customerContactPhone><customerContactPhoneMobile>07817 738912</customerContactPhoneMobile><customerId>957056</customerId><customerLocationCity>London</customerLocationCity><customerLocationCountry>United Kingdom</customerLocationCountry><customerLocationId>1372407</customerLocationId><customerLocationName>Example' Hospital</customerLocationName><customerLocationNotes></customerLocationNotes><customerLocationPostalCode>GH1 7EH</customerLocationPostalCode><customerLocationState>Greater London</customerLocationState><customerLocationStreetAddress>Example Bridge Road</customerLocationStreetAddress><customerLocationZone></customerLocationZone><customerName>Example' Trust (EXTT)</customerName><dateTimeCreated>2010-06-10T07:37:58</dateTimeCreated><dateTimeClosed>2010-06-10T07:42:40</dateTimeClosed><description>Software -Example - 55555</description><detailedDescription>The example that I have created.
This is an example, I made up the data.
This is another line for the example. 
</detailedDescription><priority>3</priority><priorityLabel>Medium</priorityLabel><serviceManagerId>0</serviceManagerId><serviceRequestId>6007</serviceRequestId><status>Closed</status><timeOpen_hours>0.0833000</timeOpen_hours><type>Problem</type></serviceRequest></serviceRequestList></response>"
library(httr)
library(xml2)
library(dplyr)
library(XML)
library(plyr)
#Change working directory to be able to save on network share drive
setwd("\\\\mwo-file\\Example")
#Parsing the clean XML File
data <- xmlTreeParse("Sample.txt")
即使当我编辑文本文件时 Error: 1: Start tag expected, '<' not found
是第一个,它仍然不起作用。当我尝试时:
<
它可以工作,但对象看起来很混乱,我不知道如何检索我想要的节点的信息。
#Parsing the clean XML File
data <- htmlTreeParse("Sample.txt")
data
有人知道解析该示例文件并从中生成数据框的最佳方法吗?