我有两个复杂的XML文件,我想找到它们之间的差异。
我需要的是找到:
我已尝试compareXMLDocs
套餐XML
,但效果不理想。
实施例
XML1
<root>
<first>name1</first>
<second>id1</second>
<third>
<third.1>something</third.1>
<third.2>something else</third.2>
</third>
<fifth>no differences</fifth>
</root>
XML2
<root>
<second>id2</second>
<third>
<third.1>something2</third.1>
<third.2>something else2</third.2>
</third>
<fourth>blahblah</fourth>
<fifth>no differences</fifth>
</root>
所以当我与compareXMLDocs
比较时,我有:
> compareXMLDocs(a, b)
$inA
first
1
$inB
fourth
1
$countDiffs
named integer(0)
我知道first
标记仅用于XML1,而fourth
标记仅用于XML2。但我不知道第三版和第三版中的值是不同的例如。这就是我要找的。我不明白countDiffs
的作用。这里似乎没什么用处。
我也尝试在数据框中转换XML,但输出格式不是很有帮助。对于树很深的大型XML文件,它会变得最糟糕。
我希望这个例子的结果是这样的数据框:
Path A B
/root/first name1 NA
/root/second id1 id2
/root/third/third.1 something something2
/root/third/third.2 something else something else2
/fourth NA blahblah
答案 0 :(得分:3)
数据:
library(xml2)
library(tidyverse)
read_xml("<root>
<first>name1</first>
<second>id1</second>
<third>
<third.1>something</third.1>
<third.2>something else</third.2>
</third>
<fifth>no differences</fifth>
</root>
") -> d1
read_xml("
<root>
<second>id2</second>
<third>
<third.1>something2</third.1>
<third.2>something else2</third.2>
</third>
<fourth>blahblah</fourth>
<fifth>no differences</fifth>
</root>
") -> d2
制作快速帮助功能:
# NOTE: this will not handle attributes
as_path_df <- function(x) {
as_list(x) %>%
unlist() %>%
as.list() %>%
as_data_frame() %>%
gather(key, val)
}
这是^^的作用:
(d1_p <- as_path_df(d1))
## # A tibble: 5 x 2
## key val
## <chr> <chr>
## 1 first name1
## 2 second id1
## 3 third.third.1 something
## 4 third.third.2 something else
## 5 fifth no differences
(d2_p <- as_path_df(d2))
## # A tibble: 5 x 2
## key val
## <chr> <chr>
## 1 second id2
## 2 third.third.1 something2
## 3 third.third.2 something else2
## 4 fourth blahblah
## 5 fifth no differences
键?
setdiff(d1_p$key, d2_p$key)
## [1] "first"
值?
rename(d1_p, d1_val=val) %>%
left_join(rename(d2_p, d2_val=val)) %>%
mutate(same = (d1_val == d2_val))
## # A tibble: 5 x 4
## key d1_val d2_val same
## <chr> <chr> <chr> <lgl>
## 1 first name1 <NA> NA
## 2 second id1 id2 FALSE
## 3 third.third.1 something something2 FALSE
## 4 third.third.2 something else something else2 FALSE
## 5 fifth no differences no differences TRUE
您可能只能在is.na()
或_val
列的same
列中使用setdiff()
作为关键缺失部分。但/*
Make an ajax call and put the results in the movies array
*/
getMovies()
{
let self = this;
axios.get('https://pastebin.com/raw/FF6Vec6B')
.then(response => self.setState({ movies: response.data }));
}
超级快。