R中的xml-tei:具有一个不同值的长度(节点)

时间:2017-03-11 23:07:58

标签: r xml tei

抱歉,还没有足够的文档来解析R中的xml-tei,特别是对于R中的初学者。

我正在计算几个带有'getNodeSet'函数的节点,它们在'contains'中只有一个不同的值。目的是根据特定的'包含'来计算'@type = verb',其中包含所有的'contains'(@ana,'#action')'。示例:

#different value "@ana, '#displacement'" 
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#displacement') and contains(@ana, '#ANT')]", ns)  
VARIABLE NAME01 <- length(nodes)
VARIABLE NAME01
#result in the console [2] 

#different value "@ana, '#put_together'" 
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#put_together') and  contains(@ana, '#ANT')]", ns)
VARIABLE NAME02 <- length(nodes) 
VARIABLE NAME02
#result in the console [0]

#different value "@ana, '#destruction'" 
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#destruction') and contains(@ana, '#ANT')]", ns)
VARIABLE NAME03 <- length(nodes)
VARIABLE NAME03
#result in the console [7]

但是每次写相同的东西当然是非常繁琐的,而且它不是很漂亮。

是否有可能有类似的东西(对不起,没有财产编码,只是一个符合我需要的例子):

#a condition
For node=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and not(contains(@ana, '#ANT'))]" 
#add in contains
(    
(@ana, ‘DIFFERENT VALUE01') FOR VARIABLE01
(@ana, ‘DIFFERENT VALUE02') FOR VARIABLE02
(@ana, ‘DIFFERENT VALUE03') FOR VARIABLE03
)
#etc.
你知道吗?

之后,我需要能够添加结果:

add_result <- sum(VARIABLE NAME01, VARIABLE NAME02, VARIABLE NAME03)
add_result

然而,我在考虑:

nodes=sum(
 (getNodeSet(doc,"//ns:w[contains(@ana,'#action') and contains(@type,'verb')]", ns)),
 (getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@type,'verb')]", ns))
 )
add_result <- length(nodes) 
add_result

然后我查找具有不同值的其他节点。 但遗憾的是,它不起作用。

提前感谢您的建议。

2 个答案:

答案 0 :(得分:0)

到目前为止我做了什么:

nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#displacement') and contains(@ana, '#ANT')]", ns)
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#put_together') and  contains(@ana, '#ANT')]", ns)
VARIABLE NAME01 <- length(nodes)
VARIABLE NAME02 <- length(nodes)

不知道是否有更简单的方法可以做到。

答案 1 :(得分:0)

一位同事,'R'专家,帮助我并建议:

typeAction=c("'#displacement'","'#put_together'","'#agression'","'#confrontation'","'#movement'","'#otherAction'")
total_action_ANT=0
for (i in 1:length(typeAction)) total_action_ANT=total_action_ANT+length(getNodeSet(doc,paste0("//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, ",typeAction[i],") and contains(@ana, '#ANT')]"), ns))
total_action_ANT

nodelist=list()
for (i in 1:length(typeAction))nodelist[[i]]=getNodeSet(doc,paste0("//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, ",typeAction[i],") and contains(@ana, '#ANT')]"), ns)
str(nodelist)
resultats = cbind(action=typeAction,occurences=unlist(lapply(nodelist,function(x)length(x))))
resultats

效果很好!希望这会有所帮助。