抱歉,还没有足够的文档来解析R中的xml-tei,特别是对于R中的初学者。
我正在计算几个带有'getNodeSet'函数的节点,它们在'contains'中只有一个不同的值。目的是根据特定的'包含'来计算'@type = verb',其中包含所有的'contains'(@ana,'#action')'。示例:
#different value "@ana, '#displacement'"
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#displacement') and contains(@ana, '#ANT')]", ns)
VARIABLE NAME01 <- length(nodes)
VARIABLE NAME01
#result in the console [2]
#different value "@ana, '#put_together'"
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#put_together') and contains(@ana, '#ANT')]", ns)
VARIABLE NAME02 <- length(nodes)
VARIABLE NAME02
#result in the console [0]
#different value "@ana, '#destruction'"
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#destruction') and contains(@ana, '#ANT')]", ns)
VARIABLE NAME03 <- length(nodes)
VARIABLE NAME03
#result in the console [7]
但是每次写相同的东西当然是非常繁琐的,而且它不是很漂亮。
是否有可能有类似的东西(对不起,没有财产编码,只是一个符合我需要的例子):
#a condition
For node=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and not(contains(@ana, '#ANT'))]"
#add in contains
(
(@ana, ‘DIFFERENT VALUE01') FOR VARIABLE01
(@ana, ‘DIFFERENT VALUE02') FOR VARIABLE02
(@ana, ‘DIFFERENT VALUE03') FOR VARIABLE03
)
#etc.
你知道吗?
之后,我需要能够添加结果:
add_result <- sum(VARIABLE NAME01, VARIABLE NAME02, VARIABLE NAME03)
add_result
然而,我在考虑:
nodes=sum(
(getNodeSet(doc,"//ns:w[contains(@ana,'#action') and contains(@type,'verb')]", ns)),
(getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@type,'verb')]", ns))
)
add_result <- length(nodes)
add_result
然后我查找具有不同值的其他节点。 但遗憾的是,它不起作用。
提前感谢您的建议。
答案 0 :(得分:0)
到目前为止我做了什么:
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#displacement') and contains(@ana, '#ANT')]", ns)
nodes=getNodeSet(doc,"//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, '#put_together') and contains(@ana, '#ANT')]", ns)
VARIABLE NAME01 <- length(nodes)
VARIABLE NAME02 <- length(nodes)
不知道是否有更简单的方法可以做到。
答案 1 :(得分:0)
一位同事,'R'专家,帮助我并建议:
typeAction=c("'#displacement'","'#put_together'","'#agression'","'#confrontation'","'#movement'","'#otherAction'")
total_action_ANT=0
for (i in 1:length(typeAction)) total_action_ANT=total_action_ANT+length(getNodeSet(doc,paste0("//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, ",typeAction[i],") and contains(@ana, '#ANT')]"), ns))
total_action_ANT
nodelist=list()
for (i in 1:length(typeAction))nodelist[[i]]=getNodeSet(doc,paste0("//ns:w[contains(@type,'verb') and contains(@ana,'#action') and contains(@ana, ",typeAction[i],") and contains(@ana, '#ANT')]"), ns)
str(nodelist)
resultats = cbind(action=typeAction,occurences=unlist(lapply(nodelist,function(x)length(x))))
resultats
效果很好!希望这会有所帮助。