我想在属性中选择几个属性值,其中多个值为print
。对于这个例子
#in R
interpRef <- getNodeSet(doc,"//ns:ref[contains(@ana, 'whatAction')]", ns)
interpRef_ana <- for (i in 1:length(interpRef)) print(paste(xmlGetAttr(interpRef[[i]],"ana")))
我有结果:
[[1]]
<ref ana="whatAction #ktu1-3_ii_l6b_tḫtṣb #verb.competition #contend">Action belongs to verb competition subcategory contend
<stage ana="whatResult #result #defeate_ofOpposition"/></ref>
[[2]]
<ref ana="whatAction #ktu1-3_ii_l7_tmḫṣ #verb.emotion #humiliation">Action belongs to verb emotion, subcategory humiliation
<stage ana="whatResult #result #defeate_ofOpposition"/></ref>
[[3]]
<ref ana="whatAction #ktu1-3_ii_l8_tṣmt #verb.emotion #humiliation">Action belongs to verb emotion, subcategory humiliation</ref>
#print
[1] "whatAction #ktu1-3_ii_l6b_tḫtṣb #verb.competition #contend"
[1] "whatAction #ktu1-3_ii_l7_tmḫṣ #verb.emotion #humiliation"
[1] "whatAction #ktu1-3_ii_l8_tṣmt #verb.emotion #humiliation"
我只需要@ana
个属性中的少数属性值,值为2和3,示例为print
:
[1] "#ktu1-3_ii_l6b_tḫtṣb #contend"
[1] "#ktu1-3_ii_l7_tmḫṣ #humiliation"
[1] "#ktu1-3_ii_l8_tṣmt #humiliation"
我已经做了几次尝试,其中之一是后续的,但它不起作用:
interpRef_ana <- for (i in 1:length(interpRef)) print(paste(xmlGetAttr(interpRef[[i]],"ana",[2:3])))
==== XML示例====
每个<ref>
都在<interp>
之内,每个@ana
都遵循相同的层次结构,词汇来自预定义的分类法。
<interp xml:id="ktu1-3_ii_l6b_int" ana="#ktu1-3_ii_l6b" corresp="#ktu1-3_ii_6b">
<desc>
<ref ana="whatAction #ktu1-3_ii_l6b_tḫtṣb #verb.competition #contend"
>Action belongs to verb competition subcategory contend
<stage ana="whatResult #result #defeate_ofOpposition" />
</ref>
<castList>
<castItem>
<persName type="character" ana="#whatCharacter #Character #ANT #Female">
<state ana="#whatRole #active" />ʾAnatu
</persName>
</castItem>
</castList>
<view>
<placeName ana="#whatContext #battle">battle
<location ana="#whatSphere #outside" />
</placeName>
</view>
<stage ana="#whatBehavior">
<span ana="#toDestroy #five_dD #rage">Voluntary
intentionality, to destroy of her free will, with rage
(level five).</span>
<span ana="#AffectEntity_and_other">The result of action has
an impact on ʾAnatu and others</span>
</stage>
</desc>
</interp>
<interp xml:id="ktu1-3_ii_l7_int" ana="#ktu1-3_ii_l7" corresp="#ktu1-3_ii_l7">
<desc>
<ref ana="whatAction #ktu1-3_ii_l7_tmḫṣ #verb.emotion #humiliation"
>Action belongs to verb emotion, subcategory humuliation
<stage ana="whatResult #result #defeate_ofOpposition" />
</ref>
<castList>
<castItem>
<persName type="character" ana="#whatCharacter #Character #ANT #Female">
<state ana="#whatRole #active" />ʾAnatu
</persName>
<persName type="character" cert="low" ana="#Character #UNK #Unknown">
<state ana="#behav #passive" />People from the West
</persName>
</castItem>
</castList>
<view>
<placeName ana="#whatContext #battle">battle
<location ana="#whatSphere #outside" />outside her household
</placeName>
</view>
<stage ana="#whatBehavior">
<span ana="#toDestroy #free #five_dD">Voluntary
intentionality, to destroy of her free will, with rage
(level five)Five.</span>
<span ana="#affectEntity_and_other">The result of action has
an impact on ʾAnatu and others</span>
</stage>
</desc>
</interp>
====更新====
我尝试使用库string
,理论上它可行,我可以选择我需要的属性值:
x <- for (i in 1:length(interp)) print((cbind((y=(KTU = (xmlGetAttr(interp[[i]],"ana")))), (z=(verb.category = (xmlGetAttr(interpRef[[i]],"ana")))))))
x1 <- print (cbind(word(word(y,-1)),(word(z, -3, -2))))
x1
> x <- for (i in 1:length(interp)) print((cbind((y=(KTU = (xmlGetAttr(interp[[i]],"ana")))), (z=(verb.category = (xmlGetAttr(interpRef[[i]],"ana")))))))
[,1] [,2]
[1,] "#ktu1-3_ii_l5b-6a" "whatAction #ktu1-3_ii_l5b-6a_tmtḫṣ #verb.competition #contend"
[,1] [,2]
[1,] "#ktu1-3_ii_l6b" "whatAction #ktu1-3_ii_l6b_tḫtṣb #verb.competition #contend"
[,1] [,2]
[1,] "#ktu1-3_ii_l7" "whatAction #ktu1-3_ii_l7_tmḫṣ #verb.emotion #humiliation"
[,1] [,2]
[1,] "#ktu1-3_ii_l8" "whatAction #ktu1-3_ii_l8_tṣmt #verb.emotion #humiliation"
[,1] [,2]
[1,] "ktu1-3_ii_l11b_12a" "whatAction #ktu1-3_ii_l11b-12a_ʿtkt #put_together #action"
[,1] [,2]
[1,] "#ktu1-3_ii_l12b_13a" "whatAction #ktu1-3_ii_l12b-13a_šnst #put_together #action"
[,1] [,2]
[1,] "#ktu1-3_ii_l13b_14a" "whatAction #ktu1-3_ii_l13b-14a_tġlt #action #movement"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "whatAction #ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation #action"
> x
NULL
> x1 <- print (cbind(word(word(y,-1)),(word(z, -3, -2))))
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
> x1
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
但它只给出了一次出现的属性值而不是列表。所以我尝试添加for (i in 1:length(interp))
:
x1 <- for (i in 1:length(interp)) print (cbind(word(word(y,-1)),(word(z, -3, -2))))
> x1 <- for (i in 1:length(interp)) print (cbind(word(word(y,-1)),(word(z, -3, -2))))
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
[,1] [,2]
[1,] "#ktu1-3_ii_l15b_16a" "#ktu1-3_ii_l5b_6a_tmtḫṣ #confrontation"
> x1
我只重复相同的事件8次(=实际发生次数)
提前,谢谢你的帮助。
答案 0 :(得分:0)
我找到了解决方案,也许会有所帮助:
listInterp <- list()
for (i in 1:length(interp)) {
print ((cbind((y=(KTU = (xmlGetAttr(interp[[i]],"ana")))), (z=(verb.category = (xmlGetAttr(interpRef[[i]],"ana")))))))
listInterp[[i]] <- (paste(cbind(word(word(y,-1)),(word(z, -3, -2))), collapse=": ")) #to select attribute values
}
listInterp<-(lapply(listInterp,gsub,pattern="#",replacement="")) #to replace # by empty space
listInterp
#result
[[1]]
[1] "ktu1-3_ii_l5b-6a: ktu1-3_ii_l5b-6a_tmtḫṣ verb.competition"
[[2]]
[1] "ktu1-3_ii_l6b: ktu1-3_ii_l6b_tḫtṣb verb.competition"
[[3]]
[1] "ktu1-3_ii_l7: ktu1-3_ii_l7_tmḫṣ verb.emotion"
[[4]]
[1] "ktu1-3_ii_l8: ktu1-3_ii_l8_tṣmt verb.emotion"
[...]