我的想法是,我正在接受一个Wordnet文本行,将该行的所有不同部分分配给不同的变量,并将这些变量作为三元组输入到RDFlib图中。
以下是文本文件中的示例行:
13797906 23 n 04 flood 0 inundation 0 deluge 0 torrent 0 005 @ 13796604 n 0000 + 00603894 a 0401 + 00753137 v 0302 + 01527311 v 0203 + 02361703 v 0101 | an overwhelming number or amount; "a flood of requests"; "a torrent of abuse"
这是我的代码。
from rdflib import URIRef, Graph
from StringIO import StringIO
G = Graph()
F = open("new_2.txt", "r")
for line in F:
L = line.split()
L2 = line.strip().split('|')
synset_offset = L[0]
lex_filenum = L[1]
ss_type = L[2]
gloss = L2[1]
before_at, after_at = line.split('@', 1)
N = int(L[3])
K = int(before_at.split()[-1])
word = L[4:4 + 2 * N:2]
iw = iter(word)
S = after_at.split()[0:0 +4 * K:4]
ip = iter(S)
SS = after_at.split()[1:1 + 4 * K:4]
iss = iter(SS)
ST = after_at.split()[2:2 + 4 * K:4]
ist = iter(ST)
line1 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.w3.org/1999/02/22-rdf-syntax-ns#lex_filenum '''+lex_filenum+''''''
line2 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#ss_type '''+ss_type+''''''
line3 = ''''''
#line4 = '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#gloss '''gloss'''
for item in word:
line3 += '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#lexical_entry '''+iw.next()+'''\n'''
line5 = ''''''
for item in S:
line5 += '''http://www.example.org/lexicon#'''+synset_offset+''' http://www.monnetproject.eu/lemon#has_ptr '''+ip.next()+'''\n'''
line6 = ''''''
for item in SS:
line6 += '''http://www.example.org/lexicon#'''+ip.next()+''' http://www.monnetproject.eu/lemon#pos '''+iss.next()+'''\n'''
line7 = ''''''
for item in ST:
line7 += '''http://www.example.org/lexicon#'''+ip.next()+''' http://www.monnetproject.eu/lemon#source_target '''+ist.next()+'''\n'''
contents = '''\
'''+line1+'''
'''+line2+'''
'''+line3+'''
'''+line5+'''
'''+line6+'''
'''+line7+''''''#'''+line4+'''
tabfile = StringIO(contents)
for line in tabfile:
triple = line.split()
triple = (URIRef(t) for t in triple)
G.add(triple)
print G.serialize(format='nt')
这一切都很完美,直到line5
。 (第4行因其他原因被注释掉了,我还不需要它)
这是我在包含第5行,第6行和第7行时遇到的错误:
G.add(triple)
File "/usr/lib/python2.7/site-packages/rdflib-4.1_dev-py2.7.egg/rdflib/graph.py", line 352, in add
def add(self, (s, p, o)):
ValueError: need more than 0 values to unpack
我不明白line3和line5之间的区别是什么会导致错误,line3完美无缺!
答案 0 :(得分:0)
似乎S = after_at.split()[0:0 +4 * K:4]
会导致空值,这意味着S是一个空列表。此外,虽然您正在遍历所有项目,但在for item in S
中,项目尚未在该循环内使用!