Question

这是我的python代码，它适用于某些数据（列表），如下所示：

1 Check _ VERB VB _ 0 ROOT _ _ 
2 out _ PRT RP _ 1 prt _ _ 
3 this _ DET DT _ 4 det _ _ 
4 video _ NOUN NN _ 1 dobj _ _ 
5 about _ ADP IN _ 4 prep _ _ 
6 Northwest _ NOUN NNP _ 7 nn _ _ 
7 Arkansas _ NOUN NNP _ 5 pobj _ _ 
8 - _ . , _ 7 punct _ _ 
9 https _ X ADD _ 7 appos _ _ 
10 : _ . : _ 4 punct _ _ 
11 hello _ NUM CD _ 12 num _ _ 
12 # _ NOUN NN _ 4 dep _ _ 
13 TEAMWALMART _ NOUN NNP _ 12 appos _ _ 

1 Check _ VERB VB _ 0 ROOT _ _ 
2 out _ PRT RP _ 1 prt _ _ 
3 this _ DET DT _ 6 det _ _ 
4 # _ NOUN NN _ 5 dep _ _ 
5 AMAZING _ VERB VBG _ 6 amod _ _ 
6 vide _ NOUN NN _ 1 dobj _ _ 
7 o _ NOUN NNP _ 6 partmod _ _ 
8 about _ ADP IN _ 7 prep _ _ 
9 Walmart _ NOUN NNP _ 8 pobj _ _ 
10 # _ . $ _ 9 dep _ _ 
11 MORETHANEXPECTED _ VERB VBN _ 9 dep _ _

python代码包含两部分： 1.通过收集上面行中的所有第二个标记来读取行并创建原始文本。 2.为每个单词创建看起来像{'pos_tag': 'NNP', 'position': '13', 'dep_rel': 'appos', 'parent': '12', 'word': 'TEAMWALMART'}的单词标签。

Python代码：

for line in data:
    if line:
        tweet.append(line)
    if not line:
        original_tweet = get_original_tweet(tweet)
        word_tags = get_word_tags(tweet)
        new_json_object['text'] = original_tweet
        new_json_object['tags'] = word_tags
        new_json_object_list.append(new_json_object.copy())
        del tweet[:]
print new_json_object_list

def get_original_tweet(tweet_lines):
tweet_words = []
for line in tweet_lines:
    tweet_words.append(line.split()[1])
original_tweet = ' '.join(tweet_words)
return original_tweet

def get_word_tags(tweet_lines):
word_tags = {}
word_tags_list = []
for line in tweet_lines:
    words = line.split()
    word_tags['word'] = words[1]
    word_tags['position'] = words[0]
    word_tags['pos_tag'] = words[4]
    word_tags['dep_rel'] = words[7]
    word_tags['parent'] = words[6]
    word_tags_list.append(word_tags.copy())
    word_tags.clear()
return word_tags_list

此代码工作正常但仅在第一组线上执行，即直到数据中的第13行。它忽略了第二条推文。我不知道自己错过了什么。代码对我来说似乎是正确的。有人可以帮我调试吗？

Answer 1

我没有足够的代表发表评论，所以我会写一个超级简短的答案。

我认为您只需要在数据列表的末尾添加一个空行。

循环在Python中没有正确发生

1 个答案: