Question

社区，

我正在努力追加我从文本文件中提取的两个子列表（p和t）。代码应该可以“打印（p，t）”，但之后的append命令不起作用（我也尝试过output.extend（[p，t]））。这些清单包含： p =代词（由测试人员说出） t = testpersons（缩短为VP +编号）什么也是伟大的不仅是，为了得到代词而且它出现的线，在当前的代码中，这不幸地不起作用。我也得到一个缩进错误，我的同事使用相同的代码得不到。

谢谢！

import re

    with open (r'./Transliteration_Task1_DE.txt', 'r')as file:

        pro=["ich", "mir", "mich", "wir", "uns", "du", "dir", "dich"]
        t=""    #variable for testpersons
        output=list()
        for line in file:
            words=list()
            words=line.split(" ")
            #print(words)
            if re.match(r'.*VP.*', line):
                t=line
                words=line.split(" ")
                #print(words)
            for w in words:
                #print(w)
                for p in pro:
                    if p == w:
                        print(p, t)
                        output.append([p,t])
        for o in output:
            print(output) #output should be a list with sublists (testpersons and pronouns)

Answer 1

您的代码可以简化：

pronouns = ["ich", "mir", "mich", "wir", "uns", "du", "dir", "dich"]
output = []

with open (r'./Transliteration_Task1_DE.txt', 'r') as file:
    for line_number, line in enumerate(file):
        words = line.split()  # Split the line on whitespaces such that words contains a list of words from the line.

        if "VP" in line:  # Only do something if the line contains "VP" - you don't need a regular expression.
            for pronoun in pronouns:  # Search all pronouns
                if pronoun in words:  # If the pronoun is in the list of words, append it to the output
                    print(pronoun, line_number, line)
                    output.append([pronoun, line_number, line])

for o in output:
    print(o)

要获取行号，您只需enumerate文件句柄即可。

要查看该行是否包含字符串VP，使用in运算符会有更多的pythonic方式。

类似于第二个嵌套for循环：只需使用in来查看代词是否包含在单词列表中。

此外，它有助于提供更易读的变量名称。单字符名称通常令人困惑且难以阅读。

另外，请记住，您的输入行可能包含您可能需要删除的标点符号或大写/小写组合。如果您想要不区分大小写，则需要将所有单词设为小写（请参阅lower的{{1}}函数。）

Answer 2

如果您要这样做，可以使用+运算符加入两个列表：

>>> p = [0, 1]
>>> q = [2, 3]
>>> p + q
[0, 1, 2, 3]

使用*（星号）一元运算符解压缩元素：

>>> [*p, *q]
[0, 1, 2, 3]

并使用.extend()列表方法：

>>> p.extend(q)
>>> print(p)
[0, 1, 2, 3]

加入两个子列表

2 个答案: