Question

我有一个.txt文件，其中包含可变长度的句子，并在开头附加了各自的ID。像这样：

MR1 firstWord secondWord thirdword fourthWord
MR2 some sentence written again.
MR3 some other sentence with variable length of words.

我想将其转换为两列中的.csv文件：

MR1 firstWord
MR1 secondWord
MR1 thirdWord
MR1 fourthWord
MR2 some
MR2 sentence
.....
....
....

我的逻辑是应用双循环来实现这一点，但是根据我的逻辑，输出文件就像这样：

MR1 firstWord secondWord thirdword fourthWord
MR1 some sentence written again.
MR1 some other sentence with variable length of words.
MR2 firstWord secondWord thirdword fourthWord
MR2 some sentence written again.
MR2 some other sentence with variable length of words.
MR3 ....

其中每个ID与文件中的所有句子相关联，这显然是弯曲的。

任何有助于实现上述预期结果的帮助都将受到极大关注。谢谢。

Answer 1

你可以做点什么，

拆分每一行并将结果存储到变量中。
从位置1开始迭代列表中的所有元素。
为每次迭代打印第0个元素以及切片列表中的元素。

示例：

>>> s = 'MR1 firstWord secondWord thirdword fourthWord'
>>> for i in s.split()[1:]:
        print(s.split()[0], i)


MR1 firstWord
MR1 secondWord
MR1 thirdword
MR1 fourthWord

确切的代码是，

with open("file", "r") as myfile:
    lines = myfile.readlines()
    for line in lines:
        m = line.split()
        for i in m[1:]:
            print(m[0], i)

使用Python中的2列将.txt中的不同长度句子转换为.csv文件

1 个答案: