无法将文本正确提取到列表中

时间:2019-04-23 11:53:18

标签: python python-3.x pathlib

我有一个文件列表,我希望从每个文件中提取文本并将每个文件的文本放在单独的列表中。虽然,输出是以字符串形式出现的,这使我很难区分哪个文本属于哪个文件。

-----------这是文件列表:

[WindowsPath('C:/Users/xxxx/Desktop/test_folder/final test.txt'),  WindowsPath('C:/Users/xxxx/Desktop/test_folder/iptest.txt'), WindowsPath('C:/Users/xxxx/Desktop/test_folder/New Text Document.txt'), WindowsPath('C:/Users/xxxx/Desktop/test_folder/test2.txt')]

-----------我得到的输出是:

rgerg



egfreg



secret

dafreagr 343.23.12.53.100 aefref
secret

grre

regreg



ergre

测试正常

-----------我希望输出为:

[['rgerg','egfreg','secret'],
['dafreagr 343.23.12.53.100 aefref'],
['secret','grre','regreg','ergre'],
['test is working']]

------------或者,可以列出个人名单:

['rgerg','egfreg','secret']
['dafreagr 343.23.12.53.100 aefref']
['secret','grre','regreg','ergre']
['test is working']

-------------我有一个函数“ loader()”,该函数当前提取文件的文本:

 for items in txt_files:
   for item in loader(items):
    words = item
    print(words)

我无法同时通过列表和字典获得所需的输出。不知道该怎么办。

更新后的输出:

物品输出:

C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\XXXX\Desktop\test\New Microsoft Word Document.docx
C:\Users\xxxx\Desktop\test\secretwe.docx
C:\Users\xxxx\Desktop\test\secretwe.docx
C:\Users\xxxx\Desktop\test\secretwe.docx
C:\Users\xxxx\Desktop\test\secretwe.docx
C:\Users\xxxx\Desktop\test\secretwe.docx
C:\Users\xxxx\Desktop\test\secretwe.docx
C:\Users\xxxx\Desktop\test\secretwe.docx

项目输出:

S
e
c
r
e
t












S
e
c
r
e
t

t
h
i
s

i
s

a

t
e
s
t

d
o
c
u
m
e
n
t

f
o
r

k
e
y
w
o
r
d

s
c
a
n
s
.




T
h
i
s

i
s

a

t
e
s
t
.




S
e
c
r
e
t
s
e
c
r
e
t

1 个答案:

答案 0 :(得分:1)

您需要声明一个result列表,然后向其添加数据。

例如:

result = []
for items in txt_files:
    temp = []
    for item in loader(items):
        temp.append(item)
    result.append(temp)

print(result)

根据评论编辑

result = []
for items in txt_files:
    result.append(loader(items).splitlines())
print(result)