我有适用于一个指定文件的代码。如何为多个文件迭代相同的功能?
以下代码适用于test3.txt文件。我在一个文件夹中有多个文件(test1.txt,test2.txt,test3.txt ...),能否请您帮我遍历每个文件?我相信我必须更改第6-7行。请帮忙。我是python的新手...
import os,csv,datefinder,re
import numpy as np
os.chdir('C:\Users\dul\Dropbox\Article')
with open("test3.txt", 'r') as file1:
text1=file1.read()
#locate the date of the article
matches = list(datefinder.find_dates(text1))
if len(matches) > 0:
date=matches[1]
strdate = str(date)
else:
print 'No dates found'
#locate the name of the company
matchcomp = re.search(r'Keywords:([^,]*)(,|$)', text1).group(1).strip()
#count the number of words in the article
matchcount = re.search(r'(.*) words', text1).group(1).strip()
#determine the article
def matchwho():
if 'This story was generated by' in text1:
return('1')
elif 'This story includes elements generated' in text1:
return('2')
elif 'Elements of this story were generated' in text1:
return('2')
elif 'Portions of this story were generated' in text1:
return('2')
else:
return('3')
matchw =str(matchwho())
#list the returns in a line
combid = matchcomp + "," + strdate + "," + matchw + "," + matchcount
#save in txt format
with open('outfile.txt', 'w') as outfile:
outfile.write(combid)
我希望收益会附加在outfile.txt中
答案 0 :(得分:0)
首先将第6行中的所有内容移至名为process_file
的新函数中,该函数将获得参数filename
,然后将该函数中的text3.txt
替换为filename
< / p>
现在您可以在脚本结尾处编写
for f in os.listdir('C:\Users\dul\Dropbox\Article'):
process_file(f)
这将完成工作。
答案 1 :(得分:0)
如何将所有代码打包到一个可以为每个文件多次调用的函数中
import os,csv,datefinder,re
import numpy as np
os.chdir('C:\Users\dul\Dropbox\Article')
def matchwho(text_to_match):
if 'This story was generated by' in text_to_match:
return('1')
elif 'This story includes elements generated' in text_to_match:
return('2')
elif 'Elements of this story were generated' in text_to_match:
return('2')
elif 'Portions of this story were generated' in text_to_match:
return('2')
else:
return('3')
def extract_data(filename):
with open(filename, 'r') as file1:
text1=file1.read()
#locate the date of the article
matches = list(datefinder.find_dates(text1))
if len(matches) > 0:
date=matches[1]
strdate = str(date)
else:
print 'No dates found'
#locate the name of the company
matchcomp = re.search(r'Keywords:([^,]*)(,|$)', text1).group(1).strip()
#count the number of words in the article
matchcount = re.search(r'(.*) words', text1).group(1).strip()
#determine the article
matchw =str(matchwho(text1))
#list the returns in a line
combid = matchcomp + "," + strdate + "," + matchw + "," + matchcount
#save in txt format
with open('outfile.txt', 'w') as outfile:
outfile.write(combid)
files = os.listdir()
for file in files:
if ".txt" in file:
extract_data(file)
*注意,我没有测试此代码,因为我没有正在处理的.txt文件。可能有错误,但是我认为它展示了如何获取文件名并将其输入处理功能。如果这样可以解决您的问题,那么可以单击帖子旁的复选标记:)