我有一个很大的txt文件,其中包含字符串。我想在最短的时间内对这些字符串应用几个函数。然后,我想到了使用线程。 假设我有以下两个功能:
def length(string):
print len(string)
def numberOfPoints(string):
print string.count('.')
如果没有线程,我会这样做:
with open('bigfile.txt', 'r') as f:
samples = f.read().splitlines()
for s in samples:
length(s)
numberOfPoints(s)
当我想使用线程时会出现我的问题,我会这样做:
import threading
with open('bigfile.txt', 'r') as f:
samples = f.read().splitlines()
for s in samples:
thread1 = threading.Thread(target=length, args=(s,))
thread2 = threading.Thread(target=numberOfPoints, args=(s,))
thread1.start()
thread2.start()
thread1.join()
thread2.join()
但是这样做,我感觉好像缺少了线程提供的内部属性。我想要一个线程来计算文件中每个字符串的长度,而另一个线程来计算文件中每个字符串的点数。在我的循环中,我正在创建(2 *文件中字符串的数量)线程。如何从文件中加载所有字符串,而不用像这样两次读取文件:
import threading
def length(samples):
for s in samples:
print len(s)
def numberOfPoints(samples):
for s in samples:
print s.count('.')
with open('bigfile.txt', 'r') as f:
samples = f.read().splitlines()
thread1 = threading.Thread(target=length, args=(samples,))
thread2 = threading.Thread(target=numberOfPoints, args=(samples,))
thread1.start()
thread2.start()
thread1.join()
thread2.join()