我从字典中抓出了一堆单词,并用它们创建了一个庞大的CSV文件,每行一个单词。
我还有另一个功能,该功能可以从大量的CSV文件中读取内容,然后创建较小的CSV文件。
该功能只能创建500个单词/行的CSV文件,但是有些地方不对。第一个文件有501个字/行。其余文件有502个字/行。
伙计,也许我很累,但是我似乎无法在代码中发现到底是什么原因造成的。还是我的代码完全没有错?
下面是我假设引起该问题的部分功能。完整功能可以在下面看到。
def create_csv_files():
limit = 500
count = 0
filecount = 1
zfill = 3
filename = 'C:\\Users\\Anthony\\Desktop\\Scrape\\Dictionary\\terms{}.csv'.format('1'.zfill(zfill))
with open('C:\\Users\\Anthony\\Desktop\\Scrape\\Results\\dictionary.csv') as readfile:
csvReader = csv.reader(readfile)
for row in csvReader:
term = row[0]
if ' ' in term:
term = term.replace(' ', '')
if count <= limit:
count += 1
else:
count = 0
filecount += 1
filename = 'C:\\Users\\Anthony\\Desktop\\Scrape\\Dictionary\\terms{}.csv'.format(str(filecount).zfill(zfill))
aw = 'a' if os.path.exists(filename) else 'w'
with open(filename, aw, newline='') as writefile:
fieldnames = [ 'term' ]
writer = csv.DictWriter(writefile, fieldnames=fieldnames)
writer.writerow({
'term': term
})
def create_csv_files():
limit = 500
count = 0
filecount = 1
zfill = 3
idiomsfilename = 'C:\\Users\\Anthony\\Desktop\\Scrape\\Dictionary\\idioms.csv'
filename = 'C:\\Users\\Anthony\\Desktop\\Scrape\\Dictionary\\terms{}.csv'.format('1'.zfill(zfill))
with open('C:\\Users\\Anthony\\Desktop\\Scrape\\Results\\dictionary.csv') as readfile:
csvReader = csv.reader(readfile)
for row in csvReader:
term = row[0]
if 'idiom' in row[0] and row[0] != ' idiom':
term = row[0][:-5]
aw = 'a' if os.path.exists(idiomsfilename) else 'w'
with open(idiomsfilename, aw, newline='') as idiomsfile:
idiomsfieldnames = ['idiom']
idiomswriter = csv.DictWriter(idiomsfile, fieldnames=idiomsfieldnames)
idiomswriter.writerow({
'idiom':term
})
continue
else:
if ' ' in term:
term = term.replace(' ', '')
if count <= limit:
count += 1
else:
count = 0
filecount += 1
filename = 'C:\\Users\\Anthony\\Desktop\\Scrape\\Dictionary\\terms{}.csv'.format(str(filecount).zfill(zfill))
aw = 'a' if os.path.exists(filename) else 'w'
with open(filename, aw, newline='') as writefile:
fieldnames = [ 'term' ]
writer = csv.DictWriter(writefile, fieldnames=fieldnames)
writer.writerow({
'term': term
})
print(term)
答案 0 :(得分:2)
所以文件具有奇怪的行数的原因是由于您的if-else条件。
当count
小于或等于count
时,您增加limit
。对于您的第一个迭代,您将递增至1,然后写下您的第一项,然后递增,依此类推。因为您使用<=
而不是严格的不等式,所以您仍将以count = 500
递增并写入第501个字。
从第二个循环开始,您的第一个单词写在count = 0
上。循环在count = 501
处再次终止,因此您这次写了502个字。
要解决此问题,请检查count >= limit
,然后创建一个新文件。在写入CSV文件之后而不是之前增加count
。那应该有帮助。
def create_csv_files():
limit = 500
count = 0
filecount = 1
zfill = 3
filename = 'C:\\Users\\Anthony\\Desktop\\Scrape\\Dictionary\\terms{}.csv'.format('1'.zfill(zfill))
with open('C:\\Users\\Anthony\\Desktop\\Scrape\\Results\\dictionary.csv') as readfile:
csvReader = csv.reader(readfile)
for row in csvReader:
term = row[0]
if ' ' in term:
term = term.replace(' ', '')
# Remove if and keep else
if count >= limit:
count = 0
filecount += 1
filename = 'C:\\Users\\Anthony\\Desktop\\Scrape\\Dictionary\\terms{}.csv'.format(str(filecount).zfill(zfill))
aw = 'a' if os.path.exists(filename) else 'w'
with open(filename, aw, newline='') as writefile:
fieldnames = [ 'term' ]
writer = csv.DictWriter(writefile, fieldnames=fieldnames)
writer.writerow({
'term': term
})
count += 1 # Increment here