我试图读取.dat
文件中的前4位数字,并将其存储在每一行的循环中。 .dat
文件如下所示:
0004 | IP
0006 | IP
0008 | IP
我想创建一个循环,该循环读取前四位数字,并存储该循环的迭代,直到读取整个文件,然后将其写入输出文件中。
我写了这个,但是它所做的基本上就是将.dat转换为csv
with open('stores.dat', 'r') as input_file:
lines = input_file.readlines()
newLines = []
for line in lines:
newLine = line.strip('|').split()
newLines.append(newLine)
with open('file.csv', 'w') as output_file:
file_writer = csv.writer(output_file)
file_writer.writerows(newLines)
答案 0 :(得分:1)
由于您知道每次要读取4个字符,因此只需阅读一个切片即可:
import csv
# you can open multiple file handles at the same time
with open('stores.dat', 'r') as input_file, \
open('file.csv', 'w') as output_file:
file_writer = csv.writer(output_file)
# iterate over the file handle directly to get the lines
for line in input_file:
row = line[:4] # slice the first 4 chars
# make sure this is wrapped as a list otherwise
# you'll get unsightly commas in your rows
file_writer.writerow([row])
哪个输出
$ cat file.csv
0004
0006
0008
答案 1 :(得分:0)
如果每行总是有四位数字,那么它就很简单
with open('stores.dat', 'r') as input_file:
lines = input_file.readlines()
newLines = []
for line in lines:
newLine = line[:4]
newLines.append(newLine)
否则,您可以使用正则表达式来完成这项工作,例如:
import re
with open('stores.dat', 'r') as input_file:
lines = input_file.readlines()
newLines = []
for line in lines:
newLine = re.findall(r'\d{3}', line)[0]
newLines.append(newLine)
请注意,re.findall()
将返回一个list
,其中包含该行的所有匹配项,因此,最后的[0]
仅返回第一个匹配项或该行的第一个元素列表。