我有一个文本文件,我正在解析一列数据,结果是一个大列表(50个元素):
CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES, ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL, EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ, DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG, QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP
现在,在该列表中的每10个元素之后,我想要一个新行。所以我接近它的方式是每10个逗号将列表拆分成一个新行,这是我的方法:
import csv
import re
filename = input("Please enter file name to extract data from: ")
with open(filename) as f:
next(f)
data = f.readlines()
my_list2 = []
ticker_list = []
for line in data:
my_list = line.split()
my_list2.append(my_list[1])
for item in my_list2:
ticker_list = ', '.join(my_list2)
count = 0
for item in ticker_list:
if item == ",":
count += 1
if count == 10:
ticker_list = [i.split('\n')[0] for i in ticker_list]
print (ticker_list)
##with open("ticker_data.txt", "w") as file:
## file.write(', '.join(ticker_list))
但它似乎没有用,有没有人为我提供一个解决方案,它会在txt文件中给我这个结果:
CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES,
ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL,
EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ,
DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG,
QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP
谢谢,我顺便使用Python 3 ..
答案 0 :(得分:1)
Ok使用名为rawdata.txt的文件,如下所示:
CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES, ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL, EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ, DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG, QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP
这是一个脚本,它读取每一行并将其分成多行,每行超过10个符号
import csv
with open('rawdata.txt') as f:
with open('ticker_data.csv', 'wb') as csvfile:
writer = csv.writer(csvfile)
for line in f.readlines():
data = line.split(', ')
chunks=[data[x:x+10] for x in xrange(0, len(data), 10)]
for chunk in chunks:
writer.writerow(chunk)
其中生成一个包含此文件的文件:
CLB,HNRG,LPI,MTDR,MVO,NRGY,PSE,PVR,RRC,WES
ACMP,ATLS,ATW,BP,BWP,COG,DGAS,DNR,EPB,EPL
EXLP,NOV,OIS,PNRG,SEP,APL,ARP,CVX,DMLP,DRQ
DWSN,EC,ECA,FTI,GLOG,IMO,LINE,NFX,OILT,PNG
QRE,RGP,RRMS,SDRL,SNP,TLP,VNR,XOM,XTXI,AHGP
答案 1 :(得分:0)
另一种选择是使用切片和xrange:
import csv
writer = csv.writer(open("output.txt", "w"))
for x in xrange(0,len(ticker_list),10):
writer.writerow(ticker_list[x:x+10])
xrange
给出了0和步长为10的列表长度之间的数字,然后我们打印出一个长度为10的切片,从每个指标开始到csvfile
。 csv.writer
将负责添加逗号分隔符等。
答案 2 :(得分:0)
你可以这样做:
import csv
from itertools import izip_longest
with open('/tmp/line.csv','r') as fin:
cr=csv.reader(fin)
n=10
data=izip_longest(*[iter(list(cr)[0])]*n,fillvalue='')
print '\n'.join(', '.join(t) for t in data)
使用您的数据,打印:
CLB, HNRG, LPI, MTDR, MVO, NRGY, PSE, PVR, RRC, WES
ACMP, ATLS, ATW, BP, BWP, COG, DGAS, DNR, EPB, EPL
EXLP, NOV, OIS, PNRG, SEP, APL, ARP, CVX, DMLP, DRQ
DWSN, EC, ECA, FTI, GLOG, IMO, LINE, NFX, OILT, PNG
QRE, RGP, RRMS, SDRL, SNP, TLP, VNR, XOM, XTXI, AHGP
澄清(Py 3)
我会以这种方式编写你的程序:
import csv
from itertools import zip_longest
n=10
with open('/tmp/rawdata.txt','r') as fin, open('/tmp/out.csv','w') as fout:
reader=csv.reader(fin)
writer=csv.writer(fout)
source=(e for line in reader for e in line)
for t in zip_longest(*[source]*n):
writer.writerow(list(e for e in t if e))
的变化:
n
是什么,输出都是n
个元素,直到最后一位< Ñ