我想在row[4]
为数字时创建一个列表,然后在row
不是数字的情况下用row[4]
扩展它,但我得到重复的结果。有人能引导我朝正确的方向发展吗?
这是一个示例csv文件:
Name,Last,,,Account
joe,joe last,,,11111
joe address,city,state,zip,
,,,,
sam,sam last,,,22222
sam address,city,state,zip,
,,,,
bob,bob last,,,33333
bob address,city,state,zip,
我的代码:
localdir = 'C:\\Users\\User\\My Documents'
fn = 'test_file.csv'
with open(os.path.join(localdir, fn), 'rb') as fopen:
csvdata = list(csv.reader(fopen))
data = []
for row in csvdata:
if not row[0] or row[0].startswith('Name'):
continue
if row[4].isdigit():
accts = []
accts += row
data.append(accts)
for line in data:
print(line)
我的结果是:
['joe', 'joe last', '', '', '11111', 'joe address', 'city', 'state', 'zip', '']
['joe', 'joe last', '', '', '11111', 'joe address', 'city', 'state', 'zip', '']
['sam', 'sam last', '', '', '22222', 'sam address', 'city', 'state', 'zip', '']
['sam', 'sam last', '', '', '22222', 'sam address', 'city', 'state', 'zip', '']
['bob', 'bob last', '', '', '33333', 'bob address', 'city', 'state', 'zip', '']
['bob', 'bob last', '', '', '33333', 'bob address', 'city', 'state', 'zip', '']
我想得到:
['joe', 'joe last', '', '', '11111', 'joe address', 'city', 'state', 'zip', '']
['sam', 'sam last', '', '', '22222', 'sam address', 'city', 'state', 'zip', '']
['bob', 'bob last', '', '', '33333', 'bob address', 'city', 'state', 'zip', '']
答案 0 :(得分:3)
问题是您要为文件中的每一行附加accnts
。
将你的if(循环的最后4行)更改为:
if row[4].isdigit():
accts = []
else:
data.append(accts)
accts += row
或者你可以重写逻辑以使其更容易理解。
with open(os.path.join(localdir, fn), 'rb') as fopen:
data = []
reader = csv.reader(fopen)
header = next(reader)
for row in reader:
next_row = next(reader)
blank_row = next(reader)
data.append(row + next_row)
(仅当您确定格式一致时才有效)
答案 1 :(得分:2)
你只需要跳过标题,一次得到三行,拉出前两行:
from itertools import islice
import csv
with open("out.csv") as f:
next(f)
r = csv.reader(f)
out = [row[0] + row[1] for row in iter(lambda: list(islice(r, 3)), [])]
输出:
[['joe', 'joe last', '', '', '11111', 'joe address', 'city', 'state', 'zip', ''],
['sam', 'sam last', '', '', '22222', 'sam address', 'city', 'state', 'zip', ''],
['bob', 'bob last', '', '', '33333', 'bob address', 'city', 'state', 'zip', '']]
使用python3
我们可以解压缩而不会出错:
from itertools import islice
import csv
with open("out.csv") as f:
next(f)
r = csv.reader(f)
print([a + b for a, b, *_ in iter(lambda: list(islice(r, 3)), [])])
答案 2 :(得分:1)
这不是标准的csv文件,因为交替的行具有不同的含义。幸运的是,由于csv.reader
是一个迭代器,因此在需要时使用next()
很容易抓住下一行。
import csv
# todo: debug test file
open('test_file.csv', 'w').write(""" Name, Lastname, , , Account
joe, joe last, , , 11111
joe address, city, state, zip,
, , , ,
sam, sam last, , , 22222
sam address, city, state, zip,
, , , ,
bob, bob last, , , 33333
bob address, city, state, zip,
, , , ,
""")
with open('test_file.csv') as fp:
reader = csv.reader(fp)
for row in reader:
row = [c.strip() for c in row]
# skip empty lines and rows w/o col 0, then check digit
if row and row[0] and row[4].isdigit():
# add next line
row.extend(c.strip() for c in next(reader))
print(row)