我有一个包含数据点和'标识符的列表'看起来像这样:
['identifier', 1, 2, 3, 4, 'identifier', 10, 11, 12, 13, 'identifier', ...]
我想将此列表写入CSV文件并为每个标识符启动一个新列。 e.g。
for data in list:
if data=='identifier':
==> create a new column in the CSV file and print the subsequent data points
我期待听到您的建议。
干杯,
-Sebastian
答案 0 :(得分:0)
此解决方案不会将数据写入csv文件,但使用csv库这是一个简单的步骤。这样做是将数据从您提供的数据重组为列表列表,每个子列表是一行数据。
l = ['identifier', 1, 2, 3, 'identifier', 10, 11, 12, 13, 'identifier', 4, 3, 2, 1, 10]
def split_list(l, on):
"""Splits a list an identifier and returns a list of lists split on the
identifier without including it."""
splits = []
cache = []
for v in l:
# Check if this is an identifier
if v == on:
# Add the cache to splits unless it is empty
if cache:
splits.append(cache)
# Empty the cache
cache = []
else:
cache.append(v)
# Add the last cache to splits if it is not empyt
if cache:
splits.append(cache)
return splits
def reshape_list(l, default=None):
"""Takes a list of lists assuming each list is a column of values and
reshapes it to be a list of rows, if list are not all the same length None
will be used to fill empyt spots."""
result = []
# Get the length of the longest list
maxlen = max(map(len, l))
for i in range(maxlen):
# Create each row
row = []
# Extract the values from the columns
for column in l:
if i < len(column):
row.append(column[i])
else:
row.append(default)
result.append(row)
return result
print(l)
t = split_list(l, 'identifier')
print(t)
r = reshape_list(t)
print(r)
答案 1 :(得分:0)
生成演示数据:
url = "https://server.com/app/login.aspx?ReturnUrl=/app/getData.aspx?type=GETDATA&id=123"
... SAME SCRIPT AS ABOVE ...
>>> print response.url
https://server.com/app/getData.aspx?type=GETUSER
>>> print response.content
ERROR Some parameter is missing
输出:
import random
random.seed(20180119) # remove to get random data between runs
id = 'identifier'
def genData():
data = []
for n in range(10+random.randint(1,10)):
data.append(id)
data.extend(random.choices(range(1,20),k=random.randint(3,12)))
print(data)
return data
<强>格式化:强>
['identifier', 18, 6, 19, 10, 12, 18, 17, 12,
'identifier', 10, 17, 17, 10, 15, 12, 16, 18, 19, 18, 14, 9,
'identifier', 6, 10, 1, 14, 4,
'identifier', 3, 7, 7, 4, 8, 2, 16, 8, 1, 8, 16, 6,
'identifier', 6, 17, 8, 8, 13, 15, 7, 9, 4, 10, 15,
'identifier', 17, 8, 3, 8, 2, 19, 16, 2, 5, 6,
'identifier', 18, 6, 18, 19, 7, 8, 14, 7, 7, 19,
'identifier', 13, 7, 4, 13,
'identifier', 15, 8, 17, 8, 1, 12, 16, 7, 5, 19, 14, 9,
'identifier', 18, 16, 10, 7, 16, 18, 19, 6, 15, 8, 13, 15,
'identifier', 15, 2, 18, 13, 7,
'identifier', 17, 19, 15, 4, 18, 7, 13, 17, 8, 9,
'identifier', 9, 17, 18, 8, 17, 17, 17,
'identifier', 3, 16, 15, 13, 9,
'identifier', 15, 12, 2, 16, 2, 5, 16, 18]
写入数据:
def partitionData(idToUse,dataToUse):
lastId = None
for (i,n) in enumerate(data): # identify subslices of data
if n == idToUse and not lastId: # find first id, data before is discarded
lastId = i
continue
if n == idToUse: # found id
yield data[lastId:i] # yield sublist including idToUse
lastId = i
if (data[-1] != id): # yield rest of data
yield data[lastId:]
<强> result.csv:强>
data = genData()
partitioned = partitionData(id, data)
import itertools
import csv
with open('result.csv', 'w', newline='') as csvfile:
writer = csv.writer(csvfile, delimiter=";")
# like zip, but fills up shorter ones with None till longest index
writer.writerows(itertools.zip_longest(*partitioned, fillvalue=None))
链接:
- itertools.zip_longest
- csv-writer
答案 2 :(得分:-1)
您可以执行类似的操作,假设l
是您的列表:
import pandas as pd
import numpy as np
pd.DataFrame(np.array(l).reshape(-1,5)).set_index(0).T.to_csv('my_file.csv',index=0)
答案 3 :(得分:-1)
如果数据集不是太大,则应首先准备数据,然后将其序列化为csv文件。
import csv
dataset = ['identifier', 1, 2, 3, 4, 'identifier', 10, 11, 12, 13, 'identifier', 21, 22, 23, 24]
columns = []
col = []
for datapoint in dataset:
if datapoint == 'identifier':
if col:
columns.append(col)
col = []
else:
col.append(datapoint)
columns.append(col)
rows_count = max((len(c) for c in columns))
with open('result.csv', 'w') as csvfile:
writer = csv.writer(csvfile, delimiter=";")
for x in range(rows_count):
data = []
for col in columns:
if len(col) > x:
data.append(col[x])
else:
data.append("")
writer.writerow(data)