我想为这样的输入写一个函数tha
1405684432, d8:c7:c8:5e:7c:2d, SUTD_GLAB, 72
1405684432, d8:c7:c8:5e:7c:2c, SUTD_BOT, 72
1405684432, d8:c7:c8:5e:7c:2b, SUTD_Student, 72
1405684432, d8:c7:c8:5e:7c:2a, SUTD_Staff, 72
1405684433, d8:c7:c8:5e:7c:29, SUTD_ILP2, 71
1405684433, d8:c7:c8:5e:7d:eb, SUTD_Student, 57
1405684433, d8:c7:c8:5e:7d:ea, SUTD_Staff, 57
输出会给我两个按第一列分组的列表或文件,这意味着如果第一列中的数字相同,它将被分组为一个列表。结果应该是这样的:
列出一个:
1405684432, d8:c7:c8:5e:7c:2d, SUTD_GLAB, 72
1405684432, d8:c7:c8:5e:7c:2c, SUTD_BOT, 72
1405684432, d8:c7:c8:5e:7c:2b, SUTD_Student, 72
1405684432, d8:c7:c8:5e:7c:2a, SUTD_Staff, 72
列表二:
1405684433, d8:c7:c8:5e:7c:29, SUTD_ILP2, 71
1405684433, d8:c7:c8:5e:7d:eb, SUTD_Student, 57
1405684433, d8:c7:c8:5e:7d:ea, SUTD_Staff, 57
我不知道应该使用哪种方法。
答案 0 :(得分:2)
您可以使用itertools.groupby()
。 (假设输入按该列排序。)
示例:
import itertools
data = """\
1405684432, d8:c7:c8:5e:7c:2d, SUTD_GLAB, 72
1405684432, d8:c7:c8:5e:7c:2c, SUTD_BOT, 72
1405684432, d8:c7:c8:5e:7c:2b, SUTD_Student, 72
1405684432, d8:c7:c8:5e:7c:2a, SUTD_Staff, 72
1405684433, d8:c7:c8:5e:7c:29, SUTD_ILP2, 71
1405684433, d8:c7:c8:5e:7d:eb, SUTD_Student, 57
1405684433, d8:c7:c8:5e:7d:ea, SUTD_Staff, 57
"""
data = data.splitlines()
keyfunc = lambda x: x.split(',')[0]
#data.sort(key=keyfunc) # if input is not sorted by first column
for k,l in itertools.groupby(data, key=keyfunc):
print "group:", k
for x in l:
print x
输出:
group: 1405684432
1405684432, d8:c7:c8:5e:7c:2d, SUTD_GLAB, 72
1405684432, d8:c7:c8:5e:7c:2c, SUTD_BOT, 72
1405684432, d8:c7:c8:5e:7c:2b, SUTD_Student, 72
1405684432, d8:c7:c8:5e:7c:2a, SUTD_Staff, 72
group: 1405684433
1405684433, d8:c7:c8:5e:7c:29, SUTD_ILP2, 71
1405684433, d8:c7:c8:5e:7d:eb, SUTD_Student, 57
1405684433, d8:c7:c8:5e:7d:ea, SUTD_Staff, 57
供参考:
答案 1 :(得分:0)
我会选择使用字典来跟踪第一列。解决方案是使用类似的东西:
def split_on_first_column(data):
result = dict()
for line in data:
l = line.split(',')
if not l[0] in result:
result[l[0]] = [line]
else:
result[l[0]].append(line)
return result.values()
在python 2中,在这种情况下为你提供了一个列表列表,在python 3中为列表提供了一个迭代器。
请注意,这些行存储为完整字符串,不会进一步拆分为列表。
答案 2 :(得分:0)
Python代码:
import csv
groups = {}
with open("data.csv") as data:
reader = csv.reader(data)
for row in reader:
if len(row) > 0:
col1 = row[0].strip()
group = groups.get(col1, [])
group.append(row)
groups[col1] = group
for key in groups:
print("=== {0} ===".format(key))
print("\n".join(",".join(row) for row in groups[key]))
输出:
=== 1405684433 ===
1405684433, d8:c7:c8:5e:7c:29, SUTD_ILP2, 71
1405684433, d8:c7:c8:5e:7d:eb, SUTD_Student, 57
1405684433, d8:c7:c8:5e:7d:ea, SUTD_Staff, 57
=== 1405684432 ===
1405684432, d8:c7:c8:5e:7c:2d, SUTD_GLAB, 72
1405684432, d8:c7:c8:5e:7c:2c, SUTD_BOT, 72
1405684432, d8:c7:c8:5e:7c:2b, SUTD_Student, 72
1405684432, d8:c7:c8:5e:7c:2a, SUTD_Staff, 72