为简单起见,我不会将下载的数据(csv)分配为3类。有没有人知道我可以看的任何提示或类似项目,或者我应该看的python工具。
a,b,c
d,e,f
g
我下载的数据可能具有上述任何价值的组合。
https://docs.google.com/spreadsheets/d/1GU7jVLA-YzqRTxyLMdbymdJ6b1RtB09bpOjIDX6eJok/edit?usp=sharing
这是2个基本示例,说明数据将下载为什么以及我希望将其转换为什么。
真实数据将有10到15个投资和大约4个类别,我只想知道可以进行这种排序吗?由于我们的投资名称较长,有些名称相似,但分类为不同的类别,因此变得棘手。
如果有人可以向我指出正确的方向,即我需要一本字典或一些基本框架或代码来查看,那将非常棒。
热衷于学习,但不知道从哪里开始欢呼-这是我的第一个适当的编码项目。
只要对信息进行清晰的分类并汇总我很高兴的每个类别,我就不必为输出的格式而烦恼:)
答案 0 :(得分:0)
您不需要框架,只需内置函数即可(就像在Python中一样)。
from collections import defaultdict
# Input data "rows". These would probably be loaded from a file.
raw_data = [
('a', 1000.00),
('b', 2000.00),
('d', 3000.00),
('e', 4000.00),
('g', 5000.00),
('g', 10000.00),
('c', 5000.00),
('d', 2000.00),
('a', 4000.00),
('e', 5000.00),
]
# Category definitions, mapping a category name to the row "types" (first column).
categories = {
'Shares': {'a', 'b', 'c'},
'Bonds': {'d', 'e', 'f'},
'Cash': {'g'},
}
# Build an inverse map that makes lookups faster later.
# This will look like e.g. {"a": "Shares", "b": "Shares", ...}
category_map = {}
for category, members in categories.items():
for member in members:
category_map[member] = category
# Initialize an empty defaultdict to group the rows with.
rows_per_category = defaultdict(list)
# Iterate through the raw data...
for row in raw_data:
type = row[0] # Grab the first column per row,
category = category_map[type] # map it through the category map (this will crash if the category is undefined),
rows_per_category[category].append(row) # and put it in the defaultdict.
# Iterate through the now collated rows in sorted-by-category order:
for category, rows in sorted(rows_per_category.items()):
# Sum the second column (value) for the total.
total = sum(row[1] for row in rows)
# Print header.
print("###", category)
# Print each row.
for row in rows:
print(row)
# Print the total and an empty line.
print("=== Total", total)
print()
这将输出类似
### Bonds
('d', 3000.0)
('e', 4000.0)
('d', 2000.0)
('e', 5000.0)
=== Total 14000.0
### Cash
('g', 5000.0)
('g', 10000.0)
=== Total 15000.0
### Shares
('a', 1000.0)
('b', 2000.0)
('c', 5000.0)
('a', 4000.0)
=== Total 12000.0