Question

基本上我有一个包含数据和价格的元组列表，如：

[ ("2013-02-12", 200.0), ("2012-02-25", 300.0), ("2000-03-04", 100.0), ("2000-03-05", 50.0)]

该函数需要找到每个月的平均股票价值，然后返回元组列表，包括日期（月份和年份）和股票价格。类似的东西：

[(250.0, "02-2013"), (100.0, "03-2000"), (50.0, "03-2000")]

这是我到目前为止的代码：

def average_data(list_of_tuples = []):

    list_of_averages = []
    current_year_int = 2013
    current_month_int = 2
    sum_float = float()
    count = 0
    for dd_tuple in list_of_tuples:
        date_str = dd_tuple[0]
        data_float = dd_tuple[1]
        date_list = date_str.split("-")
        year_int = int(date_list[0])
        month_int = int(date_list[1])
        date_year_str = "Date: " + str(month_int) + "-" + str(year_int);


        if month_int != current_month_int:
            average_float = sum_float / count
            average_list = [date_year_str, average_float]
            average_tuple = tuple(average_list)
            list_of_averages.append(average_tuple)
            current_month_int = month_int
            sum_float += data_float


        sum_float += data_float
        count += 1
        current_month_int = month_int
        current_year_int = year_int


    return list_of_averages

它返回一个平均值，但不是正确值，也许不是全部？我曾尝试在互联网上查看示例并询问我的TA（这是一个python类），但无济于事。有人能指出我正确的方向吗？

编辑：根据建议，if语句现在看起来应该是这样，对吗？

    if month_int != current_month_int:
        average_float = sum_float / count
        average_list = [date_year_str, average_float]
        average_tuple = tuple(average_list)
        list_of_averages.append(average_tuple)
        current_month_int = month_int
        sum_float = 0.0
        count = 0
        sum_float += data_float
        count += 1

编辑：感谢大家的帮助！我现在已经运行了代码。

Answer 1

>>> lis = [ ("2013-02-12", 200.0), ("2012-02-25", 300.0), ("2000-03-04", 100.0), ("2000-03-05", 50.0)]
>>> from collections import defaultdict
>>> dic = defaultdict(list)
>>> for k,val in lis:
        key = "-".join(k.split('-')[:-1][::-1])             
        dic[key].append(val)
...     
>>> [(sum(v)/float(len(v)),k)  for k,v in dic.items()]

[(200.0, '02-2013'), (300.0, '02-2012'), (75.0, '03-2000')]

上述代码的简单版本：

lis = [ ("2013-02-12", 200.0), ("2012-02-25", 300.0), ("2000-03-04", 100.0), ("2000-03-05", 50.0)]
dic = {}
for date, val in lis:
    #split the date string at '-' and assign the first  2 items to  year,month
    year, month = date.split('-')[:2]
    #now check if (month,year) is there in the dict
    if (month, year) not in dic:
        #if the tuple was not found then initialise one with an empty list
        dic[month,year] = []

    dic[month,year].append(val) # append val to the (month,year) key

print dic
#Now iterate over key,value items and do some calculations to get the desired output
sol =[]
for key, val in dic.items():
    new_key = "-".join(key)
    avg = sum(val) / len(val)
    sol.append((avg, new_key))
print sol

输出

#print dic
{('03', '2000'): [100.0, 50.0],
 ('02', '2013'): [200.0],
 ('02', '2012'): [300.0]}
#print sol
[(75.0, '03-2000'), (200.0, '02-2013'), (300.0, '02-2012')]

Answer 2

我从不确定作业问题，但是如何通过使用词典让你在那里的一部分。我试图保持这个例子简单，这样很容易理解发生了什么。

monthly_prices = {}
for dd_tuple in list_of_tuples:
    date, price = dd_tuple
    year, month, _ = date.split("-")
    # this will be a list
    curr_prices = monthly_prices.setdefault((year, month), [])
    curr_prices.append(price)

这将为您提供(year, month)元组到价格列表的映射。试着去那里。

setdefault检查映射中是否已存在密钥，如果密钥不存在，则将密钥设置为具有默认值。（a defaultdict本质上是一个很好的语法糖，并且避免了必须在每次迭代时初始化列表。）

Answer 3

让我们为您的示例添加一个重复日期，这样我们就可以看到一些平均值：

l = [ ("2013-02-12", 200.0), ("2012-02-25", 300.0), ("2000-03-04", 100.0), ("2000-03-05", 50.0), ("2013-02-12", 100.0)]

“2013-02-12”显示两次，总计300.0，所以平均值应为150.0

我不知道你是否已经学习了字典或者更好，但是，这是我正在使用的。使用defaultdict，您可以在构造函数中指定在未找到密钥时应返回的内容：

from collections import defaultdict

d = default_dict(float) # we'll use this to keep a running sum per date
d_count = default_dict(int) # this one will keep track of how many times the date shows up

我们也可以使用collections.Counter来保持计数，但是我们必须在列表上迭代一个额外的时间，这对于具有大量列表的速度来说并不是很好。

现在，您需要查看列表，并使用日期作为键将值添加到字典中：

for k,v in l:
    d[k] += v # add the value
    d_count[k] += 1 # increment the count

所以你现在应该有两个字典，如下所示：

>>> d
defaultdict(<type 'float'>, {'2013-02-12': 300.0, '2012-02-25': 300.0, '2000-03-05': 50.0, '2000-03-04': 100.0})

>>> d_count
defaultdict(<type 'int'>, {'2013-02-12': 2, '2012-02-25': 1, '2000-03-05': 1, '2000-03-04': 1})

现在，由于两个词典都具有相同的键，您可以迭代字典中的项目，并将日期的值除以该日期的计数，以便按日期给出平均值。

for k,v in d.iteritems():
    d[k] /= d_count[k]

“d”现在应按日期包含您的最终平均值：

>>> d
defaultdict(<type 'float'>, {'2013-02-12': 150.0, '2012-02-25': 300.0, '2000-03-05': 50.0, '2000-03-04': 100.0})

>>> d['2013-02-12']
150.0

>>> for k,v in d.iteritems():
print k, v

2013-02-12 150.0
2012-02-25 300.0
2000-03-05 50.0
2000-03-04 100.0

Answer 4

在if循环中，sum_float和count不会为0，因此当程序继续进行时，平均值将持续数月。所以尝试这样做，它应该解决你的问题。你的逻辑还有一点就是你的元组列表是否被排序，如果不是它可能会导致你的逻辑复杂化。

Python：查找每个月的平均股票价值

4 个答案: