Question

我有一个元组..

for i in my_tup:
   print(i)

输出：

   (Transfer, 0:33:20, Cycle 1)                 
   (Transfer, 0:33:10, Cycle 1)                    
   (Download, 0:09:10, Cycle 1)          
   (Transfer, 0:33:10, Cycle 1)                   
   (Download, 0:13:00, Cycle 1)            
   (Download, 0:12:30, Cycle 2)           
   (Transfer, 0:33:10, Cycle 2)              
   (Download, 0:02:00, Cycle 2)            
   (Transfer, 0:33:00, Cycle 2)              
   (Transfer, 0:33:00, Cycle 2)               
   (Transfer, 0:33:00, Cycle 2)            
   (Transfer, 0:32:40, Cycle 2)

我正在尝试计算“转移”PER周期类别的出现次数。即周期1中的转移发生次数，周期2中的转移次数等等......

我可以在第一个周期中解决这个问题，但不能在此之后解决..（实际输出中还有更多周期）。

accumulatedList = []
count = 0
for i in range(0 len(my_tup)):
     if my_tup[i][0] == 'Transfer' and my_tup[i][2] == 'Cycle 1':
          count +=1
     accumulatedList.append(count)

不确定如何为其他人做这件事。

Answer 1

使用pandas库很简单：

import pandas as pd
df = pd.DataFrame(my_tup, columns=['Category', 'TimeSpan', 'Cycle'])
g = df.groupby(['Category', 'Cycle']).size()

它返回：

Category  Cycle  
Download  Cycle 1    2
          Cycle 2    2
Transfer  Cycle 1    3
          Cycle 2    5
dtype: int64

如果您只关心转移，请使用索引对其进行切片：

g['Transfer']

Cycle
Cycle 1    3
Cycle 2    5
dtype: int64

Answer 2

您可以使用collections.Counter作为O（n）解决方案。

from collections import Counter

c = Counter()

for cat, time, cycle in lst:
    if cat == 'Transfer':
        c[cycle] += 1

<强>结果

Counter({'Cycle 1': 3,
         'Cycle 2': 5})

<强>设置

lst =  [('Transfer', '0:33:20', 'Cycle 1'),                 
        ('Transfer', '0:33:10', 'Cycle 1'),        
        ('Download', '0:09:10', 'Cycle 1'),        
        ('Transfer', '0:33:10', 'Cycle 1'),                 
        ('Download', '0:13:00', 'Cycle 1'),          
        ('Download', '0:12:30', 'Cycle 2'),         
        ('Transfer', '0:33:10', 'Cycle 2'),            
        ('Download', '0:02:00', 'Cycle 2'),          
        ('Transfer', '0:33:00', 'Cycle 2'),            
        ('Transfer', '0:33:00', 'Cycle 2'),             
        ('Transfer', '0:33:00', 'Cycle 2'),          
        ('Transfer', '0:32:40', 'Cycle 2')]

<强>解释

每次类别为＆＃34;转移＆＃34;时，使用collections.Counter对象递增循环键。

Answer 3

你可以用pandas

来做

import pandas as pd

df = pd.DataFrame([("Transfer", "0:33:20", "Cycle 1"),
("Transfer", "0:33:10", "Cycle 1"),
("Download", "0:09:10", "Cycle 1"),
("Transfer", "0:33:10", "Cycle 1"),
("Download", "0:13:00", "Cycle 1"),
("Download", "0:12:30", "Cycle 2"),
("Transfer", "0:33:10", "Cycle 2"),
("Download", "0:02:00", "Cycle 2"),
("Transfer", "0:33:00", "Cycle 2"),
("Transfer", "0:33:00", "Cycle 2"),
("Transfer", "0:33:00", "Cycle 2"),
("Transfer", "0:32:40", "Cycle 2")])

df.groupby(2).size()

df.groupby(2).size()["Cycle 1"]
df.groupby(2).size()["Cycle 2"]

Answer 4

您可以使用字典来保存结果：

result = {}

for var, _, cycle in my_tup:
    if var == 'Transfer':
        try:
            result[cycle] += 1
        except KeyError:
            result[cycle] = 1

然后result看起来像：

{'Cycle 1': 3, 'Cycle 2': 5}

Answer 5

按元组的第一个和最后一个项排序和分组;迭代组并将Transfer组添加到字典中。

import operator, itertools, collections

a = [('Transfer', '0:33:20', 'Cycle 1'),('Transfer', '0:33:10', 'Cycle 1'),
     ('Download', '0:09:10', 'Cycle 1'),('Transfer', '0:33:10', 'Cycle 1'),
     ('Download', '0:13:00', 'Cycle 1'),('Download', '0:12:30', 'Cycle 2'),
     ('Transfer', '0:33:10', 'Cycle 2'),('Download', '0:02:00', 'Cycle 2'),
     ('Transfer', '0:33:00', 'Cycle 2'),('Transfer', '0:33:00', 'Cycle 2'),
     ('Transfer', '0:33:00', 'Cycle 2'),('Transfer', '0:32:40', 'Cycle 2')]

key = operator.itemgetter(0,2)

a.sort(key=key)
d = {}
for (direction, cycle), group in itertools.groupby(a, key):
    g = list(group)
    if direction == 'Transfer':
        d[cycle] = len(g)
    #print(direction, cycle, g)

>>> d
... {'Cycle 1': 3, 'Cycle 2': 5}

>>>

计算多个元组条件

5 个答案: