我只是学习python不久。我尽力表示我的数据看起来更好,就像以前一样。 现在我有一些元组数据类型,如下所示:
2016-01-16 02:34:28 Connection: opening to smtp.gmail.com:587, timeout=300, options=array (
)
2016-01-16 02:34:28 Connection: opened
2016-01-16 02:34:29 SERVER -> CLIENT: 220 smtp.gmail.com ESMTP ry1sm18220246pab.30 - gsmtp
2016-01-16 02:34:29 CLIENT -> SERVER: EHLO localhost
2016-01-16 02:34:29 SERVER -> CLIENT: 250-smtp.gmail.com at your service, [73.15.255.61]
250-SIZE 35882577
250-8BITMIME
250-STARTTLS
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-CHUNKING
250 SMTPUTF8
2016-01-16 02:34:29 CLIENT -> SERVER: STARTTLS
2016-01-16 02:34:29 SERVER -> CLIENT: 220 2.0.0 Ready to start TLS
2016-01-16 02:34:29 CLIENT -> SERVER: EHLO localhost
2016-01-16 02:34:29 SERVER -> CLIENT: 250-smtp.gmail.com at your service, [73.15.255.61]
250-SIZE 35882577
250-8BITMIME
250-AUTH LOGIN PLAIN XOAUTH2 PLAIN-CLIENTTOKEN OAUTHBEARER XOAUTH
250-ENHANCEDSTATUSCODES
250-PIPELINING
250-CHUNKING
250 SMTPUTF8
2016-01-16 02:34:29 CLIENT -> SERVER: AUTH LOGIN
2016-01-16 02:34:29 SERVER -> CLIENT: 334 VXNlcm5hbWU6
2016-01-16 02:34:29 CLIENT -> SERVER: dmliaHUxMjAxQGdtYWlsLmNvbQ==
2016-01-16 02:34:29 SERVER -> CLIENT: 334 UGFzc3dvcmQ6
2016-01-16 02:34:29 CLIENT -> SERVER: Q0BycGVEMWVt
2016-01-16 02:34:29 SERVER -> CLIENT: 235 2.7.0 Accepted
2016-01-16 02:34:29 CLIENT -> SERVER: MAIL FROM:<v@v>
2016-01-16 02:34:29 SERVER -> CLIENT: 250 2.1.0 OK ry1sm18220246pab.30 - gsmtp
2016-01-16 02:34:29 CLIENT -> SERVER: RCPT TO:<****@gmail.com>
2016-01-16 02:34:29 SERVER -> CLIENT: 250 2.1.5 OK ry1sm18220246pab.30 - gsmtp
2016-01-16 02:34:29 CLIENT -> SERVER: DATA
2016-01-16 02:34:29 SERVER -> CLIENT: 354 Go ahead ry1sm18220246pab.30 - gsmtp
2016-01-16 02:34:29 CLIENT -> SERVER: Date: Sat, 16 Jan 2016 02:34:28 +0000
2016-01-16 02:34:29 CLIENT -> SERVER: To: **** **** <****@gmail.com>
2016-01-16 02:34:29 CLIENT -> SERVER: From: v <v@v>
2016-01-16 02:34:29 CLIENT -> SERVER: Subject: Message Sent from jcrageralternatives.com by: v
2016-01-16 02:34:29 CLIENT -> SERVER: Message-ID: <421aa50e45d9e33b9b7c41918d99af59@localhost>
2016-01-16 02:34:29 CLIENT -> SERVER: X-Mailer: PHPMailer 5.2.14 (https://github.com/PHPMailer/PHPMailer)
2016-01-16 02:34:29 CLIENT -> SERVER: MIME-Version: 1.0
2016-01-16 02:34:29 CLIENT -> SERVER: Content-Type: multipart/alternative;
2016-01-16 02:34:29 CLIENT -> SERVER: boundary="b1_421aa50e45d9e33b9b7c41918d99af59"
2016-01-16 02:34:29 CLIENT -> SERVER: Content-Transfer-Encoding: 8bit
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER: This is a multi-part message in MIME format.
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER: --b1_421aa50e45d9e33b9b7c41918d99af59
2016-01-16 02:34:29 CLIENT -> SERVER: Content-Type: text/plain; charset=us-ascii
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER: as;lkdfjas;ldkf
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER: --b1_421aa50e45d9e33b9b7c41918d99af59
2016-01-16 02:34:29 CLIENT -> SERVER: Content-Type: text/html; charset=us-ascii
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER: <p>Name: v</p><p>Email Provided: v@v</p><p>Phone Number Provided: 1234567891</p><p>Message: 'as;lkdfjas;ldkf'</p>
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER: --b1_421aa50e45d9e33b9b7c41918d99af59--
2016-01-16 02:34:29 CLIENT -> SERVER:
2016-01-16 02:34:29 CLIENT -> SERVER: .
2016-01-16 02:34:30 SERVER -> CLIENT: 250 2.0.0 OK 1452911670 ry1sm18220246pab.30 - gsmtp
2016-01-16 02:34:30 CLIENT -> SERVER: QUIT
2016-01-16 02:34:30 SERVER -> CLIENT: 221 2.0.0 closing connection ry1sm18220246pab.30 - gsmtp
2016-01-16 02:34:30 Connection: closed
{"success":true}
我想知道每个人买了多少件物品。
假设不同的名字是不同的人。
那么我怎么做才能获得如下信息:
('John', '5', 'Coke')
('Mary', '1', 'Pie')
('Jack', '3', 'Milk')
('Mary', '2', 'Water')
('John', '3', 'Coke')
我不知道我现在该怎么办。即使是愚蠢的人也无法想出任何方法。
答案 0 :(得分:8)
我建议使用名字和饮料作为collections.Counter
的关键:
from collections import Counter
count = Counter()
for name, amount, drink in tuples:
key = name, drink
count.update({key: int(amount)}) # increment the value
# represent the aggregated data
for (name, drink), amount in count.items():
print('{}: {} {}'.format(name, amount, drink))
更新我做了一些简单的测量,并想出了
count[name, drink] += value
不仅更具可读性,而且比调用update
快得多,这不应该是一个惊喜。此外,defaultdict(int)
甚至更快(大约两倍)(大概是因为Counter
还执行了一些排序。)
答案 1 :(得分:2)
重新安排数据顺序可能有所帮助:
John: 8 Coke
Mary: 1 Pie
Mary: 2 Water
Jack: 3 Milk
当写为时,可能更具洞察力
(John, Coke) : 8
(Mary, Pie) : 1
(Mary, Water): 2
(Jack, Milk) : 3
如果你知道SQL,这或多或少等同于groupby(name, dish)
和sum(count)
。
因此,在Python中,您可以为该对创建字典:
data = [
('John', '5', 'Coke'),
('Mary', '1', 'Pie'),
('Jack', '3', 'Milk'),
('Mary', '2', 'Water'),
('John', '3', 'Coke'),
]
orders = {}
for name, count, dish in data:
if (name, dish) in orders:
orders[(name, dish)] += int(count)
else:
# first entry
orders[(name, dish)] = int(count)
更加pythonic,使用collections.defaultdict
:
orders = defaultdict(int)
for name, count, dish in data:
orders[(name, dish)] += int(count)
@bereal指出或collections.Counter
。
根据需要格式化数据。
答案 2 :(得分:1)
假设你有一个元组列表
tuples = [('John', '5', 'Coke'),
('Mary', '1', 'Pie'),
('Jack', '3', 'Milk'),
('Mary', '2', 'Water'),
('John', '3', 'Coke')]
memory = {}
# First, we calculate the amount for each pair
for tuple in tuples:
# I define a generated key through the names. For example John-Cake, Mary-Pie, Jack-Milk,...
key = (tuple[0],tuple[2])
number = int(tuple[1])
if key in memory:
memory[key] += number
else:
memory[key] = number
# After, we format the information
list = []
for key in memory:
list.append((key[0],memory[key],key[1]))