因此,我在这里想要做的是获取有关不同用户发送了多少电子邮件(使用MailChimp之类的应用程序)的报告,但是我希望在一个查询中使用两种不同的指标。我想知道每个用户发送了多少封电子邮件。这意味着如果他们分别向100个联系人发送3封电子邮件,则显示300。但是我还想知道发送了多少个唯一电子邮件,即显示3。
我想得到类似以下内容的东西:
-------------------------------------------------------------
| Full Name | Username | Total Sent | Unique Mails |
|-------------|-----------------|------------|--------------|
| John Doe | jdoe@mail.com | 12000 | 4 |
| James Smith | jsmith@mail.com | 6000 | 12 |
| Jane Jones | jjones@mail.com | 4000 | 2 |
| ... | ... | ... | ... |
-------------------------------------------------------------
所以我知道John向很多联系人发送了几封电子邮件,而James向更少联系人发送了更多电子邮件。
这是我的查询的样子。我已经更改了表名和列名,但这是它的确切表示。
SELECT
CONCAT(Usernames.FirstName, ' ', Usernames.LastName) AS 'Full Name',
Usernames.Username,
COUNT(Sent_Mail_Contacts.IDContact) AS `Total Sent`,
COUNT(Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
FROM Usernames
LEFT JOIN Sent_Mail_Contacts ON Usernames.Username = Sent_Mail_Contacts.Username
LEFT JOIN Mass_Mail ON Usernames.Username = Mass_Mail.Username
GROUP BY Usernames.Username
ORDER BY `Total Sent`
我有一个包含用户名的表,一个具有联系方式的个人表和一个具有唯一性电子邮件的表。
那么我的查询有意义吗?这有可能吗?因为现在当我运行它时,它给了我这样的东西:
-------------------------------------------------------------
| Full Name | Username | Total Sent | Unique Mails |
|-------------|-----------------|------------|--------------|
| John Doe | jdoe@mail.com | 12000 | 12000 |
| James Smith | jsmith@mail.com | 6000 | 6000 |
| Jane Jones | jjones@mail.com | 4000 | 4000 |
| ... | ... | ... | ... |
-------------------------------------------------------------
我在两栏中都给了我相同的数字,需要7分钟来处理。
以下是三个表格的示例(如果有帮助的话):
Usernames
------------------------------------------------
| Username | FirstName | LastName | ... |
|-----------------|-----------|----------|-----|
| jdoe@mail.com | John | Doe | ... |
| jsmith@mail.com | James | Smith | ... |
| jjones@mail.com | Jane | Jones | ... |
| ... | ... | ... | ... |
------------------------------------------------
Mass_Mail
----------------------------------------------------
| ID_Mass_Mail | Username | Date | ... |
|--------------|----------------|------------|-----|
| 1 | jdoe@mail.com | 2019-01-16 | ... |
| 2 | jdoe@mail.com | 2019-01-29 | ... |
| 3 | jjones@mail.com| 2019-02-14 | ... |
| ... | ... | ... | ... |
----------------------------------------------------
Sent_Mail_Contacts
---------------------------------------------------------------------
| ID_Mass_Mail | Username | Contact_ID | Contact_Email | ... |
|--------------|----------------|------------|----------------|------
| 1 | jdoe@mail.com | 1 | bob@mail.com | ... |
| 1 | jdoe@mail.com | 2 | jim@mail.com | ... |
| 1 | jdoe@mail.com | 3 | cindy@mail.com | ... |
| ... | ... | ... | ... | ... |
| 2 | jdoe@mail.com | 4 | mike@mail.com | ... |
| 2 | jdoe@mail.com | 2 | jim@mail.com | ... |
| 2 | jdoe@mail.com | 3 | cindy@mail.com | ... |
| ... | ... | ... | ... | ... |
---------------------------------------------------------------------
答案 0 :(得分:1)
使用COUNT(DISTINCT ...)
:
SELECT
CONCAT(Usernames.FirstName, ' ', Usernames.LastName) AS 'Full Name',
Usernames.Username,
COUNT(Sent_Mail_Contacts.IDContact) AS `Total Sent`,
COUNT(DISTINCT Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
FROM Usernames
LEFT JOIN Sent_Mail_Contacts ON Usernames.Username = Sent_Mail_Contacts.Username
LEFT JOIN Mass_Mail ON Usernames.Username = Mass_Mail.Username
GROUP BY Usernames.Username
ORDER BY `Total Sent`
NB:虽然这样不会使查询更快。首先,您至少应确保在JOIN
中使用主键/外键关系:Usernames(Username)
,Sent_Mail_Contacts(Username)
,Mass_Mail(Username)
答案 1 :(得分:1)
假设IDMass_Mail
中的值表示唯一的电子邮件,则只需编辑最后一个COUNT
即可使用DISTINCT
关键字。
COUNT(DISTINCT Mass_Mail.IDMass_Mail) AS `Individual E-Mails`
这将返回按Username
分组的唯一值的数量。
如果您能够向Username
和Sent_Mail_Contacts
表中的Mass_Mail
列添加索引,则还应该提高性能。
答案 2 :(得分:1)
我设法使用一个查询来做到这一点(除了出于隐私考虑而更改了实际的表名和列名)。
SELECT
Accounts.Account_Name AS `account`,
Usernames.Username AS `username`,
COUNT(Mass_Mail_Reached_Contacts.ID_Contact) AS `total_emails`,
COUNT(Mass_Mail_Reached_Contacts.ID_Mass_Mail) /
(
SELECT COUNT(*)
FROM
Mass_Mail_Reached_Contacts
WHERE
Mass_Mail_Reached_Contacts.DATE >= '2019-02-01'
AND
Mass_Mail_Reached_Contacts.DATE <= '2019-02-28'
)
* 100 AS `%`,
COUNT(DISTINCT Mass_Mail.ID_Mass_Mail) AS `unique_emails`,
COUNT(Mass_Mail_Reached_Contacts.ID_Mass_Mail) /
COUNT(DISTINCT mass_mail.ID_Mass_Mail)
AS `avg_contacts_per_email`
FROM
Usernames
LEFT JOIN Mass_Mail_Reached_Contacts ON Mass_Mail_Reached_Contacts.Username = Usernames.Username
LEFT JOIN Account ON Account.ID_Account = Usernames.ID_Account
LEFT JOIN Mass_Mail ON Mass_Mail.ID_Mass_Mail = Mass_Mail_Reached_Contacts.ID_mass_mail
WHERE
Mass_Mail_Reached_Contacts.DATE >= '2019-02-01'
AND
Mass_Mail_Reached_Contacts.DATE <= '2019-02-28'
GROUP BY
Usernames.Username
HAVING COUNT(DISTINCT Mass_Mail.IDMass_Mail) > 0
ORDER BY
`total_emails` DESC
我现在可以得到一个像这样的表
Emails Stats
--------------------------------------------------------------------------------------
| account | username | total_emails | % | unique_emails | avg_contact_email |
|----------|--------------|--------------|-------|------------------------------------
| Bob inc. | bob@mail.com | 28,550 | 14.52 | 12 | 2379.17 |
| ... | ... | ... | ... | ... | ... |
--------------------------------------------------------------------------------------
答案 3 :(得分:0)
开始于:为什么Mass_Mail
和Sent_Mail_Contacts
都包含Username
?这看起来多余。还是Sent_Mail_Contacts.ID_Mass_Mail
为空?
至少对于此查询,我想我们可以完全忽略Username
中的Sent_Mail_Contacts
。真正链接两个表的是ID_Mass_Mail
,并且您在查询中忘记了此连接条件。
select
ws_concat(' ', u.firstname, u.lastname) as full_name,
u.username,
count(smc.idmass_mail) as total_sent,
count(mm.idmass_mail) as individual_e_mails
from usernames u
left join mass_mail mm on mm.username = u.username
left join sent_mail_contacts smc on smc.id_mass_mail = u.id_mass_mail
group by u.username
order by total_sent;