我们遇到了问题。我们的客户抱怨他们在收件箱中收到重复的电子邮件。有些日子,同一时间内完全相同的电子邮件最多可达5或6个实例。我们不明白为什么。代码至少重写了一次,但问题仍然存在。
我试着解释一下......但它有点复杂:O(
每天晚上(清晨),我们都希望向用户发送包含使用统计信息的每日报告。所以我们有一个cron工作:
<cron>
<url>/redacted/report/url</url>
<description>Send out daily reports to active subscribers</description>
<schedule>every 2 hours</schedule>
</cron>
cron作业命中servlet get方法:
protected void doGet(HttpServletRequest req, HttpServletResponse resp) throws ServletException, IOException {
AccountFilter filter = AccountFilter.forWebSafeName(req.getParameter("filter"));
createTasks(filter, null);
}
使用空游标调用createTasks方法:
private void createTasks(AccountFilter accountFilter, String cursor) {
try {
PagedResults<Account> pagedAccounts = accountRepository.getAccounts(accountFilter.getFilter(), 50, cursor);
createTaskBatch(pagedAccounts);
// If there are still more results in cursor, then send cursor back to this servlet's doPost method so we don't hit the request time limit
if (pagedAccounts.getCursor() != null) {
getQueue(QUEUE_NAME).add(withUrl(WORKER_URL).param(CURSOR_KEY, pagedAccounts.getCursor()).param(FILTER_KEY, accountFilter.getWebSafeName()));
}
} catch(Exception ex) {
logger.log(Level.WARNING, "Problem creating daily report task batch for filter " + accountFilter.getWebSafeName(), ex);
}
}
它抓取50个帐户并对其进行迭代,为此时应发送的电子邮件创建新的排队作业。在创建新的排队任务之前,有代码可以显式检查上次发送的时间戳并更新时间戳。如果不发送报告而不是发送重复报告,这应该是错误的:
private void createTaskBatch(PagedResults<Account> pagedAccounts) {
// GAE datastore query might return duplicate results?!
List<Account> list = pagedAccounts.getResults();
Set<Account> noDuplicates = new HashSet<>(list);
int dups = list.size() - noDuplicates.size();
if ( dups > 0 ){
logger.warning ("Accounts paged results contained " + dups + " duplicates!");
}
for (Account account : noDuplicates) {
try {
if (lastReportSentOver12HoursAgo(account)) {
List<Parent> parents = parentRepository.getVerifiedParentsForAccount(account.getId());
if (eitherParentSubscribed(parents)) {
List<AccountUser> users = accountUserRepository.listUsers(account.getId());
List<Device> devices = getUserDevices(account, users);
if (!devices.isEmpty()) {
DateTimeZone tz = getMostCommonTimezone(devices);
if ( null == tz ){
logger.warning("No timezone found for account: " + account.getId() );
}
else{
// Send early in the morning as the report contains the previous day's stats
if (now(tz).getHourOfDay() < 7) {
// mark sent now because queue might not be processed for a while
// and the next cursor set might contain some of the same accounts
accountRepository.markReportSent(account.getId(), now());
getQueue(QUEUE_NAME).add(withUrl(DailyReportServlet.WORKER_URL).param(DailyReportServlet.ACCOUNT_ID, account.getId()).param(DailyReportServlet.COMMON_TIMEZONE, tz.getID()));
}
}
}
}
}
} catch(Exception ex) {
logger.log(Level.WARNING, "Problem creating daily report task for " + account.getId(), ex);
}
}
}
servlet POST方法负责通过游标方法处理结果的后续页面:
public void doPost(HttpServletRequest req, HttpServletResponse resp) throws IOException {
AccountFilter accountFilter = AccountFilter.forWebSafeName(req.getParameter(FILTER_KEY));
logger.log(Level.INFO, "doPost hit from task queue with filter " + accountFilter.getWebSafeName());
String cursor = req.getParameter(CURSOR_KEY);
createTasks(accountFilter, cursor);
}
还有另一个servlet处理每个报告任务,它只是在com.sendgrid.SendGrid类上创建电子邮件内容和调用send。
数据存储中的最终一致性似乎是一个可能的候选者,但应该在几秒钟内解决,我不知道这将如何解释抱怨的客户数量和一些客户看到的重复数量。
帮助!有任何想法吗?我们在某处愚蠢吗?
已更新
为清楚起见......电子邮件发送任务队列最终会在此方法中捕获异常并将其报告给我们。我们没有看到重复案例的例外情况:
private void sendReport(Account account, DateTimeZone tz) throws IOException, EntityNotFoundException {
try {
boolean sent = false;
Map<String, Object> root = buildEmailData(account, tz);
for (Parent parent : parentRepository.getVerifiedParentsForAccount(account.getId())) {
if (parent.getEmailPreferences().isSubscribedReports()) {
emailBuilder.send(account, parent, root, "report", EmailSender.NOTIFICATION);
sent = true;
}
}
if ( sent ){
accountRepository.markReportSent(account.getId(), now());
}
} catch (Exception ex) {
String message = "Problem building report email for account " + account.getId();
logger.log(Level.WARNING, message, ex);;
new TeamNotificationEvent( message + " : exception: " + ex.getMessage()).fire();
throw new IOException(message, ex);
}
}
在添加额外调试记录后更新2
我在同一个任务队列中看到两个POSTS同时使用相同的光标:
09:35:08.397 2015-04-30 200 0 B 3.78s / ws / notification / daily-report-task-creator 0.1.0.2 - - [30 / Apr / 2015:01:35:08 -0700]&#34; POST / ws / notification / daily-report-task-creator HTTP / 1.1&#34; 200 0&#34; http://screentimelabs.appspot.com/ws/notification/daily-report-task-creator&#34; &#34;应用服务引擎-谷歌; (+ http://code.google.com/appengine)&#34; &#34; screentimelabs.appspot.com&#34; ms = 3782 cpu_ms = 662 queue_name = dailyReports task_name = 8168414365365326983 instance = 00c61b117c33a909790f0d1882657e04f40b2c7e app_engine_release = 1.9.20 09:35:04.618 com.screentime.service.taskqueue.reports.DailyReportTaskCreatorServlet createTasks:createTasks要求滤波器:ACTIVE与光标:的 E-ABAIICO2oQc35zY3JlZW50aW1lbGFic3InCxIHQWNjb3VudCIaamFybW8ua2Fya2thaW5lbkBnbWFpbC5jb20MiAIAFA
09:35:08.432 2015-04-30 200 0 B 8.84s / ws / notification / daily-report-task-creator 0.1.0.2 - - [30 / Apr / 2015:01:35:08 -0700]&#34; POST / ws / notification / daily-report-task-creator HTTP / 1.1&#34; 200 0&#34; http://screentimelabs.appspot.com/ws/notification/daily-report-task-creator&#34; &#34;应用服务引擎-谷歌; (+ http://code.google.com/appengine)&#34; &#34; screentimelabs.appspot.com&#34; ms = 8837 cpu_ms = 1348 queue_name = dailyReports task_name = 50170612326424582061 instance = 00c61b117c2bffe8de313e96fea8aeb813f4b20f app_engine_release = 1.9.20 trace_id = 7e5c0348382e66cf4e2c6ba400529fb7 09:34:59.608 com.screentime.service.taskqueue.reports.DailyReportTaskCreatorServlet createTasks:createTasks要求滤波器:ACTIVE与光标:的 E-ABAIICO2oQc35zY3JlZW50aW1lbGFic3InCxIHQWNjb3VudCIaamFybW8ua2Fya2thaW5lbkBnbWFpbC5jb20MiAIAFA
搜索1个特定帐户ID我看到了这些请求:
09:35:08.397 2015-04-30 200 0 B 3.78s / ws / notification / daily-report-task-creator
09:35:08.432 2015-04-30 200 0 B 8.84s / ws / notification / daily-report-task-creator
09:35:08.443 2015-04-30 200 0 B 6.73s / ws / notification / daily-report-task-creator
09:35:10.541 2015-04-30 200 0 B 4.03s / ws / notification / daily-report-task-creator
09:35:10.690 2015-04-30 200 0 B 11.09s / ws / notification / daily-report-task-creator
09:35:13.678 2015-04-30 200 0 B 862ms / ws / notification / daily-report-worker
09:35:13.829 2015-04-30 500 0 B 1.21s / ws / notification / daily-report-worker
09:35:14.677 2015-04-30 200 0 B 1.56s / ws / notification / daily-report-worker
09:35:14.961 2015-04-30 200 0 B 346ms / ws / notification / daily-report-worker
有些人重复了光标值。
答案 0 :(得分:0)
我会猜测,因为我没有看到任务队列代码。您可能没有在任务队列中正确处理错误。如果任务队列以错误结束,则gae将重新排队。因此,如果已经发送了一些电子邮件,该任务仍将再次运行。你需要一种方法来记住你在任务队列中已处理的内容,以便重试不会重新处理它们。