我的任务是: "获取交易表,按交易日期分组并计算状态。这种操作将形成统计数据,将在页面上呈现"。
这是我统计数据生成的方法
public static function getStatistics(Website $website = null)
{
if($website == null) return [];
$query = \DB::table('transactions')->where("website_id", $website->id)->orderBy("dt", "desc")->get();
$transitions = collect(static::convertDate($query))->groupBy("dt");
$statistics = collect();
dd($transitions);
foreach ($transitions as $date => $trans) {
$subscriptions = $trans->where("status", 'subscribe')->count();
$unsubscriptions = $trans->where("status", 'unsubscribe')->count();
$prolongations = $trans->where("status", 'rebilling')->count();
$redirections = $trans->where("status", 'redirect_to_lp')->count();
$conversion = $redirections == 0 ? 0 : ((float) ($subscriptions / $redirections));
$earnings = $trans->sum("pay");
$statistics->push((object)[
"date" => $date,
"subscriptions" => $subscriptions,
'unsubscriptions' => $unsubscriptions,
'prolongations' => $prolongations,
'redirections' => $redirections,
'conversion' => round($conversion, 2),
'earnings' => $earnings,
]);
}
return $statistics;
}
如果交易行的数量低于100,000 - 它们都是正确的。但是,如果计数超过150-200k - nginx抛出502坏网关。你有什么建议给我的?我在bigdata处理方面没有任何过关。可能是,我的实力有根本性的错误?
答案 0 :(得分:3)
大数据绝非易事,但我建议使用Laravel chunk
代替get
。
https://laravel.com/docs/5.1/eloquent(ctrl + f" :: chunk")
::chunk
做的是一次选择 n 行,并允许您一点一点地处理它们。这很方便,因为它允许您将更新流式传输到浏览器,但在〜150k结果范围内,我建议查找如何将此工作推送到后台进程,而不是根据请求处理它。
答案 1 :(得分:1)
因此。经过几天关于这个问题的学习信息,我发现只有一个正确的答案:
不使用PHP处理原始数据。最好使用SQL!
就我而言,我们使用的是PostgreSQL。
下面,我将编写帮助我的sql-query,也许它会帮助其他人。
WITH
cte_range(dt) AS
(
SELECT
generate_series('2016-04-01 00:00:00'::timestamp with time zone, '{$date} 00:00:00'::timestamp with time zone, INTERVAL '1 day')
),
cte_data AS
(
SELECT
date_trunc('day', dt) AS dt,
COUNT(*) FILTER (WHERE status = 'subscribe') AS count_subscribes,
COUNT(*) FILTER (WHERE status = 'unsubscribe') AS count_unsubscribes,
COUNT(*) FILTER (WHERE status = 'rebilling') AS count_rebillings,
COUNT(*) FILTER (WHERE status = 'redirect_to_lp') AS count_redirects_to_lp,
SUM(pay) AS earnings,
CASE
WHEN COUNT(*) FILTER (WHERE status = 'redirect_to_lp') > 0 THEN 100.0 * COUNT(*) FILTER (WHERE status = 'subscribe')::float / COUNT(*) FILTER (WHERE status = 'redirect_to_lp')::float
ELSE 0
END
AS conversion_percent
FROM
transactions
WHERE
website_id = {$website->id}
GROUP BY
date_trunc('day', dt)
)
SELECT
to_char(cte_range.dt, 'YYYY-MM-DD') AS day,
COALESCE(cte_data.count_subscribes, 0) AS count_subscribe,
COALESCE(cte_data.count_unsubscribes, 0) AS count_unsubscribes,
COALESCE(cte_data.count_rebillings, 0) AS count_rebillings,
COALESCE(cte_data.count_redirects_to_lp, 0) AS count_redirects_to_lp,
COALESCE(cte_data.conversion_percent, 0) AS conversion_percent,
COALESCE(cte_data.earnings, 0) AS earnings
FROM
cte_range
LEFT JOIN
cte_data
ON cte_data.dt = cte_range.dt
ORDER BY
cte_range.dt DESC