这个问题是针对那里的所有mySQL专家。我一直在构建一个广告网络,并一直在努力优化抓取广告的主要SQL调用。它会检查符合以下条件的广告:
这是我正在运行的当前SQL查询:
SELECT
w.nid as w_nid,
w.uid as w_uid,
w.status as w_status,
w.landing_page as w_landing_page,
w.starting_bid as w_starting_bid,
w.daily_budget as w_daily_budget,
w.revshare as w_revshare,
w.filters as w_filters,
w.device as w_device,
w.os as w_os,
w.conversion as w_conversion,
w.max_ctr as w_max_ctr,
w.frequency as w_frequency,
w.ad_title as w_ad_title,
w.ad_desc as w_ad_desc,
w.ad_728x90 as w_ad_728x90,
w.ad_300x250 as w_ad_300x250,
w.ad_160x600 as w_ad_160x600,
w.match_type as w_match_type,
wg.nid as wg_nid,
wg.geo as wg_geo,
wk.keyword as wk_keyword,
wk.nid as wk_nid,
IFNULL(wcs.estimate,0) as wcs_spend,
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) as wcs_ctr,
pss.bid as pss_bid,
pss.ctr as pss_ctr,
wci.count as wci_count,
ws.*
FROM
websites w
LEFT JOIN
websites_geos wg
ON
wg.nid = w.nid
LEFT JOIN
websites_keywords wk
ON
wk.nid = w.nid
LEFT JOIN
api_bucket_website_daily_clicks wcs
ON
wcs.nid = w.nid AND wcs.date = CURDATE()
LEFT JOIN
publisher_subid_stats pss
ON
pss.uniq = CONCAT(w.nid,'_',:pid
,'_',:subid)
LEFT JOIN
websites_cur_ips wci
ON
wci.unique = CONCAT(CURDATE(),:ip,w.nid)
LEFT JOIN
websites_subids ws
ON
w.nid = ws.nid AND CONCAT(:pid,'_',:subid) = ws.subid
WHERE
(
(
match_type = 0 /* MATCH RON KEYWORDS */
)
OR
(
wk.keyword = :keyword /* MATCH EXACT KEYWORD */
AND
match_type = 1
)
OR
(
:keyword LIKE CONCAT('%',wk.keyword,'%') /* MATCH PHRASE KEYWORD */
AND
match_type = 2
)
OR
(
:keyword LIKE CONCAT('%',REPLACE(wk.keyword, ' ', '%'),'%') /* MATCH BROAD KEYWORD */
AND
match_type = 3
)
)
AND
wg.geo = :geo
AND
w.os = :os
AND
w.device = :device
AND
w.enabled = 1
AND
w.conversion IN (:conversiontype)
AND
((:sectoday/86400) * w.daily_budget) >= IFNULL(wcs.estimate,0)
AND
IFNULL(wci.count,0) < w.frequency
AND
ws.nid IS NULL
AND
((:adometry = 0) OR (:adometry = 1 AND w.filters = 0))
AND
(
(
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) <= w.max_ctr
AND
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) >= IFNULL(pss.ctr,0)
)
OR
(
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) >= IFNULL(pss.ctr,0)
)
)
ORDER BY
IFNULL(pss.bid,w.starting_bid) DESC, RAND()
我看看这些查询,我哭了,因为即使它现在响应速度超快,我们计划每天收到超过10亿个查询,超过500个广告客户,我想确保它是优化的尽可能。如果您需要更多信息,请与我们联系!
答案 0 :(得分:1)
可能值得研究的几个评论,然后它们可能根本不是有益的,所以你可能想(彻底)先测试它们!
我可能会将WHERE
中有关match_type
的部分转换为CASE WHEN
结构
AND wg.geo = :geo
子句中有WHERE
,但websites_geo
通过left outer join
链接。将“过滤器”移至JOIN
或将连接设为正常JOIN
。
((:sectoday/86400) * w.daily_budget) >= IFNULL(wcs.estimate,0)
((:sectoday/86400)
移到右侧IFNULL()
的原因是因为LEFT OUTER JOIN
来api_bucket_website_daily_clicks,对吧?如果是,那么您可以将过滤器移至LEFT JOIN
语句并删除IFNULL()
部分(:sectoday/86400)
并将其存储在变量中可能会有用。 IFNULL(wci.count,0) < w.frequency
((:adometry = 0) OR (:adometry = 1 AND w.filters = 0))
:adometry
有点,您可以将其缩短为(:adometry = 0 OR w.filters = 0)
部分
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) <= w.max_ctr
AND
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) >= IFNULL(pss.ctr,0)
可以缩短为
IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0) BETWEEN IFNULL(pss.ctr,0) AND w.max_ctr
这可能不会更快,但似乎更容易阅读恕我直言。
在执行nbr 6时,“精神上”将IFNULL(((wcs.conversions/wcs.conversion_impressions)*100),0)
替换为actual_ctr
我得到以下代码
atcual_ctr BETWEEN IFNULL(pss.ctr,0) AND w.max_ctr
OR
atcual_ctr >= IFNULL(pss.ctr,0)
哪个恕我直言听起来像是没有意义的东西(bug?)
我打算评论Rand()可能没有为每条记录“更新”,但事实证明它是在mysql中。 (供参考:虽然它在MSSQL中不起作用)
奇怪的是,您选择了ws.*
,但考虑到websites_subids
的要求,我假设您实际上在websites
中寻找否匹配websites_subids
,因此所述表的所有字段都将返回NULL。我不禁想知道如何解释这一点。
作为一个更通用的评论:我不知道您的数据在活动和非活动记录中的表现如何(是1%-99%还是50&amp; -50%?),但是如果性能会变成问题我考虑从表中删除字段enabled
并将逻辑更改为有2个表,一个表示“过时”记录,另一个表示活动记录。如果需要,您仍然可以预见UNION的两个表,以防您不希望对用于维护数据特性的逻辑进行太多更改(partitiong也可能是一个选项)。
答案 1 :(得分:0)
一些建议:(1)如果可以提供帮助,请不要加入函数,例如CONCAT; (2)以类似的方式,尝试减少WHERE子句中的计算,例如“((:sectoday / 86400)* w.daily_budget)&gt; = IFNULL(wcs.estimate,0),”也许存储索引表/列中的那些计算结果; (3)考虑从wk.keyword中创建一个FULLTEXT索引(我猜它不是当前的,考虑到查询的WHERE条件)并使用单个MATCH AGAINST子句而不是多个LIKE子句。祝你好运!