所以我的任务是仅使用RDD操作来实现以下postgresql查询。这是查询:
查询1:
SELECT DISTINCT c.name, count(p.pid)FROM clubs c
JOIN teams t on c.cid = t.cid
JOIN tournaments d on t.tid = t.tid
JOIN players p on p.ncid = c.ncid
WHERE c.cid = 45 AND d.tyear = 2014
GROUP BY c.name
ORDER BY count DESC
查询2:
SELECT DISTINCT t.tyear, c.name, (SELECT max(m.matchdate) - min(m.matchdate) FROM matches m WHERE t.tyear = date_part('year', m.matchdate)) AS days FROM tournaments t
JOIN hosts h ON t.tyear = h.tyear
JOIN countries c on c.cid = h.cid
JOIN stadiums s on s.cid = c.cid
JOIN matches m on m.sid = s.sid
GROUP BY t.tyear, c.name, s.sid
ORDER BY days DESC
有人知道如何仅使用RDD操作来计算这些查询吗? 任何帮助将不胜感激。