具有多个表和关系的复杂SQL查询

时间:2013-07-31 06:44:50

标签: sql postgresql

在这个查询中,我必须列出一对玩家ID和玩家名称的球员,他们为同一支球队效力。如果一名球员为3支球队效力,则另一支球员必须参加完全相同的3支球队。不能少,不多了。如果两名球员目前不参加任何球队,他们也应该被包括在内。查询应该返回(playerID1,playername1,playerID2,playerName2)而没有重复,例如如果玩家1信息在玩家2之前出现,则不应该有另一个玩家2信息在玩家1之前出现的元组。

例如,如果玩家A为洋基队和红袜队队员比赛,而且队员队员为洋基队队员,红袜队队员和道奇队队员队效力,我就不应该参加比赛。他们都必须为洋基队和红袜队效力,而不是其他人。现在,如果玩家为同一个团队玩游戏,此查询会找到答案。

Tables:
player(playerID: integer, playerName: string)
team(teamID: integer, teamName: string, sport: string)
plays(playerID: integer, teamID: integer)

Example data:
PLAYER    
playerID    playerName
1           Rondo
2           Allen
3           Pierce
4           Garnett
5           Perkins

TEAM      
teamID     teamName       sport
1          Celtics        Basketball
2          Lakers         Basketball
3          Patriots       Football
4          Red Sox        Baseball
5          Bulls          Basketball

PLAYS
playerID    TeamID
1           1
1           2
1           3
2           1
2           3
3           1
3           3

所以我应该把它作为答案 -

 2, Allen, 3, Pierce 
 4, Garnett, 5, Perkins

2,艾伦,3皮尔斯是一个snwer,因为他们只参加CELTICS和PATRIOTS  4,加内特,5,帕金斯给出了答案,因为两位球员都没有参加任何应该输出的球队。

现在我的查询是

SELECT p1.PLAYERID, 
       f1.PLAYERNAME, 
       p2.PLAYERID, 
       f2.PLAYERNAME 
FROM   PLAYER f1, 
       PLAYER f2, 
       PLAYS p1 
       FULL OUTER JOIN PLAYS p2 
                    ON p1.PLAYERID < p2.PLAYERID 
                       AND p1.TEAMID = p2.TEAMID 
GROUP  BY p1.PLAYERID, 
          f1.PLAYERID, 
          p2.PLAYERID, 
          f2.PLAYERID 
HAVING Count(p1.PLAYERID) = Count(*) 
       AND Count(p2.PLAYERID) = Count(*) 
       AND p1.PLAYERID = f1.PLAYERID 
       AND p2.PLAYERID = f2.PLAYERID; 

我不是百分百肯定,但我认为这可以找到为同一支球队效力的球员,但我想找出那些与上述球队完全相同的球员的球员

我坚持在此之后如何接近它。有关如何解决此问题的任何提示。谢谢你的时间。

16 个答案:

答案 0 :(得分:4)

我相信这个查询会做你想要的:

SELECT array_agg(players), player_teams
FROM (
  SELECT DISTINCT t1.t1player AS players, t1.player_teams
  FROM (
    SELECT
      p.playerid AS t1id,
      concat(p.playerid,':', p.playername, ' ') AS t1player,
      array_agg(pl.teamid ORDER BY pl.teamid) AS player_teams
    FROM player p
    LEFT JOIN plays pl ON p.playerid = pl.playerid
    GROUP BY p.playerid, p.playername
  ) t1
INNER JOIN (
  SELECT
    p.playerid AS t2id,
    array_agg(pl.teamid ORDER BY pl.teamid) AS player_teams
  FROM player p
  LEFT JOIN plays pl ON p.playerid = pl.playerid
  GROUP BY p.playerid, p.playername
) t2 ON t1.player_teams=t2.player_teams AND t1.t1id <> t2.t2id
) innerQuery
GROUP BY player_teams


Result:
PLAYERS               PLAYER_TEAMS
2:Allen,3:Pierce      1,3
4:Garnett,5:Perkins

对于plays中的每个玩家,它使用teamid的team_agg匹配具有完全相同团队配置的玩家。例如,我包含了一个包含团队的专栏,但只要不从group by子句中删除,就可以删除该列,而不会影响结果。

SQL Fiddle example.使用Postgesql 9.2.4进行测试

编辑:修复了重复行的错误。

答案 1 :(得分:1)

似乎OP可能不再感兴趣了,但是如果其他人发现它有用, 这是纯SQL中的查询(对我来说至少;))

SELECT M.p1, pr1.playername, M.p2, pr2.playername FROM player pr1 
INNER JOIN player pr2 INNER JOIN
(
   SELECT plays1.player p1, plays2.player p2, plays1.team t1 FROM plays plays1 
   INNER JOIN plays plays2 
   ON (plays1.player < plays2.player AND plays1.team = plays2.team)
   GROUP BY plays1.player, plays2.player HAVING COUNT(*) = 
((SELECT COUNT(*) FROM plays plays3 WHERE plays3.player = plays1.player) + 
(SELECT COUNT(*) FROM plays plays4 WHERE plays4.player = plays2.player)) /2
) M ON pr1.playerID = M.p1 AND pr2.playerID = M.p2 
UNION ALL
SELECT M.pid, M.pname, N.pid2, N.pname2 FROM
(
(SELECT p.playerID pid, p.playerName pname, pl.team FROM player p
 LEFT JOIN plays pl ON p.playerId = pl.player WHERE pl.team IS NULL) M
 INNER JOIN
 (SELECT p.playerID pid2, p.playerName pname2, pl.team FROM player p
  LEFT JOIN plays pl ON p.playerId = pl.player WHERE pl.team IS NULL) N 
 ON (pid < pid2)
)

答案 2 :(得分:1)

没什么大不了的,这是解决方案

with gigo as(select a.playerid as playerid,count(b.teamname) as nteams from player a 
full outer join plays c on a.playerid=c.playerid full outer join team b 
on b.teamid=c.teamid group by a.playerid)
select array_agg(a.*),g.nteams from player a inner join gigo g on a.playerid=g.playerid 
group by g.nteams having count(a.*)>1 order by g.nteams desc

答案 3 :(得分:1)

此解决方案适用于我:

SELECT TMP1. PLAYERID,TMP2.PLAYERID FROM
(
    SELECT a.playerid , a.teamid,b.team_sum 
    FROM plays  A
    INNER JOIN 
    (
        SELECT PLAYERID,SUM(teamid) AS team_sum
        FROM plays
        GROUP BY 1
     ) B

    ON a.playerid=b.playerid
 ) TMP1

INNER JOIN

(
    SELECT a.playerid , a.teamid,b.team_sum
    FROM plays  A

    INNER JOIN 
    (
        SELECT PLAYERID,SUM(teamid) AS team_sum
        FROM plays
        GROUP BY 1
    ) B

ON a.playerid=b.playerid

)TMP2
ON TMP1.PLAYERID < TMP2.PLAYERID
AND TMP1.TEAMID=TMP2.TEAMID
AND TMP1.TEAM_SUM=TMP2.TEAM_SUM
GROUP BY 1,2

UNION ALL
SELECT n1,n2 FROM  
(
    SELECT TMP3.PLAYERID AS n1,TMP4.PLAYERID AS n2 FROM 
    PLAYER  TMP3
    INNER JOIN PLAYER TMP4
    ON TMP3.PLAYERID<TMP4.PLAYERID
    WHERE TMP3.PLAYERID  NOT IN (SELECT  PLAYERID FROM plays  )
    AND tmp4.playerid NOT IN (SELECT playerid FROM plays)
) TMP5

答案 4 :(得分:0)

我想到了两种可能的解决方案:

  1. 光标 - 循环播放每个玩家并将其与所有其他玩家进行比较,直至得出结论。
  2. 递归查询 - 相同的想法虽然稍微复杂一点,但更挑剔的是更好的方法。可能也有更好的表现。
  3. 你能提供一些样本数据,以便我可以创建一个例子吗?

答案 5 :(得分:0)

看起来你想要的基本数据类型是集合,而不是数组。因此,一个选项可能是使用PL / Python,其代码类似于下面的代码(请参阅本答案的底部,了解可能适用于此目的的函数)。当然,这绝不是一种“纯粹的SQL”方法。

但是坚持使用PostgreSQL(虽然不是标准的SQL),你可能也想使用DISTINCT和array_agg。请注意,以下仅提供符合条件的第一对(原则上可能还有更多)。

WITH teams AS (
  SELECT playerID, array_agg(DISTINCT teamID ORDER BY teamID) AS teams
  FROM plays
  GROUP BY playerID),
teams_w_nulls AS (
  SELECT a.playerID, b.teams
  FROM player AS a
  LEFT JOIN teams AS b
  ON a.playerID=b.playerID),
player_sets AS (
  SELECT teams, array_agg(DISTINCT playerID ORDER BY playerID) AS players
  FROM teams_w_nulls
  GROUP BY teams
  -- exclude players who are only share a team list with themselves.
  HAVING array_length(array_agg(DISTINCT playerID ORDER BY playerID),1)>1)
SELECT a.teams, b.playerID, b.playerName, c.playerID, c.playerName
FROM player_sets AS a
INNER JOIN player AS b
ON a.players[1]=b.playerID
INNER JOIN player AS c
ON a.players[2]=c.playerID;

上面的查询给出了以下输出:

 teams | playerid | playername | playerid | playername 
-------+----------+------------+----------+------------
 {1,3} |        2 | Allen      |        3 | Pierce
       |        4 | Garnett    |        5 | Perkins
(2 rows)

示例PL / Python函数:

CREATE OR REPLACE FUNCTION set(the_list integer[])
  RETURNS integer[] AS
$BODY$
    return list(set(the_list))
$BODY$
  LANGUAGE plpython2u;

CREATE OR REPLACE FUNCTION pairs(a_set integer[])
  RETURNS SETOF integer[] AS
$BODY$
    def pairs(x):
        for i in range(len(x)):
            for j in x[i+1:]:
                yield [x[i], j]
    return list(pairs(a_set))
$BODY$
  LANGUAGE plpython2u;

SELECT set(ARRAY[1, 1, 2, 3, 4, 5, 6, 6]);

上面使用这些函数的代码版本(输出类似,但是当给定一组团队有多个时,这种方法会选择所有对):

WITH teams AS (
  SELECT playerID, set(array_agg(teamID)) AS teams
  FROM plays
  GROUP BY playerID),
teams_w_nulls AS (
  SELECT a.playerID, b.teams
  FROM player AS a
  LEFT JOIN teams AS b
  ON a.playerID=b.playerID),
player_pairs AS (
  SELECT teams, pairs(set(array_agg(playerID))) AS pairs
  FROM teams_w_nulls
  GROUP BY teams)
  -- no need to exclude players who are only share a team 
  -- list with themselves.
SELECT teams, pairs[1] AS player_1, pairs[2] AS player_2
FROM player_pairs;

答案 6 :(得分:0)

我们根据每位玩家的团队数量和ascii(team_name)+ team_id的总和进行查询,称之为team_value。我们对自己进行相同查询的自连接,其中count和team_values匹配,但id不等于id,这为我们提供了我们想要获取的ID

select * from player where player_id in 
(
 select set2.player_id orig
 from
 (select count(*) count,b.player_id , nvl(sum(a.team_id+ascii(team_name)),0) team_value
   from plays a, player b , team c
   where a.player_id(+)=b.player_id
    and a.team_id = c.team_id(+)
   group by b.player_id) set1,
(select count(*) count,b.player_id , nvl(sum(a.team_id+ascii(team_name)),0) team_value
   from plays a, player b , team c
   where a.player_id(+)=b.player_id
    and a.team_id = c.team_id(+)
   group by b.player_id) set2
where set1.count=set2.count and set1.team_value=set2.team_value
  and set1.player_id<>set2.player_id
)

答案 7 :(得分:0)

这是使用UNION和2-3个简单连接的简单查询。 UNION之前的第一个查询包含玩家名称和玩家队伍相同数量的队员相同次数。 UNION之后的第二个查询包含没有为任何球队效力的球员名称和球员。

只需复制粘贴此查询并尝试执行它,您将看到预期的结果。

    select playername,c.playerid from 
    (select a.cnt, a.playerid from 
    (select count(1) cnt , PLAYERID from plays group by  PLAYERID) a ,
    (select count(1) cnt , PLAYERID from plays group by  PLAYERID) b 
    where a.cnt=b.cnt
    and  a.playerid<> b.playerid ) c ,PLAYER  d
    where c.playerid=d.playerid
    UNION
    select e.playername,e.playerid 
    from player e 
    left outer join plays f on 
    e.playerid=f.playerid where nvl(teamid,0 )=0

答案 8 :(得分:0)

试试这个: 这里测试是你问题中的PLAYS表。

select group_concat(b.name),a.teams from
(SELECT playerid, group_concat(distinct teamid ORDER BY teamid) AS teams
  FROM test
  GROUP BY playerid) a, player b
where a.playerid=b.playerid
group by a.teams
union
select group_concat(c.name order by c.playerid),null from player c where c.playerid not in (select        playerid from test);

答案 9 :(得分:0)

对于任何感兴趣的人,这个简单的查询对我有用

SELECT UNIQUE PLR1.PID,PLR1.PNAME, PLR2.PID, PLR2.PNAME
FROM PLAYS PLY1,PLAYS PLY2, PLAYER PLR1, PLAYER PLR2
WHERE PLR1.PID < PLR2.PID AND PLR1.PID = PLY1.PID(+) AND PLR2.PID = PLY2.PID(+)
AND NOT EXISTS(( SELECT PLY3.TEAMID FROM PLAYS PLY3 WHERE PLY3.PID = PLR1.PID) 
MINUS
( SELECT PLY4.TEAMID FROM PLAYS PLY4 WHERE PLY4.PID = PLR2.PID));

答案 10 :(得分:0)

select p1.playerId, p2.playerId, count(p1.playerId)
from plays p1, plays p2
WHERE p1.playerId<p2.playerId
and p1.teamId = p2.teamId
GROUP BY p1.playerId, p2.playerId
having count(*) = (select count(*) from plays where playerid = p1.playerid)

答案 11 :(得分:0)

WITH temp AS (
  SELECT p.playerid, p.playername, listagg(t.teamname,',') WITHIN GROUP (ORDER BY t.teamname) AS teams
  FROM player p full OUTER JOIN plays p1 ON p.playerid = p1.playerid
    LEFT JOIN team t ON p1.teamid = t.teamid GROUP BY (p.playerid , p.playername))
SELECT concat(concat(t1.playerid,','), t1.playername), t1.teams 
FROM temp t1 WHERE nvl(t1.teams,' ') IN (
  SELECT nvl(t2.teams,' ') FROM temp t2 
  WHERE t1.playerid <> t2.playerid) 
ORDER BY t1.playerid

答案 12 :(得分:0)

这是ANSI SQL,不使用任何特殊功能。

SELECT   TAB1.T1_playerID AS playerID1 , TAB1.playerName1  ,   
  TAB1.T2_playerID AS playerID2, TAB1. playerName2
 FROM
(select   T1.playerID AS T1_playerID ,  T3. playerName  AS  playerName1 ,

T2.playerID  AS T2_playerID ,  T4. playerName AS playerName2  ,COUNT (T1.TeamID) AS MATCHING_TEAM_ID_CNT
FROM PLAYS T1
INNER JOIN PLAYS T2  ON(  T1.TeamID = T2.TeamID AND T1.playerID <> T2.playerID )
INNER JOIN player T3 ON (  T1.playerID=T3.playerID)
INNER JOIN player T4 ON (  T2.playerID=T4.playerID)
 GROUP BY 1,2,3,4
) TAB1

INNER JOIN 
( SELECT  T1.playerID AS playerID, COUNT(T1.TeamID) AS TOTAL_TEAM_CNT
 FROM PLAYS  T1
GROUP BY T1.playerID) TAB2
ON(TAB1.T2_playerID=TAB2.playerID AND    
  TAB1.MATCHING_TEAM_ID_CNT =TAB2.TOTAL_TEAM_CNT)

INNER JOIN 
( SELECT  T1.playerID AS playerID, COUNT(T1.TeamID) AS TOTAL_TEAM_CNT
FROM PLAYS  T1
GROUP BY T1.playerID 
) TAB3
ON( TAB1. T1_playerID = TAB3.playerID  AND 
 TAB1.MATCHING_TEAM_ID_CNT=TAB3.TOTAL_TEAM_CNT)
WHERE playerID1  < playerID2

    UNION ALL (
    SELECT   T1.playerID, T1.playerName ,T2.playerID,T2.playerName
    FROM
    PLAYER T1 INNER JOIN PLAYER T2
    ON (T1.playerID<T2.playerID) 
    WHERE T1.playerID NOT IN ( SELECT playerID FROM PLAYS))

答案 13 :(得分:0)

假设您的teamId是唯一的,则此查询将起作用。它通过对teamid进行求和来简单地识别具有完全相同的团队的所有玩家,或者如果玩家没有id,则它将为null。然后计算团队比赛的比赛次数。我在postgre 9.3中使用SQL小提琴测试。

SELECT 
     b.playerID
    ,b.playerName
FROM (
--Join the totals of teams to your player information and then count over the team matches.
        SELECT 
                p.playerID
                ,p.playerName
                ,m.TeamMatches
                ,COUNT(*) OVER(PARTITION BY TeamMatches) as Matches
        FROM player p
                LEFT JOIN (
                --Assuming your teamID is unique as it should be. If it is then a sum of the team ids for a player will give you each team they play for. 
                --If for some reason your team id is not unique then rank the table and join same as below. 
                    SELECT 
                         ps.playerName
                        ,ps.playerID
                        ,SUM(t.teamID) as TeamMatches
                    FROM plays p
                            LEFT JOIN team t ON p.teamID = p.teamID
                            LEFT JOIN player ps ON p.playerID = ps.playerID
                    GROUP BY 
                            ps.playerName
                        ,ps.playerID
                ) m ON p.playerID = m.playerID
) b
WHERE
b.Matches <> 1

答案 14 :(得分:-1)

此查询应该解决它。  通过在PLAYS上自我加入。 - 比较玩家ID - 将匹配的行数与每个玩家的总数进行比较。

&#13;
&#13;
select p1.playerId, p2.playerId, count(p1.playerId)
from plays p1, plays p2
WHERE p1.playerId<p2.playerId
and p1.teamId = p2.teamId
GROUP BY p1.playerId, p2.playerId
having count(*) = (select count(*) from plays where playerid = p1.playerid)
&#13;
&#13;
&#13;

答案 15 :(得分:-2)

在SQl 2008中创建功能

ALTER FUNCTION [dbo].[fngetTeamIDs] ( @PayerID int ) RETURNS varchar(101) AS Begin

declare @str varchar(1000)

SELECT @str= coalesce(@str + ', ', '') + CAST(a.TeamID AS varchar(100)) FROM (SELECT DISTINCT TeamID from Plays where PayerId=@PayerID) a

return @str

END

- 选择dbo.fngetTeamIDs(2)

查询从这里开始

drop table #temp,#A,#B,#C,#D

(select PayerID,count(*) count 
into #temp
from Plays 
group by PayerID)


select *
into #A
from #temp as T

where T.count in (
        select T1.count from #temp as T1
        group by T1.count having count(T1.count)>1 
)

select A.*,P.TeamID
into #B
from #A A inner join Plays P
on A.PayerID=P.PayerID
order by A.count


select B.PayerId,B.count, 
(
select dbo.fngetTeamIDs(B.PayerId)
) as TeamIDs
into #C
from #B B
group by B.PayerId,B.count


select TeamIDs 
into #D
from #c as C
group by C.TeamIDs
having count(C.TeamIDs)>1

select C.PayerId,P.PlayerName,D.TeamIDs
from #D D inner join #C C
on D.TeamIDs=C.TeamIDs
inner join Player P
on C.PayerID=P.PlayerID