外部加入最常见的植物物种的子查询

时间:2015-10-23 18:26:05

标签: mysql sql join

MySQL 5.5.43

我正在开发一个包含7,200种大麻品种的数据库,需要显示菌株列表以及其育种者最常见的品种。

主题非常令人困惑,所以这里有一些事实可以帮助你理解我的困惑在哪里:

  • 每种大麻品种是以下物种之一;籼稻,苜蓿 或者Ruderalis,或者它可能是所有三个的交叉。
  • 受欢迎的菌株可能有多达30种不同的种鸡生产该菌株的种子。
  • 这一株的每个育种者可能具有略微不同的杂交/遗传并报告不同的物种。例如:Breeder1声称StrainX是100%籼稻,Breeder2声称StrainX主要是籼稻(可能是90%的籼稻和10%的苜蓿)。显然,苜蓿植物的效果令人振奋,籼稻有点令人沮丧,因此记录每个品种的微小差异以达到药用目的非常重要。

示例STRAIN:

对于一种非常受欢迎的名为White Widow的菌株,这是我生产的结果集。它有29种不同的育种者,每个育种者声称不同的物种。正如您在结果中所看到的,该菌株最受欢迎的品种是Indica / Sativa(同等杂交种)。

SELECT
  s.id,
  b.id AS breederID,
  b.breederName AS breederName,
  GROUP_CONCAT(DISTINCT sp.species ORDER BY sp.species ASC SEPARATOR '/') AS species
FROM strains AS s
LEFT JOIN strainBreedersDir AS sbd ON s.id = sbd.strainID
LEFT JOIN breeders AS b ON sbd.breederID = b.id
LEFT JOIN strainBreederSpeciesDir AS sbsd ON s.id = sbsd.strainID AND sbd.breederID = sbsd.breederID
LEFT JOIN species AS sp ON sbsd.speciesID = sp.id
WHERE s.id = 6782
GROUP BY s.id, sbd.breederID

Database result set

我想要的结果

我想展示一个菌株名称列表,在每一个名单旁边,我想展示一个育种者名单和所有育种者最受欢迎/平均声称的物种。因此,正如我之前向您展示的那样,育种者为这种菌株记录的最受欢迎的物种是Indica / Sativa,并希望这样显示:

strainID  | strainName      | breeders                 | averageSpecies
--------------------------------------------------------------------------
6782      | White Widow     | Green House Seeds,       | Indica/Sativa
          |                 | Barney's Farm

我做了什么:

我没有展示每个物种旁边最受欢迎的物种,而是每个物种旁边都有第一个记录物种。我认为这样就足够了,但物种的第一个实例可能是空的,因为目前大约有100个物种未知。所以,我不希望物种的第一个例子是“未知”,当这个品种的实际上其他种鸽知道其中的物种是什么时。因此,我认为最好确定记录最多的物种,并表明相反。这是我到目前为止所处的位置:

SELECT
  s.id,
  s.strainName,
  GROUP_CONCAT(DISTINCT b.breederName ORDER BY b.breederName ASC separator ', ') AS breeders,
  COALESCE(NULLIF(ps.primarySpecies,''),'Unknown') AS primarySpecies
FROM strains AS s
LEFT JOIN strainBreedersDir AS sbd ON s.id = sbd.strainID
LEFT JOIN breeders AS b ON sbd.breederID = b.id
LEFT OUTER JOIN (
  SELECT
    sbd.breederID AS breederID,
    GROUP_CONCAT(DISTINCT sp.species ORDER BY sp.species ASC SEPARATOR '/') AS primarySpecies
  FROM strains AS s
  LEFT JOIN strainBreedersDir AS sbd ON s.id = sbd.strainID
  LEFT JOIN strainBreederSpeciesDir AS sbsd ON s.id = sbsd.strainID AND sbd.breederID = sbsd.breederID
  LEFT JOIN species AS sp ON sbsd.speciesID = sp.id
  GROUP BY s.id, sbd.breederID
) AS ps ON sbd.breederID = ps.breederID
WHERE s.id = 6782
GROUP BY s.id

RESULT

id   | strainName   | breeders           | species
----------------------------------------------------------
6782 | White Widow  | Green House Seeds, | Indica/Sativa
     |              | Barney's Farm      |

但我无法弄清楚如何修改OUTER JOIN以显示最受欢迎的物种,而不仅仅是第一个连接的行。我已经尝试了很多不同外部联接查询的变种,但有许多失败,并且已经忘记了已经尝试过的内容。

如何展示最受欢迎的物种?

数据库结构:

strains

id (PK AUTO)     |  strainName (UNIQUE)
---------------------------------------------
6782             |  White Widow

-

strainBreedersDir

strainID (FK UNIQUE)     | breederID (UNIQUE)
---------------------------------------------
6782                     | 16
6782                     | 23

-

breeders

id (PK AUTO)      | breederName (UNIQUE)
---------------------------------------------
16                | Green House Seeds
23                | Barney's Farm

-

strainBreederSpeciesDir

strainID (FK UNIQUE)  | breederID (INT UNIQUE)  | speciesID (INT UNIQUE)
----------------------------------------------------------------------
6782                  | 16                      | 1
6782                  | 16                      | 2
6782                  | 23                      | 5

-

species

id (PK AUTO)  | species (UNIQUE)
-------------------------------------
1             | Indica
2             | Sativa
3             | Ruderalis
4             | Mostly Indica
5             | Mostly Sativa
6             | Mostly Ruderalis

HERE IS AN SQLFIDDLE - 由Juan Carlos Oropeza提供。

1 个答案:

答案 0 :(得分:1)

我想知道你想要汇总你的工作查询。

我可能会以不同的方式执行此操作,但由于我没有更改您的工作查询,因此这可能会为您提供所需内容。子查询进来,因为GROUP_CONCAT让事情变得更加艰难,因为我们依靠那个领域,我不能直接在那里坚持计数(除非有人能告诉我一个更好的方法)然后我从中选择MAXAVG。您可以为MAX切换AVG

SELECT MAX(aggregated.theCount),
aggregated.id,
aggregated.breederID,
aggregated.breeders as mostPopularBreeders,
aggregated.species as mostPopularSpecies,
AllStrainBreeders.allBreeders as strainBreeders
      FROM(
    SELECT 
        speciesWithBreeder.id,
        speciesWithBreeder.breederID,
        speciesWithBreeder.breederName,
        GROUP_CONCAT(DISTINCT speciesWithBreeder.breederName ORDER BY speciesWithBreeder.breederName ASC separator ', ') AS breeders,
        speciesWithBreeder.species,
        COUNT(*) as theCount
    FROM(
        SELECT
          s.id,
          b.id AS breederID,
          b.breederName AS breederName,
          GROUP_CONCAT(DISTINCT sp.species ORDER BY sp.species ASC SEPARATOR '/') AS species
        FROM strains AS s
        LEFT JOIN strainBreedersDir AS sbd ON s.id = sbd.strainID
        LEFT JOIN breeders AS b ON sbd.breederID = b.id
        LEFT JOIN strainBreederSpeciesDir AS sbsd ON s.id = sbsd.strainID AND sbd.breederID = sbsd.breederID
        INNER JOIN species AS sp ON sbsd.speciesID = sp.id
        WHERE s.id = 6782
        GROUP BY s.id, sbd.breederID) 
    AS speciesWithBreeder
GROUP BY speciesWithBreeder.species
ORDER BY COUNT(*) DESC
  ) as aggregated 
  LEFT JOIN(
    SELECT 
    sbd.strainID,
    GROUP_CONCAT(DISTINCT b.breederName ORDER BY b.breederName ASC SEPARATOR ',') AS allBreeders
    FROM breeders b  
    LEFT JOIN strainBreedersDir sbd ON sbd.breederID = b.id AND sbd.strainID = 6782
    GROUP BY sbd.strainID
   ) as AllStrainBreeders      
   ON aggregated.id = AllStrainBreeders.strainID
GROUP BY aggregated.id