我希望能够计算出各大洲按国家细分的苹果的总类型(仅限有机);包括总数,如果它们混合在一起。
例如,食品B1是来自美国的有机金苹果。因此,对于有机食品,应该有一个数字“ 1” golden_bag和“ 1”。现在,A1也是阿根廷生产的有机食品-但是,它既有奶奶也有红色美味的苹果-因此,它被视为“ 1”混合袋,“ granny_bag”被计算为“ 1”,red_bag也被视为“ 1”。
最后,E1和F1都是老挝的富士苹果,但是一个是有机的,另一个不是。因此总数为2 fuji_bag,organic_fd的总数应为1。
Table X:
food_item | food_area | food_loc | food_exp
A1 lxgs argentina 1/1/20
B1 iyan usa 5/31/21
C1 lxgs peru 4/1/20
D1 wa8e norway 10/1/19
E1 894a laos 5/1/19
F1 894a laos 9/17/19
Table Y:
food_item | organic
A1 Y
B1 Y
C1 N
D1 N
E1 Y
F1 N
Table Z:
food_item | food_type
A1 189
A1 190
B1 191
C1 189
D1 192
E1 193
F1 193
SELECT continent, country,
SUM(organic) AS organic_fd, SUM(Granny) AS granny_bag,
SUM(Red_delc) AS red_bag, SUM(Golden) AS golden_bag,
SUM(Gala) AS gala_bag, SUM(Fuji) AS fuji_bag,
SUM(CASE WHEN Granny + Red_delc + Golden + Gala + Fuji > 1 THEN 1 ELSE 0 END) AS mixed_bag
FROM (SELECT (CASE SUBSTR (x.food_area, 4, 1)
WHEN 's' THEN 'SA' WHEN 'n' THEN 'NA'
WHEN 'e' THEN 'EU' WHEN 'a' THEN 'AS' ELSE NULL END) continent,
x.food_loc country, COUNT(y.organic) AS Organic
COUNT(CASE WHEN z.food_type = '189' THEN 1 END) AS Granny,
COUNT(CASE WHEN z.food_type = '190' THEN 1 END) AS Red_delc,
COUNT(CASE WHEN z.food_type = '191' THEN 1 END) AS Golden,
COUNT(CASE WHEN z.food_type = '192' THEN 1 END) AS Gala,
COUNT(CASE WHEN z.food_type = '193' THEN 1 END) AS Fuji
FROM x LEFT JOIN z ON x.food_item = z.food_item
LEFT JOIN y on x.food_item = y.food_item and y.organic = 'Y'
WHERE x.exp_date > sysdate
GROUP BY SUBSTR (x.food_area, 4, 1), x.food_loc, y.organic) h
GROUP BY h.continent, h.country, h.organic
我没有得到正确的输出,例如,老挝将显示TWICE来说明有机计数和非有机计数。因此它将显示1 organic_fd
和0 organic_fd
和1 fuji_bag
,另一行将是另外一个1 fuji_bag
。我想要总计。 (此外,如果我添加更多食品,则我的blend_bag几乎每条记录/行都显示“ 1”计数)。
下面是所需的输出:
| continent | country |organic_fd | granny_bag| red_bag| golden_bag| gala_bag|fuji_bag | mixed_bag
| SA | argentina | 1 | 1 | 1 | 0 | 0 | 0 | 1
| SA | peru | 0 | 1 | 0 | 0 | 0 | 0 | 0
| NA | usa | 1 | 0 | 0 | 1 | 0 | 0 | 0
| EU | norway | 0 | 0 | 0 | 0 | 1 | 0 | 0
| AS | laos | 1 | 0 | 0 | 0 | 0 | 2 | 0
因此,假设我要添加另一种食品,来自挪威的G1,它具有3种有机苹果:fuji, red, granny
...那么挪威现在将有1
个计数列:mixed_bag
,organic_fd
,fuji_bag
,red_bag
,granny_bag
(除了先前的1 gala_bag
计数)。如果您添加的H1与G1完全相同,则以下项的总数为2
:mixed_bag
,organic_fd
,fuji_bag
,{ {1}},red_bag
答案 0 :(得分:1)
查询:
def f(x):
a = x - x.iloc[0]
b = x.count()
c = x.index - x.index[0] + 1
return pd.DataFrame({'Diff':a, 'Count':b, 'Index':c})
df = df.join(df.groupby('name')['value'].apply(f))
print(df)
name value Diff Count Index
0 A 1 0 2 1
1 A 3 2 2 2
2 B 1 0 4 1
3 B 2 1 4 2
4 B 3 2 4 3
5 B 1 0 4 4
6 C 2 0 3 1
7 C 3 1 3 2
8 C 3 1 3 3
您可以在此处尝试此查询:https://rextester.com/TSSH87409。
答案 1 :(得分:0)
x
和z
之间存在一对多关系,并且像A1一样,联接可能为x
中的每一行产生很多行。因此,首先必须为x
中的行编号,这是我的子查询t1
的工作,除了映射值。然后像对子查询max()
一样,对每个计数的列(奶奶,有机食品等)以t2
进行分组。最后求和值。
with
t1 as (
select rn, food_item, food_area, food_loc country, food_exp, food_type,
decode(substr(food_area, 4, 1), 's', 'SA', 'n', 'NA', 'e', 'EU', 'a', 'AS') continent,
case organic when 'Y' then 1 else 0 end org,
case when food_type = '189' then 1 else 0 end gra,
case when food_type = '190' then 1 else 0 end red,
case when food_type = '191' then 1 else 0 end gol,
case when food_type = '192' then 1 else 0 end gal,
case when food_type = '193' then 1 else 0 end fuj
from (select rownum rn, x.* from x) x join y using (food_item) join z using (food_item)
where food_exp > sysdate),
t2 as (
select rn, country, continent, max(org) org, max(gra) gra,
max(red) red, max(gol) gol, max(gal) gal, max(fuj) fuj,
case when max(gra) + max(red) + max(gol) + max(gal) + max(fuj) > 1
then 1 else 0
end mix
from t1 group by rn, country, continent)
select continent, country, sum(org) organic_fd, sum(gra) granny, sum(red) red_delc,
sum(gol) golden_bag, sum(gal) gala_bag, sum(fuj) fuji_bag, sum(mix) mixed_bag
from t2
group by continent, country
以上查询给出了预期的输出,请对其进行测试并根据需要进行调整。我注意到您使用左联接。如果对于X
中的某些行来说,Y
或Z
中没有数据,则可能必须在计算中添加nvl()
。也许您还应该将映射的硬编码值放入表中。对它们进行硬编码不是一个好习惯。希望这会有所帮助:)