Question

我有一张这样的表：

Location 1 | Location 2 | ID (autoIncremented)

位置行在此sytax中：

Country*State*City

所以我可以有这样的行：

USA*NY*BROOKLYN
USA*WASHINGTON*SEATTLE
USA*WASHINGTOM*BELLINGHAM
CANADA*BC*VANCOUVER
CANADA*MANITOBA*WINNIPEG
MEXICO*MEHICO*MEXICOCITY

我想得到这样的结果：

Country 1 | Country 2 | count([count of all the occurrences together])

但我坚持要做到这一点。我想计算一起出现的国家组合。我需要提取国家的一部分，所以我使用：

substring_index(location1, '*', 1) as country

我最近的完整查询，但没有正常工作：

select 
substring_index(location1, '*', 1) as country1,
substring_index(location2, '*', 1) as country2
count(*)
FROM location_table
GROUP BY [not sure which to group by]

Answer 1

以下是其中一个选项。这不是一个有效的SQL（因为我在GROUP BY中使用别名） - 只是我想法的一个例证。您将需要使用substring表达式作为使用非标准化数据的惩罚。

GROUP BY ( LEAST(country1, country2) + GREATEST(country1, country2) )

以上假设USA | CANADA和CANADA | USA应该统计在一起。

Answer 2

在MySQL中，你可以在group by中使用别名，所以如果你想保留排序：

SELECT substring_index(location1, '*', 1) as country1,
       substring_index(location2, '*', 1) as country2,
       count(*)
FROM location_table
GROUP BY country1, country2;

如果您想要所有配对，无论订购如何：

SELECT LEAST(substring_index(location1, '*', 1), substring_index(location2, '*', 1)) as country1,
       GREATEST(substring_index(location1, '*', 1), substring_index(location2, '*', 1)) as country2,
       count(*)
FROM location_table
GROUP BY country1, country2;

在国家一起出现时得到组合的数量

2 个答案: