SQL:选择重复值的第一行

时间:2018-12-05 15:35:10

标签: mysql sql

通过以下代码:

SELECT
farm_products.id as ID,  
farm_products.product as Product1,  
shop_products.product as Product2
FROM  
farm_products,  
shop_products,  
shop_farm
WHERE
farm_products.id = shop_farm.farm_id and
shop_farm.farm_id = shop_products.id and 
farm_products.product != shop_products.product;

我得到以下输出:

+-------+----------+---------+
|ID     | Product1 | Product2|
+-------+----------+---------+
|06     | 'Apple'  | 'Grape' |
+-------+----------+---------+
|06     | 'Orange' | 'Grape' |
+-------+----------+---------+
|06     | 'Pear'   | 'Apple' |
+-------+----------+---------+
|07     | 'Apple'  | 'Pear'  |
+-------+----------+---------+
|08     | 'Kiwi'   | 'Grape' |
+-------+----------+---------+
|08     | 'Grape'  | 'Orange |
+-------+----------+---------+

我想要一个表,其中每个表仅出现第一行
            出现ID。

换句话说,我想要一个看起来像这样的输出:

+-------+----------+---------+
|ID     | Product1 | Product2|
+-------+----------+---------+
|06     | 'Apple'  | 'Grape' |
+-------+----------+---------+
|07     | 'Apple'  | 'Pear'  |
+-------+----------+---------+
|08     | 'Kiwi'   | 'Grape' |
+-------+----------+---------+

我尝试使用 DISTINCT 删除所有重复的ID,但这(显然)不起作用。 我想尝试避免嵌套查询,并使代码尽可能简单。

有人可以帮忙吗?

2 个答案:

答案 0 :(得分:1)

使用export class CustomComponent implements , ControlValueAccessor { onChange = (val: string) => { }; onTouched = () => { }; writeValue(val: string): void { // value passed from parent throug ngModel will come under this funtion } registerOnChange(fn: (val: string) => void): void { this.onChange = fn; } registerOnTouched(fn: () => void): void { this.onTouched = fn; } ngOnInit() { } // If you want to emit value to parent use the onChange function myEmitFunction(){ this.onChange("value u want to emit") } }

row_number()

您可以通过以下方式使用查询

select * from 
(select *, row_number() over( partition by id order by Product1 ) rn
from table_name
)t where rn=1

答案 1 :(得分:0)

这将起作用:

from pyspark.sql.window import Window
from pyspark.sql.functions import rowNumber

w = Window().orderBy()

df =  df.withColumn("rowid", rowNumber().over(w))
forecasts =  forecasts.withColumn("rowid", rowNumber().over(w))

mergedDF = df.join(forecasts, "rowid").drop("rowid")
mergedDF.show()