SQL正确的过滤器用于字符串列中的浮点值

时间:2018-12-04 12:22:25

标签: php mysql sql database

我有表格发票,列“总计” varchar(255)。像这样的值:“ 500.00”,“ 5'199.00”,“ 129.60”,“ 1.00”等。 我需要选择记录并按总计列进行过滤。例如,查找总计不超过180的记录。

我尝试过:

SELECT total from invoices WHERE invoices.total <= '180'

但是结果是:

125.25
100.50
1593.55 - not correct
4'799.00 - not correct
1.00
-99.00
2406.52 -not correct

如何解决此问题并为此列编写正确的过滤器?谢谢!

3 个答案:

答案 0 :(得分:1)

您可以使用cast()函数将其转换为浮点型

scala> val diff = udf((col: String, c1: String, c2: String) => if (c1 == c2) "" else col )

scala> DF1.join(DF2, DF1("emp_id") === DF2("emp_id"))
res15: org.apache.spark.sql.DataFrame = [emp_id: int, emp_city: string ... 10 more fields]

scala> res15.withColumn("diffcolumn", split(concat_ws(",",DF1.columns.map(x => diff(lit(x), DF1(x), DF2(x))):_*),","))
res16: org.apache.spark.sql.DataFrame = [emp_id: int, emp_city: string ... 11 more fields]

scala> res16.show(false)
+------+---------+--------+---------+-------+--------+------+--------+--------+---------+-------+--------+---------------------------+
|emp_id|emp_city |emp_name|emp_phone|emp_sal|emp_site|emp_id|emp_city|emp_name|emp_phone|emp_sal|emp_site|diffcolumn                 |
+------+---------+--------+---------+-------+--------+------+--------+--------+---------+-------+--------+---------------------------+
|3     |Chennai  |rahman  |9846     |45000  |SanRamon|3     |Chennai |rahman  |9846     |45000  |SanRamon|[, , , , , ]               |
|1     |Hyderabad|ram     |9847     |50000  |SF      |1     |Sydney  |ram     |9847     |48000  |SF      |[, emp_city, , , emp_sal, ]|
+------+---------+--------+---------+-------+--------+------+--------+--------+---------+-------+--------+---------------------------+

scala> val diff_cols = res16.select(explode($"diffcolumn")).filter("col != ''").distinct.collect.map(a=>col(a(0).toString))

scala> val exceptOpr = DF1.except(DF2)

scala> exceptOpr.select(diff_cols:_*).show

+-------+---------+
|emp_sal| emp_city|
+-------+---------+
|  50000|Hyderabad|
+-------+---------+

答案 1 :(得分:0)

为什么要将数字存储为字符串?这是您的数据模型的一个基本问题,您应该对其进行修复。

有时候,我们被别人的非常,非常,非常糟糕的决定所困扰。在这种情况下,您可以尝试通过显式转换来解决此问题:

SELECT i.total 
FROM invoices i
WHERE CAST(REPLACE(i.total, '''', '') as DECIMAL(20, 4)) <= 180;

请注意,如果总计中还有其他意外字符,则会返回错误。

答案 2 :(得分:0)

如果字符串以数字开头,然后包含非数字字符,则可以使用CAST()函数或通过添加0将其隐式转换为数字:

SELECT CAST('1234abc' AS UNSIGNED); -- 1234
SELECT '1234abc'+0; -- 1234

要从任意字符串中提取数字,可以添加自定义的function,例如this

DELIMITER $$

CREATE FUNCTION `ExtractNumber`(in_string VARCHAR(50)) 
RETURNS INT
NO SQL
BEGIN
    DECLARE ctrNumber VARCHAR(50);
    DECLARE finNumber VARCHAR(50) DEFAULT '';
    DECLARE sChar VARCHAR(1);
    DECLARE inti INTEGER DEFAULT 1;

    IF LENGTH(in_string) > 0 THEN
        WHILE(inti <= LENGTH(in_string)) DO
            SET sChar = SUBSTRING(in_string, inti, 1);
            SET ctrNumber = FIND_IN_SET(sChar, '0,1,2,3,4,5,6,7,8,9'); 
            IF ctrNumber > 0 THEN
                SET finNumber = CONCAT(finNumber, sChar);
            END IF;
            SET inti = inti + 1;
        END WHILE;
        RETURN CAST(finNumber AS UNSIGNED);
    ELSE
        RETURN 0;
    END IF;    
END$$

DELIMITER ;

一旦定义了函数,就可以在查询中使用它:

SELECT total from invoices WHERE ExtractNumber(invoices.total) <= 180