在Spark SQL中将从文本文件加载的列表传递给SQL查询

时间:2019-02-11 06:56:02

标签: sql scala apache-spark jdbc apache-spark-sql

我已经从文本文件中读取了帐户,该帐户使用,作为分隔符:

val csv = spark.read.text("src/main/resources/in/insight/account_issues.txt")

//implicits
import spark.sqlContext.implicits._

val string_account = csv.map(_.getString(0)).collect.toList.toString()
//print(string_account)

val query = s"""(SELECT
               |    ACCOUNT_NUMBER,
               |    CASE WHEN STMT.CRF_TYPE='CREDIT' THEN STMT.AMOUNT_LCY
               |        ELSE NULL
               |    END as 'CreditAmount',
               |    CASE WHEN STMT.CRF_TYPE='DEBIT' THEN STMT.AMOUNT_LCY
               |        ELSE  NULL
               |    END as 'DebitAmount',
               |    STMT.BOOKING_DATE,
               |    STMT.VALUE_DATE,
               |    CRF_TYPE
               |FROM [InsightLanding].[dbo].[v_STMT_ENTRY] AS STMT
               |    LEFT JOIN [InsightWarehouse].[dbo].[v_Account] AS A ON a.AccountNum = STMT.ACCOUNT_NUMBER
               |
               |WHERE STMT.MIS_DATE='$BusinessDate'
               | AND STMT.ACCOUNT_NUMBER IN ($string_account) ) tmp """.stripMargin

val responseWithSelectedColumns = spark
  .read
  .format("jdbc")
  .option("url", url)
  .option("driver", driver)
  .option("dbtable", query)
  .load()

我无法得到作品,而是出现错误:

: 'List' is not a recognized built-in function name

我的代码有什么问题?

1 个答案:

答案 0 :(得分:3)

创建string_account时,请在列表上使用toString()。这将为您提供一个字符串List(...),例如:

scala> List(1,2,3).toString()
res0: String = List(1, 2, 3)

您要使用的是mkString(",")

scala> List(1,2,3).mkString(",")
res1: String = "1,2,3"

在这种情况下应该是:

val string_account = csv.map(_.getString(0)).collect.toList.mkString(",")

注意:如果需要string_account,则可以轻松地将括号添加到mkString("(", ",", ")")中,而不是将其包含在SQL查询中。