我有一个以下格式的json:
{"Request": {"TrancheList": {"Tranche": [{"TrancheId": "500192163","OwnedAmt": "26500000", "Curr": "USD" }, { "TrancheId": "500213369", "OwnedAmt": "41000000","Curr": "USD"}]},"FxRatesList": {"FxRatesContract": [{"Currency": "CHF","FxRate": "0.97919983706115"},{"Currency": "AUD", "FxRate": "1.2966804979253"},{ "Currency": "USD","FxRate": "1"},{"Currency": "SEK","FxRate": "8.1561012531034"},{"Currency": "NOK", "FxRate": "8.2454981641398"},{"Currency": "JPY","FxRate": "111.79999785344"},{"Currency": "HKD","FxRate": "7.7568025218916"},{"Currency": "GBP","FxRate": "0.69425159677867"}, {"Currency": "EUR","FxRate": "0.88991723769689"},{"Currency": "DKK", "FxRate": "6.629598372301"}]},"isExcludeDeals": "true","baseCurrency": "USD"}}
我正在尝试获取等于baseCurrency标记
的Currency的Fxrate值我正在从hdfs集群中读取json
val hdfsRequest = spark.read.json("localhost/user/request.json")
val baseCurrency = hdfsRequest.select("Request.baseCurrency")
var fxRates = hdfsRequest.select("Request.FxRatesList.FxRatesContract")
val fxRatesDF = fxRates.select(explode(fxRates("FxRatesContract"))).toDF("FxRatesContract").select("FxRatesContract.Currency", "FxRatesContract.FxRate").filter($"Currency=baseCurrency")
我运行这行代码的错误是:
org.apache.spark.sql.AnalysisException: cannot resolve '`Currency=baseCurrency`' given input columns: [Currency, FxRate];
如何在Scala / Spark的dataframe过滤器表达式中指定可变baseCurrency?
由于
答案 0 :(得分:4)
如果基础货币只是一个值,那么您可以做的是:
val hdfsRequest = spark.read.json("localhost/user/request.json")
val baseCurrency = hdfsRequest.select("Request.baseCurrency")
.map(_.getString(0)).collect.headOption
var fxRates = hdfsRequest.select("Request.FxRatesList.FxRatesContract")
val fxRatesDF = fxRates.select(explode(fxRates("FxRatesContract")))
.toDF("FxRatesContract")
.select("FxRatesContract.Currency", "FxRatesContract.FxRate")
.filter($"Currency"===baseCurrency.fold(-1D)(identity))