一个查询,它使用键值对将结果拉到下面

时间:2018-06-24 22:23:46

标签: scala apache-spark cassandra

上下文: 我正在使用Spark版本2.0.2.17和Scala版本2.11.11的一个集群datastax dse 5.1.8。

问题: 我是具有这种结构定义的表

CREATE TABLE test (
location text,
cell text,
testdate timestamp,
model text,
testnumber bigint,
reading bigint,
scan bigint,
testtype text,
readingtype text,
payloadtype text,
datatype text,
measurementid timeuuid,
aircraftengineposition text,
aircrafttailnumber text,
aircrafttypename text,
analysisnumber bigint,
archiveddate timestamp,
archivemethod text,
createddate timestamp,
edwid bigint,
engineserialnumber text,
etdscampus text,
etdsrecordid bigint,
etdssetid bigint,
fleet text,
flightnumber text,
jnumber bigint,
method text,
modifieddate timestamp,
parameters map<text, frozen<tuple<text, text, bigint, double, text>>>,
payloadchecksum blob,
payloadlocation text,
payloaduri text,
payloaduserreference text,
powerset text,
rating text,
status text,
vehicle text,
PRIMARY KEY ((location, cell, testdate, model, testnumber, reading, scan, testtype, readingtype, payloadtype, datatype), measurementid)
) WITH CLUSTERING ORDER BY (measurementid DESC)

我的主键是:     位置,单元格,测试日期,型号,测试编号,读数,     扫描,测试类型,读取类型,有效负载类型,数据类型

当我使用cassandra的列参数查询此表时,例如,我在下面得到了元组的一行:

{'PF121A': ('R', None, None, 1131.65942, None), 'PT25AA': ('R', None, None, 29.40011, None), 'PT25AB': ('R', None, None, 29.83459, None), 'PT25AC': ('R', None, None, 29.93993, None), 'PT25AD': ('R', None, None, 14.02732, None), 'PT25AE': ('R', None, None, 31.12416, None), 'PT25AF': ('R', None, None, 31.06807, None), 'PT25SA': ('R', None, None, 14.02681, None), 'PT25SB': ('R', None, None, 29.39588, None), 'PT25SC': ('R', None, None, 29.6147, None), 'PT25SD': ('R', None, None, 29.97557, None), 'PT25SE': ('R', None, None, 30.94819, None), 'PT25SF': ('R', None, None, 30.66682, None), 'PT31FA': ('R', None, None, 814.65393, None), 'PT31FB': ('R', None, None, 818.90576, None), 'PT31FC': ('R', None, None, 819.23676, None), 'PT31FD': ('R', None, None, 814.65485, None), 'PT31FE': ('R', None, None, 814.4751, None), 'PT31FF': ('R', None, None, 813.31396, None), 'PT31RA': ('R', None, None, 812.73108, None), 'PT31RB': ('R', None, None, 814.26917, None), 'PT31RC': ('R', None, None, 814.9386, None), 'PT31RD': ('R', None, None, 814.97534, None), 'PT31RE': ('R', None, None, 814.57043, None), 'PT31RF': ('R', None, None, 813.4953, None), 'PT49JA': ('R', None, None, 137.66408, None), 'PT49JB': ('R', None, None, 136.09262, None), 'PT49JC': ('R', None, None, 134.53801, None), 'PT49JD': ('R', None, None, 135.47989, None), 'PT49JE': ('R', None, None, 134.17297, None), 'PT49WA': ('R', None, None, 138.44234, None), 'PT49WB': ('R', None, None, 136.7596, None), 'PT49WC': ('R', None, None, 135.41495, None), 'PT49WD': ('R', None, None, 135.69455, None), 'PT49WE': ('R', None, None, 133.96593, None), 'PT50EA': ('R', None, None, 20.66105, None), 'PT50EB': ('R', None, None, 20.15389, None), 'PT50EC': ('R', None, None, 20.01845, None), 'PT50ED': ('R', None, None, 19.94046, None), 'PT50EE': ('R', None, None, 14.02631, None), 'PT50EF': ('R', None, None, 20.40714, None), 'PT50SA': ('R', None, None, 20.47363, None), 'PT50SB': ('R', None, None, 19.98885, None), 'PT50SC': ('R', None, None, 19.87228, None), 'PT50SD': ('R', None, None, 19.74927, None), 'PT50SE': ('R', None, None, 19.84222, None), 'PT50SF': ('R', None, None, 14.0299, None), 'PUBDAT': ('R', None, None, 14717, None), 'PUBTIM': ('R', None, None, 6.1759e+07, None), 'TACALL': ('R', None, None, 217.15161, None), 'TACALR': ('R', None, None, 215.13361, None), 'TACAUL': ('R', None, None, 192.04047, None), 'TACAUR': ('R', None, None, 207.94643, None), 'TACFLL': ('R', None, None, 166.98624, None), 'TACFLR': ('R', None, None, 153.83304, None), 'TACFUL': ('R', None, None, 180.14462, None), 'TACFUR': ('R', None, None, 168.80762, None), 'TACMUL': ('R', None, None, 183.63214, None), 'TACMUR': ('R', None, None, 111.59378, None), 'TM25AA': ('R', None, None, 209.83342, None), 'TM25AB': ('R', None, None, 209.16489, None), 'TM25AC': ('R', None, None, 209.9166, None), 'TM25AD': ('R', None, None, 216.28339, None), 'TM25AE': ('R', None, None, 221.4989, None), 'TM25AF': ('R', None, None, 227.64745, None), 'TM25SA': ('R', None, None, 209.70547, None), 'TM25SB': ('R', None, None, 208.61472, None), 'TM25SC': ('R', None, None, 208.71387, None), 'TM25SD': ('R', None, None, 212.55757, None), 'TM25SE': ('R', None, None, 222.05238, None), 'TM25SF': ('R', None, None, 227.34381, None), 'TM31FA': ('R', None, None, 1303.27625, None), 'TM31FB': ('R', None, None, 1302.17749, None), 'TM31FC': ('R', None, None, 1301.16431, None), 'TM31FD': ('R', None, None, 1300.43945, None), 'TM31FE': ('R', None, None, 1306.04688, None), 'TM31FF': ('R', None, None, 1308.64978, None), 'TM31RA': ('R', None, None, 1308.25098, None), 'TM31RB': ('R', None, None, 1307.29443, None), 'TM31RC': ('R', None, None, 1307.47229, None), 'TM31RD': ('R', None, None, 1308.46118, None), 'TM31RE': ('R', None, None, 1311.94226, None), 'TM31RF': ('R', None, None, 1317.01111, None), 'TM50EA': ('R', None, None, 1066.51196, None), 'TM50EB': ('R', None, None, 1037.17004, None), 'TM50EC': ('R', None, None, 1035.68445, None), 'TM50ED': ('R', None, None, 1030.4646, None), 'TM50EE': ('R', None, None, 1022.53888, None), 'TM50EF': ('R', None, None, 1016.54425, None), 'TM50SA': ('R', None, None, 1064.77722, None), 'TM50SB': ('R', None, None, 1044.97632, None), 'TM50SC': ('R', None, None, 1044.73755, None), 'TM50SD': ('R', None, None, 1046.18323, None), 'TM50SE': ('R', None, None, 1047.10889, None), 'TM50SF': ('R', None, None, 1042.55457, None), 'XNHCT1': ('R', None, None, 10639.88574, None), 'XNLCT1': ('R', None, None, 2448.99121, None)} 

我的问题是使用scala查询,该查询允许我将上面的结果与键值对组合在一起,例如从这些示例上方的结果中提取示例:

KEY: VALUE PAIR

PF121A: 1131.65942
PT25AA: 29.40011 
PT25AB: 29.83459 
PT25AC: 29.93993 
PT25AD: 14.02732
PT25AE: 31.12416 
PT25AF: 31.06807

在我从spark使用的第一个查询下面,但我不知道如何继续:

scala> val parameters = sc.cassandraTable(String).select("parameters")
parameters: com.datastax.spark.connector.rdd.CassandraTableScanRDD[String] = CassandraTableScanRDD[1] at RDD at CassandraRDD.scala:19

5行结果:

scala> parameters.take(10)
res1: Array[String] = Array({PT25AF: (R, null, null, 14.898366928100586, null),PT31RD: (R, null, null, 73.38668060302734, null),PT49WD: (R, null, null, 20.127065658569336, null),TM50SC: (R, null, null, 803.0751342773438, null),PT50SE: (R, null, null, 14.305357933044434, null),TACAUL: (R, null, null, 211.5594940185547, null),PT31RE: (R, null, null, 73.51492309570312, null),PT31FE: (R, null, null, 73.38947296142578, null),TM50SF: (R, null, null, 810.0723876953125, null),PT50ED: (R, null, null, 14.262494087219238, null),PT49JC: (R, null, null, 20.577911376953125, null),TACMUR: (R, null, null, 113.4200668334961, null),TM50EF: (R, null, null, 802.0999145507812, null),PT50EE: (R, null, null, 14.02810001373291, null),PT25SC: (R, null, null, 14.932862281799316, null),PT25AA: (R, null, null, 14..

谢谢您的帮助。

0 个答案:

没有答案