我正在从火花流媒体工作中的kafka主题中读取数据。我需要从数据中创建关键值RDD。
val messages = KafkaUtils.createStream(streamingContext, "localhost:2181","abc",topics, StorageLevel.MEMORY_ONLY)
messages.print()
create key value RDD out of CustomerId and Tokens
val xactionByCustomer = messages.map(_._2).map {
transaction =>
val key = transaction.customerId
var tokens = transaction.tokens
(key, tokens)
}
错误::
[error] /home/ec2-user/alok/marseille/src/main/scala/com/jcalc/feed/MarkovPredictor.scala:115: value customerId is not a member of String
[error] val key = transaction.customerId
[error] ^
[error] /home/ec2-user/alok/marseille/src/main/scala/com/jcalc/feed/MarkovPredictor.scala:116: value tokens is not a member of String
[error] var tokens = transaction.tokens
[error] ^
[error] two errors found
[error] (compile:compileIncremental) Compilation failed
示例数据::
(null,W3Q6TF3CCI,X84N230CIH,NNN)
(null,O8IV7KEXT0,G1D590G05V,NNS)
(null,LBQKYNE081,MYU0O7JC5H,NHN)
(null,SRB4P501SW,E0FTI4RN7X,LHL)
(null,HELRFMAXVS,W6F704TN21,LHN)
(null,FS4PLQLI63,TK1O9YHS15,NNN)
(null,KI70UDVJLC,4ANBDAW7SU,LNN)
(null,IP6IVPGCWQ,MD93GGGBKA,NNN)
(null,976N9RPXSP,JKU0SV7UMH,LNL)
(null,J4V3AB1YVT,J9WXC1BRAY,LHN)
我对第二&仅对于RDD对的第4个值。 任何帮助?
答案 0 :(得分:0)
您的数据看起来像元组:(String, String, String, String)
,因为您对2dn&第四个值映射:
val xactionByCustomer = messages.map(row => (row._2, row._4))
应该足够了。