如何在spark RDD

时间:2017-03-10 07:14:14

标签: apache-spark

我使用MongoDB数据创建了spark RDD。创建RDD后,我得到了这个结果。 样本结果集:

Document{{_id=5737782,
SurveyNo=54ebc800-6gd3f-11e6-8ccb-ef65b1f62b86,
currenceNo=b5ae2a30-6e09-11e6-b9e1-757ghs218348,
"QA" : [
        {
            "QuestionId" : "8cdffd91-1bad-11e5-aa85-d53d21cd02a4", 
            "QuestionText" : "How are you?", 
            "ScaledAnswerValue" : NumberInt(75), 
            "AnswerValue" : NumberInt(3), 
            "AnswerText" : "Horrible", 
            "AnsweredDate" : ISODate("2014-12-10T07:00:43.958+0000"), 
            "Order" : NumberInt(0), 
            "Status" : "Failed", 
            "AnswerSelectors" : [
                {
                    "Value" : NumberInt(0), 
                    "Text" : "Horrible", 
                    "Order" : NumberInt(1)
                }, 
                {
                    "Value" : NumberInt(1), 
                    "Text" : "Poor", 
                    "Order" : NumberInt(2)
                }, 
                {
                    "Value" : NumberInt(2), 
                    "Text" : "Fair", 
                    "Order" : NumberInt(3)
                }, 
                {
                    "Value" : NumberInt(3), 
                    "Text" : "Good", 
                    "Order" : NumberInt(4)
                }, 
                {
                    "Value" : NumberInt(4), 
                    "Text" : "Ex`enter code here`cellent", 
                    "Order" : NumberInt(5)
                }
            ], 
            "AnswerType" : "Mood"
        }
    ]

我想只选择QA.QuestionId,QA.QuestionText,QA.ScaledAnswerValue。我怎么能平展这个QA阵列。

0 个答案:

没有答案