如何在Apache Spark Java中将JavaRDD <row>转换为JavaRDD <string>

时间:2017-11-02 13:10:42

标签: java apache-spark

我正在尝试对Row类型数据执行JavaRDD操作。但我无法解析或迭代JavaRDD&lt;行&GT;数据

架构:

root
 |-- categories: array (nullable = true)
 |    |-- element: string (containsNull = true)
 |-- discount: long (nullable = true)
 |-- expiration: string (nullable = true)
 |-- id: long (nullable = true)
 |-- maxCashback: string (nullable = true)
 |-- minTicket: long (nullable = true)
 |-- name: string (nullable = true)
 |-- rules: struct (nullable = true)
 |    |-- cardRequired: boolean (nullable = true)
 |    |-- cardType: array (nullable = true)
 |    |    |-- element: string (containsNull = true)
 |    |-- usageLimit: long (nullable = true)
 |    |-- vendor: array (nullable = true)
 |    |    |-- element: string (containsNull = true)

数据:

+--------------------+--------+----------+---+-----------+---------+--------------------+--------------------+
|          categories|discount|expiration| id|maxCashback|minTicket|                name|               rules|
+--------------------+--------+----------+---+-----------+---------+--------------------+--------------------+
|      [Movie, Event]|    null|31-03-2018|  1|        100|        1|ICICI Bank Credit...|[true,WrappedArra...|
|             [Movie]|      10|30-11-2017|  2|        100|        2|RBL Credit Card O...|[true,WrappedArra...|
|             [Movie]|    null|30-11-2017|  3|        150|        2|SBI RUPAY PLATINU...|[true,WrappedArra...|
|             [Movie]|    null|31-10-2017|  4|        150|        2|IDEA Select Prepa...|[true,WrappedArra...|
|[Movie, Event, Sp...|      10|31-10-2017|  5|        150|        1|Mobikwik Wallet O...|[true,WrappedArra...|
|[Movie, Event, Sp...|    null|      null|  6|         {}|        1|       Payback Point|[null,WrappedArra...|
+--------------------+--------+----------+---+-----------+---------+--------------------+--------------------+

代码段:

JavaRDD<Row> applicableOffers = offers.toJavaRDD();
applicableOffers.foreach((a)->{

            int fieldNoTicket = a.fieldIndex("minTicket");
            int filedNoCashback=a.fieldIndex("maxCashback");
            int fieldNoDiscount=a.fieldIndex("discount");

            System.out.println("a : " +a);
        });

输出:

a : [WrappedArray(Movie, Event),null,31-03-2018,1,100,1,ICICI Bank Credit Card Offer,[true,WrappedArray(Credit),null,WrappedArray(ICICI)]]
a : [WrappedArray(Movie),10,30-11-2017,2,100,2,RBL Credit Card Offer,[true,WrappedArray(Credit),15,WrappedArray(RBL)]]
a : [WrappedArray(Movie),null,30-11-2017,3,150,2,SBI RUPAY PLATINUM DEBIT CARD OFFER,[true,WrappedArray(Platinum Debit),null,WrappedArray(SBI)]]
a : [WrappedArray(Movie),null,31-10-2017,4,150,2,IDEA Select Prepaid Offer,[true,WrappedArray(SIM),null,WrappedArray(IDEA)]]
a : [WrappedArray(Movie, Event, Sports),10,31-10-2017,5,150,1,Mobikwik Wallet Offer,[true,WrappedArray(eWallet),null,WrappedArray(Mobikwik)]]
a : [WrappedArray(Movie, Event, Sports),null,null,6,{},1,Payback Point,[null,WrappedArray(Credit, Debit),null,WrappedArray(ICICI,SBI,Canara)]]

我需要做的就是运行一个动作来计算1000美元的折扣,并在Apache Spark Java中输出商品的价值和名称。

1 个答案:

答案 0 :(得分:0)

我设法找到了解决方法。使用fieldIndex(colName)来捕获索引,然后getLong(index)来访问项目。

int orderValue=1000; // USD 1000 is order value

applicableOffers.foreach((a) -> {

        int name = a.fieldIndex("name");
        int discount = a.fieldIndex("discount");

        String offerName = a.getString(name);
        Long discount = a.getLong(discount);

        System.out.println("Offer:" + offerName + "  Total:" + computeCashBack(orderValue,discount));
    });