我正在尝试将RDD写入Cassandra表中。 如下所示,TableWriter多次写入0行,最后写入Cassandra。
18/10/22 07:15:50 INFO TableWriter: Wrote 0 rows to log_by_date in 0.171 s.
18/10/22 07:15:50 INFO Executor: Finished task 4.0 in stage 0.0 (TID 4). 622 bytes result sent to driver
18/10/22 07:15:50 INFO TableWriter: Wrote 0 rows to log_by_date in 0.220 s.
18/10/22 07:15:50 INFO Executor: Finished task 1.0 in stage 0.0 (TID 1). 665 bytes result sent to driver
18/10/22 07:15:50 INFO TableWriter: Wrote 0 rows to log_by_date in 0.194 s.
18/10/22 07:15:50 INFO TableWriter: Wrote 0 rows to log_by_date in 0.224 s.
18/10/22 07:15:50 INFO Executor: Finished task 6.0 in stage 0.0 (TID 6). 708 bytes result sent to driver
18/10/22 07:15:50 INFO TableWriter: Wrote 0 rows to log_by_date in 0.231 s.
18/10/22 07:15:50 INFO Executor: Finished task 5.0 in stage 0.0 (TID 5). 622 bytes result sent to driver
18/10/22 07:15:50 INFO Executor: Finished task 3.0 in stage 0.0 (TID 3). 622 bytes result sent to driver
18/10/22 07:15:50 INFO TableWriter: Wrote 0 rows to log_by_date in 0.246 s.
18/10/22 07:15:50 INFO Executor: Finished task 0.0 in stage 0.0 (TID 0). 708 bytes result sent to driver
18/10/22 07:15:50 INFO TaskSetManager: Finished task 4.0 in stage 0.0 (TID 4) in 418 ms on localhost (executor driver) (1/8)
18/10/22 07:15:50 INFO TaskSetManager: Finished task 2.0 in stage 0.0 (TID 2) in 433 ms on localhost (executor driver) (2/8)
18/10/22 07:15:50 INFO TaskSetManager: Finished task 5.0 in stage 0.0 (TID 5) in 426 ms on localhost (executor driver) (3/8)
18/10/22 07:15:50 INFO TaskSetManager: Finished task 3.0 in stage 0.0 (TID 3) in 433 ms on localhost (executor driver) (4/8)
18/10/22 07:15:50 INFO TaskSetManager: Finished task 0.0 in stage 0.0 (TID 0) in 456 ms on localhost (executor driver) (5/8)
18/10/22 07:15:50 INFO TaskSetManager: Finished task 1.0 in stage 0.0 (TID 1) in 436 ms on localhost (executor driver) (6/8)
18/10/22 07:15:50 INFO TaskSetManager: Finished task 6.0 in stage 0.0 (TID 6) in 424 ms on localhost (executor driver) (7/8)
18/10/22 07:15:50 INFO **TableWriter: Wrote 1 rows to log_by_date in 0.342 s.**
为什么它无法事先保存几次,如何对其进行调整以进行生产?
答案 0 :(得分:1)
这不是user10465355指出的故障。当Spark将一项工作分解为“任务”时,该工作可能分布不均,或者没有足够的工作来完成每个任务。这导致某些任务为空,因此当Spark Cassandra Connector处理它们时,它们将写入0行。
例如说;