从列表创建JavaRDD的常用方法是使用JavaSparkContext.parallelize(List)
但是,在Spark 2.0中SparkSession
被用作切入点,我不知道如何从列表中创建JavaRDD
答案 0 :(得分:5)
我遇到了同样的问题。到目前为止我做了什么:
List<String> list = Arrays.asList("Any", "List", "with", "Strings");
Dataset<String> listDS = sparkSession.createDataset(list, Encoders.STRING());
JavaRDD<String> javaRDDString = listDS.toJavaRDD();
我这样做的一个原因是,例如,我想使用flatMap
,JavaRDD<String>
适用于Dataset<String>
但不适用于:verifile1
cls
echo.
echo Before you can continue give out the following information...
echo.
echo What is your username?
echo.
set /p name1=Username:
if not exist "%name1%_1.bat" (
echo Invalid Username
pause>nul
goto welcome
)
echo.
echo Your password?
echo.
set /p pass1=Password:
call label %name1%_1.bat
if not %password1% EQU %pass1% (
echo Password entered do not match
pause>nul
goto welcome
)
goto Story
希望这有帮助。
答案 1 :(得分:3)
解决方案:Spark-shell(Spark 2.0)
import org.apache.spark.api.java.JavaSparkContext
val jsc = new JavaSparkContext(sc)
val javaRDD:java.util.List[Int]= java.util.Arrays.asList(1, 2, 3, 4, 5)
jsc.parallelize(javaRDD)