尝试在Windows中使用sc.textFile加载文件时出错

时间:2016-06-08 07:56:22

标签: scala hadoop apache-spark

我是hadoop的新手,我正在尝试使用sc.textFile命令上传本地文件

val data = sc.textFile("file:///D:\\test.txt")

在此之后我尝试对这些数据进行一些操作,然后我得到错误

java.lang.IllegalArgumentException: java.net.URISyntaxException: Relative path in absolute URI: D:test.txt
        at org.apache.hadoop.fs.Path.initialize(Path.java:206)
        at org.apache.hadoop.fs.Path.<init>(Path.java:172)
        at org.apache.hadoop.fs.Path.<init>(Path.java:94)
        at org.apache.hadoop.fs.Globber.glob(Globber.java:248)
        at org.apache.hadoop.fs.FileSystem.globStatus(FileSystem.java:1698)
        at org.apache.hadoop.mapred.FileInputFormat.singleThreadedListStatus(FileInputFormat.java:229)
        at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:200)
        at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
        at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:179)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
        at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
        at org.apache.spark.rdd.FlatMappedRDD.getPartitions(FlatMappedRDD.scala:30)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
        at org.apache.spark.rdd.MappedRDD.getPartitions(MappedRDD.scala:28)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:204)
        at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:202)
        at scala.Option.getOrElse(Option.scala:120)
        at org.apache.spark.rdd.RDD.partitions(RDD.scala:202)
        at org.apache.spark.Partitioner$.defaultPartitioner(Partitioner.scala:65)
        at org.apache.spark.rdd.PairRDDFunctions.reduceByKey(PairRDDFunctions.scala:279)
        at $iwC$$iwC$$iwC$$iwC.<init>(<console>:16)
        at $iwC$$iwC$$iwC.<init>(<console>:21)
        at $iwC$$iwC.<init>(<console>:23)
        at $iwC.<init>(<console>:25)
        at <init>(<console>:27)
        at .<init>(<console>:31)
        at .<clinit>(<console>)
        at .<init>(<console>:7)
        at .<clinit>(<console>)
        at $print(<console>)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.repl.SparkIMain$ReadEvalPrint.call(SparkIMain.scala:789)
        at org.apache.spark.repl.SparkIMain$Request.loadAndRun(SparkIMain.scala:1062)
        at org.apache.spark.repl.SparkIMain.loadAndRunReq$1(SparkIMain.scala:615)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:646)
        at org.apache.spark.repl.SparkIMain.interpret(SparkIMain.scala:610)
        at org.apache.spark.repl.SparkILoop.reallyInterpret$1(SparkILoop.scala:814)
        at org.apache.spark.repl.SparkILoop.interpretStartingWith(SparkILoop.scala:859)
        at org.apache.spark.repl.SparkILoop.command(SparkILoop.scala:771)
        at org.apache.spark.repl.SparkILoop.processLine$1(SparkILoop.scala:616)
        at org.apache.spark.repl.SparkILoop.innerLoop$1(SparkILoop.scala:624)
        at org.apache.spark.repl.SparkILoop.loop(SparkILoop.scala:629)
        at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply$mcZ$sp(SparkILoop.scala:954)
        at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:902)
        at org.apache.spark.repl.SparkILoop$$anonfun$process$1.apply(SparkILoop.scala:902)
        at scala.tools.nsc.util.ScalaClassLoader$.savingContextLoader(ScalaClassLoader.scala:135)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:902)
        at org.apache.spark.repl.SparkILoop.process(SparkILoop.scala:997)
        at org.apache.spark.repl.Main$.main(Main.scala:31)
        at org.apache.spark.repl.Main.main(Main.scala)
        at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
        at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
        at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
        at java.lang.reflect.Method.invoke(Method.java:606)
        at org.apache.spark.deploy.SparkSubmit$.launch(SparkSubmit.scala:328)
        at org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:75)
        at org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
Caused by: java.net.URISyntaxException: Relative path in absolute URI: D:test.txt
        at java.net.URI.checkPath(URI.java:1804)
        at java.net.URI.<init>(URI.java:752)
        at org.apache.hadoop.fs.Path.initialize(Path.java:203)
        ... 69 more

我在sc.textFile命令中尝试了一些变体。如

val data = sc.textFile("file:///D:/test.txt")

然后我收到错误

java.lang.IllegalArgumentException: java.net.URISyntaxException: Expected scheme-specific part at index 2: D

我已经尝试了很长时间,但我无法让这个基本的东西正确。有人可以帮忙!!提前致谢 !!

1 个答案:

答案 0 :(得分:1)

添加spark.sql.warehouse.dir为我解决了这个问题

import java.util.Scanner;
public class prime
{
    public static int  num1 = 0,  num2 = 0,  num3 = 0,  num4 = 0, sum1= 0,sum2,sum3,sum4,sum5,sum6,val1,stop;  
    public static Scanner input = new Scanner(System.in);
    public static void main(String args[])
    {  
        input();
        sumPair1(num1, num2, sum1);
        sumPair2(num1, num3, sum2);
        sumPair3(num1, num4, sum3);
        sumPair4(num2, num3, sum4);
        sumPair5(num2, num4, sum5);
        sumPair6(num3, num4, sum6);
       // Prime(sum1,sum2,sum3,sum4,sum5,sum6,val1,stop);
    }        
    public static void input ()
    {      
        System.out.print("Enter a positive integer: ");        
        num1 = input.nextInt(); 
        if(num1==0 || num1 < 0)
            while(num1==0 || num1 < 0)
            {
                System.out.print("Enter a positive integer: ");
                num1 = input.nextInt();
            }            
        if(num1 > 0)

            System.out.println("Thank you."); 

        System.out.print("Enter a positive integer: ");        
        num2 = input.nextInt(); 
        if(num2==0 || num2 < 0)
            while(num2 == 0 || num2 < 0)
            {
                System.out.print("Enter a positive integer: ");
                num2 = input.nextInt();
            }            
        if(num2 > 0)

            System.out.println("Thank you.");                                

        System.out.print("Enter a positive integer: ");        
        num3 = input.nextInt(); 
        while(num3 ==0 || num3 < 0)
        {
            System.out.print("Enter a positive integer: ");
            num3 = input.nextInt();
        }     
        if(num3 > 0)

            System.out.println("Thank you.");                                

        System.out.print("Enter a positive integer: ");        
        num4 = input.nextInt(); 
        if(num4==0 || num4 < 0)
            while(num4 == 0 || num4 < 0)
            {
                System.out.print("Enter a positive integer: ");
                num4 = input.nextInt();
            }
        if(num4 > 0)

            System.out.println("Thank you.");                                

    } 
    public static int sumPair1(int num1, int num2, int sum1)
     {
        sum1 = num1 + num2;
        return sum1;
    }

    public static int sumPair2(int num1,int num3,int sum2)
      {       
        sum2 = num1 + num3;
         System.out.println(sum1);
        return sum2;
    }

    public static int sumPair3(int num1, int num4,int sum3)
    {  
        sum3 = num1 + num4;
        return sum3;
    }

    public static int sumPair4(int num2, int num3,int sum4)
    {  
        sum4 = num2 + num3;
        return sum4;
    }

    public static int sumPair5(int num2,int num4,int sum5)
    {  
        sum5 = num2 + num4;
        return sum5;
    }

    public static int sumPair6(int num3, int num4,int sum6)
    { 
        sum6 = num3 + num4;  
        System.out.println(sum1);
        return sum6;        
    }
}

以下是代码段

System.setProperty("spark.sql.warehouse.dir", "file:///C:/spark-warehouse");