我正在尝试从基于function f=pw(varargin)
for ip=1:numel(varargin)
switch class(varargin{ip})
case {'double','logical'}
varargin{ip}=@(x)(repmat(varargin{ip},size(x)));
case 'function_handle'
%do nothing
otherwise
error('wrong input class')
end
end
c=struct('cnd',varargin(1:2:end),'fcn',varargin(2:2:end));
f=@(x)pweval(x,c);
end
function y=pweval(x,p)
todo=true(size(x));
y=x.*0;
for segment=1:numel(p)
mask=todo;
mask(mask)=logical(p(segment).cnd(x(mask)));
y(mask)=p(segment).fcn(x(mask));
todo(mask)=false;
end
assert(~any(todo));
end
的应用程序中读取现有文件。这是我的片段:
$('#idTERRITORIAL_8').next().find('.active_default').addClass("red");
我得到例外:
Spark
奇怪,当我尝试sc.hadoopConfiguration.set("fs.s3.awsAccessKeyId", "MYKEY")
sc.hadoopConfiguration.set("fs.s3.awsSecretAccessKey", "MYSECRET")
val a = sc.textFile("s3://myBucket/TNRealtime/output/2016/01/27/22/45/00/a.txt").map{line => line.split(",")}
val b = a.collect // **ERROR** producing statement
中的相同代码段时,我得到了不同的错误:
org.apache.hadoop.mapred.InvalidInputException: Input path does not exist: s3://snapdeal-personalization-dev-us-west-2/TNRealtime/output/2016/01/27/22/45/00/a.txt
at org.apache.hadoop.mapred.FileInputFormat.listStatus(FileInputFormat.java:251)
at org.apache.hadoop.mapred.FileInputFormat.getSplits(FileInputFormat.java:270)
at org.apache.spark.rdd.HadoopRDD.getPartitions(HadoopRDD.scala:207)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.rdd.MapPartitionsRDD.getPartitions(MapPartitionsRDD.scala:35)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:239)
at org.apache.spark.rdd.RDD$$anonfun$partitions$2.apply(RDD.scala:237)
at scala.Option.getOrElse(Option.scala:120)
at org.apache.spark.rdd.RDD.partitions(RDD.scala:237)
at org.apache.spark.SparkContext.runJob(SparkContext.scala:1921)
at org.apache.spark.rdd.RDD$$anonfun$collect$1.apply(RDD.scala:909)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:147)
at org.apache.spark.rdd.RDDOperationScope$.withScope(RDDOperationScope.scala:108)
at org.apache.spark.rdd.RDD.withScope(RDD.scala:310)
at org.apache.spark.rdd.RDD.collect(RDD.scala:908)
at com.snapdeal.pears.trending.TrendingDecay$.load(TrendingDecay.scala:68)
任何人都可以帮助我理解这个问题。
答案 0 :(得分:1)
我不确定你的场景是什么,但是当我在本地运行Spark并希望访问S3上的文件时,我在s3路径中指定了密钥和密钥,如下所示:
sc.textFile("s3://MYKEY:MYSECRET@myBucket/TNRealtime/output/2016/01/27/22/45/00/a.txt")
也许这对你也有用。
答案 1 :(得分:1)
尝试将s3
替换为s3n
,这是一种新协议。