在Java中,我有一个表示图像中像素的二维数组。由于阵列的大小(9744×9744),我不能将整个阵列保存在一个RDD中。我决定一次处理一行图像的一半,然后使用saveAsTextFile()
将其输出到文件中。当我这样做时,我会在处理完第一行后得到Exception in thread "main" org.apache.hadoop.mapred.FileAlreadyExistsException
。
有没有办法逐步添加到第一行RDD的前半部分生成的文件?以下是我想要做的一个例子。
int pixs[][] = new int[4872][2];
int count = 0;
int rowI = 0;
int colJ = 0;
int colJCurrent = 0;
JavaRDD<int[]> firstHalf = null;
sparkPixels = new double[ 1 ][ 9744 ][3];
for( ; rowI < 1; rowI++ )
{
for(int colCount=0;colCount < 2;colCount++)
{
for( colJ=colJCurrent; colJ < (colJCurrent+(rawWidth)); colJ++ )
{
pixs[count][0]=rowI;
pixs[count][1]=colJ;
count++;
}
colJCurrent=colJ;
count=0;
firstHalf = ctx.parallelize(Arrays.asList(pixs));
JavaRDD<SimpleMatrix> firstResults = firstHalf.map(new Function<int[], SimpleMatrix>() {
private static final long serialVersionUID = 1L;
public SimpleMatrix call(int pix[])
{
return PixelInfoFunction(pix[0], pix[1]);
}
});
JavaRDD<String> stringOutput = firstResults.map(new Function<SimpleMatrix, String>() {
private static final long serialVersionUID = 1L;
public String call(SimpleMatrix i)
{
return PixelInfoInStringFormatt;
}
});
stringOutput.saveAsTextFile("/home/bielasjj/Projection_Output/test");
}
colJ=0;
count = 0;
colJCurrent=0;
}
ctx.stop();
ctx.close();
我已修改代码以一次处理一行并将其添加到数组以供稍后输出,但在运行时我得到Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded
for(int rowI = 0 ; rowI < 9744; rowI++ )
{
for(int colJ = 0 ; colJ < 9744; colJ++ )
{
pixs[colJ][0]=rowI;
pixs[colJ][1]=colJ;
}
JavaRDD<int[]> firstHalf = ctx.parallelize(Arrays.asList(pixs));
JavaRDD<SimpleMatrix> rowResults = firstHalf.map(new Function<int[], SimpleMatrix>() {
private static final long serialVersionUID = 1L;
public SimpleMatrix call(int pix[])
{
return PixelInfoFunction(pix[0], pix[1]);
}
});
rowResults.foreach(new VoidFunction<SimpleMatrix>(){
private static final long serialVersionUID = 1L;
public void call(SimpleMatrix i)
{
sparkPixels[(int)i.get(3,0)][(int)i.get(4,0)][0] = i.get(0, 0);
sparkPixels[(int)i.get(3,0)][(int)i.get(4,0)][1] = i.get(1, 0);
sparkPixels[(int)i.get(3,0)][(int)i.get(4,0)][2] = i.get(2, 0);
}
});
}
PrintWriter csv = new PrintWriter("/home/CSV.csv");
for(int i=0; i < (2 * rawWidth)-1; i++)
{
for(int j=0; j < (3 * rawHeight)-1; j++)
{
String line = i+ "," + j + ", " + (int)sparkPixels[i][j][0] +", "+(int)sparkPixels[i][j][1]+", "+(int)sparkPixels[i][j][2];
csv.println(line);
}
}
csv.close();
答案 0 :(得分:0)
在输出文件名中添加行号:
stringOutput.saveAsTextFile("/home/bielasjj/Projection_Output/test" + rowI);