我正在运行这个hadoop程序,这是reducer:
public static class joinsReduce extends Reducer<Text, Text, Text, Text>
{ //start class reduce (1)
public void reduce(Text key, Iterable<Text> values, Context context)
throws IOException, InterruptedException
{ // start method reduce (2)
String result = "";
Map<String, String> memOf = new HashMap<String, String>();
Map<String, String> subOrg = new HashMap<String, String>();
Map<String, String> email = new HashMap<String, String>();
List<String> studList = new ArrayList<String>();
String line = "";
String source = "";
String sub = "";
String obj = "";
String hashKey = "";
String hashVal = "";
int count = 0;
for (Text value : values)
{ //start iterate over values (3)
line = value.toString();
String[] parts = line.trim().split(",");
source = parts[0].trim();
sub = parts[1].trim();
obj = parts[2].trim();
hashKey = sub;
hashVal = obj;
if (source.equals("type"))
{
studList.add(sub);
}
if (source.equals("memberOf"))
{
memOf.put(hashKey, hashVal);
} // end mem
if (source.equals("subOrganizationOf"))
{
subOrg.put(hashKey, hashVal);
} // end if sub
if (source.equals("emailAddress"))
{ //(4)
email.put(hashKey, hashVal);
} // end if email (4)
} //end reading loop iterating over values (3)
String y = "";
String z = "";
String z1 = "";
for (String x : studList)
{ // (6)
y = memOf.get(x);
z = email.get(x);
z1 = subOrg.get(y);
if (y != null && z != null && z1 != null)
{ // (8)
result = x + ',' + y + ',' + z + ',' + z1;
context.write(new Text(x), new Text(result));
} // end inner z if (8)
} // end iterating loop //(6)
}//end method (2)
}//end class (1)
它会在达到减少阶段的92%时抛出错误:java堆空间。我正在使用9个减速器,我不想增加它们。
有没有办法减少java内存使用,代码明智? 或者可以增加java堆大小以超过我的主内存(4GB)。
我目前正在使用其中的第四个
<property>
<name>mapreduce.map.java.opts</name>
<value>-Xmx3072m</value>
<description>No description</description>
</property>
如果我可以超出可用内存并且如果我将-Xmx3072m更改为-Xmx4096m,我该怎么办才能使其生效?