在hadoop上运行mahout的fpg算法作为集群mod

时间:2015-01-27 07:16:12

标签: hadoop mahout

我在linux(centos)上安装mahout-0.7和hadoop-1.2.1 .hadoop config作为multi_node。 我创建了一个名为hadoop的用户并在path / home / hadoop / opt /中安装mahout和hadoop 我在用户环境中的.bashrc文件中设置了MAHOU_HOME和HADOOP_HOME以及MAHOUT_LOCAL,....

# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
        . /etc/bashrc
fi

# User specific aliases and functions
export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.71/jre
export HADOOP_HOME=/home/hadoop/opt/hadoop
export PATH=$PATH:$HADOOP_HOME/bin
export HADOOP_CONF_DIR=/opt/hadoop/conf
export MAHOUT_HOME=/home/hadoop/opt/mahout
export MAHOUT_CONF_DIR=$MAHOUT_HOME/conf
export PATH=$PATH:$MAHOUT_HOME/bin


I want to run mahout on hadoop systemfile ,When I run the following command, I get an error

命令: hadoop @ master mahout $ bin / mahout fpg -i /home/hadoop/output.dat -o patterns -method mapreduce -k 50 -s 2

错误:

 MAHOUT_LOCAL is not set; adding HADOOP_CONF_DIR to classpath.
 hadoop binary is not in PATH,HADOOP_HOME/bin,HADOOP_PREFIX/bin, running locally
 Error occurred during initialization of VM
 Could not reserve enough space for object heap
 Error: Could not create the Java Virtual Machine.
 Error: A fatal exception has occurred. Program will exit.

请帮帮我。我试过但无法解决错误。

1 个答案:

答案 0 :(得分:0)

您的配置和使用情况似乎存在一些冲突。 在第一眼看,你可以确定这些: 要确保您已正确设置Mahout路径,请使用以下命令:

echo $MAHOUT_LOCAL

这不应该返回一个空字符串(当你在本地运行mahout时) HADOOP_CONF_DIR也应设为$HADOOP_HOME/conf 这是Hadoop的热门环境变量列表:

#HADOOP VARIABLES START
export JAVA_HOME=/path/to/jdk1.8.0/  #your jdk path
export HADOOP_HOME=/usr/local/hadoop #your hadoop path
export HADOOP_INSTALL=/usr/local/hadoop #your hadoop path
export PATH=$PATH:$HADOOP_INSTALL/bin
export PATH=$PATH:$HADOOP_INSTALL/sbin
export HADOOP_MAPRED_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_HOME=$HADOOP_INSTALL
export HADOOP_HDFS_HOME=$HADOOP_INSTALL
export YARN_HOME=$HADOOP_INSTALL
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_INSTALL/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_INSTALL/lib"
export HADOOP_CLASSPATH=/home/hduser/lib/* #thir party libraries to be loaded with Hadoop
#HADOOP VARIABLES END

此外,您会收到堆错误,并且应该增加堆大小,以便启用JVM来初始化

此外,您可以通过添加有关群集的更多信息来帮助解决错误:

  1. 你使用了多少台机器?
  2. 这些机器的硬件规格是什么?
  3. Hadoop的分布和版本?