Question

我正在尝试制作Cloudera's Hadoop食谱。我正在使用this cloudera食谱来使其发挥作用：

# PSEUDEO INSTALL:
# https://ccp.cloudera.com/display/CDH4DOC/Installing+CDH4+on+a+Single+Linux+Node+in+Pseudo-distributed+Mode#InstallingCDH4onaSingleLinuxNodeinPseudo-distributedMode-InstallingCDH4withYARNonaSingleLinuxNodeinPseudodistributedmode

if node[:platform] == "ubuntu"
    execute "apt-get update"
end

# Install required base packages
package "curl" do
    action :install
end

package "wget" do
    action :install
end

# Install Cloudera Basic:
case node[:platform]
    when "ubuntu"
        case node[:lsb][:codename]
            when "precise"
                execute "curl -s http://archive.cloudera.com/cdh4/ubuntu/precise/amd64/cdh/archive.key | sudo apt-key add -"
                execute "wget http://archive.cloudera.com/cdh4/one-click-install/precise/amd64/cdh4-repository_1.0_all.deb"
            when "lucid"
                execute "curl -s http://archive.cloudera.com/cdh4/ubuntu/lucid/amd64/cdh/archive.key | sudo apt-key add -"
                execute "wget http://archive.cloudera.com/cdh4/one-click-install/lucid/amd64/cdh4-repository_1.0_all.deb"
             when "squeeze"
                execute "curl -s http://archive.cloudera.com/cdh4/ubuntu/squeeze/amd64/cdh/archive.key | sudo apt-key add -"
                execute "wget http://archive.cloudera.com/cdh4/one-click-install/squeeze/amd64/cdh4-repository_1.0_all.deb"
        end
        execute "dpkg -i cdh4-repository_1.0_all.deb"
        execute "apt-get update"
end




if node['cloudera']['installyarn'] == true
    package "hadoop-conf-pseudo" do
      action :install
    end
else
    package "hadoop-0.20-conf-pseudo" do
      action :install
    end
end


# copy over helper script to start hdfs
cookbook_file "/tmp/hadoop-hdfs-start.sh" do
    source "hadoop-hdfs-start.sh"
    mode "0744"
end
cookbook_file "/tmp/hadoop-hdfs-stop.sh" do
    source "hadoop-hdfs-stop.sh"
    mode "0744"
end
cookbook_file "/tmp/hadoop-0.20-mapreduce-start.sh" do
    source "hadoop-0.20-mapreduce-start.sh"
    mode "0744"
end
cookbook_file "/tmp/hadoop-0.20-mapreduce-stop.sh" do
    source "hadoop-0.20-mapreduce-stop.sh"
    mode "0744"
end
# helper to prepare folder structure for first time
cookbook_file "/tmp/prepare-yarn.sh" do
    source "prepare-yarn.sh"
    mode "0777"
end
cookbook_file "/tmp/prepare-0.20-mapreduce.sh" do
    source "prepare-0.20-mapreduce.sh"
    mode "0777"
end


# only for the first run we need to format as hdfs (we pass input "N" to answer the reformat question with No )
################
execute "format namenode" do
    command 'echo "N" | hdfs namenode -format'
    user "hdfs"
    returns [0,1]
end


# Jobtracker repeats - was the only way to get both together
%w{jobtracker tasktracker}.each { |name|
  service "hadoop-0.20-mapreduce-#{name}" do
    supports :status => true, :restart => true, :reload => true
    action [ :enable, :start ]
  end
} if !node['cloudera']['installyarn']

# now hadopp should run and this should work: http://localhost:50070:
%w(datanode namenode secondarynamenode).each { |name|
  service "hadoop-hdfs-#{name}" do
    supports :status => true, :restart => true, :reload => true
    action [ :enable, :start ]
  end
}

# Prepare folders (only first run)
# TODO: only do this if "hadoop fs -ls /tmp" return "No such file or directory"
################

if node['cloudera']['installyarn'] == true
    execute "/tmp/prepare-yarn.sh" do
     user "hdfs"
     not_if 'hadoop fs -ls -R / | grep "/tmp/hadoop-yarn"'
    end
else
    execute "/tmp/prepare-0.20-mapreduce.sh" do
     user "hdfs"
     not_if 'hadoop fs -ls -R / | grep "/var/lib/hadoop-hdfs/cache/mapred"'
    end
end

因此，在我创建vagrant vm之后，我尝试在其上安装hadoop：

 knife bootstrap localhost --ssh-user vagrant --ssh-password vagrant --ssh-port 2222 --run-list "recipe[cloudera]" --sudo

但我得到的唯一结果是：

localhost The following packages have unmet dependencies:
localhost  hadoop-0.20-conf-pseudo : Depends: hadoop-hdfs-namenode (= 2.0.0+1554-1.cdh4.6.0.p0.16~precise-cdh4.6.0) but it is not going to be installed
localhost                            Depends: hadoop-hdfs-datanode (= 2.0.0+1554-1.cdh4.6.0.p0.16~precise-cdh4.6.0) but it is not going to be installed
localhost                            Depends: hadoop-hdfs-secondarynamenode (= 2.0.0+1554-1.cdh4.6.0.p0.16~precise-cdh4.6.0) but it is not going to be installed
localhost                            Depends: hadoop-0.20-mapreduce-jobtracker (= 2.0.0+1554-1.cdh4.6.0.p0.16~precise-cdh4.6.0) but it is not going to be installed
localhost                            Depends: hadoop-0.20-mapreduce-tasktracker (= 2.0.0+1554-1.cdh4.6.0.p0.16~precise-cdh4.6.0) but it is not going to be installed
localhost STDERR: E: Unable to correct problems, you have held broken packages.
localhost ---- End output of apt-get -q -y install hadoop-0.20-conf-pseudo=2.0.0+1554-1.cdh4.6.0.p0.16~precise-cdh4.6.0 ----
localhost Ran apt-get -q -y install hadoop-0.20-conf-pseudo=2.0.0+1554-1.cdh4.6.0.p0.16~precise-cdh4.6.0 returned 100
localhost [2014-04-03T03:41:41+00:00] FATAL: Chef::Exceptions::ChildConvergeError: Chef run process exited unsuccessfully (exit code 1)

我曾尝试安装这些依赖项，但确实有效。有人可以帮助我吗？

Answer 1

我们在这条道路上挣扎了一段时间，并得出结论，使用cloudera和hortonworks提供的vms更有意义。工作部件太多，与供应商提供的vms相比没有足够的好处。

在这条路的旁边似乎有很多残骸 - 人们创造了他们自己的厨师hadoop食谱/烹饪书。证据：撬棍项目可能非常出色，但似乎是一个无可救药的复杂泥球。

仅限一个观点。如果出现适当的配方，我们会使用它。或者，cloudera和/或hortonworks可以发布食谱而不是vm ....

Answer 2

您的apt-cache已过期。在引导过程中手动运行apt-get update，或者先在运行列表中添加apt cookbook。

厨师cloudera食谱找不到遗失的依赖

2 个答案: