使用whirr在amazon ec2上启动hadoop集群:未找到Action处理程序

时间:2012-03-12 12:58:02

标签: hadoop amazon-ec2 amazon apache-whirr

我想在亚马逊实例上使用whirr启动一个实例。但是当我尝试使用标准化的whirr命令启动一个集群时,它在我的hadoop集群名称之后搜索某个目录名称为“myclutster”,它不存在。有人可以帮忙吗?

kaustubh@hdv-Kaustubh:~/Downloads$ whirr launch-cluster --config whirrprop.properties
Unable to start the cluster. Terminating all nodes.
java.lang.IllegalArgumentException: java.lang.NullPointerException: Action handler not found
    at org.apache.whirr.actions.ScriptBasedClusterAction.safeGetActionHandler(ScriptBasedClusterAction.java:245)
    at org.apache.whirr.actions.ScriptBasedClusterAction.execute(ScriptBasedClusterAction.java:100)
    at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:106)
    at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
    at org.apache.whirr.cli.Main.run(Main.java:64)
    at org.apache.whirr.cli.Main.main(Main.java:97)
Caused by: java.lang.NullPointerException: Action handler not found
    at com.google.common.base.Preconditions.checkNotNull(Preconditions.java:204)
    at org.apache.whirr.HandlerMapFactory$ReturnHandlerByRoleOrPrefix.apply(HandlerMapFactory.java:66)
    at org.apache.whirr.HandlerMapFactory$ReturnHandlerByRoleOrPrefix.apply(HandlerMapFactory.java:45)
    at com.google.common.collect.ComputingConcurrentHashMap$ComputingValueReference.compute(ComputingConcurrentHashMap.java:355)
    at com.google.common.collect.ComputingConcurrentHashMap$ComputingSegment.compute(ComputingConcurrentHashMap.java:184)
    at com.google.common.collect.ComputingConcurrentHashMap$ComputingSegment.getOrCompute(ComputingConcurrentHashMap.java:153)
    at com.google.common.collect.ComputingConcurrentHashMap.getOrCompute(ComputingConcurrentHashMap.java:69)
    at com.google.common.collect.ComputingConcurrentHashMap$ComputingMapAdapter.get(ComputingConcurrentHashMap.java:393)
    at org.apache.whirr.actions.ScriptBasedClusterAction.safeGetActionHandler(ScriptBasedClusterAction.java:238)
    ... 5 more
Unable to load cluster state, assuming it has no running nodes.
java.io.FileNotFoundException: /home/kaustubh/.whirr/mycluster/instances (No such file or directory)
    at java.io.FileInputStream.open(Native Method)
    at java.io.FileInputStream.(FileInputStream.java:120)
    at com.google.common.io.Files$1.getInput(Files.java:100)
    at com.google.common.io.Files$1.getInput(Files.java:97)
    at com.google.common.io.CharStreams$2.getInput(CharStreams.java:91)
    at com.google.common.io.CharStreams$2.getInput(CharStreams.java:88)
    at com.google.common.io.CharStreams.readLines(CharStreams.java:306)
    at com.google.common.io.Files.readLines(Files.java:580)
    at org.apache.whirr.state.FileClusterStateStore.load(FileClusterStateStore.java:54)
    at org.apache.whirr.state.ClusterStateStore.tryLoadOrEmpty(ClusterStateStore.java:58)
    at org.apache.whirr.ClusterController.destroyCluster(ClusterController.java:143)
    at org.apache.whirr.ClusterController.launchCluster(ClusterController.java:118)
    at org.apache.whirr.cli.command.LaunchClusterCommand.run(LaunchClusterCommand.java:63)
    at org.apache.whirr.cli.Main.run(Main.java:64)
    at org.apache.whirr.cli.Main.main(Main.java:97)
java.lang.NullPointerException: Action handler not found
Usage: whirr launch-cluster [OPTIONS]

Option                                  Description                            
------                                  -----------                            
--aws-ec2-spot-price             Spot instance price (aws-ec2 specific  
                                          option)                              
--blobstore-cache-container             The name of the container to be used   
                                          for caching local files. If not      
                                          specified Whirr will create a random 
                                          one and remove it at the end of the  
                                          session.                             
--blobstore-credential                  The blob store credential              
--blobstore-identity                    The blob store identity                
--blobstore-location-id                 The blob store location ID             
--blobstore-provider                    The blob store provider. E.g. aws-s3,  
                                          cloudfiles-us, cloudfiles-uk         
--client-cidrs                          A comma-separated list of CIDR blocks. 
                                          E.g. 208.128.0.0/11,108.128.0.0/11   
--cluster-name                          The name of the cluster to operate on. 
                                          E.g. hadoopcluster.                  
--cluster-user                          The name of the user that Whirr will   
                                          create on all the cluster instances. 
                                          You have to use this user to login   
                                          to nodes.                            
--config             Note that Whirr properties specified   
                                          in this file  should all have a      
                                          whirr. prefix.                       
--credential                            The cloud credential.                  
--firewall-rules                        A comma-separated list of port         
                                          numbers. E.g. 8080,8181              
--firewall-rules-role                   A comma-separated list of port         
                                          numbers. E.g. 8080,8181. Replace     
                                          'role' with an actual role name      
--hardware-id                           The type of hardware to use for the    
                                          instance. This must be compatible    
                                          with the image ID.                   
--hardware-min-ram             The minimum amount of instance memory. 
                                          E.g. 1024                            
--identity                              The cloud identity.                    
--image-id                              The ID of the image to use for         
                                          instances. If not specified then a   
                                          vanilla Linux image is chosen.       
--instance-templates                    The number of instances to launch for  
                                          each set of roles. E.g. 1 hadoop-    
                                          namenode+hadoop-jobtracker, 10       
                                          hadoop-datanode+hadoop-tasktracker   
--instance-templates-max-percent-       The percentage of successfully started 
  failures                                instances for each set of roles. E.  
                                          g. 100 hadoop-namenode+hadoop-       
                                          jobtracker,60 hadoop-datanode+hadoop-
                                          tasktracker means all instances with 
                                          the roles hadoop-namenode and hadoop-
                                          jobtracker has to be successfully    
                                          started, and 60% of instances has to 
                                          be succcessfully started each with   
                                          the roles hadoop-datanode and hadoop-
                                          tasktracker.                         
--instance-templates-minimum-number-of- The minimum numberof successfully      
  instances                               started instances for each set of    
                                          roles. E.g. 1 hadoop-namenode+hadoop-
                                          jobtracker,6 hadoop-datanode+hadoop- 
                                          tasktracker means 1 instance with    
                                          the roles hadoop-namenode and hadoop-
                                          jobtracker has to be successfully    
                                          started, and 6 instances has to be   
                                          successfully started each with the   
                                          roles hadoop-datanode and hadoop-    
                                          tasktracker.                         
--location-id                           The location to launch instances in.   
                                          If not specified then an arbitrary   
                                          location will be chosen.             
--login-user                            Override the default login user used   
                                          to bootstrap whirr. E.g. ubuntu or   
                                          myuser:mypass.                       
--max-startup-retries          The number of retries in case of       
                                          insufficient successfully started    
                                          instances. Default value is 1.       
--private-key-file                      The filename of the private RSA key    
                                          used to connect to instances.        
--provider                              The name of the cloud provider. E.g.   
                                          aws-ec2, cloudservers-uk             
--public-key-file                       The filename of the public key used to 
                                          connect to instances.                
--run-url-base                          The base URL for forming run urls      
                                          from. Change this to host your own   
                                          set of launch scripts.               
--service-name                          (optional) The name of the service to  
                                          use. E.g. hadoop.                    
--state-store                           What kind of store to use for state    
                                          (local, blob or none). Defaults to   
                                          local.                               
--state-store-blob                      Blob name for state storage. Valid     
                                          only for the blob state store.       
                                          Defaults to whirr-     
--state-store-container                 Container where to store state. Valid  
                                          only for the blob state store.       
--terminate-all-on-launch-failure       Whether or not to automatically        
                                 terminate all nodes when cluster     
                                          launch fails for some reason.        
--version   

1 个答案:

答案 0 :(得分:1)

没有这样的东西叫做 hadoop-jobtracer (注意缺少的k)

这就是Hadoop集群所需要的:

whirr.instance-templates=1 hadoop-namenode+hadoop-jobtracker, 1 hadoop-datanode+hadooptasktracker
whirr.hardware-id=m1.small # or larger - t1.micro is not a good choice

另外,请确保查看recipes / hadoop-ec2.properties以获取更多示例。