nutch2.0与cassandra

时间:2012-09-17 10:44:49

标签: eclipse cassandra nutch gora

Exception in thread "main" org.apache.gora.util.GoraException: java.io.IOException
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
    at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
    at org.apache.nutch.crawl.InjectorJob.run(InjectorJob.java:214)
    at org.apache.nutch.crawl.Crawler.runTool(Crawler.java:68)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:136)
    at org.apache.nutch.crawl.Crawler.run(Crawler.java:250)
    at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
    at org.apache.nutch.crawl.Crawler.main(Crawler.java:257)
Caused by: java.io.IOException
    at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:88)
    at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
    ... 8 more
Caused by: java.lang.NullPointerException
    at org.apache.gora.cassandra.store.CassandraMapping.<init>(CassandraMapping.java:117)
    at org.apache.gora.cassandra.store.CassandraMappingManager.get(CassandraMappingManager.java:84)
    at org.apache.gora.cassandra.store.CassandraClient.initialize(CassandraClient.java:84)
    at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:85)
    ... 10 more

我只是在cassandra上运行nutch2.0。它是爬网的输出,TestGoreStorage的输出如下:

Starting!
Exception in thread "main" org.apache.gora.util.GoraException: java.io.IOException
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:167)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:135)
    at org.apache.nutch.storage.StorageUtils.createWebStore(StorageUtils.java:75)
    at org.apache.nutch.storage.TestGoraStorage.main(TestGoraStorage.java:204)
Caused by: java.io.IOException
    at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:88)
    at org.apache.gora.store.DataStoreFactory.initializeDataStore(DataStoreFactory.java:102)
    at org.apache.gora.store.DataStoreFactory.createDataStore(DataStoreFactory.java:161)
    ... 3 more
Caused by: java.lang.NullPointerException
    at org.apache.gora.cassandra.store.CassandraMapping.<init>(CassandraMapping.java:117)
    at org.apache.gora.cassandra.store.CassandraMappingManager.get(CassandraMappingManager.java:84)
    at org.apache.gora.cassandra.store.CassandraClient.initialize(CassandraClient.java:84)
    at org.apache.gora.cassandra.store.CassandraStore.initialize(CassandraStore.java:85)
    ... 5 more

我可以将cassandra与cassandra-cli连接起来,然后从svn中查看nutch。 以下是gora.properties中的效果配置:

    gora.datastore.default=org.apache.gora.cassandra.store.CassandraStore
    gora.sqlstore.jdbc.driver=org.hsqldb.jdbc.JDBCDriver
    gora.sqlstore.jdbc.url=jdbc:hsqldb:hsql://210.44.138.8/nutchtest
    gora.sqlstore.jdbc.user=sa
    gora.sqlstore.jdbc.password=
    gora.cassandrastore.servers=210.44.138.8:9160

和gora-cassandra-mapping中的配置:

<keyspace name="webpage" cluster="My Cluster" host="210.44.138.8">
    <family name="p"/>
    <family name="f"/>
    <family name="sc" type="super"/>
</keyspace>

210.44.138.8是我的群集的节点,群集的名称是“我的群集”, 更多信息:关闭防火墙,在eclipse中运行。如果有人给我任何帮助,我很高兴。

1 个答案:

答案 0 :(得分:0)

我不确定我是否有完全相同的问题,但我发现在gora-cassandra-mapping.xml文件中我必须添加一个键空间属性( keyspace =“ks1” )到class元素:

<keyspace name="ks1" cluster="My Cluster" host="1.2.3.4">
    ...
</keyspace>
<class keyspace="ks1" keyClass="java.lang.String" name="org.apache.nutch.storage.WebPage">
    ...
</class>