Elasticsearch CouchDB河牌陈旧/不同步

时间:2013-06-30 19:26:18

标签: couchdb elasticsearch elasticsearch-plugin

我似乎从ES和CouchDB获得了不同的结果,ES只有2个较旧的文档,CouchDB不再拥有,而CouchDB有许多新文档,而ES根本没有看到。造成这种情况的原因是什么,以及如何找出CouchDB河的状态?

这是我的要求:

#ES has Document-1...
$curl http://localhost:9200/portal_production/portal_production/_search?pretty=true\&q=_id:Document-1

{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 1,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "portal_production",
        "_type": "portal_production",
        "_id": "Document-1",
        "_score": 1.0,
        "_source": {
          "_rev": "2-2a986416ddb8a95446b0e143739094d2",
          "text": "    FILE TYPE                      : INTERROGATION\n    FILE TITLE                     : TMJ06001.A91\n    FILE CREATED                   : 01 JANUARY 2006 AT 00:00\n\n! This file contains all detections for 2006 from the juvenile bypass outfall.\n! The tags were detected using an FS-2001F portable transceiver and flat-plate\n! antenna.  These data were compiled from the original files by Dave Marvin,\n! PTAGIS.  The original data files are listed in the data stream below, \n! followed by their contents.\n\n! TMJ06032.A1\n| 01 02/16/06 18:34:51 3D9.1BF11B4053 XX 91\n| 01 02/16/06 19:08:15 3D9.1BF1E7919A XX 91\n| 01 02/16/06 19:18:36 3D9.1BF1A998FA XX 91\n| 01 02/17/06 18:21:03 3D9.1BF20E8FE2 XX 91\n| 01 02/20/06 18:27:01 3D9.1BF11BFFF5 XX 91\n| 01 02/22/06 01:56:38 3D9.1BF23F62D4 XX 91\n| 01 02/22/06 03:56:10 3D9.1BF234346C XX 91\n| 01 02/22/06 17:59:11 3D9.1BF2342E83 XX 91\n| 01 02/22/06 19:03:37 3D9.1BF23435A4 XX 91",
          "_id": "Document-1"
        }
      }
    ]
  }
}

#~but CouchDB has no Document-1
$ curl http://localhost:5984/portal_production/Document-1
{
  "error": "not_found",
  "reason": "missing"
}

#CouchDB has Document-1000...
$ curl http://localhost:5984/portal_production/Document-1000
{
  "_id": "Document-1000",
  "_rev": "8-d7f049228abc6311a920f9f7786ab9a4",
  "text": null,
  "metadata": [],
  "data": [
    {
      "1": "07/22/08 18:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 18:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 18:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 19:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 19:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 19:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 19:49:45",
      "2": "3D9.1C2C42D260"
    },
    {
      "1": "07/22/08 20:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 20:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 20:14:38",
      "2": "3D9.1C2C54F95E"
    },
    {
      "1": "07/22/08 20:22:24",
      "2": "3D9.1BF1FDA622"
    },
    {
      "1": "07/22/08 20:49:28",
      "2": "3D9.1C2C42D260"
    },
    {
      "1": "07/22/08 21:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 21:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 21:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 22:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 22:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 22:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 22:49:27",
      "2": "3D9.1C2C42D260"
    },
    {
      "1": "07/22/08 23:09:22",
      "2": "3E7.0000001DFF"
    },
    {
      "1": "07/22/08 23:09:22",
      "2": "3E7.0000001DFF"
    }
  ],
  "foreign_keys": [],
  "primary_keys": [
    "1",
    "2"
  ]
}

#~but ES has no Document-1000
$ curl http://localhost:9200/portal_production/portal_production/_search?pretty=true\&q=_id:Document-1000
{
  "took": 4,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 0,
    "max_score": null,
    "hits": []
  }
}

#Everything ES has:
$ curl http://localhost:9200/portal_production/portal_production/_search?pretty=true
{
  "took": 3,
  "timed_out": false,
  "_shards": {
    "total": 5,
    "successful": 5,
    "failed": 0
  },
  "hits": {
    "total": 2,
    "max_score": 1.0,
    "hits": [
      {
        "_index": "portal_production",
        "_type": "portal_production",
        "_id": "Document-1",
        "_score": 1.0,
        "_source": {
          "_rev": "2-2a986416ddb8a95446b0e143739094d2",
          "text": "    FILE TYPE                      : INTERROGATION\n    FILE TITLE                     : TMJ06001.A91\n    FILE CREATED                   : 01 JANUARY 2006 AT 00:00\n\n! This file contains all detections for 2006 from the juvenile bypass outfall.\n! The tags were detected using an FS-2001F portable transceiver and flat-plate\n! antenna.  These data were compiled from the original files by Dave Marvin,\n! PTAGIS.  The original data files are listed in the data stream below, \n! followed by their contents.\n\n! TMJ06032.A1\n| 01 02/16/06 18:34:51 3D9.1BF11B4053 XX 91\n| 01 02/16/06 19:08:15 3D9.1BF1E7919A XX 91\n| 01 02/16/06 19:18:36 3D9.1BF1A998FA XX 91\n| 01 02/17/06 18:21:03 3D9.1BF20E8FE2 XX 91\n| 01 02/20/06 18:27:01 3D9.1BF11BFFF5 XX 91\n| 01 02/22/06 01:56:38 3D9.1BF23F62D4 XX 91\n| 01 02/22/06 03:56:10 3D9.1BF234346C XX 91\n| 01 02/22/06 17:59:11 3D9.1BF2342E83 XX 91\n| 01 02/22/06 19:03:37 3D9.1BF23435A4 XX 91",
          "_id": "Document-1"
        }
      },
      {
        "_index": "portal_production",
        "_type": "portal_production",
        "_id": "Ifilter-1",
        "_score": 1.0,
        "_source": {
          "headers": [
            {
              "val": "[ ]*(FILE[ ]+TYPE)[ ]*:[ ]*([A-Z]+)",
              "id": "0"
            },
            {
              "val": "[ ]*(FILE[ ]+TITLE)[ ]*:[ ]*([A-Z0-9.]+)",
              "id": "1"
            },
            {
              "val": "[ ]*(FILE[ ]+CREATED)[ ]*:[ ]*([A-Z0-9: ]+)",
              "id": "2"
            }
          ],
          "_rev": "4-d9c8e771bc345d1182fbe7c2d63f5d00",
          "_id": "Ifilter-1",
          "filter_headers": {
            "2": "[ ]*(FILE[ ]+CREATED)[ ]*:[ ]*([A-Z0-9: ]+)",
            "1": "[ ]*(FILE[ ]+TITLE)[ ]*:[ ]*([A-Z0-9.]+)",
            "0": "[ ]*(FILE[ ]+TYPE)[ ]*:[ ]*([A-Z]+)"
          }
        }
      }
    ]
  }
}

在日志中找到

对不起,我被一个更大的怪物殴打了。无论如何,发现了一个问题:

[2013-08-19 17:55:08,379][WARN ][river.couchdb            ] [Morning Star] [couchdb][portal_production] failed to read from _changes, throttling....
java.io.IOException: Bogus chunk size
at sun.net.www.http.ChunkedInputStream.processRaw(ChunkedInputStream.java:319)
at sun.net.www.http.ChunkedInputStream.readAheadBlocking(ChunkedInputStream.java:572)
at sun.net.www.http.ChunkedInputStream.readAhead(ChunkedInputStream.java:609)
at sun.net.www.http.ChunkedInputStream.read(ChunkedInputStream.java:696)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at sun.net.www.protocol.http.HttpURLConnection$HttpInputStream.read(HttpURLConnection.java:3052)
at sun.nio.cs.StreamDecoder.readBytes(StreamDecoder.java:283)
at sun.nio.cs.StreamDecoder.implRead(StreamDecoder.java:325)
at sun.nio.cs.StreamDecoder.read(StreamDecoder.java:177)
at java.io.InputStreamReader.read(InputStreamReader.java:184)
at java.io.BufferedReader.fill(BufferedReader.java:154)
at java.io.BufferedReader.readLine(BufferedReader.java:317)
at java.io.BufferedReader.readLine(BufferedReader.java:382)
at org.elasticsearch.river.couchdb.CouchdbRiver$Slurper.run(CouchdbRiver.java:477)
at java.lang.Thread.run(Thread.java:724)
[2013-08-19 17:55:13,392][WARN ][river.couchdb            ] [Morning Star] [couchdb][portal_production] failed to read from _changes, throttling....`

1 个答案:

答案 0 :(得分:0)

奇怪的是couchdb河支持删除。您应该查看日志以及CouchDb中的_changes API。你能看到删除操作吗?