我创建了一个包含文档的MongoDb集合,下面是一个文档示例。
{
"_id": ObjectId("53837eed557acd39628b4cdf"),
"userid": null,
"importdate": ISODate("2014-05-26T17:50:37.0Z"),
"documentnumber": "174953-2014",
"source": "ted",
"typeoftender": "public",
"categories": {
"0": ObjectId("527baa62557acd1669eb992d")
},
"data": {
"oj": "100",
"ol": "bg",
"cy": "bg",
"dt": ISODate("2014-06-30T22:00:00.0Z"),
"heading": "01302",
"ti": {
"bg": "Услуги по програмиране на системен софтуер и потребителски софтуерни средства",
"cs": "Programování systémového a uživatelského programového vybavení",
"da": "Programmeringsservice i forbindelse med systemer og brugerprogrammel",
"de": "Programmierung von System- und Anwendersoftware",
"el": "Υπηρεσίες προγραμματισμού λογισμικών συστήματος και χρήστη",
"en": "Programming services of systems and user software",
"es": "Servicios de programación de sistemas y software de usuario",
"et": "Süsteemide ja kasutajatarkvara programmeerimine",
"fi": "Varus- ja käyttäjäohjelmiston ohjelmointipalvelut",
"fr": "Services de programmation de systèmes et de logiciels utilitaires",
"ga": "Programming services of systems and user software",
"hr": "Usluge programiranja sustava i korisničke podrške",
"hu": "Rendszer- és felhasználói szoftverek programozási szolgáltatásai",
"it": "Servizi di programmazione di software di sistemi e di utente",
"lt": "Programavimo paslaugos, susijusios su sistemomis ir vartotojo programine įranga",
"lv": "Sistēmu un lietotāju programmatūras programmēšanas pakalpojumi",
"mt": "Servizzi ta' programmizzar tas-sistemi u tas-software ta' l-utenti",
"nl": "Programmering van systeem- en gebruikerssoftware",
"pl": "Usługi programowania oprogramowania systemowego i dla użytkownika",
"pt": "Serviços de programação de sistemas e de software para o utilizador",
"ro": "Servicii de programare de sisteme informatice şi software utilitare",
"sk": "Programovanie systémového a používateľského softvéru",
"sl": "Storitve programiranja sistemske in uporabniške programske opreme",
"sv": "Programmering av system- och användarprogram"
},
"tw": {
"bg": "София",
"cs": "Sofie",
"da": "Sofia",
"de": "Sofia",
"el": "Σόφια",
"en": "Sofia",
"es": "Sofía",
"et": "Sofia",
"fi": "Sofia",
"fr": "Sofia",
"ga": "Sóifia",
"hr": "Sofija",
"hu": "Szófia",
"it": "Sofia",
"lt": "Sofija",
"lv": "Sofija",
"mt": "Sofija",
"nl": "Sofia",
"pl": "Sofia",
"pt": "Sófia",
"ro": "Sofia",
"sk": "Sofia",
"sl": "Sofija",
"sv": "Sofia"
},
"rc": "BG411",
"cpv": {
"0": "72211000"
}
},
"document": {
"da": "<p>Direktiv 2004\/18\/EF<\/p><div class=\"grseq\"><p class=\"tigrseq\">Del I: Ordregivende myndighed<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"col [...]",
"de": "<p>Richtlinie 2004\/18\/EG<\/p><div class=\"grseq\"><p class=\"tigrseq\">Abschnitt I: Öffentlicher Auftraggeber<\/p><div class=\"mlioccur\"><span class=\"nomark\" [...]",
"en": "<p>Directive 2004\/18\/EC<\/p><div class=\"grseq\"><p class=\"tigrseq\">Section I: Contracting authority<\/p><div class=\"mlioccur\"><span class=\"nomark\" style= [...]",
"es": "<p>Directiva 2004\/18\/CE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Apartado I: Poder adjudicador<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"co [...]",
"fi": "<p>Direktiivi 2004\/18\/EY<\/p><div class=\"grseq\"><p class=\"tigrseq\">I kohta: Hankintaviranomainen<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"c [...]",
"fr": "<p>Directive 2004\/18\/CE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Section I: Pouvoir adjudicateur<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\" [...]",
"el": "<p>Οδηγία 2004\/18\/ΕΚ<\/p><div class=\"grseq\"><p class=\"tigrseq\">Τμήμα I: Αναθέτουσα αρχή<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"color:blac [...]",
"it": "<p>Direttiva 2004\/18\/CE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Sezione I: Amministrazione aggiudicatrice<\/p><div class=\"mlioccur\"><span class=\"nomar [...]",
"nl": "<p>Richtlijn 2004\/18\/EG<\/p><div class=\"grseq\"><p class=\"tigrseq\">Afdeling I: Aanbestedende dienst<\/p><div class=\"mlioccur\"><span class=\"nomark\" style= [...]",
"pt": "<p>Directiva 2004\/18\/CE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Secção I: Autoridade adjudicante<\/p><div class=\"mlioccur\"><span class=\"nomark\" style= [...]",
"sv": "<p>Direktiv 2004\/18\/EG<\/p><div class=\"grseq\"><p class=\"tigrseq\">Avsnitt I: Upphandlande myndighet<\/p><div class=\"mlioccur\"><span class=\"nomark\" style= [...]",
"cs": "<p>Směrnice 2004\/18\/ES<\/p><div class=\"grseq\"><p class=\"tigrseq\">Oddíl I: Veřejný zadavatel<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"color: [...]",
"et": "<p>Direktiiv 2004\/18\/EÜ<\/p><div class=\"grseq\"><p class=\"tigrseq\">I osa: Hankija<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"color:black\">I.1) [...]",
"hu": "<p>2004\/18\/EK irányelv<\/p><div class=\"grseq\"><p class=\"tigrseq\">I. szakasz: Ajánlatkérő<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"color:bla [...]",
"lt": "<p>Direktyva 2004\/18\/EB<\/p><div class=\"grseq\"><p class=\"tigrseq\">I dalis: Perkančioji organizacija<\/p><div class=\"mlioccur\"><span class=\"nomark\" style [...]",
"lv": "<p>Direktīva 2004\/18\/EK<\/p><div class=\"grseq\"><p class=\"tigrseq\">I iedaļa: Līgumslēdzēja iestāde<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\" [...]",
"mt": "<p>Direttiva 2004\/18\/KE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Taqsima I: Awtorità kontraenti<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"c [...]",
"pl": "<p>Dyrektywa 2004\/18\/WE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Sekcja I: Instytucja zamawiająca<\/p><div class=\"mlioccur\"><span class=\"nomark\" style= [...]",
"sk": "<p>Smernica 2004\/18\/ES<\/p><div class=\"grseq\"><p class=\"tigrseq\">Oddiel I: Verejný obstarávateľ<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"co [...]",
"sl": "<p>Direktiva 2004\/18\/ES<\/p><div class=\"grseq\"><p class=\"tigrseq\">Oddelek I: Naročnik<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"color:black\" [...]",
"ga": "<p>Treoir 2004\/18\/CE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Alt I: Údarás conarthachta<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"color:bl [...]",
"bg": "<p>Директива 2004\/18\/ЕО<\/p><div class=\"grseq\"><p class=\"tigrseq\">Раздел І: Възлагащ орган<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"color:b [...]",
"ro": "<p>Directiva 2004\/18\/CE<\/p><div class=\"grseq\"><p class=\"tigrseq\">Secțiunea I: Autoritatea contractantă<\/p><div class=\"mlioccur\"><span class=\"nomark\" s [...]",
"hr": "<p>Direktiva 2004\/18\/EZ<\/p><div class=\"grseq\"><p class=\"tigrseq\">Odjeljak I.: Javni naručitelj<\/p><div class=\"mlioccur\"><span class=\"nomark\" style=\"co [...]"
}
}
一旦弹性搜索完成索引,它只存储
{
_index: tendersidx
_type: page
_id: 53837eec557acd39628b4c2b
_score: 1
_source: {
document: {
da: <p>Direktiv 2004/18/EF</p><div class="grseq"><p class="tigrseq">Del I: Ordregivende myndighed</p><div class="mlioccur"><span class="nomark" style="color:black">I.1)</span><span class="timark" style="font-weight:bold;color:black;">Navn, adresser og kontaktpunkt(er)</span><div class="txtmark" style="color:black"><p><p class="addr">Turun kaupunki<br><br>Linnankatu 55 K, 2 krs. / PL 630<br>20101<br>TurkuFINLAND<br>+358 449075222<br>karolus.haarte@turku.fi</p></p></p><p><p class="ft"><b>Bud eller ansøgninger om deltagelse skal sendes til:</b></p><p class="addr">Turun kaupunki<br><br>https://tarjouspalvelu.fi/turku/?id=17775&tpk=93d33e8c-86aa-40c6-8c60-16d511c61a9a<br></p></p></div></div></span></div><div class="grseq"><p class="tigrseq">Del II: Kontraktens genstand</p><div class="mlioccur"><span class="nomark" style="color:black">II.1)</span><span class="timark" style="font-weight:bold;color:black;">Beskrivelse</span></div></span><div class="mlioccur"><span class="nomark" style="color:black">II.1.6)</span><span class="timark" style="font-weight:bold;color:black;">CPV-glossaret (common procurement vocabulary)</span><div class="txtmark" style="color:black"><p>85000000</p></div></div></span><div class="mlioccur"><span class="nomark" style="color:black"></span><span class="timark" style="font-weight:bold;color:black;">Beskrivelse</span><div class="txtmark" style="color:black"><p>Sundhedsvæsen og sociale foranstaltninger.</p></div></div></span></div><div class="grseq"><p class="tigrseq">Del IV: Procedure</p><div class="mlioccur"><span class="nomark" style="color:black">IV.3)</span><span class="timark" style="font-weight:bold;color:black;">Administrative oplysninger</span></div></span><div class="mlioccur"><span class="nomark" style="color:black">IV.3.3)</span><span class="timark" style="font-weight:bold;color:black;">Vilkår for adgang til specifikationer og yderligere dokumenter eller beskrivende dokumenter</span></div></span><div class="mlioccur"><span class="nomark" style="color:black">IV.3.4)</span><span class="timark" style="font-weight:bold;color:black;">Frist for modtagelse af bud eller ansøgninger om deltagelse</span><div class="txtmark" style="color:black"><p>11.8.2014 - 14:00</p></div></div></span><div class="mlioccur"><span class="nomark" style="color:black">IV.3.6)</span><span class="timark" style="font-weight:bold;color:black;">Sprog, der må benyttes ved afgivelse af bud eller ansøgninger om deltagelse</span><div class="txtmark" style="color:black"><p>finsk.</p></div></div></span></div>
}
source: ted
_id: 53837eec557acd39628b4c2b
documentnumber: 175084-2014
importdate: 2014-05-26T17:50:36.000Z
data: {
dt: 2014-08-10T22:00:00.000Z
cpv: [
85000000
]
cy: fi
td: 3
rc: FI183
ti: {
sl: Storitve na področju zdravstva in socialnega varstva
hr: Usluge u području zdravstva i socijalne skrbi
sk: Zdravotnícka a sociálna pomoc
ro: Servicii de sănătate şi servicii de asistenţă socială
da: Sundhedsvæsen og sociale foranstaltninger
it: Servizi sanitari e di assistenza sociale
mt: Servizzi dwar saħħa ta' xogħol soċjali
hu: Egészségügyi és szociális gondozási szolgáltatások
lv: Veselības un sociālie pakalpojumi
lt: Sveikatos priežiūros ir socialinio darbo paslaugos
ga: Health and social work services
cs: Zdravotní a sociální péče
de: Dienstleistungen des Gesundheits- und Sozialwesens
el: Υγειονομικές και κοινωνικές υπηρεσίες
fi: Terveyspalvelut ja sosiaalitoimen palvelut
pt: Serviços de saúde e acção social
pl: Usługi w zakresie zdrowia i opieki społecznej
sv: Hälso- och sjukvård samt socialvård
bg: Услуги на здравеопазването и социалните дейности
fr: Services de santé et services sociaux
en: Health and social work services
et: Tervishoiu ja sotsiaaltöö teenused
es: Servicios de salud y asistencia social
nl: Gezondheidszorg en maatschappelijk werk
}
ty: 1
nc: 4
tw: {
sl: Turku
hr: Turku
sk: Turku
ro: Turku
da: Turku
it: Turku
mt: Turku
hu: Turku
lv: Turku
lt: Turku
ga: Turku
cs: Turku
de: Turku
el: Turku
fi: Turku
pt: Turku
pl: Turku
sv: Åbo
bg: Турку
fr: Turku
en: Turku
et: Turu
es: Turku
nl: Turku
}
ol: fi
oj: 100
ds: 0.00000000 1400623200
pr: 1
heading: 01302
}
userid: null
categories: false
typeoftender: public
}
}
正如您所看到的,elasticssearch只索引了“document”的一部分,即“da”元素。
使用以下命令创建索引:
curl -XPUT "localhost:9200/_river/tenders/_meta" -d '
{
"type": "mongodb",
"mongodb": {
"servers": [
{ "host": "127.0.0.1", "port": 27017 }
],
"options": { "secondary_read_preference": true },
"db": "tenderdb",
"collection": "tenders"
},
"index": {
"name": "tendersidx",
"type": "page"
}
}'
使数据库插入工作的过程是: 1)从服务器下载数据 2)提取从服务器下载的数据 3)将数据插入MongoDB集合 4)从服务器下载元数据(此部分包含“文档”信息) 5)提取下载的元数据 6)将提取的元数据插入MongoDB集合。元数据存储在各种文件中,每种语言都有自己的文件。 “da” - 丹麦语是第一个插入的文件。
MongoDb:2.6.1
ElasticSearch:1.1.0
插件:
elasticsearch-mapper-attachments version 2.0.0 elasticsearch-river-mongodb version 2.0.0
任何人都知道为什么除了“da”之外的mongo“document”中的其他条目在eleasticsearch数据集中不可用?
答案 0 :(得分:0)
Elastic Search和Java内存不足。增加分配给Java的内存量为我解决了这个问题。