Mongodb查询以查找字符串类型的字段是否包含子字符串?

时间:2019-10-04 21:53:45

标签: mongodb-query aggregation-framework

我正在使用以下查询来识别集合中的垃圾文档:

db.mycoll.find(
{ $and: [
         {"Text": { $regex:'Error' }},
         {"Language": { $eq: "English" }},
        ] })

这为我提供了所有文档,其中包含“错误”字符串,其中包括一些与Http / database / php错误不同的干净文档。我已经手动寻找了这些模式并创建了一个字符串列表,例如:

var junk =["MySQL Query Error",
        "Fatal error",
        "Uncaught Error",
        "Forgot your password",
        "Failed to connect to localhost",
        "RuntimeException",
        "Error message",
        "Error Sql",
        "Error 404",.....]

现在,我正在尝试编写一个查询,该查询将搜索集合中的这些垃圾字符串,以便在我可以在我的值字符串中找到上述任何垃圾字符串的任何地方(作为子字符串且不完全匹配) ),我只是取消该文档!

例如

文档为:

{ 
    "_id" : NumberInt(441868), 
    "Text" : "404 Error", 
    "newLanguage" : "English"
}
{ 
    "_id" : NumberInt(5860039), 
    "Text" : "France orders the withdrawal of infant milk due to risk of salmonellosis\nHEALTH 1 week ago Elpais 33\nThe authorities publish a list of lots of Lactalis that would affect more than a dozen countries, including Colombia, Peru or the United Kingdom\nComments\nLoading ...\nPlease, Insert your name Please, Enter your E-Mail Please write your comments Error Happened\nRelated Posts", 
    "newLanguage" : "English"
}
....

“文本”字段具有以下值:

错误的文档:

"500 Error", 
"400 Error", 
"==============================================================================\nError Sql : UPDATE mynews_art SET view_cnt = view_cnt + 1 WHERE site_id = 12265 AND art_no =\nError Msg : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1\n==============================================================================\n==============================================================================\nError Sql : SELECT chgdate, pubdate FROM mynews_art WHERE site_id = 12265 AND  art_no =\nError Msg : You have an error in your SQL syntax; check the manual that corresponds to your MySQL server version for the right syntax to use near '' at line 1\n==============================================================================", 
"*\nError 404, Page not found\n??? ??? ?????? ??? ?????? ???????? ???? ?????  www.alayam.com  ?????? ??????? ???? ?????? ?????? ?????? ????? ?? ??????? ???    ?????? ????????\n? ???? ?? ??? ?????", 
"Server Error in '/' Application.\nRuntime Error\nDescription: An application error occurred on the server. The current custom error settings for this application prevent the details of the application error from being viewed remotely (for security reasons). It could, however, be viewed by browsers running on the local server machine.\nDetails: To enable the details of this specific error message to be viewable on remote machines, please create a <customErrors> tag within a \"web.config\" configuration file located in the root directory of the current web application. This <customErrors> tag should then have its \"mode\" attribute set to \"Off\".\n<!-- Web.Config Configuration File --> <configuration> <system.web> <customErrors mode=\"Off\"/> </system.web> </configuration>\nNotes: The current error page you are seeing can be replaced by a custom error page by modifying the \"defaultRedirect\" attribute of the application's <customErrors> configuration tag to point to a custom error page URL.\n<!-- Web.Config Configuration File --> <configuration> <system.web> <customErrors mode=\"RemoteOnly\" defaultRedirect=\"mycustompage.htm\"/> </system.web> </configuration>", 
"Error 301 Moved permanently HTTP\nMoved permanently HTTPS", 
....

好的文档:

"Quebec whooping cough: 5 things to know\nDeaths and explosion of cases in Mauricie prompt warning from health officials\nCBC News\nPosted: Nov 30, 2015 10:02 PM ET Last Updated: Dec 01, 2015 9:31 AM ET\nQuebec health officials say it's crucial for babies to follow the vaccination schedule for the whooping cough.Typo or Error Send Feedback\nTo encourage thoughtful and respectful conversations, first and last names will appear with each submission to CBC/Radio-Canada's online communities" 
"Refugee crisis: Croatia lifts border blockade with Serbia\nThomson Reuters\nPosted: Sep 25, 2015 12:02 PM ET Last Updated: Sep 25, 2015 12:07 PM ET\nMigrants wait to cross into Croatia through the Serbian border on Friday in Bapska, Croatia. More than 40,000 migrants have crossed into Croatia from Serbia since Tuesday last week, and the Croatian government has said it can't cope with the flow. 
Typo or Error Send Feedback\nTo encourage thoughtful and respectful conversations."
.....

$IN运算符搜索完全匹配,而$regrex运算符不允许将该列表作为参数!有关哪种操作员在这种情况下将如何工作以及如何工作的任何建议?

0 个答案:

没有答案