我正在制作能够识别人类发言计划的应用,例如"Each 2 weeks check car before 6p.m"
或"run for 2 hours"
它可以用任何正确的形式书写,无论是数字还是单词(6可以是6或“6”)。
我已经制作了一些字典和一些规则
字典和规则的一部分:
plan.rules = {
language : "EN",
dictionary : {
numbers : {
ones : [
["zero"],
["one", "first", "once"],
["two", "second", "twice"],
["three", "third", "thrice"],
["four", "fourth"],
["five", "fifth"],
["six", "sixth"],
["seven", "seventh"],
["nine", "nineth"]
],
teens : [
[],
["ten", "tenth"],
["eleven", "eleventh"],
["twelwe", "twelweth"],
["fourteen", "fourteenth"],
["fiveteen", "fiveteenth"],
["sixteen", "sixteenth"],
["seventeen", "seventeenth"],
["eightteen", "eightteenth"],
["nineteen", "nineteenth"],
],
tens : [
[],
["ten"],
["twenty"],
["thirty"],
["fourtu"],
["fifty"],
["sixty"],
["seventy"],
["eighty"],
["ninety"],
]
},
peroids : {
minute : ["min", "minute", "minutes"],
hour : ["hour", "hours"],
day : ["day", "days"],
week : ["week", "weeks"],
month : ["month", "months"],
year : ["year", "years"]
}
},
rules : {
each : [
"each {peroid}",
"each {number} {peroid}",
"every {peroid}",
"every {number} {peroid}",
],
for : [
"for {peroid}",
"for {number} {peroid}"
]
}
}
基于以上数据,例如"Each two weeks check something"
:
"two"
匹配数字 2
"weeks"
匹配 peroid “周”
所以句子匹配模式"each {number} {peroid}"
我正在努力制作一些算法来分析输入,我正在考虑运行字典和规则的巨大循环,但也许可以根据这么多情况构建一些regExp?
如果我做错了,怎么可能呢?
答案 0 :(得分:1)
你可以用正则表达式来做,但我认为你会得到一些非常不守规矩的正则表达式。
仅作为示例:如果您的文字总是包含单词each
,后跟一些文字和number
以及一些文字和period
,您可以尝试执行此类操作(如果你决定扩展它,你需要更多的数字组合):
[Ee]ach.*(one|first|1|two|second|2).*(minute?|hour?|day?|week?|month?|year?)
Each two weeks check something
匹配two
和week
和
Each first day check something else
匹配first
和day
但,Each first day of the week do something
或Each 3rd week of the month do something
无法使用。
使用自然语言有很多可能的方式来说each {number} {period}
如果你想捕捉所有内容,使用正则表达式会非常难以使用。