我在我的网络应用程序中设置了一个简单的Django搜索表单,用户可以在其中搜索我的阿拉伯语语料库中的特定单词。用户可以搜索以下三种方式之一:'确切' (正如它的类型一样),' Stem' (它带来了所有变形形式的引理)和“RegEx”#39; (他们可以通过正则表达式进行更复杂的搜索)。
我遇到的问题是,如果用户提交了无效的正则表达式,而不是给出验证错误或空结果,则会触发500服务器错误。我想象的是混乱。下面是搜索具有不平衡括号的正则表达式引起的此类和错误的追溯:ha((。*(?!al))
无论如何都要抓住这种错误,或者让它更加用户友好? (我还在下面列出了我的表格的代码。)
谢谢。
class ConcordanceForm(forms.Form):
searchterm = forms.CharField(max_length=100, required=True)
search_type = forms.ChoiceField(widget=RadioSelect(),
choices= ([('string', 'Exact'), ('lemma', 'Stem'), ('regex', 'Regex') ]),
required=True )
def concord_test(request):
if request.method == 'POST':
form = ConcordanceForm(request.POST)
if form.is_valid():
searchterm = form.cleaned_data['searchterm'].encode('utf-8')
search_type = form.cleaned_data['search_type']
context, texts_len, results_len = make_concordance(searchterm, search_type)
return render_to_response('corpus/concord.html', locals())
else:
form = ConcordanceForm()
return render_to_response('corpus/search_test.html',
{'form': form}, context_instance=RequestContext(request))
<p style=" font-weight:bold;">Search for any word in the corpus:</p>
<form action="/search_test/" method="post">{% csrf_token %}
{{ form.as_p }}
<input type="submit" value="Submit" />
</form>
追踪(最近一次呼叫最后一次):
File "/home/larapsodia/webapps/django/lib/python2.6/django/core/handlers/base.py", line 100, in get_response
response = callback(request, *callback_args, **callback_kwargs)
File "/home/larapsodia/webapps/django/tunisiya2/corpus/views.py", line 154, in concord_test
context, texts_len, results_len = make_concordance(searchterm, search_type)
File "/home/larapsodia/webapps/django/tunisiya2/corpus/views.py", line 91, in make_concordance
p = re.compile(r'\b' + searchterm + r'__') # initial position in word_pos_lemma string
File "/usr/local/lib/python2.6/re.py", line 190, in compile
return _compile(pattern, flags)
File "/usr/local/lib/python2.6/re.py", line 245, in _compile
raise error, v # invalid expression
error: unbalanced parenthesis
<WSGIRequest
GET:<QueryDict: {}>,
POST:<QueryDict: {u'searchterm': [u'ha((.*(?!al))'], u'search_type': [u'regex'], u'csrfmiddlewaretoken': [u'c9a6cad4a0761580f5e351e9e534e028']}>,
COOKIES:{'__utma': '58037167.1544119768.1401037185.1401381302.1401384825.14',
'__utmb': '58037167.10.10.1401384825',
'__utmc': '58037167',
'__utmz': '58037167.1401037185.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)',
'csrftoken': 'c9a6cad4a0761580f5e351e9e534e028',
'sessionid': '8d5b0b8730ccce0860b687b4c7ec1fdb'},
META:{'CONTENT_LENGTH': '109',
'CONTENT_TYPE': 'application/x-www-form-urlencoded',
'CSRF_COOKIE': 'c9a6cad4a0761580f5e351e9e534e028',
'DOCUMENT_ROOT': '/usr/local/apache2/htdocs',
'GATEWAY_INTERFACE': 'CGI/1.1',
'HTTP_ACCEPT': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8',
'HTTP_ACCEPT_ENCODING': 'gzip,deflate,sdch',
'HTTP_ACCEPT_LANGUAGE': 'en-US,en;q=0.8,ar;q=0.6',
'HTTP_CACHE_CONTROL': 'max-age=0',
'HTTP_CONNECTION': 'close',
'HTTP_COOKIE': 'sessionid=8d5b0b8730ccce0860b687b4c7ec1fdb; csrftoken=c9a6cad4a0761580f5e351e9e534e028; __utma=58037167.1544119768.1401037185.1401381302.1401384825.14; __utmb=58037167.10.10.1401384825; __utmc=58037167; __utmz=58037167.1401037185.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none)',
'HTTP_FORWARDED_REQUEST_URI': '/search_test/',
'HTTP_HOST': 'www.tunisiya.org',
'HTTP_HTTPS': 'off',
'HTTP_HTTP_X_FORWARDED_PROTO': 'http',
'HTTP_ORIGIN': 'http://www.tunisiya.org',
'HTTP_REFERER': 'http://www.tunisiya.org/search_test/',
'HTTP_USER_AGENT': 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36',
'HTTP_X_FORWARDED_FOR': '68.9.41.110',
'HTTP_X_FORWARDED_HOST': 'www.tunisiya.org',
'HTTP_X_FORWARDED_PROTO': 'http',
'HTTP_X_FORWARDED_SERVER': 'www.tunisiya.org',
'HTTP_X_FORWARDED_SSL': 'off',
'PATH_INFO': u'/search_test/',
'PATH_TRANSLATED': '/home/larapsodia/webapps/django/tunisiya2.wsgi/search_test/',
'QUERY_STRING': '',
'REMOTE_ADDR': '127.0.0.1',
'REMOTE_PORT': '37086',
'REQUEST_METHOD': 'POST',
'REQUEST_URI': '/search_test/',
'SCRIPT_FILENAME': '/home/larapsodia/webapps/django/tunisiya2.wsgi',
'SCRIPT_NAME': u'',
'SERVER_ADDR': '127.0.0.1',
'SERVER_ADMIN': '[no address given]',
'SERVER_NAME': 'www.tunisiya.org',
'SERVER_PORT': '80',
'SERVER_PROTOCOL': 'HTTP/1.0',
'SERVER_SIGNATURE': '',
'SERVER_SOFTWARE': 'Apache/2.2.15 (Unix) mod_wsgi/3.2 Python/2.6.8',
'mod_wsgi.application_group': 'tunisiya2.com|',
'mod_wsgi.callable_object': 'application',
'mod_wsgi.handler_script': '',
'mod_wsgi.input_chunked': '0',
'mod_wsgi.listener_host': '',
'mod_wsgi.listener_port': '39877',
'mod_wsgi.process_group': '',
'mod_wsgi.request_handler': 'wsgi-script',
'mod_wsgi.script_reloading': '1',
'mod_wsgi.version': (3, 2),
'wsgi.errors': <mod_wsgi.Log object at 0xd69b570>,
'wsgi.file_wrapper': <built-in method file_wrapper of mod_wsgi.Adapter object at 0xa7efda0>,
'wsgi.input': <mod_wsgi.Input object at 0xd69b598>,
'wsgi.multiprocess': False,
'wsgi.multithread': True,
'wsgi.run_once': False,
'wsgi.url_scheme': 'http',
'wsgi.version': (1, 1)}>
答案 0 :(得分:1)
将make_concordance
换成try
- except
;如果发生异常,
为用户呈现原始表单模板以及错误信息。
import re
try:
context, texts_len, results_len = make_concordance(searchterm, search_type)
except re.error as e:
form._errors['search_term'] = str(e)
del form.cleaned_data['search_term']
return render_to_response('corpus/search_test.html',
{'form': form}, context_instance=RequestContext(request))
更好的方法是制作一个custom cleaner,但似乎有点复杂,而且我没有Django。
答案 1 :(得分:0)
在@Sam的评论的基础上,以下是正则表达式无法编译时如何捕获特定错误:
import re
err_message = None
try:
re.compile('(unbalanced')
except re.error as exc:
err_message = 'Uhoh: {}'.format(exc)
print err_message
输出:
Uhoh:不平衡的括号
答案 2 :(得分:0)
我最终建立了一个定制清洁剂,正如Antti所说。这最终有效:
def clean(self):
cleaned_data = self.cleaned_data
searchterm = cleaned_data.get('searchterm')
search_type = cleaned_data.get('search_type')
if search_type == 'regex':
try:
re.search(searchterm, 'randomdatastring') #this is just to test if the regex is valid
except re.error:
raise forms.ValidationError("Invalid regular expression.")
return cleaned_data