Question

我正在尝试构建一个正则表达式来匹配HTML响应中的多义字符串。这用于监视来自负载均衡器的网页。如果正则表达式匹配，则负载均衡器会将服务器视为UP，并发送流量。

预期的HTML响应示例：

HTTP/1.1 200 
X-AREQUESTID: *1KIRCWLx688x71065x0
X-XSS-Protection: 1; mode=block
X-FRAME-OPTIONS: SAMEORIGIN
X-Content-Type-Options: nosniff
Access-Control-Allow-Origin: *
Content-Type: application/json
Transfer-Encoding: chunked
Content-Encoding: gzip
Vary: Accept-Encoding
Date: Wed, 31 Oct 2018 11:28:14 GMT
{"state":"RUNNING"}

我想要实现的是尝试匹配{"state":"RUNNING"}或{"state":"MAINTENANCE"}和HTTP/1.1 200

所以我有以下工作中的 regexes ，但是我不知道如何将它们捆绑在一起；-）

\{\"state\":\"RUNNING\"\}|\{\"state\":\"MAINTENANCE\"\}

将匹配{"state":"RUNNING"}或{"state":"MAINTENANCE"}

HTTP\/1\.(0|1) (200|301|302)

将匹配HTTP响应代码200、301或302（例如：HTTP/1.1 200）

那么，现在，如何在所有条件下构建大型正则表达式？

HTTP\/1\.(0|1) (200|301|302)和\{\"state\":\"RUNNING\"\}|\{\"state\":\"MAINTENANCE\"\}吗？

有可能吗？

预先感谢

Answer 1

这可以解决问题：

from bokeh.models.widgets import Dropdown, RadioButtonGroup, CheckboxGroup, \
    Toggle, DataTable, TableColumn, NumberFormatter
from bokeh.plotting import figure, curdoc, ColumnDataSource
from bokeh.layouts import column, layout


def update_format(attr, old, new):
    if toggle_commas.active == 1:
        (t.columns[1].formatter)
        # remove the commas
        t.columns[1].formatter = NumberFormatter(format='0,0.[00]')
        # show that it updates the actual attribute
        print(t.columns[1].formatter)
        del doc_layout.children[-1]
        doc_layout.children.insert(1, toggle_commas)
    else:
        # change the formatter back and note that it doesn't update the table unless you remove and add something
        (t.columns[1].formatter)
        # remove the commas
        t.columns[1].formatter = NumberFormatter(format='0.[00]')
        # show that it updates the actual attribute
        print(t.columns[1].formatter)

table_data = dict(
        percentiles=['min', '1st', '5th', '10th', '25th', '50th',
                     '75th', '90th', '95th', '99th', 'max', '', 'mean', 'std'],
        values=[i for i in range(1000, 1014)]
    )
table_source = ColumnDataSource(table_data)
table_columns = [
    TableColumn(field="percentiles", title="Percentile"),
    TableColumn(field="values", title="Value", formatter=NumberFormatter(format='0.[00]'))
    ]

t = DataTable(source=table_source, columns=table_columns, width=400, height=600,
              name='pct_table')

toggle_commas = Toggle(label='Commas', active=False)
toggle_commas.on_change('active', update_format)

doc_layout = layout(t, toggle_commas)
curdoc().add_root(doc_layout)

此版本中的密钥是/HTTP\/1\.(0|1) (200|301|302).*?(\{\"state\":\"RUNNING\"\}|\{\"state\":\"MAINTENANCE\"\})/s标志，该标志允许s匹配换行符（.）字符。 Demo on regex101。

另一个选项，不使用标志：

\n

键是HTTP\/1\.(0|1) (200|301|302)[\s\S]*?(\{\"state\":\"RUNNING\"\}|\{\"state\":\"MAINTENANCE\"\})，它以非贪婪的方式匹配任何内容（包括新行）。 Demo on regex101。

Answer 2

通过简单地将两个正则表达式连接起来就可以将它们组合在一起。如果要允许在两者之间使用任意字符，请在它们之间使用.*。

假设您的正则表达式在整个响应中都匹配，包括标头和正文，而不仅仅是一行，那么以下内容就足够了。

HTTP\/1\.(0|1) (200|301|302)\r\n(.*?)(\{\"state\":\"RUNNING\"\}|\{\"state\":\"MAINTENANCE\"\})$

需要正则表达式以匹配HTML响应中的多个字符串

2 个答案: