应用错误收集

When crawling with Import.io, we have the advanced option to set an URL Pattern to determine with pages should have data extracted.

I'm used to use Regex, so I'm having a hard time to use the Import.io URL Patterns.

The pattern in Regex would be

http://www.site.com/.[0-9]+.html.

How to do that with the Import.io Pattern?

I'd tried the following but it didn't work:

www.site.com/{any}{num}.html

Some examples that should be extracted:

These are the Import.io Notation:

{any} - anything (including nothing) {num} - a number, e.g. 8767
{alpha} - a-z characters, e.g. Dog {alpha-num} - either alpha or num, e.g. 435h5k
{words-num} - words containing numbers separated by -, _ or +, e.g. this-is_a+2nd example
{not-slash} - anything apart from a slash
{uuid} - a UUID, e.g. 439a110f-bba1-46a5-befd-1f32cfb63dc8
{query-string} - a query string, e.g. ?a=1&b=2%c=3
{query-params} - a partial query string, e.g. a=1&b=2
{ref} - a reference, otherwise known as an anchor, e.g. #foo $ - match the end of the URL

Thanks!