以下代码仅返回“干得好!”如何从中获取实际的URL?我按照给定的网站上的教程,我仍然有点麻烦缠绕它。另外,我认为这不是正则表达式的最佳方式(将正则表达式与html混合)。有没有一种基于它的CSS类捕获文本的简单方法?
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Threading.Tasks;
using System.Net;
using System.IO;
using System.Text.RegularExpressions;
namespace Scraper
{
class Program
{
static void Main(string[] args)
{
string target = @"http://www.omegacoder.com/?p=58";
HttpWebRequest request = (HttpWebRequest)WebRequest.Create(target);
HttpWebResponse response = (HttpWebResponse)request.GetResponse();
Regex URL = new Regex("(?:href=)(?<link>.*?)");
string line;
using (Stream responseStream = response.GetResponseStream())
using (StreamReader htmlStream = new StreamReader(responseStream))
while ((line = htmlStream.ReadLine()) != null){
Match m = URL.Match(line);
if (m.Success) {
Console.WriteLine("Good job! " + URL.Match(line) + m.Groups[0].Value + m.Groups[1].Value + m.Groups["link"]);
Console.ReadLine();
} else {
}
}
/* if (Regex.IsMatch(line, "XXXXX"))
Console.WriteLine(line);
} */
Console.ReadLine();
}
}
}
答案 0 :(得分:0)
您应该使用(?:href=)(?<link>\S*)
\S
匹配非空格字符