我正在尝试从公司页面http://customercarecontacts.com/contact-infosys-phone-address-of-infosys-offices/
获取电子邮件ID和链接我成功获得链接,但我没有收到电子邮件。我尝试了很多方法但失败了。这是我正在尝试的代码
import java.io.IOException;
import java.util.HashSet;
import java.util.Set;
import java.util.regex.Matcher;
import java.util.regex.Pattern;
import org.jsoup.nodes.Document;
import org.jsoup.Jsoup;
import org.jsoup.nodes.Element;
import org.jsoup.select.Elements;
public class JSoupTest {
public static void main(String[] args) throws IOException {
Document doc = Jsoup.connect("http://customercarecontacts.com/contact-infosys-phone-address-of-infosys-offices/").userAgent("Mozilla/5.0").timeout(5000).get();
Pattern p = Pattern.compile("[a-zA-Z0-9_.+-]+@[a-zA-Z0-9-]+\\.[a-zA-Z0-9-.]+");
Matcher matcher = p.matcher(doc.text());
Set<String> emails = new HashSet<String>();
while (matcher.find()) {
emails.add(matcher.group());
}
Set<String> links = new HashSet<String>();
Elements elements = doc.select("a[href]");
for (Element e : elements) {
links.add(e.attr("href"));
}
System.out.println("emails : "+emails);
System.out.println("links : "+links);
}
}
任何人都可以建议获取电子邮件的方式或解决方案。
答案 0 :(得分:0)
你可以试试这个:
[a-zA-Z0-9_.+-]+(@\\w+|\\s*\\(at\\)\\s*\\w+)\\.[a-zA-Z]+
示例Java代码
final String regex = "[a-zA-Z0-9_.+-]+(@\\w+|\\s*\\(at\\)\\s*\\w+)\\.[a-zA-Z]+";
final String string = "df\n"
+ "askus (at) infosys.com (queries)<br />\n"
+ "asdfasdf\n"
+ "asdfasdf\n"
+ "asdf abc@yahoo.com asdfadsf\n"
+ "asdf pqr@google.com a sdfasfd\n\n\n";
final Pattern pattern = Pattern.compile(regex);
final Matcher matcher = pattern.matcher(string);
while (matcher.find()) {
System.out.println("Full match: " + matcher.group(0));
for (int i = 1; i <= matcher.groupCount(); i++) {
System.out.println("Group " + i + ": " + matcher.group(i));
}
}