Question

在java中实现以下方法有什么好方法我为我的数据库获取新的Web数据？会比较一堆数组元素吗？一些想法会很棒。

Crawler imdbCrawler = new Crawler(files.getLocalTitles("C:\\Movies"));
//add these titles to the database
//query to get existing DB titles, get directory titles and crawl negated union of these titles

Answer 1

你知道，IMDB offers their database for free ......当然有一些注意事项需要商业用途。

其次，使用某种结构/ Collection是最好的，所以如果集合中的对象有数据，那就意味着你已经抓了它。如果没有，它仍然需要爬行。如果找到新链接，只需将其添加到集合中（没有数据），您的数据收集线程将在以后找到它们。

德克尔

高效的网络抓取

1 个答案: