我想使用Scrapy抓取新闻网站。该代码从当前链接中检索了相关新闻,但没有跟随下一页链接。新闻网站具有以下链接属性
我正在遵循的代码:
#include <stdio.h>
double integrate(double low, double hi, int trap) {
...
}
int flush_line(void) {
// Consume the pending input and return `'\n`` or `EOF`
int c;
while ((c = getchar()) != EOF && c != '\n')
continue;
return c;
}
int main() {
// Main program loop
for (;;) {
int trap, test;
double low, hi;
char repeat;
//Gather End Points
for (;;) {
printf("Enter endpoints of interval to be integrated (low hi): ");
test = scanf("%lf %lf", &low, &hi);
if (test == EOF)
return 1;
if (test != 2) {
printf("Error: Improperly formatted input\n");
if (flush_line() == EOF)
return 1;
continue; // ask again
}
if (low > hi) {
printf("Error: low must be < hi\n");
continue;
}
break; // input is valid
}
//Gather amount of triangles
for (;;) {
printf("Enter number of trapezoids to be used: ");
test = scanf("%d", &trap);
if (test == EOF)
return 1;
if (test != 1) {
printf("Error: Improperly formated input\n");
if (flush_line() == EOF)
return 1;
continue;
}
if (trap < 1) {
printf("Error: numT must be >= 1\n");
continue;
}
break;
}
//Output integrate
printf("Using %d trapezoids, integral between %lf and %lf is %lf\n",
trap, low, hi, integrate(low, hi, trap));
//Prompt user for another time
for (;;) {
printf("\nEvaluate another interval (Y/N)? ");
if (scanf(" %c", &repeat) != 1)
return 1; // unexpected end of file
switch (repeat) {
case 'Y':
case 'y':
break;
case 'N':
case 'n':
return 0;
default:
printf("Error: must enter Y or N\n");
if (flush_line() == EOF)
return 1;
continue;
}
break;
}
}
}
尽管它从当前页面返回信息,但也显示错误。
我输入的信息是: NASA
答案 0 :(得分:1)
主要错误是您拥有css
函数和xpath
的{{1}}选择器:
next_page
下一个问题是您在next_page = response.css("//a[@class='btn-next btn']/@href").get()
个周期内产生了下一页的请求。这将导致调用大量重复请求。
所以我想这些变化:
for