为什么使用scrapy shell打印会出现这种不一致的行为?

时间:2016-03-04 07:19:03

标签: python python-2.7 scrapy scrapy-spider scrapy-shell

加载scrapy shell

 private View mWave1;
private View mWave2;
private View mWave3;
private View mWave4;
private View mWave5;

private AnimatorSet setAnimation;



PropertyValuesHolder pvhX = PropertyValuesHolder.ofFloat(View.SCALE_X, 10F);
PropertyValuesHolder pvhY = PropertyValuesHolder.ofFloat(View.SCALE_Y, 10F);
PropertyValuesHolder alpha = PropertyValuesHolder.ofFloat(View.ALPHA, 1F, 0.05F);

@Nullable
@Override
public View onCreateView(LayoutInflater inflater, ViewGroup container, Bundle savedInstanceState) {
    View view = inflater.inflate(R.layout.frg_layout, container, false);
    mWave1 = view.findViewById(R.id.wave_view1);
    mWave2 = view.findViewById(R.id.wave_view2);
    mWave3 = view.findViewById(R.id.wave_view3);
    mWave4 = view.findViewById(R.id.wave_view4);
    mWave5 = view.findViewById(R.id.wave_view5);
    setupAnimatorSet();
    return view;
}

@Override
public void onViewCreated(View view, Bundle savedInstanceState) {
    super.onViewCreated(view, savedInstanceState);
    setAnimation.start();
}

private void setupAnimatorSet(){
    setAnimation = new AnimatorSet();

    setAnimation.play(waveAnimation(mWave1));

    setAnimation.play(waveAnimation(mWave2)).after(1000L);

    setAnimation.play(waveAnimation(mWave3)).after(2000L);

    setAnimation.play(waveAnimation(mWave4)).after(3000L);

    setAnimation.play(waveAnimation(mWave5)).after(4000L);

}



private ObjectAnimator waveAnimation(View view){
    ObjectAnimator scaleAnimation =
            ObjectAnimator.ofPropertyValuesHolder(view, pvhX, pvhY, alpha);
    scaleAnimation.setRepeatCount(ValueAnimator.INFINITE);
    scaleAnimation.setDuration(5000L);
    return scaleAnimation;
}

尝试选择器:

scrapy shell "http://www.worldfootball.net/all_matches/eng-premier-league-2015-2016/"

注意:它打印结果。

但现在将该选择器用作for语句:

response.xpath('(//table[@class="standard_tabelle"])[1]/tr[not(th)]')

点击返回两次,没有打印。要在for循环内打印结果,必须将选择器包装在打印功能中。像这样:

 for row in response.xpath('(//table[@class="standard_tabelle"])[1]/tr[not(th)]'):
     row.xpath(".//a[contains(@href, 'report')]/@href").extract_first()

为什么?

修改

如果我和Liam的帖子完全一样,我的输出就是:

print(row.xpath(".//a[contains(@href, 'report')]/@href").extract_first())

但是添加了印刷品?

rmp:www rmp$ scrapy shell "http://www.worldfootball.net/all_matches/eng-premier-league-2015-2016/"
2016-03-05 06:13:28 [scrapy] INFO: Scrapy 1.0.5 started (bot: scrapybot)
2016-03-05 06:13:28 [scrapy] INFO: Optional features available: ssl, http11
2016-03-05 06:13:28 [scrapy] INFO: Overridden settings: {'LOGSTATS_INTERVAL': 0, 'DUPEFILTER_CLASS': 'scrapy.dupefilters.BaseDupeFilter'}
2016-03-05 06:13:28 [scrapy] INFO: Enabled extensions: CloseSpider, TelnetConsole, CoreStats, SpiderState
2016-03-05 06:13:28 [scrapy] INFO: Enabled downloader middlewares: HttpAuthMiddleware, DownloadTimeoutMiddleware, UserAgentMiddleware, RetryMiddleware, DefaultHeadersMiddleware, MetaRefreshMiddleware, HttpCompressionMiddleware, RedirectMiddleware, CookiesMiddleware, ChunkedTransferMiddleware, DownloaderStats
2016-03-05 06:13:28 [scrapy] INFO: Enabled spider middlewares: HttpErrorMiddleware, OffsiteMiddleware, RefererMiddleware, UrlLengthMiddleware, DepthMiddleware
2016-03-05 06:13:28 [scrapy] INFO: Enabled item pipelines: 
2016-03-05 06:13:28 [scrapy] DEBUG: Telnet console listening on 127.0.0.1:6023
2016-03-05 06:13:28 [scrapy] INFO: Spider opened
2016-03-05 06:13:29 [scrapy] DEBUG: Crawled (200) <GET http://www.worldfootball.net/all_matches/eng-premier-league-2015-2016/> (referer: None)
[s] Available Scrapy objects:
[s]   crawler    <scrapy.crawler.Crawler object at 0x108c89c10>
[s]   item       {}
[s]   request    <GET http://www.worldfootball.net/all_matches/eng-premier-league-2015-2016/>
[s]   response   <200 http://www.worldfootball.net/all_matches/eng-premier-league-2015-2016/>
[s]   settings   <scrapy.settings.Settings object at 0x10a25bb10>
[s]   spider     <DefaultSpider 'default' at 0x10c1201d0>
[s] Useful shortcuts:
[s]   shelp()           Shell help (print this help)
[s]   fetch(req_or_url) Fetch request (or URL) and update local objects
[s]   view(response)    View response in a browser
2016-03-05 06:13:29 [root] DEBUG: Using default logger
2016-03-05 06:13:29 [root] DEBUG: Using default logger


In [1]: for row in response.xpath('(//table[@class="standard_tabelle"])[1]/tr[not(th)]'):
...:        row.xpath(".//a[contains(@href, 'report')]/@href").extract_first()
...:     

1 个答案:

答案 0 :(得分:1)

这对我有用。

>>>scrapy shell "http://www.worldfootball.net/all_matches/eng-premier-league-2015-2016/"



>>> for row in response.xpath('(//table[@class="standard_tabelle"])[1]/tr[not(th)]'):
...     row.xpath(".//a[contains(@href, 'report')]/@href").extract_first()
...

u'/report/premier-league-2015-2016-manchester-united-tottenham-hotspur/'
u'/report/premier-league-2015-2016-afc-bournemouth-aston-villa/'
u'/report/premier-league-2015-2016-everton-fc-watford-fc/'
u'/report/premier-league-2015-2016-leicester-city-sunderland-afc/'
u'/report/premier-league-2015-2016-norwich-city-crystal-palace/'
u'/report/premier-league-2015-2016-chelsea-fc-swansea-city/'
u'/report/premier-league-2015-2016-arsenal-fc-west-ham-united/'
u'/report/premier-league-2015-2016-newcastle-united-southampton-fc/'
u'/report/premier-league-2015-2016-stoke-city-liverpool-fc/'
u'/report/premier-league-2015-2016-west-bromwich-albion-manchester-city/'

这不会为您显示相同的结果吗?