我对scrapy全新。我安装了它,我可以很简单地构建教程。我也可以通过
启动零碎的shellscrapy shell"网站"
下表包含其他更高级别的内容,如div。使用scrapy,我该如何提取下表?我是否需要阅读div或者我可以直接跳到表中并提取信息吗?
我正在寻找的是这样的代码,它在dict中返回表中的所有项目,最好是在一个完全包含的代码中,我可以运行和学习:
def parse(self, response):
hxs = HtmlXPathSelector(response)
divs = hxs.select('//tr[@class="someclass"]')
for div in divs:
item['var1'] = div.select('//table/tbody/tr[*]/td[2]/p/span[2]
yield item
注意:我从最后删除了重复的东西。
<div class="full arrangeable" data-id="calendar"> <div class="full row" data-row="0"> <div class="full column" data-column="0"> <div class="cell" data-cell="0" data-compid="Calendar"> <a name="Calendar" class="anchor"></a> <div class="flexShell"> <div class="flexBox calendar" id="flexBox_flex_calendar_mainCal" data-more="0" data-checkstate="0" data-initcallback="calendar" data-updatecallback="calendar" data-visiblejs="[]" data-disablejs="[]"> <form action="flex.php" method="post" onsubmit="return Flex.prepareSubmit(this);" data-submit="options"> <input name="s" value="" type="hidden"> <input name="securitytoken" value="guest" type="hidden"> <input name="do" value="saveoptions" type="hidden"> <input name="setdefault" value="no" type="hidden"> <input name="ignoreinput" value="no" type="hidden"> <input name="flex[Calendar_mainCal][idSuffix]" value="" type="hidden"> <input name="flex[Calendar_mainCal][_flexForm_]" value="flexForm" type="hidden"> <input name="flex[Calendar_mainCal][modelData]" value="YToxMDp7czoxMToicGFfY29udHJvbHMiO3M6MTc6ImNhbGVuZGFyfENhbGVuZGFyIjtzOjE2OiJwYV9pbmplY3RyZXZlcnNlIjtiOjA7czoxNDoidmlld2luZ0RlZmF1bHQiO3M6OToiVGhpcyBXZWVrIjtzOjExOiJwcmV2Q2FsTGluayI7czoxNDoiZGF5PW5vdjMwLjIwMTEiO3M6MTE6Im5leHRDYWxMaW5rIjtzOjEzOiJkYXk9ZGVjMi4yMDExIjtzOjc6InByZXZBbHQiO3M6MjY6Ik5vdiAzMCwgMjAxMSAtIERlYyAxLCAyMDExIjtzOjc6Im5leHRBbHQiO3M6MjU6IkRlYyAyLCAyMDExIC0gRGVjIDMsIDIwMTEiO3M6MTA6Im5leHRIaWRkZW4iO2I6MDtzOjEwOiJwcmV2SGlkZGVuIjtiOjA7czo5OiJyaWdodExpbmsiO047fQ==" type="hidden"> <div class="head"> <ul> <li class="left pagination"><a title="Nov 30, 2011 - Dec 1, 2011" class="prev" href="calendar.php?day=nov30.2011"><span><</span></a></li> <li class="left"><a class="highlight light options flexTitle"><span><strong>Dec 1, 2011</strong></span></a></li> <li class="left pagination shadow"><a title="Dec 2, 2011 - Dec 3, 2011" class="next" href="calendar.php?day=dec2.2011"><span><</span></a></li> <li class="loader"></li> <li class="right imagefade noborder"><a class="highlight noborder filters flexFilter"><div class="fade"></div><span>Filter</span></a></li> <li class="right"> <a class="highlight noborder menu"> <span>This Week</span> <span class="dropdown"></span> <div> <div class="title">Default View:</div> <div data-value="yesterday">Yesterday</div> <div data-value="today">Today</div> <div data-value="tomorrow">Tomorrow</div> <div data-value="thisweek">This Week</div> </div> </a> </li> <li class="right shadow"><a class="highlight noborder upnext"><span>Up Next</span></a></li> <li class="layoutcontrols"><div class="pagearrange_homepage_controls"> </div> <div class="pagearrange_controls"> <span data-registered="1" class="onHomepage" title="Copy Block to Your Homepage"></span> </div></li></ul> </div> <div class="options sidebyside"> <div class="half"> <div class="shell flexoptions"> <div class="frame"> <input name="flex[Calendar_mainCal][calendardefault]" id="flex[Calendar_mainCal][calendardefault]" value="thisweek" type="hidden"> <div class="half"> <div class="pad"> <p class="title"><strong>Begin Date</strong></p> <input data-enterhandled="1" data-pickerid="flexDatePicker_1" name="flex[Calendar_mainCal][begindate]" data-container="Calendar_mainCal_begindate" class="bginput flexDatePicker" value="December 1, 2011" data-range="2007,2015" type="text"> <div class="minicalendar" id="flexDatePicker_Calendar_mainCal_begindate"><div class="pickerheader"><table class="menu"><tbody><tr><td><a class="calJump year back">«</a></td><td><a class="calJump month back">‹</a></td><td class="current">December 2011</td><td><a class="calJump month forward">›</a></td><td><a class="calJump year forward">»</a></td></tr></tbody></table></div><div class="pickercontainer"><div class="table"><div class="row header"><div class="day header">Sun</div><div class="day header">Mon</div><div class="day header">Tue</div><div class="day header">Wed</div><div class="day header">Thu</div><div class="day header">Fri</div><div class="day header">Sat</div></div><div class="row"><div data-date="November 27, 2011" class="day other"><a>27</a></div><div data-date="November 28, 2011" class="day other"><a>28</a></div><div data-date="November 29, 2011" class="day other"><a>29</a></div><div data-date="November 30, 2011" class="day other"><a>30</a></div><div data-date="December 1, 2011" class="day active"><a>1</a></div><div data-date="December 2, 2011" class="day"><a>2</a></div><div data-date="December 3, 2011" class="day"><a>3</a></div></div><div class="row"><div data-date="December 4, 2011" class="day"><a>4</a></div><div data-date="December 5, 2011" class="day"><a>5</a></div><div data-date="December 6, 2011" class="day"><a>6</a></div><div data-date="December 7, 2011" class="day"><a>7</a></div><div data-date="December 8, 2011" class="day"><a>8</a></div><div data-date="December 9, 2011" class="day"><a>9</a></div><div data-date="December 10, 2011" class="day"><a>10</a></div></div><div class="row"><div data-date="December 11, 2011" class="day"><a>11</a></div><div data-date="December 12, 2011" class="day"><a>12</a></div><div data-date="December 13, 2011" class="day"><a>13</a></div><div data-date="December 14, 2011" class="day"><a>14</a></div><div data-date="December 15, 2011" class="day"><a>15</a></div><div data-date="December 16, 2011" class="day"><a>16</a></div><div data-date="December 17, 2011" class="day"><a>17</a></div></div><div class="row"><div data-date="December 18, 2011" class="day"><a>18</a></div><div data-date="December 19, 2011" class="day"><a>19</a></div><div data-date="December 20, 2011" class="day"><a>20</a></div><div data-date="December 21, 2011" class="day"><a>21</a></div><div data-date="December 22, 2011" class="day"><a>22</a></div><div data-date="December 23, 2011" class="day"><a>23</a></div><div data-date="December 24, 2011" class="day"><a>24</a></div></div><div class="row"><div data-date="December 25, 2011" class="day"><a>25</a></div><div data-date="December 26, 2011" class="day"><a>26</a></div><div data-date="December 27, 2011" class="day"><a>27</a></div><div data-date="December 28, 2011" class="day"><a>28</a></div><div data-date="December 29, 2011" class="day"><a>29</a></div><div data-date="December 30, 2011" class="day"><a>30</a></div><div data-date="December 31, 2011" class="day"><a>31</a></div></div></div></div></div> </div> </div> <div class="half last"> <div class="pad"> <p class="title"><strong>End Date</strong> (<a data-pickerid="flexDatePicker_2" class="internal noneLink_Calendar_mainCal_enddate">none</a>)</p> <input data-enterhandled="1" data-pickerid="flexDatePicker_2" name="flex[Calendar_mainCal][enddate]" data-container="Calendar_mainCal_enddate" class="bginput flexDatePicker" value="December 1, 2011" data-range="2007,2015" data-nonehandler="noneLink_Calendar_mainCal_enddate" type="text"> <div class="minicalendar" id="flexDatePicker_Calendar_mainCal_enddate"><div class="pickerheader"><table class="menu"><tbody><tr><td><a class="calJump year back">«</a></td><td><a class="calJump month back">‹</a></td><td class="current">December 2011</td><td><a class="calJump month forward">›</a></td><td><a class="calJump year forward">»</a></td></tr></tbody></table></div><div class="pickercontainer"><div class="table"><div class="row header"><div class="day header">Sun</div><div class="day header">Mon</div><div class="day header">Tue</div><div class="day header">Wed</div><div class="day header">Thu</div><div class="day header">Fri</div><div class="day header">Sat</div></div><div class="row"><div data-date="November 27, 2011" class="day other"><a>27</a></div><div data-date="November 28, 2011" class="day other"><a>28</a></div><div data-date="November 29, 2011" class="day other"><a>29</a></div><div data-date="November 30, 2011" class="day other"><a>30</a></div><div data-date="December 1, 2011" class="day active"><a>1</a></div><div data-date="December 2, 2011" class="day"><a>2</a></div><div data-date="December 3, 2011" class="day"><a>3</a></div></div><div class="row"><div data-date="December 4, 2011" class="day"><a>4</a></div><div data-date="December 5, 2011" class="day"><a>5</a></div><div data-date="December 6, 2011" class="day"><a>6</a></div><div data-date="December 7, 2011" class="day"><a>7</a></div><div data-date="December 8, 2011" class="day"><a>8</a></div><div data-date="December 9, 2011" class="day"><a>9</a></div><div data-date="December 10, 2011" class="day"><a>10</a></div></div><div class="row"><div data-date="December 11, 2011" class="day"><a>11</a></div><div data-date="December 12, 2011" class="day"><a>12</a></div><div data-date="December 13, 2011" class="day"><a>13</a></div><div data-date="December 14, 2011" class="day"><a>14</a></div><div data-date="December 15, 2011" class="day"><a>15</a></div><div data-date="December 16, 2011" class="day"><a>16</a></div><div data-date="December 17, 2011" class="day"><a>17</a></div></div><div class="row"><div data-date="December 18, 2011" class="day"><a>18</a></div><div data-date="December 19, 2011" class="day"><a>19</a></div><div data-date="December 20, 2011" class="day"><a>20</a></div><div data-date="December 21, 2011" class="day"><a>21</a></div><div data-date="December 22, 2011" class="day"><a>22</a></div><div data-date="December 23, 2011" class="day"><a>23</a></div><div data-date="December 24, 2011" class="day"><a>24</a></div></div><div class="row"><div data-date="December 25, 2011" class="day"><a>25</a></div><div data-date="December 26, 2011" class="day"><a>26</a></div><div data-date="December 27, 2011" class="day"><a>27</a></div><div data-date="December 28, 2011" class="day"><a>28</a></div><div data-date="December 29, 2011" class="day"><a>29</a></div><div data-date="December 30, 2011" class="day"><a>30</a></div><div data-date="December 31, 2011" class="day"><a>31</a></div></div></div></div></div> </div> </div> <div class="full"> <div class="pad"> <ul class="periodshortcuts"> <li><a class="internal" data-range="April 12, 2015|April 18, 2015">This Week</a></li> <li><a class="internal" data-range="April 19, 2015|April 25, 2015">Next Week</a></li> <li><a class="internal" data-range="April 5, 2015|April 11, 2015">Last Week</a></li> <li><a class="internal" data-range="April 1, 2015|April 30, 2015">This Month</a></li> <li><a class="internal" data-range="May 1, 2015|May 31, 2015">Next Month</a></li> <li><a class="internal" data-range="March 1, 2015|March 31, 2015">Last Month</a></li> </ul> </div> </div> </div> <table> <tbody><tr> <td class="flexOptionsError"></td> <td class="flexSubmitButtons"> <input class="button flexOptionsSubmit" name="flexSettings" value="Apply Settings" type="submit"> <input class="button flexCancelOptions" value="Cancel" type="button"> </td> <td class="flexDefaults"></td> </tr> </tbody></table> </div> </div> <div class="half last"> <div class="shell flexfilters"> <div class="frame"> <table class="pad"> <tbody><tr> <td class="pad expectedimpact" width="65%"> <div class="flexErrorCheck" data-type="requireOneCheck" data-error="No Impact Selected"> <p class="title"> <strong>Expected Impact</strong>
(<a class="toggleOptions internal" data-target="flex[Calendar_mainCal][impacts]" data-toggle="all">all</a>, <a class="toggleOptions internal" data-target="flex[Calendar_mainCal][impacts]" data-toggle="none">none</a>)
</p> <table class="arrayCheckbox requireOneCheck impacts"> <tbody><tr> <td> <table class="items"> <tbody><tr> <td><input name="flex[Calendar_mainCal][impacts][high]" id="flex[Calendar_mainCal][impacts]_high" value="high" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][impacts]_high"><span class="impact high" title="High Impact Expected"></span></label></td> </tr> </tbody></table> </td> <td valign="top"> <table class="items"> <tbody><tr> <td><input name="flex[Calendar_mainCal][impacts][medium]" id="flex[Calendar_mainCal][impacts]_medium" value="medium" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][impacts]_medium"><span class="impact medium" title="Medium Impact Expected"></span></label></td> </tr> </tbody></table> </td> <td valign="top"> <table class="items"> <tbody><tr> <td><input name="flex[Calendar_mainCal][impacts][low]" id="flex[Calendar_mainCal][impacts]_low" value="low" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][impacts]_low"><span class="impact low" title="Low Impact Expected"></span></label></td> </tr> </tbody></table> </td> <td valign="top"> <table class="items"> <tbody><tr> <td><input name="flex[Calendar_mainCal][impacts][holiday]" id="flex[Calendar_mainCal][impacts]_holiday" value="holiday" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][impacts]_holiday"><span class="impact holiday" title="Non-Economic"></span></label></td> </tr> </tbody></table> </td> <td valign="top"> <table class="items"> </table> </td> </tr> </tbody></table> <input name="flex[Calendar_mainCal][_cbarray_]" value="1" type="hidden"> </div> </td> <td class="pad currencies" rowspan="2" width="35%"> <div class="flexErrorCheck" data-type="requireOneCheck" data-error="No Currencies Selected"> <p class="title"> <strong>Currencies</strong>
(<a class="toggleOptions internal" data-target="flex[Calendar_mainCal][currencies]" data-toggle="all">all</a>, <a class="toggleOptions internal" data-target="flex[Calendar_mainCal][currencies]" data-toggle="none">none</a>)
</p> <table class="arrayCheckbox requireOneCheck"> <tbody><tr> <td> <table class="items"> <tbody><tr> <td><input name="flex[Calendar_mainCal][currencies][aud]" id="flex[Calendar_mainCal][currencies]_aud" value="aud" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_aud">AUD</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][cad]" id="flex[Calendar_mainCal][currencies]_cad" value="cad" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_cad">CAD</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][chf]" id="flex[Calendar_mainCal][currencies]_chf" value="chf" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_chf">CHF</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][cny]" id="flex[Calendar_mainCal][currencies]_cny" value="cny" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_cny">CNY</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][eur]" id="flex[Calendar_mainCal][currencies]_eur" value="eur" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_eur">EUR</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][gbp]" id="flex[Calendar_mainCal][currencies]_gbp" value="gbp" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_gbp">GBP</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][jpy]" id="flex[Calendar_mainCal][currencies]_jpy" value="jpy" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_jpy">JPY</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][nzd]" id="flex[Calendar_mainCal][currencies]_nzd" value="nzd" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_nzd">NZD</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][currencies][usd]" id="flex[Calendar_mainCal][currencies]_usd" value="usd" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][currencies]_usd">USD</label></td> </tr> </tbody></table> </td> </tr> </tbody></table> <input name="flex[Calendar_mainCal][_cbarray_]" value="1" type="hidden"> </div> </td> </tr> <tr> <td class="pad"> <div class="flexErrorCheck" data-type="requireOneCheck" data-error="No Types Selected"> <p class="title"> <strong>Event Types</strong>
(<a class="toggleOptions internal" data-target="flex[Calendar_mainCal][eventtypes]" data-toggle="all">all</a>, <a class="toggleOptions internal" data-target="flex[Calendar_mainCal][eventtypes]" data-toggle="none">none</a>)
</p> <table class="arrayCheckbox requireOneCheck"> <tbody><tr> <td> <table class="items"> <tbody><tr> <td><input name="flex[Calendar_mainCal][eventtypes][growth]" id="flex[Calendar_mainCal][eventtypes]_growth" value="growth" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_growth">Growth</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][inflation]" id="flex[Calendar_mainCal][eventtypes]_inflation" value="inflation" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_inflation">Inflation</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][employment]" id="flex[Calendar_mainCal][eventtypes]_employment" value="employment" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_employment">Employment</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][centralbank]" id="flex[Calendar_mainCal][eventtypes]_centralbank" value="centralbank" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_centralbank">Central Bank</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][bonds]" id="flex[Calendar_mainCal][eventtypes]_bonds" value="bonds" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_bonds">Bonds</label></td> </tr> </tbody></table> </td> <td valign="top"> <table class="items"> <tbody><tr> <td><input name="flex[Calendar_mainCal][eventtypes][housing]" id="flex[Calendar_mainCal][eventtypes]_housing" value="housing" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_housing">Housing</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][sentiment]" id="flex[Calendar_mainCal][eventtypes]_sentiment" value="sentiment" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_sentiment">Consumer Surveys</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][pmi]" id="flex[Calendar_mainCal][eventtypes]_pmi" value="pmi" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_pmi">Business Surveys</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][speeches]" id="flex[Calendar_mainCal][eventtypes]_speeches" value="speeches" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_speeches">Speeches</label></td> </tr> <tr> <td><input name="flex[Calendar_mainCal][eventtypes][misc]" id="flex[Calendar_mainCal][eventtypes]_misc" value="misc" checked="checked" data-isdefault="true" class="requireOneCheck" type="checkbox"></td> <td class="full"><label for="flex[Calendar_mainCal][eventtypes]_misc">Misc</label></td> </tr> </tbody></table> </td> </tr> </tbody></table> <input name="flex[Calendar_mainCal][_cbarray_]" value="1" type="hidden"> </div> </td> </tr> </tbody></table> </div> <table> <tbody><tr> <td class="flexFilterError"></td> <td class="flexSubmitButtons"> <input class="button flexFilterSubmit" name="flexFilters" value="Apply Filter" type="submit"> <input class="button flexCancelFilters" value="Cancel" type="button"> </td> <td class="flexDefaults"></td> </tr> </tbody></table> </div> </div> </div> </form> <table> <thead> <tr> <th class="col1">Date</th> <th class="col2"><a href="timezone.php" title="Time Options">9:34pm</a></th> <th class="col3">Currency</th> <th class="col4">Impact</th> <th class="col5"> </th> <th class="col6">Detail</th> <th class="col7">Actual</th> <th class="col8">Forecast</th> <th class="col9">Previous</th> <th class="col10">Graph</th> </tr> </thead> <tbody><tr class="borderfix"><td></td></tr> <tr class="calendar_row newday" data-eventid="36121"> <td class="date"><span class="date">Thu<span>Dec 1</span></span></td> <td class="time">1:30am</td> <td class="currency">AUD</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>Commodity Prices y/y</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level1" data-level="1"></a></td> <td class="actual">
18.1%
</td> <td class="forecast"></td> <td class="previous"><span class="revised" title="Revised From 19.4%">19.6%</span></td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr> <tr class="calendar_row" data-eventid="35311"> <td class="date"></td> <td class="time">2:45am</td> <td class="currency">CHF</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>GDP q/q</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level1" data-level="1"></a></td> <td class="actual">
0.2%
</td> <td class="forecast">0.2%</td> <td class="previous"><span class="revised better" title="Revised From 0.4%">0.5%</span></td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr><tr class="details " data-eventid="35311"><td align="center"></td><td colspan="8" class="calendar_detail_cell details nest" align="center"></td><td align="center"></td></tr> <tr class="calendar_row" data-eventid="41782"> <td class="date"></td> <td class="time">4:00am</td> <td class="currency">EUR</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>ECB President Draghi Speaks</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level1" data-level="1"></a></td> <td class="actual"> </td> <td class="forecast"></td> <td class="previous"></td> <td class="graph"></td> </tr> <tr class="calendar_row" data-eventid="43848"> <td class="date"></td> <td class="time">4:15am</td> <td class="currency">EUR</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>Spanish Manufacturing PMI</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level0" data-level="0"></a></td> <td class="actual">
43.8
</td> <td class="forecast"></td> <td class="previous">43.9</td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr> <tr class="calendar_row" data-eventid="35078"> <td class="date"></td> <td class="time">4:30am</td> <td class="currency">CHF</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>Manufacturing PMI</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level1" data-level="1"></a></td> <td class="actual"> <span class="worse">44.8</span> </td> <td class="forecast">46.6</td> <td class="previous">46.9</td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr><tr class="details " data-eventid="35078"><td align="center"></td><td colspan="8" class="calendar_detail_cell details nest" align="center"></td><td align="center"></td></tr> <tr class="calendar_row" data-eventid="43502"> <td class="date"></td> <td class="time">4:45am</td> <td class="currency">EUR</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>Italian Manufacturing PMI</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level0" data-level="0"></a></td> <td class="actual"> <span class="better">44.0</span> </td> <td class="forecast">42.8</td> <td class="previous">43.3</td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr> <tr class="calendar_row" data-eventid="58942"> <td class="date"></td> <td class="time">4:50am</td> <td class="currency">EUR</td> <td class="impact"> <span title="Low Impact Expected" class="low"></span> </td> <td class="event"><span>French Final Manufacturing PMI</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level0" data-level="0"></a></td> <td class="actual">
47.3
</td> <td class="forecast">47.6</td> <td class="previous">47.6</td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr> <tr class="calendar_row" data-eventid="59001"> <td class="date"></td> <td class="time">4:55am</td> <td class="currency">EUR</td> <td class="impact"> <span title="Low Impact Expected" class="low"></span> </td> <td class="event"><span>German Final Manufacturing PMI</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level0" data-level="0"></a></td> <td class="actual">
47.9
</td> <td class="forecast">48.0</td> <td class="previous">47.9</td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr> <tr class="calendar_row" data-eventid="33221"> <td class="date"></td> <td class="time">5:00am</td> <td class="currency">EUR</td> <td class="impact"> <span title="Low Impact Expected" class="low"></span> </td> <td class="event"><span>Final Manufacturing PMI</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level1" data-level="1"></a></td> <td class="actual">
46.4
</td> <td class="forecast">46.4</td> <td class="previous">46.4</td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr> <tr class="calendar_row" data-eventid="33165"> <td class="date"></td> <td class="time">5:30am</td> <td class="currency">GBP</td> <td class="impact"> <span title="High Impact Expected" class="high"></span> </td> <td class="event"><span>Manufacturing PMI</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level1" data-level="1"></a></td> <td class="actual"> <span class="better">47.6</span> </td> <td class="forecast">47.1</td> <td class="previous"><span class="revised better" title="Revised From 47.4">47.8</span></td> <td class="graph"><a title="Open Graph" class="calendar_chart"></a></td> </tr> <tr class="calendar_row nogrid" data-eventid="57061"> <td class="date"></td> <td class="time"></td> <td class="currency">GBP</td> <td class="impact"> <span title="Low Impact Expected" class="low"></span> </td> <td class="event"><span>FPC Statement</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level0" data-level="0"></a></td> <td class="actual"> </td> <td class="forecast"></td> <td class="previous"></td> <td class="graph"></td> </tr> <tr class="calendar_row" data-eventid="42399"> <td class="date"></td> <td class="time">6:01am</td> <td class="currency">EUR</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>French 10-y Bond Auction</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level0" data-level="0"></a></td> <td class="actual">
3.18|3.1
</td> <td class="forecast"></td> <td class="previous">3.22|2.2</td> <td class="graph"></td> </tr> <tr class="calendar_row" data-eventid="35087"> <td class="date"></td> <td class="time">6:30am</td> <td class="currency">GBP</td> <td class="impact"> <span title="Medium Impact Expected" class="medium"></span> </td> <td class="event"><span>BOE Financial Stability Report</span></td> <td class="detail"><a title="Open Detail" class="calendar_detail level1" data-level="1"></a></td> <td class="actual"> </td> <td class="forecast"></td> <td class="previous"></td> <td class="graph"></td> </tr> <tr class="calendar_row" data-eventid="42468"> <td class="date"></td> </tbody></table> <div class="foot"> <ul> <li class="more"> <a href="#" class="flexMore"><span>More</span><span class="loader"></span></a> </li> </ul> </div> </div> </div> </div> </div> </div> </div>
答案 0 :(得分:1)
您可以根据里面的th
元素找到表格,例如:
response.xpath("//table[.//th[. = 'Date']]")
您还可以找到检查其父母的表格:
response.css("div#flexBox_flex_calendar_mainCal > table")
来自Scrapy Shell的工作示例(表格中的打印时间值):
In [1]: for row in response.css("div#flexBox_flex_calendar_mainCal table tr.calendar_row"):
print row.xpath(".//td[@class='time']/text()").extract()
[u'1:30am']
[u'2:45am']
[u'4:00am']
[u'4:15am']
[u'4:30am']
[u'4:45am']
[u'4:50am']
[u'4:55am']
[u'5:00am']
[u'5:30am']
[]
[u'6:01am']
[u'6:30am']
[u'6:36am']