在循环时删除dict中的项目

时间:2018-02-06 13:18:22

标签: python csv scrapy

我正在使用scrapy从网站获取一些数据 这是我蜘蛛的一些代码

for item in response.css('div.project-content  table tr'):
    var["installment"] = item.css(' td::text').extract_first()
    if var['installment'] is None:
        del var['installment']
    print(var)
它看起来很简单吧! 现在这是我的输出

{}
{'installment': '1st Installment'}
{'installment': '2nd Installment'}
{'installment': '3rd Installment'}
{'installment': '4th Installment'}
{'installment': '5th Installment'}
{'installment': '6th Installment'}
{'installment': '7th Installment'}

问题是,当我想将数据保存到csv文件中时,第一个 {} 会产生一个空行 这种搞砸了

我怎么能摆脱第一个{}所以我可以摆脱emtpy raw!?

2 个答案:

答案 0 :(得分:0)

您可以使用get function中的默认值:

<script src="https://ajax.googleapis.com/ajax/libs/jquery/1.11.1/jquery.min.js"></script>
<body class="animate-in" >
<div style="display: inline;">
<p style="color: red;  font-size: 35px;  text-shadow: 1px 1px 2px black, 0 0 25px yellow, 0 0 5px black; display: inline; font-family: my; ">Section </p>
<h1 class="shake-me" style="margin-top: 5px; color: blue;
  text-shadow: 0px 1px 0px #999, 0px 2px 0px #888, 0px 3px 0px #777, 0px 4px 0px #666, 0px 5px 0px #555, 0px 6px 0px #444, 0px 7px 0px #333, 0px 8px 7px #001135; text-shadow: 1px 1px 2px black, 0 0 25px blue, 0 0 5px darkblue;'ChunkFiveRegular'; ">A</h1>
               </div>



</div>

  <div class="tab" style="margin-top: 3%;">
  <button class="tablinks"   onclick="openCity(event, 'Sun2')">Sun</button>

  
</div>

<div id="tabcontent" class="selected" >

 
   <!--Sun-->
<div style="overflow-x:auto;">
   <table>
   <col width="85%">
  <col width="15%">
    <tr>
      <th>Class</th>
      <th>Time</th>
 
    </tr>
    
       <tr>
      <td id="r1">10:00-12:00PM</td>
      <td ></td>
    
    </tr>
    <tr>
      <td id="r3">12:00-2:00PM</td>
      <td ></td>
     
    </tr>
    <tr>
      <td id="r5">2:00-4:00PM</td>
      <td ></td>
      
       </tr>
    <tr>
      <td id="r7">4:00-6:00PM</td>
      <td></td>
     
    </tr>
    <tr>
       <td id="r9">6:00-8:00PM</td>
      <td></td>
       </tr>
       
      <td id="r11">8:00-10:00PM</td>
      <td></td>
     <tr> 
    </tr>
     <tr>
      <td id="r13">10:00-12:00AM</td>
      <td></td>
      
    </tr>
     <tr>
     <td id="r15">12:00-2:00AM</td>
      <td> </td>
      
    </tr>
     <tr>
       <td id="r17">2:00-4:00AM</td>
      <td> </td>
      
    </tr>
     <tr>
     <td id="r15">4:00-6:00AM</td>
      <td> </td>
      
    </tr>
     <tr>
     <td id="r15">6:00-8:00AM</td>
      <td> </td>
      
    </tr>
     <tr>
     <td id="r15">8:00-10:00AM</td>
      <td> </td>
      
    </tr>
   
  </table>
</div>
</div>

<script type="text/javascript">
  var now = new Date().getHours() * 100 + new Date().getMinutes();
var times = [1000, 1200, 1400, 1600, 1800, 2000, 2200, 2400, 2000, 4000,6000,8000];
var ids = ['r1', 'r3', 'r5', 'r7', 'r9', 'r11', 'r13', 'r15', 'r17'];
var selected = '';
for (var ix = 0; ix < times.length; ix++) {
  if (now >= times[ix]) {
    selected = ids[ix];
  }
}
if (selected) document.getElementById(selected).style.color = " white";
if (selected) document.getElementById(selected).style.fontFamily = '("Futura", cursive, sans-serif)';

if (selected) document.getElementById(selected).style.textShadow = "0 0 5px #fff, 0 0 10px #fff, 0 0 20px #ff0080, 0 0 30px #ff0080, 0 0 40px #ff0080, 0 0 55px #ff0080, 0 0 75px #ff0080";
if (selected) document.getElementById(selected).style.fontWeight = "2000";
</script>

这是范围值的示例:

var = {}
for item in response.css('div.project-content  table tr'):
    var["installment"] = item.css(' td::text').extract_first()
    if var.get('installment', None) is not None:
        print(var)

答案 1 :(得分:-1)

跳过打印空元素

为您的代码添加条件,以仅打印具有内容的元素,例如如下。

if var:
  print(var)

(空的dict在python条件中评估为False)。