在熊猫中与正则表达式斗争

时间:2016-06-22 09:58:09

标签: regex pandas

我试图从pandas中字符串列的开头删除字符来整理一些地址但是很难找到最好的正则表达式来完成这项工作。

文本的一般格式如下:

1 / BAA Temporary Building, Land Opposite Park
3 / BAC Methodist Church Hall, Park Drive, Bar
4 / BSA St Annes Church Hall , Lynton Avenue

我的指示:

df.address.str.replace(r"\d+ / [A-Z]{3}", "")

这适用于大多数情况,但对于以下情况返回空白:

2 / BAB, BAD Barlaston Village Hall, Longton R

6 / BSC, BSD Holy Trinity Church Hall 

如何提供可选参数来获取额外的三个字符元素?

1 个答案:

答案 0 :(得分:2)

试试这个:

<div id="option">

  <input type="radio" id="all" name="datesize" value="731" checked="checked">
  <label for="all">all</label>
  <input type="radio" id="year" name="datesize" value="365">
  <label for="year">year</label>
  <input type="radio" id="season" name="datesize" value="91">
  <label for="season">quarter</label>
  <input type="radio" id="month" name="datesize" value="30">
  <label for="month">month</label>
  <div id="chooseall"></div>
  <div id="chooseyear" style="visibility:hidden">
    <input type="radio" id="yearpick0" name="yearpick" value="0" checked="checked">2011
    <input type="radio" id="yearpick1" name="yearpick" value="1">2012
  </div>
  <div id="chooseseason" style="visibility:hidden">
    <input type="radio" id="quarterpick0" name="quarterpick" value="0" checked="checked">jan-mar
    <input type="radio" id="quarterpick1" name="quarterpick" value="1">apr-jun
    <input type="radio" id="quarterpick2" name="quarterpick" value="2">jul-sep
    <input type="radio" id="quarterpick3" name="quarterpick" value="3">oct-dec
  </div>
  <div id="choosemonth" style="visibility:hidden">
    <input type="radio" id="monthpick0" name="monthpick" value="0" checked="checked">jan
    <input type="radio" id="monthpick1" name="monthpick" value="1">feb
    <input type="radio" id="monthpick2" name="monthpick" value="2">mar
    <input type="radio" id="monthpick3" name="monthpick" value="3">apr
    <input type="radio" id="monthpick4" name="monthpick" value="0">may
    <input type="radio" id="monthpick5" name="monthpick" value="1">jun
    <input type="radio" id="monthpick6" name="monthpick" value="2">jul
    <input type="radio" id="monthpick7" name="monthpick" value="3">aug
    <input type="radio" id="monthpick8" name="monthpick" value="0">sep
    <input type="radio" id="monthpick9" name="monthpick" value="1">oct
    <input type="radio" id="monthpick10" name="monthpick" value="10">nov
    <input type="radio" id="monthpick11" name="monthpick" value="11">dec
  </div>

</div>