根据字符串值的阈值删除列

时间:2019-06-07 13:52:56

标签: python python-3.x pandas dataframe

我的数据框如下

lst =[['', '2014', '2014', '2014', '2014', '2015', '2015', '2015', '2015', '2016', '2016', '2016','2016'],
      ['Stmnt of Oper:', '', '', '', '', '', '', '', '', '', '', '',''],
      ['Net sale', '', '$', '88,988', '', '', '$', '107,006', '', '', '$', '135,987', ''],
      ['Oper inc', '', '$', '178', '', '', '$', '2,233', '', '', '$', '4,186', ''],
      ['Net inc', '', '$', '(241', ')', '', '$', '596', '', '', '$', '2,371', ''],
      ['EPS', '', '$', '(0.52', ')', '', '$', '1.28', '', '', '$', '5.01', ''],
      ['', '2014', '2014', '2014', '2014', '2015', '2015', '2015', '2015', '2016', '2016', '2016','2016'],
      ['Bal Shts:', '', '', '', '', '', '', '', '', '', '', '',''],
      ['Tot asts', '', '$', '53,618', '', '', '$', '64,747', '', '', '$', '83,402', ''],
      ['Tot oblig', '', '$', '14,794', '', '', '$', '17,477', '', '', '$', '20,301', '']]

df=pd.DataFrame(lst)

original df

我只想从数据框中选择那些具有数字/字符串值的列,例如列03711,所以我的输出应该如下图

result df

是否有更简单的方法来获取此信息? 我尝试过的如下,

 df.replace(to_replace=['$', ')', ')%', '%'],value='',inplace=True)
 mask = df.apply(pd.Series.value_counts,normalize=True).loc[''] > 0.5
 df = df.loc[:,~mask]

1 个答案:

答案 0 :(得分:1)

在这种情况下,您可以选中<script src="https://cdnjs.cloudflare.com/ajax/libs/jquery/3.3.1/jquery.min.js"></script> <div data-toggle-group="location"> <div class="form-group"> <label>Country</label> <select class="form-control" name="country" id="country" data-toggle="country" data-country=""></select> </div> <div class="form-group"> <label>State</label> <select class="form-control" name="state" id="state" data-toggle="state" data-state=""></select> </div> </div> <script type="text/javascript" src="https://www.cssscript.com/demo/generic-country-state-dropdown-list-countries-js/countries.js"></script>

isin