用于偶数拆分列值的子集数据框

时间:2017-11-08 03:05:10

标签: python pandas

我的数据框是“bus_rev”。我想对数据帧进行子集化,以便我有一个偶数个记录,其中good_reviews == True和good_reviews == False。任何人都可以建议一个光滑的方式来做到这一点吗?

Sample Data:

print(bus_rev[1:3])

                  user_id             business_id  stars_x  \
1  CxDOIDnH8gp9KXzpBHJYXw  XSiqtcVEsP6dLOL7ZA9OxA        4   
2  CxDOIDnH8gp9KXzpBHJYXw  v95ot_TNwTk1iJ5n56dR0g        3   

               address                                         attributes  \
1     522 Yonge Street  {u'BusinessParking': {u'garage': False, u'stre...   
2  1661 Denison Street  {u'BusinessParking': {u'garage': False, u'stre...   

                        categories     city  \
1   [Restaurants, Ramen, Japanese]  Toronto   
2  [Chinese, Seafood, Restaurants]  Markham   

                                               hours  is_open   latitude  \
1  {u'Monday': u'11:00-22:00', u'Tuesday': u'11:0...        1  43.663689   
2                                                 {}        0  43.834295   

   longitude                            name   neighborhood postal_code  \
1 -79.384200                     Kenzo Ramen  Downtown Core     M4Y 1X9   
2 -79.305282  Vince Seafood Restaurant & BBQ       Milliken     L3R 6E4   

   review_count  stars_y state good_reviews  
1            76      3.5    ON         True  
2            23      3.5    ON        False  


Code:

bus_rev['good_reviews'].value_counts()

Output:

False    482
True     168
Name: good_reviews, dtype: int64

2 个答案:

答案 0 :(得分:1)

要创建具有相等值的DataFrame,您可以使用:

bus_revs_false = bus_revs[bus_revs['good_reviews'] == False]
bus_revs_false = bus_revs_false.iloc(:168,:)
bus_revs_true = bus_revs[bus_revs['good_reviews'] == True]

bus_revs_new = bus_revs_true.append(bus_revs_false)

在这种情况下,bus_revs_new将是你的新数据框架,具有相同数量的Trues和Falses。

答案 1 :(得分:1)

要获得相同数量的真实和法力,你可以这样做:

<table width="90%" border="1" id="TestAlert">
  <tbody>
    <tr>
      <td>Attribute</td>
      <td>Thai</td>
      <td>English</td>
    </tr>
    <tr>
      <td>น้ำหนัก</td>
      <td>...</td>
      <td>WT</td>
    </tr>
    <tr>
      <td>ชื่อรุ่น</td>
      <td>...</td>
      <td>MDL</td>
    </tr>
    <tr>
      <td>สูง</td>
      <td>...</td>
      <td>HIGH</td>
    </tr>
  </tbody>
</table>