使用没有熊猫的python过滤csv数据

时间:2019-12-21 06:38:32

标签: python numpy csv

我想根据转售类型和转售价格的平均值(即5号房间)进行过滤,我正在尝试根据5间房的转售价格值进行过滤。但是,当我尝试“打印” room5时,它是空列表。我错过了哪一部分?

[![带有打印数据的代码] [1]] [1]

1 个答案:

答案 0 :(得分:0)

要获得房间

room5 = roomprice2013[ roomprice2013['flat_type'] == '5-room' ]
print(room5)

求和

room5_sum = roomprice2013[ roomprice2013['flat_type'] == '5-room' ]['price'].sum()
print(room5_sum)

工作示例。

我使用io.StringIO模拟文件,因此每个人都可以复制和运行示例,而无需将文件与数据下移。

import numpy as np
import io

s = io.StringIO('''quarter,town,flat_type,price
2013-Q2,Bedok,1-room,na
2013-Q2,Bedok,2-room,-
2013-Q2,Bedok,3-room,172000
2013-Q2,Bedok,4-room,224500
2013-Q2,Bedok,5-room,332000
2013-Q2,Bedok,Executive,420000''')

data = np.genfromtxt(s, skip_header=1, dtype=[('quarter','U10'),('town','U20'), ('flat_type','U10'), ('price','i8')], delimiter =',', missing_values=['na','-'],filling_values='0')

data_2013 = data[np.isin(data['quarter'], ['2013-Q1','2013-Q2','2013-Q3','2013-Q4'],['flat_type'])]
print(data_2013)

roomprice2013 = data_2013[['flat_type','price']]
print(roomprice2013)

room5 = roomprice2013[roomprice2013['flat_type'] == '5-room']
print(room5)

room5_sum = roomprice2013[roomprice2013['flat_type'] == '5-room']['price'].sum()
print(room5_sum)

编辑::内部roomprice2013['flat_type'] == '5-room'仅提供带有True/False的列表,您可以使用该列表(甚至多次)以仅保留所需的行。

mask = (roomprice2013['flat_type'] == '5-room')  # it works without () but it is more readable with ()
print(mask) 
# [False False False False  True False]

room5 = roomprice2013[mask]
print(room5)

room5_sum = roomprice2013[mask]['price'].sum()
print(room5_sum)