Question

我们刚刚学会了如何在Python中过滤掉一些pandas，所以我想我会在公共数据集上尝试一下。（http://data.wa.aemo.com.au/#stem-bids-and-offers）

我使用了八月的数据。

我自己设定的挑战是仅过滤$ / MWh＆gt; 0它必须出价。我们已经学会了如何使用np.logical_and进行过滤，但我发现的问题是我可以过滤EITHER数字或逻辑。不是两个。

我有一种方法可以运行并获取我所追求的数据和可视化，但我确信有一种更有效的文本和数字字段过滤方式。我的方法的问题是它只有在字符大小不同时才有效。即如果它说Bid或Fib。我会接两个。我只想要收取出价。有谁能指出我正确的方向？

这是我的代码：

#Task: I want to filter out ONLY positive $/MWh bids
#This requires 2 filters - 1 to filter out the $MWh > 0 and 1 to filter by Bids

# Try converting this to a numpy array and using the filtering mechanisms there
import numpy as np
df = pd.read_csv('stem-bids-and-offers-2017-08.csv')
df.head(5)
#I don't know how to filter by 'text' just yet so I will have to use another way which is using the len function
#This will reduce the bid/offer field to characters

df['boLength'] = df['Bid or Offer'].apply(len)
df.head(5)
filtByPriceBid = np.logical_and(df['Price ($/MWh)'] > 0, df['boLength'] == 3)
filtByPriceBid.head(5)

df2 = df[filtByPriceBid]
df2.head(10)

sns.kdeplot(df2['Price ($/MWh)'], shade=True)

PS：我附上了KDE Plot。如果有人想提供这方面的解释，请随时这样做！我期待一个正常化的分布，但不幸的是，事实并非如此。

Answer 1

我希望这就是你要找的东西。

您可以使用sns.kdeplot(df[(df['Price ($/MWh)'] > 0) & (df['Bid or Offer']=='Bid')]['Price ($/MWh)'], shade=True)将多个过滤器放在一起

EOMONTH ( start_date [, month_to_add ] )

熊猫 - 过滤文本和数据

1 个答案: