Question

需要可怕的帮助。我试图有条件地遍历Google Play商店csv文件的行。由于某些原因，我不断遇到熊猫在某些循环中无法识别'> ='符号的问题。也就是说，使用条件“如果price ==“ 9.00”可以正常工作，但其他操作（即'<='和“> =”返回错误消息。

此外，我正在尝试计算价格为9.00美元或更高的应用程序数量。我想从价格列中删除“ $”符号，然后对其进行迭代。我尝试了str.lstrip函数，但没有成功。任何帮助都将不胜感激。


df = pd.read_csv("googleplaystore.csv")

df['Rating'].fillna(value = '0.0', inplace = True)

# Calculating how many apps have a price of $9.00 or greater

apps_morethan9 = 0

for i, row in df.iterrows():
    rating = float(row.Rating)
    price = float(row.Price)
    if price >= 9:
        apps_morethan9 += 1

print(apps_morethan9)

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-103-66171ce7efb6> in <module>
      5 for i, row in df.iterrows():
      6     rating = float(row.Rating)
----> 7     price = float(row.Price)
      8     if price >= 9:
      9         apps_morethan9 += 1

ValueError: could not convert string to float: '$4.99'```

Answer 1

您可以像这样使用string.replace（）：

for i, row in df.iterrows():
    rating = float(row.Rating)
    price = float(row.Price.str.replace('$',''))
    if price >= 9:
        apps_morethan9 += 1

但是您的实现可以在速度和复杂性方面得到改善：

print(df[df.Price.str.replace('$','').astype(float) >= 9].count().values[0])

Answer 2

您可以在迭代整个序列之前将str.replace应用于整个序列，如下所示：

df["Price"].str.replace("$", "")
for i, row in df.iterrows():
    #rest of your routine

我建议您使用@gustavz解决方案以提高性能

如何从熊猫系列的字符串中去除“ $”符号？

2 个答案: