我将数据保存在pandas.groupby
对象中,并尝试根据标题为“金额”的条件栏中的条件遍历各个组。但是,我收到的错误消息正在尝试将“参考”代码从字符串转换为浮点数,但是我不确定该指令的位置。
for data in row:
if float(data) in ['Amount'] > 0:
{'buy_currency' : ['Currency'],
'buy_quantity' : ['Amount'],
'order_id' : str(data)['Reference']}
我认为我误会了第二行:if float(data) in ['Amount'] > 0
我只想将“金额”字段转换为浮点数,其他字段保留字符串即可。
我感谢任何在正确方向上的指导或推动!
import pandas as pd
import numpy as np
df2 = pd.read_excel('sample.xlsx')
mask = df2.groupby(['Reference'])
groups = mask.groups
for ref, trades in mask:
for index, row in trades.iterrows():
for data in row:
if float(data) in ['Amount'] > 0:
{'buy_currency' : ['Currency'],
'buy_quantity' : ['Amount'],
'order_id' : str(data)['Reference']}
else:
{'sell_currency' : ['Currency'],
'buy_quantity' : ['Amount'],
'order_id': str(data)['Reference']}
数据集包含以下样本: Dataset Sample
我收到的错误消息是:
ValueError: could not convert string to float: 'bbb8ee04-053c-4174-b5b1-281c10618d52'
答案 0 :(得分:1)
我不确定您要做什么。您是否知道可以在熊猫中使用masks
一次检查多个列的值?以下是一些可能对您有用的示例代码:
import pandas as pd
import numpy as np
df2 = pd.read_excel('sample.xlsx')
df2['Amount'] = df2['Amount'].astype(float)
df2['Reference'] = df2['Reference'].astype(str)
# no need to group
mask = df2['Amount'] > 0
df2['buy_currency'] = np.where(mask, df2['Currency'], np.nan)
df2['buy_quantity'] = np.where(mask, df2['Amount'], np.nan)
df2['order_id'] = np.where(mask, df2['Reference'], np.nan)
mask2 = df2['Amount'] < 0
df2['sell_currency'] = np.where(mask2, df2['Currency'], np.nan)
df2['sell_quantity'] = np.where(mask2, df2['Amount'], np.nan)
df2['order_id'] = np.where(mask2, df2['Reference'], np.nan)