如果列值是数字,大熊猫会将一列的值乘以另一列的值?

时间:2018-09-30 16:22:12

标签: python python-3.x pandas

我不知道该如何解决

c_df

   lotsize strike_lots amt_cvr
0  75.0     1.0         1.0   
1  2500.0   N           943845
2  100.0    N           742350
3  600.0    2.0         2.0   
4  8000.0   N           585214
5  3500.0   N           838704
6  6000.0   2.0         2.0   
7  4000.0   N           709020
8  1500.0   N           263610

c_df.loc[c_df['strike_lots'] != 'N', 'amt_cvr'] = float(c_df['strike_lots'])*c_df['lotsize']

我已经检查过dtypes

lotsize        float64
strike_lots    object 
amt_cvr        object 
dtype: object

我假设的问题是dtype中的strike_lots,由于要保留N的值,因此我不想更改。

Traceback (most recent call last):
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/ops.py", line 1012, in na_op
    result = expressions.evaluate(op, str_rep, x, y, **eval_kwargs)
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 205, in evaluate
    return _evaluate(op, op_str, a, b, **eval_kwargs)
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/computation/expressions.py", line 65, in _evaluate_standard
    return op(a, b)
TypeError: can't multiply sequence by non-int of type 'float'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/ops.py", line 1033, in safe_na_op
    return na_op(lvalues, rvalues)
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/ops.py", line 1018, in na_op
    result[mask] = op(x[mask], com._values_from_object(y[mask]))
TypeError: can't multiply sequence by non-int of type 'float'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/ops.py", line 1069, in wrapper
    result = safe_na_op(lvalues, rvalues)
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/ops.py", line 1037, in safe_na_op
    lambda x: op(x, rvalues))
  File "pandas/_libs/algos_common_helper.pxi", line 1212, in pandas._libs.algos.arrmap_object
  File "/media/sid1/sid/lib/python3.6/site-packages/pandas/core/ops.py", line 1037, in <lambda>
    lambda x: op(x, rvalues))
TypeError: ufunc 'multiply' did not contain a loop with signature matching types dtype('<U32') dtype('<U32') dtype('<U32')

预期产量

c_df

   lotsize strike_lots amt_cvr
0  75.0     1.0         75.0    #this value changes  
1  2500.0   N           943845
2  100.0    N           742350
3  600.0    2.0         1200.0   #This value changes   
4  8000.0   N           585214
5  3500.0   N           838704
6  6000.0   2.0         12000.0 #This value changes  
7  4000.0   N           709020
8  1500.0   N           263610

谢谢。

2 个答案:

答案 0 :(得分:3)

您可以使用to_numericfillna

df['amt_cvr'] = (df['lotsize']*pd.to_numeric(df['strike_lots'],errors='coerce')).fillna(df['amt_cvr'])

   lotsize strike_lots   amt_cvr
0     75.0         1.0      75.0
1   2500.0           N  943845.0
2    100.0           N  742350.0
3    600.0         2.0    1200.0
4   8000.0           N  585214.0
5   3500.0           N  838704.0
6   6000.0         2.0   12000.0
7   4000.0           N  709020.0
8   1500.0           N  263610.0

pd.to_numeric(df['strike_lots'],errors='coerce')会将非数字值转换为NaN s。因此,当您将其与数字列相乘时,其输出也将为NaN

然后,我们可以使用fillna用实际的amt_cvr填充空值。希望能帮助到你。

答案 1 :(得分:2)

使用遮罩,这是为了纠正您自己的一行代码

mask=df['strike_lots'] != 'N'
df.loc[mask, 'amt_cvr'] = df.loc[mask,'strike_lots'].astype(float)*df['lotsize']
df
Out[80]: 
   lotsize strike_lots   amt_cvr
0     75.0         1.0      75.0
1   2500.0           N  943845.0
2    100.0           N  742350.0
3    600.0         2.0    1200.0
4   8000.0           N  585214.0
5   3500.0           N  838704.0
6   6000.0         2.0   12000.0
7   4000.0           N  709020.0
8   1500.0           N  263610.0