我对CSV文件进行了排序以进行一些计算。 Python 2.7
import pandas as pd
df = pd.read_csv('Cliente_x_Pais_Sitio.csv', sep=',')
df1 = df.sort_values(by=['Cliente','Auth_domain','Sitio',"Country"])
df1.to_csv('test.csv')
CSV数据(test.csv
):
Cliente,Fecha,Auth_domain,Sitio,Country,ECPM_medio
FF,15/12/2017,@ff,ff_Color,Afganistán,0.53
FF,15/01/2018,@ff,ff_Color,Afganistán,0.5
FF,15/01/2017,@ff,ff_Color,Alemania,0.34
FF,15/12/2017,@ff,ff_Color,Alemania,0.38
FF,15/01/2018,@ff,ff_Color,Alemania,0.37
我需要什么:
if (15/12/2017 ECPM) ≤ (15/01/2018 ECPM):
if ((15/12/2017 ECPM)*0.8) ≥ (15/01/2017 ECPM):
r = (15/01/2017 ECPM)
else:
r = ((15/12/2017 ECPM)*0.8)
else:
if (15/01/2018 ECPM) ≥ (15/01/2017 ECPM):
r = (15/01/2017 ECPM)
else:
r = (15/01/2018 ECPM)
填写实际数据,前两行是:
if 0.53 ≤ 0.5:
if 0.5 ≥ 0: #if we don't have the cell value I would like to add a 0 True
r = 0.5
请记住我有超过10,000行的儿子我需要一个多表格
新的CSV应该告诉我:
Cliente,Auth_domain,Sitio,Country,Recomendation_ECPM
FF,@ff,ff_Color,Afganistán,0.5
FF,@ff,ff_Color,Alemania,0.34
答案 0 :(得分:1)
我不确定我有正确的
[[[[[[[[[[[[[[[[[[[['popji']]]]]]]]]]]]]]]]]]
或setval
但管道无论使用sort,group_by还是transform。因为我们会先将compare_val
(nan
首先与最后的shift(-1)
进行比较,然后我们必须将其删除。
shift(1)