Python / Pandas:如何使用从现有数据帧计算的新变量和值创建结果表

时间:2018-02-11 09:47:45

标签: python function pandas crosstab

我希望能够像这样创建一个交叉表/表/数据帧(名称):

____________________      
Performance  "value" (This value must come from a X vector, which has a formula to go to dataset, calculate and return this value)
____________________
LTFU         "value" (This value must come from a y vector, which has a formula to go to dataset, calculate and return this value)
____________________

请注意,Performance和LTFU值是从应用于python中的.csv数据集的函数生成的。性能和LTFU不存在于.csv数据集中,两者都应该只是为了让我对性能进行总结而创建。

我现在得到的如下:

import pandas as pd
performance=pd.read_csv("https://www.dropbox.com/s/08kuxi50d0xqnfc/demo.csv?dl=1")

x=performance["idade"].sum()
y=performance["idade"].mean()

l = "Performance"
k = "LTFU"

def test(y):
return pd.DataFrame({'a':y, 'b':x})

test([l,k])

         a        b
0   Performance   x vector value here (it shows 1300, it is correct)
1   LTFU          y vector value here (it shows 1300, it is wrong, it should show 14.130434782608695 instead, according to the instruction of y vector)

您可以将上述代码复制并粘贴到python IDE中并进行测试,然后将解决方案返回给我。 请给我一个关于我想要的表格结果的例子。

2 个答案:

答案 0 :(得分:0)

你的要求不符合pandas数据框的定义,你已经有了值,所以你可以使用其他方式使用输出

答案 1 :(得分:0)

您需要将输出分配到DataFrame,然后按DataFrame.to_csv写入文件:

l = "Performance"
k = "LTFU"

#changed input to 2 scalar values
def test(l1,k1):
    #changed a to list [l1, k1]
    #changed b to list [x, y]
    return pd.DataFrame({'a':[l1, k1], 'b':[x, y]})

df1 = test(l,k)
print (df1)
             a            b
0  Performance  1300.000000
1         LTFU    14.130435
df1.to_csv('file.csv', index=False, header=None, sep=' ')