我希望能够像这样创建一个交叉表/表/数据帧(名称):
____________________
Performance "value" (This value must come from a X vector, which has a formula to go to dataset, calculate and return this value)
____________________
LTFU "value" (This value must come from a y vector, which has a formula to go to dataset, calculate and return this value)
____________________
请注意,Performance和LTFU值是从应用于python中的.csv数据集的函数生成的。性能和LTFU不存在于.csv数据集中,两者都应该只是为了让我对性能进行总结而创建。
我现在得到的如下:
import pandas as pd
performance=pd.read_csv("https://www.dropbox.com/s/08kuxi50d0xqnfc/demo.csv?dl=1")
x=performance["idade"].sum()
y=performance["idade"].mean()
l = "Performance"
k = "LTFU"
def test(y):
return pd.DataFrame({'a':y, 'b':x})
test([l,k])
a b
0 Performance x vector value here (it shows 1300, it is correct)
1 LTFU y vector value here (it shows 1300, it is wrong, it should show 14.130434782608695 instead, according to the instruction of y vector)
您可以将上述代码复制并粘贴到python IDE中并进行测试,然后将解决方案返回给我。 请给我一个关于我想要的表格结果的例子。
答案 0 :(得分:0)
你的要求不符合pandas数据框的定义,你已经有了值,所以你可以使用其他方式使用输出
答案 1 :(得分:0)
您需要将输出分配到DataFrame
,然后按DataFrame.to_csv
写入文件:
l = "Performance"
k = "LTFU"
#changed input to 2 scalar values
def test(l1,k1):
#changed a to list [l1, k1]
#changed b to list [x, y]
return pd.DataFrame({'a':[l1, k1], 'b':[x, y]})
df1 = test(l,k)
print (df1)
a b
0 Performance 1300.000000
1 LTFU 14.130435
df1.to_csv('file.csv', index=False, header=None, sep=' ')