因此,我想创建一个表,该表显示平均值,std偏差并计算导入的CSV数据文件中两个变量的所有缺失值。 csv文件如下所示:
Group Var1 Var2
1 10 100
1 NA 200
2 30 NA
2 40 NA
3 50 500
3 60 600
因此,我的程序将导入此CSV文件,然后使用大熊猫生成一张表,该表显示平均值,标准偏差并计算缺失值,同时按组号对它们进行汇总。我正在寻找看起来像这样的输出:
Variables Missing Values Group 1 Group 2 Group 3
Var1 1 mean1(sd1) mean2(sd2) mean3(sd3)
Var2 2 mean1(sd1) mean2(sd2) mean3(sd3)
答案 0 :(得分:3)
使用:
$jsonArray = json_decode($json);
echo $jsonArray->GoldPrice->AUD->bid;
输出:
$jsonArray = json_decode($json, true);
echo $jsonArray['GoldPrice']['AUD']['bid'];
答案 1 :(得分:1)
您可以使用以下代码完成
WPF
以上操作全部使用数据帧的>>> import numpy as np
>>> import pandas as pd
>>>
>>> df = pd.DataFrame([
... [1, 10, 100],
... [1, np.nan, 200],
... [2, 30, np.nan],
... [2, 40, np.nan],
... [3, 50, 500],
... [3, 60, 600]])
>>>
>>> df.columns = ["Group", "Var1", "Var2"]
>>>
>>> groupCol = "Group"
>>> nan_df = df.isna().groupby(groupCol).sum().transpose()
>>> nan_df.columns = ['Missing Values']
>>> std_df = df.groupby(groupCol).std().round(3).transpose()
>>> mean_df = df.groupby(groupCol).mean().round(3).transpose()
>>> # get mean and standard deviation into one column
>>> for i in range(len(mean_df.columns)):
... mean_df.loc[:, mean_df.columns[i]] = mean_df[mean_df.columns[i]].astype(str)+'('+std_df[std_df.columns[i]].astype(str)+')'
...
>>> # change the column names
>>> mean_df.columns = ["Group "+ str(each_group) for each_group in mean_df.columns]
>>> # add missing value data
>>> mean_df = mean_df.join(nan_df)
>>> mean_df
Group 1 Group 2 Group 3 Missing Values
Var1 10.0(nan) 35.0(7.071) 55.0(7.071) 1
Var2 150.0(70.711) nan(nan) 550.0(70.711) 2
>>>
方法。只需一点操作,您就可以轻松获得所需格式的数据。