我有一个数据框。
+------------+------------+------------+------+
| Item Type | Year_Month | Total Cost | Diff |
+------------+------------+------------+------+
| Baby Food | Jul-2017 | 3000 | 100 |
+------------+------------+------------+------+
| Baby Food | Jun-2017 | 2900 | 100 |
+------------+------------+------------+------+
| Cereal | Jul-2017 | 6000 | 1000 |
+------------+------------+------------+------+
| Cereal | Jun-2017 | 5000 | 1000 |
+------------+------------+------------+------+
| Snacks | Jul-2017 | 4500 | Nan |
+------------+------------+------------+------+
| Chocolates | Jul-2017 | 3000 | Nan |
+------------+------------+------------+------+
| Ice Cream | Jul-2017 | 4000 | Nan |
+------------+------------+------------+------+
我想基于diff对数据框进行排序,但是在这种情况下,如果它包含Nan,则应该根据总成本进行排序。所以我的最终输出看起来像
+------------+------------+------------+------+
| Item Type | Year_Month | Total Cost | Diff |
+------------+------------+------------+------+
| Cereal | Jul-2017 | 6000 | 1000 |
+------------+------------+------------+------+
| Cereal | Jun-2017 | 5000 | 1000 |
+------------+------------+------------+------+
| Baby Food | Jul-2017 | 3000 | 100 |
+------------+------------+------------+------+
| Baby Food | Jun-2017 | 2900 | 100 |
+------------+------------+------------+------+
| Snacks | Jul-2017 | 4500 | Nan |
+------------+------------+------------+------+
| Ice Cream | Jul-2017 | 4000 | Nan |
+------------+------------+------------+------+
| Chocolates | Jul-2017 | 3000 | Nan |
+------------+------------+------------+------+
一种实现方法是将数据帧分为2个数据帧(当diff等于Nan时,一个包含所有带有diff的行不等于Nan,另一个包含行的数据帧)。然后根据差异和总成本对每个数据框进行排序,然后将它们合并。
+-----------+------------+------------+------+
| Item Type | Year_Month | Total Cost | Diff |
+-----------+------------+------------+------+
| Baby Food | Jul-2017 | 3000 | 100 |
+-----------+------------+------------+------+
| Baby Food | Jun-2017 | 2900 | 100 |
+-----------+------------+------------+------+
| Cereal | Jul-2017 | 6000 | 1000 |
+-----------+------------+------------+------+
| Cereal | Jun-2017 | 5000 | 1000 |
+-----------+------------+------------+------+
+------------+------------+------------+------+
| Item Type | Year_Month | Total Cost | Diff |
+------------+------------+------------+------+
| Snacks | Jul-2017 | 4500 | Nan |
+------------+------------+------------+------+
| Ice Cream | Jul-2017 | 4000 | Nan |
+------------+------------+------------+------+
| Chocolates | Jul-2017 | 3000 | Nan |
+------------+------------+------------+------+
是否还有其他优化的方式来执行此操作,因为这将涉及大量计算?
答案 0 :(得分:1)
当按列(此处为'Diff')对数据框(df)排序时,Nan值将移至数据框的末尾。因此,通过按两列(“差异”和“总成本”)对数据框进行排序,我们可以得出所需的结果。
以下是同一代码:
df=df.sort_values(by=['Diff','Total Cost'],ascending=False)
答案 1 :(得分:0)
您可以简单地使用带有功能键的排序功能:
在:
name: "ResNet_50_1by2_nsfw"
layer {
name: "data"
type: "Input"
top: "data"
input_param { shape: { dim: 1 dim: 3 dim: 224 dim: 224 } }
}
退出:
import json
jsonv = [
{
"Item Type": "Snacks",
"Year_Month": "Jul-2017",
"Total Cost": 4500,
"Diff": "5"
},
{
"Item Type": "Ice Cream",
"Year_Month": "Jul-2017",
"Total Cost": 4000,
"Diff": "Nan"
},
{
"Item Type": "Chocolates",
"Year_Month": "Jul-2017",
"Total Cost": 3000,
"Diff": "4"
}
]
def extract_diff(json):
try:
jdiff = json['Diff']
ret = int(jdiff) if jdiff != 'Nan' else 0
return ret
except KeyError:
return 0
jsonv.sort(key=extract_diff, reverse=True)
print(json.dumps(jsonv, indent=4))