通过字典键合并数据框重命名列的Python

时间:2018-08-28 16:52:02

标签: python pandas dictionary dataframe

最初,我有一个带日期字段的空数据框,后来我试图将其与新的数据框合并到一个for循环中。

com_df = pd.DataFrame(columns=['date'])
    for i in data_dict.values():
        response = requests.get('www.example.com/' + i + '?format=json')
        data = json.loads(response.content.decode('utf-8'))
        df = dataframe_format(data[1]) // convert list of dict to dataframe
        com_df = pd.merge(com_df, df, on='date', how='outer')

所以现在的输出就像

    date       value_x       value_y  value_x     value_y       value
0   2017  1.722333e+13  8.711267e+12   3485.0  197.713256   46.030025
1   2016  1.829506e+13  7.320738e+12   3052.0  249.907289   -2.024998
2   2015  3.932602e+13  8.188019e+12   2827.0  480.287296   -6.007182

但是我希望列名成为下面字典的键,

data_dict = {'A': '1','B': '2','C': '3','D': '4','E': '5'}

    date           A              B        C            D       E 
0   2017  1.722333e+13  8.711267e+12   3485.0  197.713256   46.030025
1   2016  1.829506e+13  7.320738e+12   3052.0  249.907289   -2.024998
2   2015  3.932602e+13  8.188019e+12   2827.0  480.287296   -6.007182

2 个答案:

答案 0 :(得分:0)

如果您打算应用按其值排序的字典键,则可以执行以下操作:

@Autowired
private RedisClient redisClient;

private ObjectMapper objectMapper = new ObjectMapper();

public void saveAllVariants(Set<Variant> allVariantsSet) {
    try {
        final Pipeline pipeline = redisClient.getPipeline(4); // use redis DB4
        for (Variant variant : allVariantsSet) {
            String idKey = "variants:id:" + variant.getId();
            String variantStr = objectMapper.writeValueAsString(variant); // convert to JSON object
            pipeline.set(idKey.getBytes(), variantStr.getBytes());
        }
        pipeline.sync();
    } catch (Exception e) {
        log.error("Problem in saving variants in cache. Error: ", e.getMessage());
    }
}

答案 1 :(得分:0)

我会将您的输入字典转换为一个到列名称的映射索引:

data_dict = {'A': '1','B': '2','C': '3','D': '4','E': '5'}
pos_col_dict = {int(v): k for k, v in data_dict.items()}

然后通过NumPy分配给列。您应该使用副本以避免产生副作用:

arr = df.columns.values
arr[list(pos_col_dict)] = list(pos_col_dict.values())
df.columns = arr