Question

我有一个这样的数据框：

example_df =

country  id  metric_name  metric_value account_id
US       1   clicks       111          000
UK       2   clicks       222          000
DE       3   clicks       333          000
RU       4   clicks       444          000

还有一个尺寸为

的变量

breakdowns = 'country'

我需要将example_df中“国家/地区”列的列标题替换为“ metric_key_1”

在细分变量中，可能是不同的维度名称，而不仅仅是国家/地区。最大长度可能是2，因此细分为“年龄，性别”。如果击穿='0'，则什么也不会发生。

因此，如果细分=“国家”，我的目标结果将是这样

metric_key_1   id  metric_name  metric_value account_id
US             1   clicks       111          000
UK             2   clicks       222          000
DE             3   clicks       333          000
RU             4   clicks       444          000

如果细分=“年龄，性别”

metric_key_1  metric_key_2    id  metric_name  metric_value account_id
18-24         female          1   clicks       111          000
25-44         male            2   clicks       222          000
45-65         male            3   clicks       333          000
65-100        female          4   clicks       444          000

到目前为止我所做的

#  got the columns headers in a list
columns_list = list(example_df)
#  check if berakdown is empty
if breakdowns == '0':
    pass
else:
    #  split them to a list
    breakdowns = breakdowns.split(',')
    #  substitute names with metric_key_n
    breakdowns1 = ['metric_key_{}'.format(i) for i in range(1, len(breakdowns) + 1)]
    #  here I get an error
    for x in breakdowns1:
        if x not in columns_list:
            for x,y in zip(breakdowns1, columns_list):
                columns_list[x] = y

Error:

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-78-a3a989b94f92> in <module>
      2         if x not in columns_list:
      3             for x,y in zip(breakdowns1, columns_list):
----> 4                 columns_list[x] = y

TypeError: list indices must be integers or slices, not str

我知道解决方案并不难，但无法解决。感谢任何帮助

Answer 1

我建议先用split和enumerate然后用rename列创建dict理解的字典：

breakdowns = 'country'
d = {c: 'metric_key_{}'.format(i) for i, c in enumerate(breakdowns.split(','), 1)}
print (d)
{'country': 'metric_key_1'}

df = df.rename(columns=d)
print (df)

  metric_key_1  id metric_name  metric_value  account_id
0           US   1      clicks           111           0
1           UK   2      clicks           222           0
2           DE   3      clicks           333           0
3           RU   4      clicks           444           0

动态替换熊猫数据框中的列标题

1 个答案: