如何将列名与字典键匹配并为计数器添加值

时间:2020-06-05 21:29:00

标签: python pandas loops dataframe dictionary

我创建了一个数据框,该数据框具有每个单元格的二进制值,其中每一行都是用户,每一列都是用户可以选择(或不可以)的公司,例如:

git add -f

然后我创建了一个字典,将每个公司分为高,中或低价值公司:

  // This function is just an example
  List<Widget> buildContainers(BuildContext context, int containerCount) {
    // Here you get the width of all your containers
    final containerWidth = MediaQuery.of(context).size.width / containerCount;
    // Make height = width for squares!
    final containerHeight = containerWidth;
    // We will gather all containers in this list
    List containerList = <Container>[];

    for (var i = 0; i < containerCount; i++) {
      containerList.add(
        Container(
          // Use the predefined width and height here!
          width: containerWidth, 
          height: containerHeight,
          // Random color to differentiate each container
          // You could replace this with your child widget
          color: Color(
            Random().nextInt(0xFFFFFFFF),
          ),
        ),
      );
    }

    return containerList;
  }

  @override
  Widget build(BuildContext context) {
    return Container(
      child: Row(
        children: [
          // If you are wondering what are these dots (...)
          // they are called "spread operators" and you can find more about them here:
          // https://dart.dev/guides/language/language-tour#spread-operator
          ...buildContainers(context, 7),
        ],
      ),
    );
  }

当前有些公司在数据框中,但不在字典中,但这应该很快解决。我想为每个用户选择高,中或低价值公司的次数创建变量。最终应该看起来像这样:

company1 company2 company3
1        0        0
0        0        1
0        1        1

我开始创建一个循环来完成此操作,但是我不确定如何将列名与字典键/值匹配,或者这是否是最有效的方法(大约有18,000行/用户,〜总计100列/公司):

{'company1': 'high',
'company2': 'low',
'company3': 'low'}

2 个答案:

答案 0 :(得分:2)

一种可能的方法:

d = {'company1': 'high',
     'company2': 'low',
     'company3': 'low'}

df.join(df.rename(columns=d)
         .groupby(level=0, axis=1).sum()
         .reindex(['low','mid','high'], axis=1, fill_value=0)
         .add_prefix('total_')
       )

输出:

   company1  company2  company3  total_low  total_mid  total_high
0         1         0         0          0          0           1
1         0         0         1          1          0           0
2         0         1         1          2          0           0

答案 1 :(得分:1)

不是@Quang Hoang的短,而是另一种方式;

熔融数据框

df2=pd.melt(df,  value_vars=['company1', 'company2', 'company3'])

地图字典创建另一列total

df2['total']=df2.variable.map(d)

枢轴highlow并添加中线并加入df

compa=['low','medium','high']
df.join(df2.groupby(['variable','total'])['value'].sum().unstack('total', fill_value=0).reindex(compa,axis=1, fill_value=0).add_prefix('total_').reset_index().drop(columns=['variable']))

enter image description here