Question

我创建了一个数据框，该数据框具有每个单元格的二进制值，其中每一行都是用户，每一列都是用户可以选择（或不可以）的公司，例如：

git add -f

然后我创建了一个字典，将每个公司分为高，中或低价值公司：

  // This function is just an example
  List<Widget> buildContainers(BuildContext context, int containerCount) {
    // Here you get the width of all your containers
    final containerWidth = MediaQuery.of(context).size.width / containerCount;
    // Make height = width for squares!
    final containerHeight = containerWidth;
    // We will gather all containers in this list
    List containerList = <Container>[];

    for (var i = 0; i < containerCount; i++) {
      containerList.add(
        Container(
          // Use the predefined width and height here!
          width: containerWidth, 
          height: containerHeight,
          // Random color to differentiate each container
          // You could replace this with your child widget
          color: Color(
            Random().nextInt(0xFFFFFFFF),
          ),
        ),
      );
    }

    return containerList;
  }

  @override
  Widget build(BuildContext context) {
    return Container(
      child: Row(
        children: [
          // If you are wondering what are these dots (...)
          // they are called "spread operators" and you can find more about them here:
          // https://dart.dev/guides/language/language-tour#spread-operator
          ...buildContainers(context, 7),
        ],
      ),
    );
  }

当前有些公司在数据框中，但不在字典中，但这应该很快解决。我想为每个用户选择高，中或低价值公司的次数创建变量。最终应该看起来像这样：

company1 company2 company3
1        0        0
0        0        1
0        1        1

我开始创建一个循环来完成此操作，但是我不确定如何将列名与字典键/值匹配，或者这是否是最有效的方法（大约有18,000行/用户，〜总计100列/公司）：

{'company1': 'high',
'company2': 'low',
'company3': 'low'}

Answer 1

一种可能的方法：

d = {'company1': 'high',
     'company2': 'low',
     'company3': 'low'}

df.join(df.rename(columns=d)
         .groupby(level=0, axis=1).sum()
         .reindex(['low','mid','high'], axis=1, fill_value=0)
         .add_prefix('total_')
       )

输出：

   company1  company2  company3  total_low  total_mid  total_high
0         1         0         0          0          0           1
1         0         0         1          1          0           0
2         0         1         1          2          0           0

Answer 2

不是@Quang Hoang的短，而是另一种方式；

熔融数据框

df2=pd.melt(df,  value_vars=['company1', 'company2', 'company3'])

地图字典创建另一列total

df2['total']=df2.variable.map(d)

枢轴high，low并添加中线并加入df

compa=['low','medium','high']
df.join(df2.groupby(['variable','total'])['value'].sum().unstack('total', fill_value=0).reindex(compa,axis=1, fill_value=0).add_prefix('total_').reset_index().drop(columns=['variable']))

如何将列名与字典键匹配并为计数器添加值

2 个答案: