我创建了一个数据框,该数据框具有每个单元格的二进制值,其中每一行都是用户,每一列都是用户可以选择(或不可以)的公司,例如:
git add -f
然后我创建了一个字典,将每个公司分为高,中或低价值公司:
// This function is just an example
List<Widget> buildContainers(BuildContext context, int containerCount) {
// Here you get the width of all your containers
final containerWidth = MediaQuery.of(context).size.width / containerCount;
// Make height = width for squares!
final containerHeight = containerWidth;
// We will gather all containers in this list
List containerList = <Container>[];
for (var i = 0; i < containerCount; i++) {
containerList.add(
Container(
// Use the predefined width and height here!
width: containerWidth,
height: containerHeight,
// Random color to differentiate each container
// You could replace this with your child widget
color: Color(
Random().nextInt(0xFFFFFFFF),
),
),
);
}
return containerList;
}
@override
Widget build(BuildContext context) {
return Container(
child: Row(
children: [
// If you are wondering what are these dots (...)
// they are called "spread operators" and you can find more about them here:
// https://dart.dev/guides/language/language-tour#spread-operator
...buildContainers(context, 7),
],
),
);
}
当前有些公司在数据框中,但不在字典中,但这应该很快解决。我想为每个用户选择高,中或低价值公司的次数创建变量。最终应该看起来像这样:
company1 company2 company3
1 0 0
0 0 1
0 1 1
我开始创建一个循环来完成此操作,但是我不确定如何将列名与字典键/值匹配,或者这是否是最有效的方法(大约有18,000行/用户,〜总计100列/公司):
{'company1': 'high',
'company2': 'low',
'company3': 'low'}
答案 0 :(得分:2)
一种可能的方法:
d = {'company1': 'high',
'company2': 'low',
'company3': 'low'}
df.join(df.rename(columns=d)
.groupby(level=0, axis=1).sum()
.reindex(['low','mid','high'], axis=1, fill_value=0)
.add_prefix('total_')
)
输出:
company1 company2 company3 total_low total_mid total_high
0 1 0 0 0 0 1
1 0 0 1 1 0 0
2 0 1 1 2 0 0
答案 1 :(得分:1)
不是@Quang Hoang的短,而是另一种方式;
熔融数据框
df2=pd.melt(df, value_vars=['company1', 'company2', 'company3'])
地图字典创建另一列total
df2['total']=df2.variable.map(d)
枢轴high
,low
并添加中线并加入df
compa=['low','medium','high']
df.join(df2.groupby(['variable','total'])['value'].sum().unstack('total', fill_value=0).reindex(compa,axis=1, fill_value=0).add_prefix('total_').reset_index().drop(columns=['variable']))