我是Python的初学者,我有一个看起来像这样的大型DataFrame:
import pandas as pd
df = pd.DataFrame({'Total': [10, 10, 10, 10, 10, 10, 10, 10, 10, 10], \
'Type': ['Child', 'Boy', 'Girl', 'Senior', '', '', '', '', '', ''], \
'Count': [4, 5, 1, 0, '', '', '', '', '', '']})
df[["Total", "Type", "Count"]]
df
输出:
Total Type Count
0 10 Child 4
1 10 Boy 5
2 10 Girl 1
3 10 Senior 0
4 10
5 10
6 10
7 10
8 10
9 10
我希望有类似的东西:
Total Type Count New
0 10 Child 4 Child
1 10 Boy 5 Child
2 10 Girl 1 Child
3 10 Senior 0 Child
4 10 Boy
5 10 Boy
6 10 Boy
7 10 Boy
8 10 Boy
9 10 Girl
我不知道如何创建一个新列,其条件是重复Type
ntime作为Count
的数量。
谢谢!
答案 0 :(得分:6)
在repeat
replace
,Count
空白为0
df['New']=df.Type.repeat(df.Count.replace('',0)).values
df
Out[657]:
Count Total Type New
0 4 10 Child Child
1 5 10 Boy Child
2 1 10 Girl Child
3 0 10 Senior Child
4 10 Boy
5 10 Boy
6 10 Boy
7 10 Boy
8 10 Boy
9 10 Girl
答案 1 :(得分:2)
不确定这是否是最快的方式,但它很简单:
from itertools import chain
import pandas as pd
df = pd.DataFrame({'Total': [10, 10, 10, 10, 10, 10, 10, 10, 10, 10], \
'Type': ['Child', 'Boy', 'Girl', 'Senior', '', '', '', '', '', ''], \
'Count': [4, 5, 1, 0, '', '', '', '', '', '']})
df['New'] = list(chain.from_iterable([t] * c for t, c in zip(df.Type, df.Count) if c))
print(df)
输出:
Count Total Type New
0 4 10 Child Child
1 5 10 Boy Child
2 1 10 Girl Child
3 0 10 Senior Child
4 10 Boy
5 10 Boy
6 10 Boy
7 10 Boy
8 10 Boy
9 10 Girl
答案 2 :(得分:2)
尝试下面的代码,我将<TextBox x:Name="TextBox" AcceptsReturn="True" TextWrapping="Wrap">
<i:Interaction.Behaviors>
<l:LineCountBehavior x:Name="LineCountBehavior"/>
</i:Interaction.Behaviors>
</TextBox>
<TextBlock Visibility="{Binding LineCount, ElementName=LineCountBehavior, Converter={StaticResource IntToVisibilityConverter}}"/>
乘以df['Type']
,然后将列表展平,然后为平面列表创建一个新列:
df['Count']
输出:
import numpy as np
import pandas as pd
df = pd.DataFrame({'Total': [10, 10, 10, 10, 10, 10, 10, 10, 10, 10], \
'Type': ['Child', 'Boy', 'Girl', 'Senior', '', '', '', '', '', ''], \
'Count': [4, 5, 1, 0, '', '', '', '', '', '']})
dropped = [str((x+' ')*y).split() for x,y in list(zip(df['Type'].tolist(),df['Count'].tolist())) if type(x) and type(y) != str]
df['New'] = sum(dropped, [])
print(df)
答案 3 :(得分:1)
试试这个,
df['New']= sum((df[df['Type']!=''].apply(lambda x: x['Count']*[x['Type']],axis=1)).values,[])
输出:
Count Total Type repeat
0 4 10 Child Child
1 5 10 Boy Child
2 1 10 Girl Child
3 0 10 Senior Child
4 10 Boy
5 10 Boy
6 10 Boy
7 10 Boy
8 10 Boy
9 10 Girl
答案 4 :(得分:1)
这是使用itertools.chain
和itertools.repeat
的一种方式:
from itertools import chain, repeat
# calculate number of non-blank rows
n = (df['Type'] != '').sum()
# extract values for these rows
vals = df[['Type', 'Count']].iloc[:n].values
# iterate and repeat values
df['New'] = list(chain.from_iterable(repeat(*row) for row in vals))
print(df)
Count Total Type New
0 4 10 Child Child
1 5 10 Boy Child
2 1 10 Girl Child
3 0 10 Senior Child
4 10 Boy
5 10 Boy
6 10 Boy
7 10 Boy
8 10 Boy
9 10 Girl