Question

我是Python的初学者，我有一个看起来像这样的大型DataFrame：

import pandas as pd
df = pd.DataFrame({'Total': [10, 10, 10, 10, 10, 10, 10, 10, 10, 10], \
                    'Type': ['Child', 'Boy', 'Girl', 'Senior', '', '', '', '', '', ''], \
                    'Count': [4, 5, 1, 0, '', '', '', '', '', '']})
df[["Total", "Type", "Count"]]
df

输出：

   Total    Type    Count
0   10     Child    4
1   10       Boy    5
2   10      Girl    1
3   10     Senior   0
4   10      
5   10      
6   10      
7   10      
8   10      
9   10

我希望有类似的东西：

    Total   Type    Count   New
0   10     Child       4    Child
1   10       Boy       5    Child
2   10      Girl       1    Child
3   10    Senior       0    Child
4   10                      Boy
5   10                      Boy
6   10                      Boy
7   10                      Boy
8   10                      Boy
9   10                      Girl

我不知道如何创建一个新列，其条件是重复Type ntime作为Count的数量。

谢谢！

Answer 1

在repeat

中使用replace，Count空白为0

df['New']=df.Type.repeat(df.Count.replace('',0)).values
df
Out[657]: 
  Count  Total    Type    New
0     4     10   Child  Child
1     5     10     Boy  Child
2     1     10    Girl  Child
3     0     10  Senior  Child
4           10            Boy
5           10            Boy
6           10            Boy
7           10            Boy
8           10            Boy
9           10           Girl

Answer 2

不确定这是否是最快的方式，但它很简单：

from itertools import chain
import pandas as pd

df = pd.DataFrame({'Total': [10, 10, 10, 10, 10, 10, 10, 10, 10, 10], \
                    'Type': ['Child', 'Boy', 'Girl', 'Senior', '', '', '', '', '', ''], \
                    'Count': [4, 5, 1, 0, '', '', '', '', '', '']})
df['New'] = list(chain.from_iterable([t] * c for t, c in zip(df.Type, df.Count) if c))
print(df)

输出：

  Count  Total    Type    New
0     4     10   Child  Child
1     5     10     Boy  Child
2     1     10    Girl  Child
3     0     10  Senior  Child
4           10            Boy
5           10            Boy
6           10            Boy
7           10            Boy
8           10            Boy
9           10           Girl

Answer 3

尝试下面的代码，我将<TextBox x:Name="TextBox" AcceptsReturn="True" TextWrapping="Wrap"> <i:Interaction.Behaviors> <l:LineCountBehavior x:Name="LineCountBehavior"/> </i:Interaction.Behaviors> </TextBox> <TextBlock Visibility="{Binding LineCount, ElementName=LineCountBehavior, Converter={StaticResource IntToVisibilityConverter}}"/>乘以df['Type']，然后将列表展平，然后为平面列表创建一个新列：

df['Count']

输出：

import numpy as np
import pandas as pd
df = pd.DataFrame({'Total': [10, 10, 10, 10, 10, 10, 10, 10, 10, 10], \
                    'Type': ['Child', 'Boy', 'Girl', 'Senior', '', '', '', '', '', ''], \
                    'Count': [4, 5, 1, 0, '', '', '', '', '', '']})
dropped = [str((x+' ')*y).split() for x,y in list(zip(df['Type'].tolist(),df['Count'].tolist())) if type(x) and type(y) != str]
df['New'] = sum(dropped, [])
print(df)

Answer 4

试试这个，

df['New']= sum((df[df['Type']!=''].apply(lambda x: x['Count']*[x['Type']],axis=1)).values,[])

输出：

  Count  Total    Type repeat
0     4     10   Child  Child
1     5     10     Boy  Child
2     1     10    Girl  Child
3     0     10  Senior  Child
4           10            Boy
5           10            Boy
6           10            Boy
7           10            Boy
8           10            Boy
9           10           Girl

Answer 5

这是使用itertools.chain和itertools.repeat的一种方式：

from itertools import chain, repeat

# calculate number of non-blank rows
n = (df['Type'] != '').sum()

# extract values for these rows
vals = df[['Type', 'Count']].iloc[:n].values

# iterate and repeat values
df['New'] = list(chain.from_iterable(repeat(*row) for row in vals))

print(df)

  Count  Total    Type    New
0     4     10   Child  Child
1     5     10     Boy  Child
2     1     10    Girl  Child
3     0     10  Senior  Child
4           10            Boy
5           10            Boy
6           10            Boy
7           10            Boy
8           10            Boy
9           10           Girl

如何在Pandas中创建新列，条件重复另一列的值？

5 个答案: