熊猫-用伪值替换列中的文本

时间:2020-08-10 14:07:42

标签: pandas

我有一个数据框,其中包含一些客户信息,如下所示:

Customer Name, Purchase Date
Kevin, 2020-01-10
Scott, 2020-02-01
Mark, 2020-04-01
Peter, 2020-06-12

我想用诸如“ {Customer 1”,“ Customer 2”等虚拟值替换“客户名称”列。预期输出:

Customer Name, Purchase Date
Customer 1, 2020-01-10
Customer 2, 2020-02-01
Customer 3, 2020-04-01
Customer 4, 2020-06-12

我希望这是基于DataFrame形状

3 个答案:

答案 0 :(得分:3)

如果所有值都是唯一的,请使用 If RadioButtonOptionD.Text.Split(vbNewLine).Count >= 2 Then s.Height = 150 RadioButtonOptionD.Size = s End If RadioButtonOptionD.Refresh() End Sub 值转换为字符串:

index

如果需要转换列df['Customer Name'] = 'Customer ' + (df.index + 1).astype(str) print (df) Customer Name Purchase Date 0 Customer 1 2020-01-10 1 Customer 2 2020-02-01 2 Customer 3 2020-04-01 3 Customer 4 2020-06-12 的唯一值,请使用factorize

Customer Name

可能会有差异,请参见重复的值:

s = pd.Series((pd.factorize(df['Customer Name'])[0] + 1), index=df.index).astype(str)
df['Customer Name'] = 'Customer ' + s
print (df)
  Customer Name Purchase Date
0    Customer 1    2020-01-10
1    Customer 2    2020-02-01
2    Customer 3    2020-04-01
3    Customer 4    2020-06-12

答案 1 :(得分:2)

尝试factorize

df['Customer Name']='Customer ' + pd.Series(df['Customer Name'].factorize()[0]+1).astype(str)
df
Out[11]: 
  Customer Name  Purchase Date
0    Customer 1     2020-01-10
1    Customer 2     2020-02-01
2    Customer 3     2020-04-01
3    Customer 4     2020-06-12

答案 2 :(得分:2)

尝试使用sklearn LabelEncoder

from sklearn.preprocessing import LabelEncoder
customer = LabelEncoder().fit_transform(df['Customer Name'].values)
df['Customer Name'] = customer
df['Customer Name'] = 'Customer ' + df['Customer Name'].astype(str)