Question

大家好，所以我有以下数据框：

  WM               WH          WP            LC_REF
0 Tesla        League       Test            DT 17 1C
1 Merc         Fandom       Tundra          DT 17 1C
2 Fellaine      Fark           ''           DT 17 1C
3 SeaWorld        ''           ''           DT 17 1C
4 Rectigy         ''           ''           DT 17 1C
5 Donfae          ''           ''           DT 17 1C

我的代码是：

for num in range(len(df)):
    df = df.groupby('LC_REF',sort=False).agg(lambda x: ','.join(x.astype(str).str.upper()).replace(' ','')).stack().rename_axis(('LC_REF','a')).reset_index(name='vals')

产生这个：

  LC_REF            a            vals
0 DT 17 1C         WM            Tesla,Merc,Fellaine,Seaworld,Rectigy,Donfae
1 DT 17 1C         WH            League, Fandom, Fark,,,
2 DT 17 1C         WP            Test,Tundra,,,,

有没有办法在最后删除额外的逗号？在我的代码中的某个地方，因为它是分组我希望它删除空字符串值，所以它看起来像这样：

  LC_REF            a            vals
0 DT 17 1C         WM            Tesla,Merc,Fellaine,Seaworld,Rectigy,Donfae
1 DT 17 1C         WH            League, Fandom, Fark
2 DT 17 1C         WP            Test,Tundra

Answer 1

试试这个：

df.vals.apply(lambda x: x[:x.find(',,')])

通过这种方式，您可以找到第一次出现',,'并将文本移到',,'的位置。即使最后只有一个逗号也能正常工作。

Pandas在丢弃NaN的同时保留数据

1 个答案: