Question

我收到了一封带有垃圾邮件的熊猫数据框。

我想在邮件列旁边创建一个额外的列，并显示每封邮件的字数。

例如：

Index      Content           Amount of words
0          Hi I am cool      4
1          What up?          2
2          Are you happy?    3

我可以计算每封邮件的字数：

count = data['INHALT'].str.split().str.len()
count.index = count.index.astype(str) + ' words:'

但是，如果我想将其作为列添加到我的数据框中，则只会显示NaN值。为什么？我该如何解决这个问题？

Answer 1

您可以使用附加功能添加新行

df = df.append(new_row, ignore_index=True)

Answer 2

啊，我想我做错了。

因为使用了代码

my_dataframe.append（row_to_append，ignore_index = True）

我将每个数字加到一行。但是实际上我想要一行名为“单词数”的行，然后为每列中的每条消息添加每个数字。

问题是：数字仍然显示为NaN值。

这是我要添加的专栏：

DOC_ID
0 words:     1125
1 words:      745
2 words:     1874
3 words:     1129
4 words:     1614
             ... 
78 words:    1649
79 words:     872
80 words:    1624
81 words:     866
82 words:    1327
Name: INHALT, Length: 83, dtype: int64

我只希望将数字作为“单词数”行添加到“消息”行旁边

再看一个例子：

Index      Content           Amount of words
0          Hi I am cool       4
1          What up?           2
2          Are you happy?     3

我希望我现在可以更清楚

如何向现有的熊猫数据框添加新列

2 个答案: