Question

在Dataframe上使用groupby将列数据分组并求和为一系列数据后，我随后使用.to_frame方法将结果转换回Dataframe，然后将其转换为html以输出到文件。除标题行的最后一列中的数字为零（我无法删除）外，这似乎工作得很好-有什么想法吗？ - 看这里 0 主板类型网元类型硬件版本软件版本

在此处输入代码


   NE_3 = NE_2.groupby(NE_2.columns.tolist(), as_index=False).size()
   NE_3 = NE_3.to_frame()

   NE_2 = NE_2.drop_duplicates()
   NE_3 = NE_3.drop(columns='NE Type') # This doesn't work due to the '0' corrupting the header row
   html_txt = NE_3.to_html()
   tfile.write(html_txt)
   tfile.write('<br/>')

Answer 1

尝试-import glob files_list = glob.glob("corpus/*/*/*") for path in files_list: elems = re.split("\\\\", path) corpus, ln, classe, nom = elems file = open(path, mode="r", encoding="utf", errors="ignore") read_file = file.read() words = read_file.split() average = sum(len(word) for word in words) / len(words) print(ln, classe, average)（如果最后一列的名称为french negative 34.2 french positive 23.4 german negative 9.3 german positive 8.23）。

万一最后一列的名称为german positive 9.416666666666666，您可以尝试-

NE_2 = NE_2.drop([0], axis=1)

Answer 2

最简单的方法是将Dataframe作为csv文件写回，然后重新读取-这解决了标题行中的位移。然后可以将“ 0”列简单地重命名-

NE_3 = NE_3.rename（columns = {'0'：'Total'}）

系列转换为数据框问题

2 个答案: