Question

我有一个CSV文件，已将其放入DataFrame中，并使用它来创建XlsxWriter文件，这是我正在创建的培训的一部分。好吧，实际上有2个CSV文件。第一个很完美，另一个很奇怪。许多列的“长度”都超过20，但是当您查看CSV时，该长度约为7-10个字符。

原始代码：

def get_col_widths(df:pd.DataFrame):
    """
    This takes in a DataFrame and returns a list of the sizes of each column for autofit.

    """
    cols = df.columns
    title_len_list = [len(col) for col in cols]
    col_len_list = []
    for col in cols:
        col_len_list.append(df[col].astype(str).map(len).max())
    # https://stackoverflow.com/a/40948355/10474024
    final_list = [max(val) + 4 for val in zip(title_len_list, col_len_list)]
    return final_list

我以另一种方式测试的代码，以防万一上述错误，但给出了相同的答案：

col_widths = []
for col in df2.columns:
    col_widths.append(df2[col].apply(str).apply(lambda x: x.strip()).apply(len).max())
print(col_widths)
# [1, 1, 21, 23, 20, 22, 20, 21, 21, 22, 21, 21, 22, 22, 20, 22, 21, 21, 21, 22, 23, 22, 23, 22, 23, 21, 23, 23, 22, 23, 23, 23, 21, 23, 1]

没有错误消息，它只是使所创建的XLSX中的列超宽。我试图搜索是否可能有隐藏的字符，但我不知道该如何分辨。

熊猫列长度与原始文件不匹配

0 个答案: