Question

我想阅读一个csv文件，该文件使用python pandas在一列中使用数据框中的所有数据进行格式化。每列的数据都以逗号分隔。

但是，在成千上万的地方也有逗号，所以如果我用逗号分隔，如果一行中的一列包含超过1,000的数字，它就无法正常工作。我如何摆脱成千上万的逗号？

例如：

AlarmName: "Blah"

Answer 1

正如@ A.Kot建议的那样，您可以从xlsx文件中读取每一行，删除,功能，然后重新写入pandas数据帧。类似的东西：

from __future__ import print_function
from os.path import join, dirname, abspath
import xlrd

fname = '_xlsx_path_\\data.xlsx'

# Open the workbook
xl_workbook = xlrd.open_workbook(fname)
xl_sheet = xl_workbook.sheet_by_name('Sheet1')

# Print all values, iterating through rows and columns
num_cols = xl_sheet.ncols   # Number of columns
for row_idx in range(0, xl_sheet.nrows):    # Iterate through rows
    for col_idx in range(0, num_cols):  # Iterate through columns
        cell_obj = xl_sheet.cell(row_idx, col_idx)  # Get cell object by row, col
        if row_idx == 0:
            columns = [c.encode("ascii") for c in cell_obj.value.split(',')]
            print(columns)
            print(' ')
        else:
            data_row = [d.encode("ascii") for d in cell_obj.value.split(',')]
            print(data_row)
            print(' ')

Answer 2

如果您可以指望在列之间的每个逗号后面都有空格，您可以让pandas跳过两边都有数字的逗号。

pandas.read_csv(..., sep=', ', ...)
#                         ^^         note the space after the comma

读取包含所有列的csv文件合并为一个：千位逗号问题

2 个答案: