Question

我知道有关CSV文件中的空格分隔符的问题不只几个。

我有一个CSV文件，似乎用空格隔开。导入Python时，我尝试了其中的所有代码以将空间标识为定界符。但是，我一直收到错误消息。例如：

    test_filepath = 'test_data.csv'

with codecs.open(test_filepath, "r", "Shift-JIS", "ignore") as file:  # import UTF8 based csv file 
    test_df = pd.read_table( file, delim_whitespace=True )

这会产生以下错误：

EmptyDataError: No columns to parse from file

当我尝试此操作时：

    test_filepath = 'test_data.csv'

with codecs.open(test_filepath, "r", "Shift-JIS", "ignore") as file:  # import UTF8 based csv file 
    test_df = pd.read_table( file, delimiter=" " )

它给出了相同的错误。

当我尝试此操作时：

    test_filepath = 'test_data.csv'

with codecs.open(test_filepath, "r", "Shift-JIS", "ignore") as file:  # import UTF8 based csv file 
    test_df = pd.read_table( file, sep = "/s+" )

我得到同样的错误。

当我尝试这样做时：

        test_filepath = 'test_data.csv'

with codecs.open(test_filepath, "r", "Shift-JIS", "ignore") as file:  # import UTF8 based csv file 
    test_df = pd.read_table( file, delimiter='\t')

我遇到同样的错误。

我唯一不会出错的方法是：

        test_filepath = 'test_data.csv'

with codecs.open(test_filepath, "r", "Shift-JIS", "ignore") as file:  # import UTF8 based csv file 
    test_df = pd.read_table( file, delimiter=',')

但是结果看起来完全不正确，并且test_df.info（）显示仅创建了一列（应该有100列）。

Answer 1

我认为大熊猫可以解决问题，其中之一应该可以工作。

import pandas as pd

df = pd.read_csv('file.csv', delim_whitespace=True)  
df = pd.read_csv('file.csv', delimiter=' ')

Answer 2

使用csv模块，您也许可以完成所需的工作：

import csv
import pandas as pd

with open("test_data.csv", "r") as file:
    data = csv.reader(file, delimiter=" ")

    # Perform what you need to do on data
    for row in data:
        print(row)

    # Can then load into a df if needed
    df = pd.DataFrame.from_records(data)
    print(df)

CSV导入Python中的空格分隔符

2 个答案: