如何将字符串转换为在同一列中带有空格的DataFrame

时间:2018-10-05 05:07:07

标签: python pandas dataframe

下面是一个示例字符串。如何将该字符串转换为Pandas Dataframe?

   str1 =
    """
    Feature Id & Feature Desc                             Status   Failed Total 
    ---------------------------------------------------   -------- ------ -----
    RKSPACE (RackSpace Test In)                           Passed   0      1     
    D1 (Drum 1 Test)                                      Passed   0      1     
    D2 (Drum 2 Test)                                      Passed   0      1     
    D3 (Drum 3 Test)                                      Passed   0      1     
    PRIMUS (PRIMUS Ink Test)                              Not-run  0      0     
    RGB (RGB Color Test)                                  Passed   0      1     
    YONO (App Test)                                       Not-run  0      0     
    PSENSE (Paper Sensor Test)                            Not-run  0      0     
    TFlag (Flag Test)                                     Not-run  0      0     
    MEMT (Memory Test)                                    Passed   0      1     
    CRG (CARRIAGE Test)                                   Not-run  0      0    
    """

我尝试了以下代码

    import pandas as pd
    from StringIO import StringIO        
    def get_dataframe(str1):
        test_data = StringIO(str1)
        df = pd.read_csv(test_data, sep=r'\s+', comment='--', engine='python')
        return df

我得到的结果很难看而且不正确。 Result Image 我检查了其他帖子,但是没有找到任何处理字符串中空格的问题。 通常,如果第一列中没有空格,那么获取数据框很容易,但是如何将其转换为保留与str1相同格式的数据框? 任何帮助,将不胜感激 。谢谢

1 个答案:

答案 0 :(得分:3)

您可以使用read_fwf

str1 = """
Feature Id & Feature Desc                             Status   Failed Total 
---------------------------------------------------   -------- ------ -----
RKSPACE (RackSpace Test In)                           Passed   0      1     
D1 (Drum 1 Test)                                      Passed   0      1     
D2 (Drum 2 Test)                                      Passed   0      1     
D3 (Drum 3 Test)                                      Passed   0      1     
PRIMUS (PRIMUS Ink Test)                              Not-run  0      0     
RGB (RGB Color Test)                                  Passed   0      1     
YONO (App Test)                                       Not-run  0      0     
PSENSE (Paper Sensor Test)                            Not-run  0      0     
TFlag (Flag Test)                                     Not-run  0      0     
MEMT (Memory Test)                                    Passed   0      1     
CRG (CARRIAGE Test)                                   Not-run  0      0    
"""

df = pd.read_fwf(pd.compat.StringIO(str1), 
                 colspecs=[(0, 50), (51, 62), (63, 69), (70, 76)], 
                 skiprows=[2],
                 header=[1])
print (df)
      Feature Id & Feature Desc   Status  Failed  Total
0   RKSPACE (RackSpace Test In)   Passed       0      1
1              D1 (Drum 1 Test)   Passed       0      1
2              D2 (Drum 2 Test)   Passed       0      1
3              D3 (Drum 3 Test)   Passed       0      1
4      PRIMUS (PRIMUS Ink Test)  Not-run       0      0
5          RGB (RGB Color Test)   Passed       0      1
6               YONO (App Test)  Not-run       0      0
7    PSENSE (Paper Sensor Test)  Not-run       0      0
8             TFlag (Flag Test)  Not-run       0      0
9            MEMT (Memory Test)   Passed       0      1
10          CRG (CARRIAGE Test)  Not-run       0      0

感谢@gyoza简化解决方案:

df = pd.read_fwf(pd.compat.StringIO(str1), 
                 skiprows=[2],
                 header=[1])
print (df)
      Feature Id & Feature Desc   Status  Failed  Total
0   RKSPACE (RackSpace Test In)   Passed       0      1
1              D1 (Drum 1 Test)   Passed       0      1
2              D2 (Drum 2 Test)   Passed       0      1
3              D3 (Drum 3 Test)   Passed       0      1
4      PRIMUS (PRIMUS Ink Test)  Not-run       0      0
5          RGB (RGB Color Test)   Passed       0      1
6               YONO (App Test)  Not-run       0      0
7    PSENSE (Paper Sensor Test)  Not-run       0      0
8             TFlag (Flag Test)  Not-run       0      0
9            MEMT (Memory Test)   Passed       0      1
10          CRG (CARRIAGE Test)  Not-run       0      0