下面是一个示例字符串。如何将该字符串转换为Pandas Dataframe?
str1 =
"""
Feature Id & Feature Desc Status Failed Total
--------------------------------------------------- -------- ------ -----
RKSPACE (RackSpace Test In) Passed 0 1
D1 (Drum 1 Test) Passed 0 1
D2 (Drum 2 Test) Passed 0 1
D3 (Drum 3 Test) Passed 0 1
PRIMUS (PRIMUS Ink Test) Not-run 0 0
RGB (RGB Color Test) Passed 0 1
YONO (App Test) Not-run 0 0
PSENSE (Paper Sensor Test) Not-run 0 0
TFlag (Flag Test) Not-run 0 0
MEMT (Memory Test) Passed 0 1
CRG (CARRIAGE Test) Not-run 0 0
"""
我尝试了以下代码
import pandas as pd
from StringIO import StringIO
def get_dataframe(str1):
test_data = StringIO(str1)
df = pd.read_csv(test_data, sep=r'\s+', comment='--', engine='python')
return df
我得到的结果很难看而且不正确。 Result Image 我检查了其他帖子,但是没有找到任何处理字符串中空格的问题。 通常,如果第一列中没有空格,那么获取数据框很容易,但是如何将其转换为保留与str1相同格式的数据框? 任何帮助,将不胜感激 。谢谢
答案 0 :(得分:3)
您可以使用read_fwf
:
str1 = """
Feature Id & Feature Desc Status Failed Total
--------------------------------------------------- -------- ------ -----
RKSPACE (RackSpace Test In) Passed 0 1
D1 (Drum 1 Test) Passed 0 1
D2 (Drum 2 Test) Passed 0 1
D3 (Drum 3 Test) Passed 0 1
PRIMUS (PRIMUS Ink Test) Not-run 0 0
RGB (RGB Color Test) Passed 0 1
YONO (App Test) Not-run 0 0
PSENSE (Paper Sensor Test) Not-run 0 0
TFlag (Flag Test) Not-run 0 0
MEMT (Memory Test) Passed 0 1
CRG (CARRIAGE Test) Not-run 0 0
"""
df = pd.read_fwf(pd.compat.StringIO(str1),
colspecs=[(0, 50), (51, 62), (63, 69), (70, 76)],
skiprows=[2],
header=[1])
print (df)
Feature Id & Feature Desc Status Failed Total
0 RKSPACE (RackSpace Test In) Passed 0 1
1 D1 (Drum 1 Test) Passed 0 1
2 D2 (Drum 2 Test) Passed 0 1
3 D3 (Drum 3 Test) Passed 0 1
4 PRIMUS (PRIMUS Ink Test) Not-run 0 0
5 RGB (RGB Color Test) Passed 0 1
6 YONO (App Test) Not-run 0 0
7 PSENSE (Paper Sensor Test) Not-run 0 0
8 TFlag (Flag Test) Not-run 0 0
9 MEMT (Memory Test) Passed 0 1
10 CRG (CARRIAGE Test) Not-run 0 0
感谢@gyoza简化解决方案:
df = pd.read_fwf(pd.compat.StringIO(str1),
skiprows=[2],
header=[1])
print (df)
Feature Id & Feature Desc Status Failed Total
0 RKSPACE (RackSpace Test In) Passed 0 1
1 D1 (Drum 1 Test) Passed 0 1
2 D2 (Drum 2 Test) Passed 0 1
3 D3 (Drum 3 Test) Passed 0 1
4 PRIMUS (PRIMUS Ink Test) Not-run 0 0
5 RGB (RGB Color Test) Passed 0 1
6 YONO (App Test) Not-run 0 0
7 PSENSE (Paper Sensor Test) Not-run 0 0
8 TFlag (Flag Test) Not-run 0 0
9 MEMT (Memory Test) Passed 0 1
10 CRG (CARRIAGE Test) Not-run 0 0