如何将文本输出传输到dataframe

时间:2017-12-06 07:55:00

标签: python csv dataframe web-scraping beautifulsoup

我用漂亮的汤做了一些网页抓取,这是我得到的输出文字:

TCGA-KK-A7B3-01A Male  Stage not reported  Alive FPKM 5.5 Living days 899 (2.5 years)
TCGA-G9-6347-01A Male  Stage not reported  Alive FPKM 14.2 Living days 2089 (5.7 years)
TCGA-KC-A4BL-01A Male  Stage not reported  Alive FPKM 3.8 Living days 934 (2.6 years)
TCGA-KK-A7AQ-01A Male  Stage not reported  Alive FPKM 2.6 Living days 1610 (4.4 years)
TCGA-G9-6373-01A Male  Stage not reported  Alive FPKM 4.7 Living days 811 (2.2 years)
....

如何将此结果保存到数据框中?

如何将此信息保存到csv文件中。

我需要csv文件做进一步的分析吗?

1 个答案:

答案 0 :(得分:0)

试试这段代码:

import sys
if sys.version_info[0] < 3: 
    from StringIO import StringIO
else:
    from io import StringIO

import pandas as pd

TESTDATA=StringIO("""
TCGA-KK-A7B3-01A Male  Stage not reported  Alive FPKM 5.5 Living days 899 (2.5 years)
TCGA-G9-6347-01A Male  Stage not reported  Alive FPKM 14.2 Living days 2089 (5.7 years)
TCGA-KC-A4BL-01A Male  Stage not reported  Alive FPKM 3.8 Living days 934 (2.6 years)
TCGA-KK-A7AQ-01A Male  Stage not reported  Alive FPKM 2.6 Living days 1610 (4.4 years)
TCGA-G9-6373-01A Male  Stage not reported  Alive FPKM 4.7 Living days 811 (2.2 years)
    """)

df = pd.read_csv(TESTDATA, sep=" ",header=None)
df.to_csv('output.csv', sep=',')
print df

此代码将打印数据帧并生成output.csv文件。

更新1:

list_raw=[]
for i in TESTDATA:

    list_raw.append(i.split(' '))

df=pd.DataFrame(list_raw)
print df

如果是循环读取。使用如上所示的代码进行数据帧转换。然后按照最初显示的代码执行df到csv转换。