我有一个看起来像这样的文本文件:
# Pearson correlation [n=344 #col=2]
# Name Name Value BiasCorr 2.50% 97.50% N: 2.50% N:97.50%
# --------------- --------------- -------- -------- -------- -------- -------- --------
101_DGCA3.1D[0] 101_LEC.1D[0] +0.85189 +0.85071 +0.81783 +0.87777 +0.82001 +0.87849
我已使用以下代码将其加载到python pandas中:
import pandas as pd
data = pd.read_table('test.txt')
print data
但是,我似乎无法单独访问不同的列。我尝试使用sep=' '
并复制文本文件中列之间的空格,但我仍然没有获取任何列名并尝试打印data[0]
给出了错误:
Traceback (most recent call last):
File "cut_afni_output.py", line 3, in <module>
print data[0]
File "/home/user/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 1969, in __getitem__
return self._getitem_column(key)
File "/home/user/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py", line 1976, in _getitem_column
return self._get_item_cache(key)
File "/home/user/anaconda2/lib/python2.7/site-packages/pandas/core/generic.py", line 1091, in _get_item_cache
values = self._data.get(item)
File "/home/user/anaconda2/lib/python2.7/site-packages/pandas/core/internals.py", line 3211, in get
loc = self.items.get_loc(item)
File "/home/user/anaconda2/lib/python2.7/site-packages/pandas/core/index.py", line 1759, in get_loc
return self._engine.get_loc(key)
File "pandas/index.pyx", line 137, in pandas.index.IndexEngine.get_loc (pandas/index.c:3979)
File "pandas/index.pyx", line 157, in pandas.index.IndexEngine.get_loc (pandas/index.c:3843)
File "pandas/hashtable.pyx", line 668, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12265)
File "pandas/hashtable.pyx", line 676, in pandas.hashtable.PyObjectHashTable.get_item (pandas/hashtable.c:12216)
KeyError: 0
我无法手动设置标题行,因为看起来python将整个视图视为一列。如何将文本文件作为我可以调用的单独列读入?
答案 0 :(得分:4)
试试这个:
In [33]: df = pd.read_csv(filename, comment='#', header=None, delim_whitespace=True)
In [34]: df
Out[34]:
0 1 2 3 4 5 6 7
0 101_DGCA3.1D[0] 101_LEC.1D[0] 0.85189 0.85071 0.81783 0.87777 0.82001 0.87849