我对python有一个问题:
如何使用这样的标题很好地打印矩阵:
T C G C A
[0 -2 -4 -6 -8 -10]
T [-2 1 -1 -3 -5 -7]
C [-4 -1 2 0 -2 -4]
C [-6 -3 0 1 1 -1]
A [-8 -5 -2 -1 0 2]
我使用numpy.matrix(mat)打印黑社会 但我得到的只是:
[[ 0 -2 -4 -6 -8 -10]
[ -2 1 -1 -3 -5 -7]
[ -4 -1 2 0 -2 -4]
[ -6 -3 0 1 1 -1]
[ -8 -5 -2 -1 0 2]]
我也没有成功添加标题。
感谢!!!
谢谢大家。 我成功安装了大熊猫'但我有两个新问题。 这是我的代码:
import pandas as pd
col1 = [' ', 'T', 'C', 'G', 'C', 'A']
col2 = [' ', 'T', 'C', 'C', 'A']
df = pd.DataFrame(mat,index = col2, columns = col1)
print df
但是我收到了这个错误:
df = pd.DataFrame(mat,index = col2, columns = col1)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 163, in __init__
copy=copy)
File "C:\Python27\lib\site-packages\pandas\core\frame.py", line 224, in _init_ndarray
return BlockManager([block], [columns, index])
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 237, in __init__
self._verify_integrity()
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 313, in _verify_integrity
union_items = _union_block_items(self.blocks)
File "C:\Python27\lib\site-packages\pandas\core\internals.py", line 906, in _union_block_items
raise Exception('item names overlap')
Exception: item names overlap
当我试图更改字母时,它会起作用:
T B G C A
0 -2 -4 -6 -8 -10
T -2 1 -1 -3 -5 -7
C -4 -1 2 0 -2 -4
C -6 -3 0 1 1 -1
A -8 -5 -2 -1 0 2
但是你可以看到矩阵的布局不太好。 我该如何解决这些问题呢?
答案 0 :(得分:3)
Numpy没有提供开箱即用的功能。
你可以看看大熊猫。打印pandas.DataFrame
通常看起来很不错。
import numpy as np
import pandas as pd
cols = ["T", "C", "S", "W", "Q"]
a = np.random.randint(0,11,size=(5,5))
df = pd.DataFrame(a, columns=cols, index=cols)
print df
将产生
T C S W Q
T 9 5 10 0 0
C 3 8 0 7 2
S 0 2 6 5 8
W 4 4 10 1 5
Q 3 8 7 1 4
如果您只有纯Python可用,则可以使用以下功能。
import numpy as np
def print_array(a, cols, rows):
if (len(cols) != a.shape[1]) or (len(rows) != a.shape[0]):
print "Shapes do not match"
return
s = a.__repr__()
s = s.split("array(")[1]
s = s.replace(" ", "")
s = s.replace("[[", " [")
s = s.replace("]])", "]")
pos = [i for i, ltr in enumerate(s.splitlines()[0]) if ltr == ","]
pos[-1] = pos[-1]-1
empty = " " * len(s.splitlines()[0])
s = s.replace("],", "]")
s = s.replace(",", "")
lines = []
for i, l in enumerate(s.splitlines()):
lines.append(rows[i] + l)
s ="\n".join(lines)
empty = list(empty)
for i, p in enumerate(pos):
empty[p-i] = cols[i]
s = "".join(empty) + "\n" + s
print s
c = [" ", "T", "C", "G", "C", "A"]
r = [" ", "T", "C", "C", "A" ]
a = np.random.randint(-4,15,size=(5,6))
print_array(a, c, r)
给你
T C G C A
[ 2 5 -3 7 1 9]
T [-3 10 3 -4 8 3]
C [ 6 11 -2 2 5 1]
C [ 4 6 14 11 10 0]
A [11 -4 -3 -4 14 14]
答案 1 :(得分:0)
考虑一个示例数组 -
In [334]: arr = np.random.randint(0,25,(5,6))
In [335]: arr
Out[335]:
array([[24, 8, 6, 10, 5, 11],
[11, 5, 19, 6, 10, 5],
[ 6, 2, 0, 12, 6, 17],
[13, 20, 14, 10, 18, 9],
[ 9, 4, 4, 24, 24, 8]])
我们可以使用pandas数据帧,如此 -
import pandas as pd
In [336]: print pd.DataFrame(arr,columns=list(' TCGCA'),index=list(' TCCA'))
T C G C A
24 8 6 10 5 11
T 11 5 19 6 10 5
C 6 2 0 12 6 17
C 13 20 14 10 18 9
A 9 4 4 24 24 8
请注意,pandas dataframe需要所有行和列的标题(列ID)和索引。因此,要跳过第一行和第一行的ID,我们使用了第一个为空的ID:' TCGCA'
和' TCCA'
。
答案 2 :(得分:0)
这是使用普通Python和numpy
添加标签的快速版本定义一个写行的函数。这里只是打印行,但它可以设置为打印到文件,或收集列表中的所有行并返回。
def pp(arr,lbl):
print(' ',' '.join(lbl))
for i in range(4):
print('%s %s'%(lbl[i], arr[i]))
In [65]: arr=np.arange(16).reshape(4,4)
二维数组的默认显示
In [66]: print(arr)
[[ 0 1 2 3]
[ 4 5 6 7]
[ 8 9 10 11]
[12 13 14 15]]
In [67]: lbl=list('ABCD')
In [68]: pp(arr,lbl)
A B C D
A [0 1 2 3]
B [4 5 6 7]
C [ 8 9 10 11]
D [12 13 14 15]
间距已关闭,因为numpy分别格式化每一行,为每行应用不同的元素宽度。但这是一个开始。
使用随机样本看起来更好:
In [69]: arr = np.random.randint(0,25,(4,4))
In [70]: arr
Out[70]:
array([[24, 12, 12, 6],
[22, 16, 18, 6],
[21, 16, 0, 23],
[ 2, 2, 19, 6]])
In [71]: pp(arr,lbl)
A B C D
A [24 12 12 6]
B [22 16 18 6]
C [21 16 0 23]
D [ 2 2 19 6]