Question

我有一个xml文件，其中包含一些我正在提取并放在numpy记录数组中的数据。我打印数组，我看到数据在正确的位置。我想知道如何在我的numpy记录数组中获取该信息并将其放在表中。我打印记录时也收到了字母b，如何解决？

Xml数据

CIFilter

python中的代码

<instance name="uart-0" module="uart_16550" offset="000014"/>
<instance name="uart-1"  offset="000020" module="uart_16650"/>

输出

inst_rec=np.zeros(5,dtype=[('name','a20'),('module','a20'),('offset','a5')])

for node in xml_file.iter():
    if node.tag=="instance":
        attribute=node.attrib.get('name')
        inst_rec[i]=  (node.attrib.get('name'),node.attrib.get('module'),node.attrib.get('offset'))
        i=i+1

for x in range (0,5):
    print(inst_rec[x])

Answer 1

要避免打印b'xxx'，请尝试以下操作：

print (', '.join(y.decode() for y in inst_rec[x]))

Answer 2

您正在使用Python3，它使用unicode字符串。它显示带有b的字节字符串。 xml文件也可以是字节，例如encoding='UTF-8'。

您可以通过在打印前将字符串传递给b来摆脱decode()。

更多关于在Py3中编写csv个文件的信息 Numpy recarray writes byte literals tags to my csv file?

在我的测试中，我可以通过使inst_rec数组使用unicode字符串（'U20'）

来简化显示

import numpy as np
import xml.etree.ElementTree as ET

tree = ET.parse('test.xml')
root = tree.getroot()

# inst_rec=np.zeros(2,dtype=[('name','a20'),('module','a20'),('offset','a5')])
inst_rec = np.zeros(2,dtype=[('name','U20'),('module','U20'),('offset','U5')])

i = 0
for node in root.iter():
    if node.tag=="instance":
        attribute=node.attrib.get('name')
        rec =  (node.attrib.get('name'),node.attrib.get('module'),node.attrib.get('offset'))
        inst_rec[i] = rec
        # no need to decode
        i=i+1

# simple print of the array
print(inst_rec)

# row by row print
for x in range(inst_rec.shape[0]):
    print(inst_rec[x])

# formatted row by row print
for rec in inst_rec:
    print('%20s,%20s, %5s'%tuple(rec))

# write a csv file
np.savetxt('test.out', inst_rec, fmt=['%20s','%20s','%5s'], delimiter=',')

制造

[('uart-0', 'uart_16550', '00001') ('uart-1', 'uart_16650', '00002')]

('uart-0', 'uart_16550', '00001')
('uart-1', 'uart_16650', '00002')

          uart-0,          uart_16550, 00001
          uart-1,          uart_16650, 00002

和

1703:~/mypy$ cat test.out
          uart-0,          uart_16550,00001
          uart-1,          uart_16650,00002

作为ASCII表格显示

# formatted row by row print
print('----------------------------------------')
for rec in inst_rec:
    print('| %20s | %20s | %5s |'%tuple(rec))
    print('---------------------------------------')

如果你想要更高级的东西，你需要指定显示工具 - html，富文本等。

添加了包prettyprint：

import prettytable
pp = prettytable.PrettyTable()
pp.field_names = inst_rec.dtype.names
for rec in inst_rec:
    pp.add_row(rec)
print(pp)

产生

+--------+------------+--------+
|  name  |   module   | offset |
+--------+------------+--------+
| uart-0 | uart_16550 | 00001  |
| uart-1 | uart_16650 | 00002  |
+--------+------------+--------+

在Python3中，我仍在使用unicode dtype。如果任何字符串都是字节，prettyprint将显示b。

如何从数组中获取记录到python中的表？

2 个答案: