我想读取一个特定的txt文件并从中获取数据并将其写入元组。问题是我不需要文件中的所有数据,只需要特定的数据。 所以文本文件如下所示:
HHSDMSDN1-pool 1.02T 141G 39 22 2.62M 940K
**c5t600507680C800000001CBd0 834G 118G** 32 16 2.19M 734K
**c5t600507680C00352d0 216G 22.3G** 7 5 434K 206K
HHSDMSDN2-pool 1.09T 308G 12 6 744K 83.8K
**c5t600507680C800001CDd0 790G 162G** 10 1 617K 12.5K
**c5t600507680C8000000037Dd0 203G 34.8G** 1 0 123K 10.2K
**c5t600507680C800000387d0 126G 112G** 0 5 5.36K 80.5K
HHSDMSDN3-pool 1.13T 33.4G 24 19 1.39M 623K
**c5t600507680C80002E6000001CFd0 921G 30.8G** 18 11 1.10M 465K
**c5t600507680C80002E600000203d0 235G 2.63G** 5 8 293K 158K
大胆的文字需要进入元组。如果第一个值是字符串,那么最好,接下来是两个double / float。
所以输出将是
((c5t600507680C800000001CBd0, 834, 118), (c5t600507680C00352d0, 216, 22.3), .....))
有什么想法吗?
答案 0 :(得分:0)
您只需逐行遍历文件并跟踪已经看到的内容。
根据要求修改新解决方案
import pprint
data = """HHSDMSDN1-pool 1.02T 141G 39 22 2.62M 940K
c5t600507680C800000001CBd0 834G 118G 32 16 2.19M 734K
c5t600507680C00352d0 216G 22.3G 7 5 434K 206K
HHSDMSDN2-pool 1.09T 308G 12 6 744K 83.8K
c5t600507680C800001CDd0 790G 162G 10 1 617K 12.5K
c5t600507680C8000000037Dd0 203G 34.8G 1 0 123K 10.2K
c5t600507680C800000387d0 126G 112G 0 5 5.36K 80.5K
HHSDMSDN3-pool 1.13T 33.4G 24 19 1.39M 623K
c5t600507680C80002E6000001CFd0 921G 30.8G 18 11 1.10M 465K
c5t600507680C80002E600000203d0 235G 2.63G 5 8 293K 158K"""
# collect all records by key
d = {}
# current key "HHSDM..."
k = None
# current records
r = []
for line in data.splitlines():
if line.startswith(" c"):
# this is a record, append it to the current collection of records
fields = line.split()
r.append((fields[0], fields[1], fields[2]))
elif line.startswith("H"):
# this is a key, rember it, we will need it later
k = line.split("-")[0]
elif k:
# this is an empty line and we have a key, store the records
# and reset current records and current key
d[k] = r
r = []
k = None
# append current records at the end of the input
d[k] = r
pprint.pprint(d)
输出:
{'HHSDMSDN1': [('c5t600507680C800000001CBd0', '834G', '118G'),
('c5t600507680C00352d0', '216G', '22.3G')],
'HHSDMSDN2': [('c5t600507680C800001CDd0', '790G', '162G'),
('c5t600507680C8000000037Dd0', '203G', '34.8G'),
('c5t600507680C800000387d0', '126G', '112G')],
'HHSDMSDN3': [('c5t600507680C80002E6000001CFd0', '921G', '30.8G'),
('c5t600507680C80002E600000203d0', '235G', '2.63G')]}