所以我收到了一个包含各种各样的大文本文件。我做的第一件事是迭代它并将其添加到列表中,以便每一行都是一个元素。然后我做了它,以便可以将该行编入索引。请参阅下面的代码。
def main():
f = open("/usr/local/doc/FEguide.txt", "r")
full = list()
companies= list()
for line in f:
line = line.split(",")
full.append(line[1:])
打开文件中的一行是这种格式。 (1:slice是为了省略文本行中无用的第一个元素,下面没有显示)
现在我需要这样做,以便用户可以输入汽车制造商或类型(即标准SUV)的搜索词。我的预感是,我需要列出一个只有汽车制造商的列表(可以通过某种切片完成)然后列出(所有类型),然后如果为真则调用整行。我实际上遇到了麻烦。
答案 0 :(得分:0)
你可以使用zip和dict。
假设您有此文件:
General Motors,Chevrolet,K1500 TAHOE 4WD,2900
General Motors,Chevrolet,TRAVERSE AWD,2750
Chrysler Group LLC,Dodge,Durango AWD,2750
Chrysler Group LLC,Dodge,Durango AWD,3400
Ford Motor Company,Ford,Expedition 4WD,3100
Ford Motor Company,Ford,EXPLORER AWD,275
首先定义标题的外观:
... ... cars.py
import sys
cars_list = []
# header list
headers = ['company', 'Line', 'Type', 'Annual Cost']
with open('/home/ajava/tst.txt') as file:
# you should maybe check if both zip matrixs have the same size!!
for line in file:
zipped_list = zip(headers,line.strip().split(','))
#create a dictionary of zipped-tuples and append it to the car_list
cars_list.append(dict(zipped_list))
# printing results
print("\t".join(headers))
for item in cars_list:
print("{company}\t{line}\t{type}\t{annual cost}".format(**item))
你的输出应该是这样的:
company line type annual cost
General Motors Chevrolet K1500 TAHOE 4WD 2900
General Motors Chevrolet TRAVERSE AWD 2750
Chrysler Group LLC Dodge Durango AWD 2750
Chrysler Group LLC Dodge Durango AWD 3400
Ford Motor Company Ford Expedition 4WD 3100
Ford Motor Company Ford EXPLORER AWD 275
这只是一个简单的例子,你可以在没有额外的库的情况下做到这一点。
答案 1 :(得分:0)
我认为没有必要重新发明轮子,除非这是一个编程任务。我会将命令行互动留给您,但基本功能是使用pandas
:
import pandas as pd
df = pd.read_csv('FEguid.txt')
print '----------------------------'
print 'All companies sorted:'
print df.sort('Company').Company
print '----------------------------'
print 'All Dodge models:'
print df[df['Line'] == 'Dodge']
print '----------------------------'
print 'Mean MPG and annual cost per company'
print df.groupby('Company').mean()
print '----------------------------'
print 'Mean MPG and annual cost per car type'
print df.groupby('Type').mean()
<强>输出:强>
----------------------------
All companies sorted:
2 Chrysler Group LLC
3 Chrysler Group LLC
4 Ford Motor Company
5 Ford Motor Company
1 General Motors
0 General Motors
Name: Company, dtype: object
----------------------------
All Dodge models:
Company Line Type MPG Annual Cost Category
2 Chrysler Group LLC Dodge Durango AWD 19 2750 Standard SUV 4WD
3 Chrysler Group LLC Dodge Durango AWD 16 3400 Standard SUV 4WD
----------------------------
Mean MPG and annual cost per company
MPG Annual Cost
Company
Chrysler Group LLC 17.5 3075
Ford Motor Company 18.0 2925
General Motors 18.5 2825
----------------------------
Mean MPG and annual cost per car type
MPG Annual Cost
Type
Durango AWD 17.5 3075
EXPLORER AWD 19.0 2750
Expedition 4WD 17.0 3100
K1500 TAHOE 4WD 18.0 2900
TRAVERSE AWD 19.0 2750