Question

所以我收到了一个包含各种各样的大文本文件。我做的第一件事是迭代它并将其添加到列表中，以便每一行都是一个元素。然后我做了它，以便可以将该行编入索引。请参阅下面的代码。

def main():
   f = open("/usr/local/doc/FEguide.txt", "r")
   full = list()
   companies= list()
   for line in f:
      line = line.split(",")
      full.append(line[1:])

打开文件中的一行是这种格式。（1：slice是为了省略文本行中无用的第一个元素，下面没有显示）

现在我需要这样做，以便用户可以输入汽车制造商或类型（即标准SUV）的搜索词。我的预感是，我需要列出一个只有汽车制造商的列表（可以通过某种切片完成）然后列出（所有类型），然后如果为真则调用整行。我实际上遇到了麻烦。

Answer 1

你可以使用zip和dict。

假设您有此文件：

General Motors,Chevrolet,K1500 TAHOE 4WD,2900
General Motors,Chevrolet,TRAVERSE AWD,2750
Chrysler Group LLC,Dodge,Durango AWD,2750   
Chrysler Group LLC,Dodge,Durango AWD,3400
Ford Motor Company,Ford,Expedition 4WD,3100
Ford Motor Company,Ford,EXPLORER AWD,275

首先定义标题的外观：

... ... cars.py

import sys

cars_list = []
# header list
headers = ['company', 'Line', 'Type', 'Annual Cost']

with open('/home/ajava/tst.txt') as file:
    # you should maybe check if both zip matrixs have the same size!!
    for line in file:
        zipped_list = zip(headers,line.strip().split(','))

        #create a dictionary of zipped-tuples and append it to the car_list
        cars_list.append(dict(zipped_list))

# printing results
print("\t".join(headers))
for item in cars_list:
    print("{company}\t{line}\t{type}\t{annual cost}".format(**item))

你的输出应该是这样的：

  company   line    type    annual cost
  General Motors    Chevrolet   K1500 TAHOE 4WD 2900
  General Motors    Chevrolet   TRAVERSE AWD    2750
  Chrysler Group LLC    Dodge   Durango AWD 2750
  Chrysler Group LLC    Dodge   Durango AWD 3400
  Ford Motor Company    Ford    Expedition 4WD  3100
  Ford Motor Company    Ford    EXPLORER AWD    275

这只是一个简单的例子，你可以在没有额外的库的情况下做到这一点。

Answer 2

我认为没有必要重新发明轮子，除非这是一个编程任务。我会将命令行互动留给您，但基本功能是使用pandas：

import pandas as pd
df = pd.read_csv('FEguid.txt')

print '----------------------------'
print 'All companies sorted:'
print df.sort('Company').Company
print '----------------------------'
print 'All Dodge models:'
print df[df['Line'] == 'Dodge']
print '----------------------------'
print 'Mean MPG and annual cost per company'
print df.groupby('Company').mean()
print '----------------------------'
print 'Mean MPG and annual cost per car type'
print df.groupby('Type').mean()

<强>输出：

----------------------------
All companies sorted:
2    Chrysler Group LLC
3    Chrysler Group LLC
4    Ford Motor Company
5    Ford Motor Company
1        General Motors
0        General Motors
Name: Company, dtype: object
----------------------------
All Dodge models:
              Company   Line         Type  MPG  Annual Cost          Category
2  Chrysler Group LLC  Dodge  Durango AWD   19         2750  Standard SUV 4WD
3  Chrysler Group LLC  Dodge  Durango AWD   16         3400  Standard SUV 4WD
----------------------------
Mean MPG and annual cost per company
                     MPG  Annual Cost
Company
Chrysler Group LLC  17.5         3075
Ford Motor Company  18.0         2925
General Motors      18.5         2825
----------------------------
Mean MPG and annual cost per car type
                  MPG  Annual Cost
Type
Durango AWD      17.5         3075
EXPLORER AWD     19.0         2750
Expedition 4WD   17.0         3100
K1500 TAHOE 4WD  18.0         2900
TRAVERSE AWD     19.0         2750

搜索嵌套列表

2 个答案: