Web Scraping Python:未从网页的下拉菜单中获取所需数据

时间:2019-03-19 13:30:19

标签: python-3.x web-scraping beautifulsoup request

我正在尝试从网页获取数据。这是链接https://www.cardekho.com/compare-cars。在此页面上,一旦我们在下拉菜单中提供了汽车型号及其变体的URL,就需要抓取汽车数据表及其规格的比较。这是我的示例代码。

---
- name: test
  hosts: localhost
  tasks:
    - name: Instalation of postgresql-9.6
      apt:
      name: postgresql-9.6

    - name: start postgresql service
      service: name=postgresql state=restarted enabled=yes

    - name: create a database
      postgresql_db:
        name: managys
        encoding: UTF-8
        template: template0
        state: present
      become_user: postgres
      become: yes

但是这里的问题是,由于URL的原因,我没有获得所需的确切数据。这意味着,如果我给出四种车型及其变型进行比较,它将从提到的下拉菜单中随机给出该车型的数据。

任何人都可以解释我如何解决此问题并从该URL获取所需的准确数据。

任何帮助将不胜感激。

1 个答案:

答案 0 :(得分:0)

您正在做大量工作来解析这些表。熊猫可以使用.read_html()为您完成这项工作。

这将向您返回数据帧列表。只需选择数据帧并使用熊猫.to_csv()写入csv。

如果是我,我会将其压缩为一个循环以遍历它们,但是我将其扩展了,以便您可以看到它破裂了(如果有帮助的话)

import pandas as pd

url = 'https://www.cardekho.com/compare/maruti-gypsy-and-maruti-omni.htm'
tables = pd.read_html(url)
compare_cols = list(tables[0].columns[1:])

overview = tables[0]

engine = tables[1]
engine.columns = [engine.columns[0]] + compare_cols

transmision = tables[2]
transmision.columns = [transmision.columns[0]] + compare_cols

steering = tables[3]
steering.columns = [steering.columns[0]] + compare_cols

brakes_system = tables[4]
brakes_system.columns = [brakes_system.columns[0]] + compare_cols


overview.to_csv('D:/CarDekho_Data/maruti/maruti_2/overview.csv', index=False)
engine.to_csv('D:/CarDekho_Data/maruti/maruti_2/engine.csv', index=False)
transmision.to_csv('D:/CarDekho_Data/maruti/maruti_2/transmision.csv', index=False)
steering.to_csv('D:/CarDekho_Data/maruti/maruti_2/steering.csv', index=False)
brakes_system.to_csv('D:/CarDekho_Data/maruti/maruti_2/brakes_system.csv', index=False)

输出:

print (overview)
                                      Overview                        ...                                                                       Omni
0                                On Road Price                        ...                                                               Rs.3,36,883*
1                                    Fuel Type                        ...                                                                     Petrol
2                     Engine Displacement (cc)                        ...                                                                        796
3                             Available Colors                        ...                          Fantasy BlackMetallic silky silverMetallic Pea...
4                                    Body Type                        ...                                                                    Minivan
5                                    Max Power                        ...                                                            34.2bhp@5000rpm
6                                 User Reviews                        ...                                                     4.5Based on 45 Reviews
7                               Mileage (ARAI)                        ...                                                                  16.8 kmpl
8                                 Cargo Volume                        ...                                                                 210-litres
9                           Fuel Tank Capacity                        ...                                                                   35Litres
10                            Seating Capacity                        ...                                                                          5
11                           Transmission Type                        ...                                                                     Manual
12                           Offers & Discount                        ...                                                            1 OfferView now
13                     Finance Available (EMI)                        ...                                                         Rs.6,510 Check Now
14                           Insurance SaveBig                        ...                                                          Rs.17,146Know how
15                                Service Cost                        ...                                                                   Rs.2,996
16                                         NaN                        ...                                                                        NaN
17                             Air Conditioner                        ...                                                                         No
18                                   Cd Player                        ...                                                                         No
19                    Anti Lock Braking System                        ...                                                                         No
20                              Power Steering                        ...                                                                         No
21                         Power Windows Front                        ...                                                                         No
22                          Power Windows Rear                        ...                                                                         No
23                               Leather Seats                        ...                                                                         No
24                Speed Sensing Auto Door Lock                        ...                                                                         No
25             Impact Sensing Auto Door Unlock                        ...                                                                          -
26                             Air Conditioner                        ...                                                                         No
27                                      Heater                        ...                                                                         No
28                         Adjustable Steering                        ...                                                                         No
29                                  Tachometer                        ...                                                                         No
..                                         ...                        ...                                                                        ...
47                       Adjustable Headlights                        ...                                                                        Yes
48                            Fog Lights Front                        ...                                                                         No
49                             Fog Lights Rear                        ...                                                                         No
50  Power Adjustable Exterior Rear View Mirror                        ...                                                                         No
51    Manually Adjustable Ext Rear View Mirror                        ...                                                                        Yes
52           Electric Folding Rear View Mirror                        ...                                                                         No
53                          Rain Sensing Wiper                        ...                                                                         No
54                           Rear Window Wiper                        ...                                                                         No
55                          Rear Window Washer                        ...                                                                         No
56                        Rear Window Defogger                        ...                                                                         No
57                                Wheel Covers                        ...                                                                         No
58                                Alloy Wheels                        ...                                                                         No
59                               Power Antenna                        ...                                                                         No
60                                Tinted Glass                        ...                                                                         No
61                                Rear Spoiler                        ...                                                                         No
62                Removable Or Convertible Top                        ...                                                                         No
63                                Roof Carrier                        ...                                                                         No
64                                    Sun Roof                        ...                                                                         No
65                                   Moon Roof                        ...                                                                         No
66                                Side Stepper                        ...                                                                         No
67    Outside Rear View Mirror Turn Indicators                        ...                                                                         No
68                          Integrated Antenna                        ...                                                                         No
69                               Chrome Grille                        ...                                                                         No
70                              Chrome Garnish                        ...                                                                         No
71                             Smoke Headlamps                        ...                                                                         No
72                                   Roof Rail                        ...                                                                         No
73                                    Lighting                        ...                                                                         No
74                                Trunk Opener                        ...                                                                      Lever
75                         Additional Features                        ...                          2 Speed Windshield WiperFront And Rear Thermop...
76                          Heated Wing Mirror                        ...                                                                         No

[77 rows x 3 columns]

...

print (engine)
                 Engine                 ...                                                  Omni
0                  Type                 ...                                        In-Line Engine
1          Displacement                 ...                                                   796
2             Max Power                 ...                                       34.2bhp@5000rpm
3                  Year                 ...                                                  2010
4            Max Torque                 ...                                          59Nm@2500rpm
5           Description                 ...                   0.8-litre 34.2bhp 6V In-Line Engine
6        No Of Cylinder                 ...                                                     3
7   Valves Per Cylinder                 ...                                                     2
8   Valve Configuration                 ...                                                  SOHC
9    Fuel Supply System                 ...                                                  MPFI
10         Bore XStroke                 ...                                                    No
11    Compression Ratio                 ...                                                    No
12        Turbo Charger                 ...                                                    No
13        Super Charger                 ...                                                    No

[14 rows x 3 columns]

ETC。