如何仅编码数据框中的分类数据
Income Length of Residence Median House Value Number of Vehicles Percentage Asian Percentage Black Percentage English Speaking Percentage Hispanic Percentage White MakeDescr SeriesDescr Msrp
1 90000 15.0 F 4 1 1 71 6 81 HYUNDAI Sonata-4 Cyl. 19395.0
2 125000 7.0 H 1 11 1 91 1 81 JEEP Grand Cherokee-V6 29135.0
3 90000 8.0 F 1 1 1 71 6 86 JEEP Liberty 20700.0
4 125000 8.0 F 3 1 1 86 6 86 VOLKSWAGEN Passat-V6 28750.0
5 90000 8.0 F 1 1 1 71 6 81 JEEP Wrangler 20210.0
6 110000 7.0 G 5 6 6 71 6 76 HYUNDAI Santa Fe-V6 25645.0
7 110000 7.0 G 3 11 6 71 6 71 HYUNDAI Sonata-4 Cyl. 15999.0
8 125000 8.0 G 1 1 11 81 6 76 HYUNDAI Santa Fe-V6 23645.0
9 125000 9.0 G 1 6 1 91 1 86 CHEVROLET TRUCK Trailblazer EXT 32040.0
10 110000 8.0 E 2 6 46 81 16 26 JEEP Wrangler-V6 18660.0
11 125000 11.0 G 3 6 1 76 1 86 CHEVROLET TRUCK Silverado 2500 HD 31775.0
12 125000 12.0 G 2 11 6 66 1 71 CHEVROLET Cobalt 13675.0
13 125000 13.0 G 2 1 16 95 6 71 HYUNDAI Veracruz-V6 28600.0
15 110000 11.0 F 5 6 41 61 11 41 HYUNDAI Santa Fe 22499.0
16 125000 9.0 F 2 1 6 91 1 81 HYUNDAI Santa Fe 22499.0
17 125000 8.0 G 2 11 11 66 1 66 MITSUBISHI Endeavor-V6 32602.0
18 110000 12.0 E 1 6 46 81 16 26 HYUNDAI Accent-4 Cyl. 10899.0
19 90000 9.0 F 4 1 6 71 6 81 JEEP Grand Cherokee-6 Cyl. 29080.0
21 125000 8.0 G 1 6 1 76 1 86 MITSUBISHI Endeavor-V6 29302.0
22 110000 12.0 F 2 6 26 66 11 51 HYUNDAI Santa Fe 22499.0
23 90000 9.0 F 1 6 6 66 6 76 HYUNDAI Santa Fe-V6 20995.0
24 125000 9.0 H 1 6 1 91 1 81 HYUNDAI Sonata-V6 18799.0
25 90000 14.0 F 2 1 6 71 11 81 HYUNDAI Elantra-4 Cyl. 13299.0
26 125000 9.0 G 3 1 11 81 6 76 JEEP Grand Cherokee-6 Cyl. 29080.0
27 125000 8.0 H 5 6 1 91 1 81 CHEVROLET TRUCK Trailblazer 29395.0
28 110000 12.0 E 4 6 41 61 11 36 HYUNDAI Sonata-4 Cyl. 15999.0
29 110000 10.0 E 1 6 41 61 11 36 HYUNDAI Santa Fe-V6 20995.0
30 125000 10.0 F 2 6 1 71 6 86 CHEVROLET TRUCK Tahoe 37000.0
32 90000 10.0 F 1 1 1 71 6 86 MITSUBISHI Galant-V6 19997.0
33 125000 12.0 F 1 1 1 86 6 86 CHEVROLET TRUCK Trailblazer 28175.0
... ... ... ... ... ... ... ... ... ... ... ... ...
4451 110000 9.0 F 3 6 41 61 11 36 NISSAN Sentra-4 Cyl. 17990.0
4452 125000 11.0 G 2 1 11 81 6 76 CHEVROLET TRUCK Tahoe 39515.0
4453 125000 8.0 H 1 6 1 91 1 81 HYUNDAI Elantra-4 Cyl. 15195.0
4454 110000 10.0 F 3 6 41 61 11 41 HYUNDAI Genesis-4 Cyl. 26750.0
4455 125000 7.0 H 4 11 1 76 1 76 HYUNDAI Sonata-4 Cyl. 19695.0
4456 125000 9.0 G 5 6 1 76 1 86 NISSAN Altima 22500.0
4457 110000 11.0 E 1 6 46 81 16 26 GMC LIGHT DUTY Denali 51935.0
4458 125000 6.0 H 1 11 1 76 1 76 JEEP Liberty-V6 24865.0
4459 125000 12.0 G 3 1 16 95 6 71 HONDA Accord-V6 26700.0
4460 125000 7.0 F 1 1 1 86 6 86 HYUNDAI Veloster-4 Cyl. 17300.0
4461 90000 10.0 F 2 6 11 66 6 71 CADILLAC SRX-V6 42210.0
4463 110000 8.0 F 3 6 26 61 11 56 GMC LIGHT DUTY Acadia 42390.0
4468 125000 8.0 G 1 1 1 91 1 86 HONDA Pilot-V6 40820.0
4469 125000 10.0 H 5 11 1 91 1 81 TOYOTA Highlander-V6 30695.0
4470 110000 12.0 F 1 6 41 61 11 41 HYUNDAI Elantra-4 Cyl. 15195.0
4473 110000 13.0 F 1 6 21 66 6 61 ACURA TSX 32910.0
4476 125000 9.0 G 1 6 1 76 1 86 BMW X3 36750.0
4482 125000 10.0 H 1 6 1 91 1 81 SUBARU Forester-4 Cyl. 21195.0
4486 125000 11.0 H 2 6 1 91 1 81 GMC LIGHT DUTY Yukon XL 44315.0
4492 125000 10.0 H 2 6 1 91 1 81 BMW 5 Series 53400.0
4493 110000 12.0 G 2 6 6 71 6 76 ACURA TL 33725.0
4494 125000 12.0 F 3 1 1 86 6 86 ACURA TL 33725.0
4495 125000 12.0 F 3 1 1 86 6 86 ACURA TL 33725.0
4496 125000 7.0 G 5 1 11 81 6 76 ACURA TL 33325.0
4497 125000 9.0 G 1 6 1 76 1 86 ACURA TL 33725.0
4498 125000 12.0 G 3 1 11 81 6 76 ACURA TL 33725.0
4499 110000 14.0 G 8 11 6 71 6 71 ACURA TL 33725.0
4501 125000 9.0 G 3 11 6 66 1 71 FORD Taurus-V6 20050.0
4502 110000 2.0 G 4 11 6 71 6 71 DODGE Stratus-4 Cyl. 15910.0
4503 125000 8.0 F 1 1 1 86 6 86 DODGE Stratus-4 Cyl. 19145.0
答案 0 :(得分:0)
# Using standard scikit-learn label encoder.
from sklearn.preprocessing import LabelEncoder
le = LabelEncoder()
# Encode all string columns. Assuming all categoricals are of type str.
for c in df.select_dtypes(['object']):
print "Encoding column " + c
df[c] = le.fit_transform(df[c])