不重复的DataFrame的最小值

时间:2018-12-08 13:16:51

标签: python pandas

我有一个名为mycsv.csv的csv文件,如下所示:

Cargo,Hartford Plastics Incartford,Beyond Landscape & Design Llcilsonville,Empire Of Dirt Llcquality,James Haas Al Haas Shelly Haasairfield,Ibrahim Chimandalpharetta,John Bianchiake Havasu City,Macomb Iron Llchesterfield,Robert Robertsonairhope,Viking Products Of Austin Incustin,Arachus Incashville,Dawna L Zanderppleton,Michael J Geenenaukauna,Peet'S Tree Serviceinterport,Filippo Lumaroallston,Jorge L Denisollywood,Ramiro Castilloucson,Paul J Krez Companyorton Grove,Sullivans' Homestead Inclympia,Jeffrey A Shepardypsum,Wayne E Bollinauvoo,Gary Lee Wilcoxpencer,Jacques N Faucherochester,Sidhu Trucking Incarrisburg,Edmon'S Unique Furniture & Stone Gallery Inc.Os Angeles,Ricardo Juradoacramento,Turenne Auto Body Llchorp,Allen R Pruittrown City,Kjellberg'S Carpet Oneuffalo,Dupree Testing Services Incutchinson,Vincent Rodriguezansas City,Loren Martinrand Junction,I N H Relocation Services Incoodbridge,Arrow Towing Llcent,Scott Cassidylanchardville,Como Construction Llcottstown,High Pines Farm Llcontello,Jwj Interests Incealy,Efrain Morales Diazorwalk,Rubye Hunterincolnton,Jevin Q Watsonillsboro,Martinez Transport Llcdaho Falls,Jeffery Allan Luiacine,Fish-Bones Towingew York,Wisebuys Stores Incouverneur
"City: Sikeston, Origin Lat/Lng: 36.876719,-89.5878579",288.16518178529924,2856.8366613586554,1946.2322551107845,1228.7434107157815,743.5177395236359,2909.33297790036,852.3043025072789,860.2416439860219,189.52901019707753,1300.0936820850939,2640.8430727593295,1172.415369821843,1146.3280101778155,947.4337222262991,1634.4349663994628,1983.122282932816,714.6851307436431,1477.1034889590278,477.7801455911509,1981.2631525746076,1005.9742445494613,504.68607570727,700.5054148429766,552.0350230026464,967.8909355757775,1146.3280101778155,947.4337222262991,730.2392566358949,956.2574234749868,1988.789934814995,2940.692951990481,1171.4798330278643,1350.557792572815,1377.6156311463164,1399.181008586136,581.1723144043883,1162.9211582598814,298.87161451448185,1396.3045898980015,624.5449226336435,581.1723144043883,447.3476037320306,636.7387700671857,741.9527088004397
"City: Christiansburg, Origin Lat/Lng: 37.1298517,-80.40893890000001",703.5514437626452,3646.7240547910224,1287.2084294531364,476.66667363219426,572.1520448507418,3577.5718158504587,498.42670119536973,1041.0612952927277,630.7463607947917,584.2444420355631,3325.6155567650253,1980.748881777687,1857.1281805600845,1587.7746854896682,2423.9127425154784,2732.1005263428588,238.6749202210914,1202.9080347940371,487.3460761286618,2729.7043206859803,1221.7460376077022,1263.76568052344,1439.7852953143308,830.9810543573747,172.51084998596346,1857.1281805600845,1587.7746854896682,405.20236695702386,1609.4944269930504,2801.118901401817,3640.817394557307,1883.0565213079258,635.5904239078004,2152.022664531193,683.5864159350897,813.7026734983782,400.8171247760424,1106.246368376282,664.8702291979142,531.8245049567981,813.7026734983782,783.1829715964839,994.6024025408867,259.56563195212976
"City: Columbus, Origin Lat/Lng: 39.9611755,-82.99879419999998",707.2651879191679,3365.7323351556092,1277.6185237536404,604.4488086045949,195.10312858490906,3224.4350677141797,203.3647448594909,653.9307094789133,535.8888964616413,651.184697488255,2979.6169916093795,1819.3140228573561,1808.6121466380644,1583.3491585697025,2153.458353721382,2422.7773595061117,524.259319521918,1541.059466802884,603.1147416605652,2420.0553726935846,834.6055929242748,1005.2149731399678,1365.455323769065,468.2949696761961,437.10925521373713,1808.6121466380644,1583.3491585697025,64.90416879206225,1597.5947617088123,2568.9801062784295,3299.0608085713484,1834.1127509895468,698.7699871262757,1870.608048759259,745.6029824569731,443.71249946107474,553.6312285326358,908.1974560461448,749.1370577886718,731.7103094305381,443.71249946107474,854.4055024538463,627.6209816438272,139.41862842287634
"City: Hebron, Origin Lat/Lng: 39.0661472,-84.70318879999998",543.1571809872214,3235.8655676957537,1454.0632660728259,755.8453951819669,304.88260453492325,3144.5104626988773,372.0108246469072,665.8331374357574,361.57883701393706,815.2407173452839,2892.571671221449,1650.592335949941,1631.752432596107,1408.830952940175,2017.2468207132285,2307.4828608731386,485.27222589324697,1481.3740204310964,478.4706491728169,2304.961992976113,847.8264861551796,859.3305642977896,1188.5495047616587,408.63315749508143,538.3344477730933,1631.752432596107,1408.830952940175,240.35368311648756,1422.4819792731755,2418.882252874563,3208.1796781501776,1657.2411361425955,864.4838384320662,1738.9484637837286,912.3717609741619,398.8811384621948,697.0422049825474,742.0839140363088,912.6647386349476,626.9640546291491,398.8811384621948,700.3154509586359,571.7401380515648,262.8939005965172
"City: Hickory, Origin Lat/Lng: 35.7344538,-81.3444573",584.3259105925222,3604.7782919285783,1457.0410227603552,641.0095286279536,686.8181483851599,3592.393831351412,641.9439930023347,1129.1378356640935,559.996155010706,753.8473200329004,3333.9710358127886,1888.3950267117884,1724.8267634834451,1445.0961578717026,2381.4445564511193,2712.8295220612413,64.84434046117352,1054.0113528730114,346.348873385156,2710.678863270488,1311.8171731829068,1230.8687591110142,1321.4234800772506,882.9333807688221,335.7520570225346,1724.8267634834451,1445.0961578717026,527.7482415828692,1468.7073468257295,2735.2117696530613,3644.24712640324,1750.6527710586943,804.3781070356566,2118.333634654824,851.4184243771355,873.4028682922321,564.735804395547,1047.9529798860856,829.8262537785849,361.3028967876256,873.4028682922321,635.949358799685,1045.3431322395745,391.4759589703787
"City: Northfield, Origin Lat/Lng: 44.458298299999996,-93.161604",1171.8450397951387,2523.7272281295554,1919.4882999135712,1518.3084638568828,839.6274550881894,2257.7415153062366,988.4669352858685,407.9892461251587,990.5677733060153,1515.5518691739817,2021.7766279982875,1422.5400230650566,1718.038406370119,1653.4863761024228,1390.5369314236416,1542.8563681118137,1397.1318666901464,2332.8776446458683,1284.5416857231658,1539.4624001620568,347.2709830787805,616.6010940338771,1332.0652057065838,515.0489975764948,1410.710722878088,1718.038406370119,1653.4863761024228,985.1922944361746,1646.3595192536486,1878.7470146399098,2345.256126446018,1737.1134839519866,1548.6049807104196,1103.5236866274627,1583.4548932325508,532.5055065492132,1488.706497529174,792.204172829517,1604.8206686040967,1449.8269074867078,532.5055065492132,1340.8590085161086,351.2239563914008,1113.6745122398531
"City: Fort Madison, Origin Lat/Lng: 40.629763399999995,-91.314535",722.9629150536373,2656.652498939045,1887.2630949009604,1311.0189423131728,659.7733763168603,2565.761845683486,812.1310769164796,509.78235599737604,552.8139820890664,1344.1252046700492,2308.963526586199,1218.8942471633943,1384.0707677796825,1264.8688850520796,1447.1561768369882,1718.0418742354614,1015.5448567154159,1898.035108400335,857.742654960854,1715.4558763445561,611.4882281357937,336.4291630874743,959.2044829573553,296.65842544459656,1126.793612649013,1384.0707677796825,1264.8688850520796,746.760091496324,1263.5283446961243,1878.9328171782238,2621.9046658896564,1406.2504336394966,1387.4704894491035,1162.3790171261805,1430.8490527364384,337.9270856870211,1262.6452355653091,391.0895498963794,1441.1874623832375,1020.5941059939145,337.9270856870211,889.7436293553498,259.8082227191765,836.9019063783317

在这种情况下,我必须获取哪一行的每一行或城市的最小值。我设法通过以下代码做到这一点:

import pandas as pd
df_truck_distance = pd.read_csv('mycsv.csv')
print(df_truck_distance.set_index('Cargo').idxmin(axis=1))

哪个返回了:

Cargo
City: Sikeston, Origin Lat/Lng: 36.876719,-89.5878579                  Viking Products Of Austin Incustin
City: Christiansburg, Origin Lat/Lng: 37.1298517,-80.40893890000001               Ricardo Juradoacramento
City: Columbus, Origin Lat/Lng: 39.9611755,-82.99879419999998                Kjellberg'S Carpet Oneuffalo
City: Hebron, Origin Lat/Lng: 39.0661472,-84.70318879999998                  Kjellberg'S Carpet Oneuffalo
City: Hickory, Origin Lat/Lng: 35.7344538,-81.3444573                      Paul J Krez Companyorton Grove
City: Northfield, Origin Lat/Lng: 44.458298299999996,-93.161604                     Gary Lee Wilcoxpencer
City: Fort Madison, Origin Lat/Lng: 40.629763399999995,-91.314535                Fish-Bones Towingew York

如您所见,最终结果中有重复的值。如何摆脱重复的值并显示该行的第二个最小值?

我认为输出应该是

Cargo
City: Sikeston, Origin Lat/Lng: 36.876719,-89.5878579                  Viking Products Of Austin Incustin
City: Christiansburg, Origin Lat/Lng: 37.1298517,-80.40893890000001               Ricardo Juradoacramento
City: Columbus, Origin Lat/Lng: 39.9611755,-82.99879419999998                Kjellberg'S Carpet Oneuffalo
City: Hebron, Origin Lat/Lng: 39.0661472,-84.70318879999998                  Wisebuys Stores Incouverneur
City: Hickory, Origin Lat/Lng: 35.7344538,-81.3444573                      Paul J Krez Companyorton Grove
City: Northfield, Origin Lat/Lng: 44.458298299999996,-93.161604                     Gary Lee Wilcoxpencer
City: Fort Madison, Origin Lat/Lng: 40.629763399999995,-91.314535                Fish-Bones Towingew York

1 个答案:

答案 0 :(得分:0)

毕竟我用这段代码完成了我想要的:

import pandas as pd
df_truck_distance = pd.read_csv('mycsv')

final_dataframe = df_truck_distance.set_index('Cargo').idxmin(axis=1)
duplicated = final_dataframe[final_dataframe.duplicated()].values
duplicated_index = final_dataframe[final_dataframe.duplicated()].index
df_truck_distance = df_truck_distance.drop(duplicated, axis=1)
final_dataframe2 = df_truck_distance.set_index('Cargo').idxmin(axis=1)

# Overriding the duplicated values
for i in duplicated_index:
    final_dataframe[i] = final_dataframe2[i]

print(final_dataframe)

我不知道这样做是否是更好的方法,但是我愿意接受新的想法。