回答在foor-loop中重复

时间:2017-07-06 01:16:26

标签: python pandas for-loop

我有一个数据框,每天一个月,每天每10分钟一次:

        Date     Time   Temp 
0   31/05/2006  09:00   9.3
1   31/05/2006  09:10   10.1
2   31/05/2006  09:20   10.7

我想获得Max(Temp)的时间(hh:mm),所以我使用函数argmax来计算Max(Temp)的索引

maxTime = data.iloc[data[data['Date'] == '31/05/2006']['Outside Temperature'].argmax()]['Time']

那很好,但现在我需要为每个月的每一天计算这个,所以我把它放在一个循环中。首先,我创建了MaxTempTime列表来保存循环结果:

MaxTempTime = []
for i in data['Date']:
    maxTime = data.iloc[data[data['Date'] == i ]['Outside Temperature'].argmax()]['Time']
    MaxTempTime.extend(maxTime)
    print maxTime

但我得到的答案与每天的答案一样多,我只需要一次,然后继续下一个日期

(有10分钟的时间,在每天1440分钟内有144个10分钟的时间段,所以我每天得到144个相同的答案)

有人可以帮我解决这个问题吗?谢谢!

4 个答案:

答案 0 :(得分:1)

您可以在初始尝试中添加以下轻微修改:

MaxTempTime = []
for i in data['Date'].unique():
    maxTime = data.iloc[data[data['Date'] == i ]['Outside Temperature'].argmax()]['Time']
    MaxTempTime.append(maxTime)

这样,您可以遍历DataFrame中的所有日期,但每个日期只会迭代一次。这样就可以在代码中完成工作而无需进行太多更改,但使用groupby()的方法可能会更快,如果您的DataFrame很大,这可能会引起关注。

作为旁注,您应该使用append()而不是extend()向列表中添加元素。在这种情况下,使用extend()将时间字符串拆分为单个字符,并将每个字符作为其自己的元素追加。有关两种方法之间差异的解释,请参阅here

答案 1 :(得分:0)

我猜这与你在整个数组中获取最大值有关,因此你得到一个完整的数组,然后将它添加到你的列表中。我会尝试追加而不是扩展,或者因为它们都是相同的值你可以设置maxTime = maxTime [0]

答案 2 :(得分:0)

您可以按月和日使用 import UIKit import MapKit import CoreLocation class HotPlacesViewController: UIViewController, CLLocationManagerDelegate, MKMapViewDelegate { @IBOutlet weak var mapView: MKMapView! var isFirstTime = true var locationManager = CLLocationManager() let newPin = MKPointAnnotation() override func viewDidLoad() { super.viewDidLoad() // Do any additional setup after loading the view. // Setup the location services delegate in this class. locationManager.delegate = self // This little method requests the users permission for location services whilst in this view controller. if CLLocationManager.authorizationStatus() == .notDetermined { self.locationManager.requestAlwaysAuthorization() let alert = UIAlertController(title: "You can change this option in the Settings App", message: "So keep calm your selection is not permanent. ", preferredStyle: .alert) alert.addAction(UIAlertAction(title: "OK", style: .default, handler: nil)) self.present(alert, animated: true, completion: nil) } locationManager.distanceFilter = kCLDistanceFilterNone locationManager.desiredAccuracy = kCLLocationAccuracyBest locationManager.startUpdatingLocation() } override func didReceiveMemoryWarning() { super.didReceiveMemoryWarning() // Dispose of any resources that can be recreated. } override func viewWillAppear(_ animated: Bool) { super.viewWillAppear(animated) } // Drops the pin on the users current location. func locationManager(_ manager: CLLocationManager, didUpdateLocations locations: [CLLocation]) { mapView.removeAnnotation(newPin) let location = locations.last! as CLLocation let center = CLLocationCoordinate2D(latitude: location.coordinate.latitude, longitude: location.coordinate.longitude) if(self.isFirstTime){ let region = MKCoordinateRegion(center: center, span: MKCoordinateSpan(latitudeDelta: 0.01, longitudeDelta: 0.01)) // Set the region on the map. mapView.setRegion(region, animated: true) self.isFirstTime = false } newPin.coordinate = location.coordinate mapView.addAnnotation(newPin) } }

您的数据位于groupby()

df
  1. 创建月和日列。

    >>> df
             Date  Temp   Time
    0  31/05/2006   9.3  09:00
    1  31/05/2006  10.1  09:10
    2  31/05/2006  10.7  09:20
    3  31/05/2006  10.5  09:30
    4  31/05/2006  10.9  09:40
    5  01/06/2006   9.0  09:00
    6  01/06/2006   9.3  09:10
    7  01/06/2006   9.2  09:20
    8  01/06/2006   9.7  09:30
    9  01/06/2006   9.5  09:40
    
  2. >>> df2 = df.assign(Date = pd.to_datetime(df.Date, dayfirst=True)) >>> df2 = df2.assign(mon = df2.Date.dt.month, day = df2.Date.dt.day) 按月和日,获取最大groupby()的索引。

    Temp
  3. >>> df2.groupby(['mon', 'day'])['Temp'].idxmax() mon day 5 31 4 6 1 8 Name: Temp, dtype: int64

    中选择索引
    df2
  4. Keep other columns when using min() with groupby

答案 3 :(得分:0)

对于每组最大Temp的索引,我认为您需要groupby idxmax,然后按loc选择原始df

df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df = df.loc[df.groupby('Date')['Temp'].idxmax()]
print (df)
        Date  Temp   Time
4 2006-05-31  10.9  09:40
8 2006-06-01   9.7  09:30

使用sort_valuesgroupby并使用汇总last的替代解决方案:

df['Date'] = pd.to_datetime(df['Date'], dayfirst=True)
df = df.sort_values('Temp').groupby('Date', as_index=False).last()
print (df)
        Date  Temp   Time
0 2006-05-31  10.9  09:40
1 2006-06-01   9.7  09:30