用熊猫替换循环

时间:2019-11-20 10:49:58

标签: python python-3.x pandas

我正在为机器学习模型处理大量数据。由于数据集很大,因此我的功能之一是运行时间太长。是否有任何熊猫函数可以替代以下代码:

  df = pd.DataFrame({'Weight':[45, 88, 45, 88, 45, 88, 54, 45, 88], 
               'Name':['Sam', 'Sia', 'Sam', 'Sia', 'Sam', 'Sia', 'Ryan', 'Sam', 'Sia'], 
               'Age':[100, 95, 93, 90, 10, 95, 92, 110, 33]}) 

  my_group = df.groupby(['Name'])

  col_names = []
  diff_range = 5                             
  for pair in my_group:
     for i in range(1, diff_range+1):
     col_names.append(str(i))
     difference_df[str(i)] = df['Age'].diff(i).shift(periods=-i)
  difference_df['d_id_max'] = difference_df[col_names].idxmax(axis=1)  

上面的代码首先是让每个组获取我数据帧的每一行,然后计算与'model_prediction'列的该行与下3行的差异,最后返回与该行具有最大差异的行的索引。 / p>

   Weight   Name    Age
 0  45      Sam     100
 1  88      Sia     95
 2  45      Sam     93
 3  88      Sia     90
 4  45      Sam     10
 5  88      Sia     95
 6  54      Ryan    92
 7  45      Sam     110
 8  88      Sia     33

预期输出:

   Weight   Name    Age     1     2       3      4     5    d_id_max
 0  45      Sam     100   -5.0  -7.0  -10.0   -90.0  -5.0   1
 1  88      Sia     95    -2.0  -5.0  -85.0     0.0  -3.0   4
 2  45      Sam     93    -3.0 -83.0    2.0    -1.0  17.0   5
 3  88      Sia     90   -80.0   5.0    2.0    20.0 -57.0   4
 4  45      Sam     10    85.0  82.0  100.0    23.0   NaN   3
 5  88      Sia     95    -3.0  15.0  -62.0     NaN   NaN   2
 6  54      Ryan    92    18.0 -59.0    NaN     NaN   NaN   1
 7  45      Sam     110   -77.0  NaN    NaN     NaN   NaN   1
 8  88      Sia     33      NaN  NaN    NaN     NaN   NaN  NaN

1 个答案:

答案 0 :(得分:1)

使用df.shift()计算行之间的差,然后使用df.idxmax()获取具有最大值的列。

in-string

输出:

(module string-util typed/racket
  (provide (all-defined-out))

  (: empty-string? : (-> String Boolean))
  (define (empty-string? s)
    (string=? "" s))

  (: string-first : (-> String String))
  (define (string-first s)
    (substring s 0 1))

  (: string-last : (-> String String))
  (define (string-last  s)
    (substring s (- (string-length s) 1) (string-length s)))

  (: string-rest : (-> String String))
  (define (string-rest  s)
    (substring s 1 (string-length s))))

(require 'string-util)

(define (split-string-recur str)
  (cond [(or (empty-string? str) (empty-string? (string-rest str))) '()]
        [else (cons (string-append (string-first str) (string-first (string-rest str)))
                    (split-string-recur (string-rest (string-rest str))))]))