从其他列中的值获取列索引nr

时间:2020-03-17 10:34:01

标签: python pandas numpy dataframe

我是python和pandas的新手,所以我可能没有对所有可能性的全面了解,并且希望获得有关如何解决以下问题的提示:

我有一个像这样的_client .ScriptEvaluate(null, null) .ReturnsForAnyArgs(RedisResult.Create((RedisKey)"result"));

df

我想构造一个列,该列采用根据 Jan Feb Mar Apr i j a 100 200 250 100 1 0.3 b 120 130 90 100 3 0.7 c 10 30 10 20 2 0.25 进行索引的列,然后将所选列中的值与df['i']中的值相乘。 我想要建立一个像这样的表(df['j']构造了列):

df['k']

(行 Jan Feb Mar Apr i j k a 100 200 250 100 1 0.3 60 b 120 130 90 100 3 0.7 70 c 10 30 10 20 2 0.25 2.5 adf['k']=200*0.3),行df['Feb']*df['j'] bdf['k']=100*0.7)和行df['Apr']*df['j'] cdf['k']=10*0.25))

df['Mar']*df['j']中的值将始终是整数值,因此我很想根据df['i']中的值使用列的位置。

3 个答案:

答案 0 :(得分:1)

IIUC,DataFrame.rename,然后我们可以使用DataFrame.lookupmap。最后,我们使用Series.mul

df['k'] = df['j'].mul(df.rename(columns = dict(zip(df.columns,
                                                   range(len(df.columns)))))
                        .lookup(df.index, df['i']))
print(df)

输出

   Jan  Feb  Mar  Apr  i     j     k
a  100  200  250  100  1  0.30  60.0
b  120  130   90  100  3  0.70  70.0
c   10   30   10   20  2  0.25   2.5

替代:

df['j'].mul(df.iloc[:,df['i']].lookup(df.index, 
                                      df['i'].map(dict(zip(range(len(df.columns)),
                                                           df.columns)))))

答案 1 :(得分:0)

我觉得必须有更好的方法,但是您可以像这样使用itertuples

list_k = []
for row in df.itertuples():
  month = row[int(row[5]+1)] # Tuple indexing
  j = row[6]
  list_k.append(month * j)

df['k'] = list_k 

答案 2 :(得分:0)

提供的解决方案的另一种选择:

   #convert column i to a list
     vals = df.i.tolist()

   #get the number indices for the dataframe
  num_indices = [df.index.get_loc(ind) for ind in df.index]
   # or df.index.get_indexer(df.index)

   #create a pair of the indices and vals
    paired = list(zip(num_indices,vals))

  #calculate column k by multiplying each extract with column j
   df['k'] = [df.iloc[entry] for entry in paired] * df.j

    Jan Feb Mar Apr i   j         k
a   100 200 250 100 1   0.30    60.0
b   120 130 90  100 3   0.70    70.0
c   10  30  10  20  2   0.25    2.5

更新:@ansev是正确的,不需要循环:

#get the column labels that correspond with the values in column i:
 col_labels = df.columns[df.i]
 #get the values from each column using pandas' lookup:
 result = df.lookup(df.index, col_labels)
 #multiply the array with column j:
 df['k'] = result * df.j
 #u can compress this in one line, but i believe breaking it down
 #allows for readability : 
 #df = df.assign(k = df.lookup(df.index,df.columns[df.i])*df.j)

    Jan Feb Mar Apr i   j        k
a   100 200 250 100 1   0.30    60.0
b   120 130 90  100 3   0.70    70.0
c   10  30  10  20  2   0.25    2.5