Pandas:为变量的每个不同值查找最后一个非null值

时间:2018-06-19 04:42:37

标签: python pandas

我有一个像这样的数据框:

1-Sort all PayFort requests parameters in an ascending alphabetical order based on the parameters names as follow: 

params={access_code=stbbRGM7K1rHXliCQeNwtk,
amount=250,
command=PURCHASE,
currency=QAR,
customer_email=test@gmail.com,
customer_ip=103.43.154.34,
language=en,
merchant_identifier=YRssfesdsd,
merchant_reference=REFRRSERRFFE4444,
token_name=6RTD55DVVVBDSCCVVCFCCV}
params = params.sort.to_h
string = params.to_query(nil)
string = Digest::SHA256.hexdigest string
  params.store 'signature',string 
  uri = URI.parse("https://sbpaymentservices.payfort.com/FortAPI/paymentApi")
  header = {'Content-Type': 'application/json'}
  http = Net::HTTP.new(uri.host, uri.port)
  http.use_ssl = true
  request = Net::HTTP::Post.new(uri.request_uri, header)

我想要的是,每组a1值的最后一个非空值l1。所以预期的输出是:

    a1  l1
0   a   NaN
1   a   kl
2   a   NaN
3   a   NaN
4   a   er
5   b   ye
6   b   NaN
7   b   fk
8   b   NaN

我曾试图使用shift但我不知道如何跳过缺失值。

1 个答案:

答案 0 :(得分:2)

您需要groupbyapply

df['ex'] = df.groupby('a1').l1.apply(lambda x: x.ffill().shift())
df

  a1   l1   ex
0  a  NaN  NaN
1  a   kl  NaN
2  a  NaN   kl
3  a  NaN   kl
4  a   er   kl
5  b   ye  NaN
6  b  NaN   ye
7  b   fk   ye
8  b  NaN   fk

或者,连续两个groupby来电:

df['ex'] = df.groupby('a1').ffill().groupby('a1').shift()
df

  a1   l1   ex
0  a  NaN  NaN
1  a   kl  NaN
2  a  NaN   kl
3  a  NaN   kl
4  a   er   kl
5  b   ye  NaN
6  b  NaN   ye
7  b   fk   ye
8  b  NaN   fk