Question

我有以下数据框：

(define is-full-house?
  (lambda (listy)
    ;; Sort listy from smallest to greatest
    (let ((sorted-list (sort listy <=)))
        (and 
         ((= (first sorted-list) (second sorted-list)) (= (fourth sorted-list) (fifth sorted-list))))
        (or
         ((= third fourth)) (= first third)))))

看起来像这样：

import pandas as pd
df = pd.DataFrame({'probegene' : ['1431492_at Lipn', '1448678_at Fam118a','1452580_a_at Mrpl21'],
                   '(5)foo.ID.LN.x2' : [130, 150,173],
                   '(5)foo.ID.LN.x1' : [20.3, 25.3,3.1]})

我想要做的是将In [21]: df Out[21]: (5)foo.ID.LN.x1 (5)foo.ID.LN.x2 probegene 0 20.3 130 1431492_at Lipn 1 25.3 150 1448678_at Fam118a 2 3.1 173 1452580_a_at Mrpl21中的行拆分为两列，从而产生：

probegene

我怎样才能做到这一点？

我坚持这个：

probe           gene    (5)foo.ID.LN.x1  (5)foo.ID.LN.x2            
1431492_at      Lipn           20.3              130      
1448678_at      Fam118a        25.3              150   
1452580_a_at    Mrpl21          3.1              173

Answer 1

我仍然不确定这是否是最佳方法，但如果您.apply(pd.Series)得到split的结果，则会获得正确索引的帧。之后你可以加入：

>>> new_cols = df.pop("probegene").str.split().apply(pd.Series)
>>> new_cols.columns = ["probe","gene"]
>>> df = df.join(new_cols)
>>> df
   (5)foo.ID.LN.x1  (5)foo.ID.LN.x2         probe     gene
0             20.3              130    1431492_at     Lipn
1             25.3              150    1448678_at  Fam118a
2              3.1              173  1452580_a_at   Mrpl21

我不确定这是否是最好的方法是因为apply往往很慢。像

这样的东西

pd.DataFrame.from_records(df["probegene"].str.split().tolist(), index=df.index)

如果这是一个瓶颈，

可能会更快。

Answer 2

单线解决方案

update ... set gameCapacity = gameCapacity -1 where ...

如何在Pandas中将行拆分为两列

2 个答案: