熊猫合并具有两个具有相同代码和输入数据的结果

时间:2018-08-14 10:01:55

标签: python-3.x pandas dataframe statistics

我有两个数据框要合并。当我用相同的输入数据和代码运行程序时,会出现两种情况(第一个:成功合并;第二个:合并数据中的数据属于“ annotate”,即NaN。)

raw_df2 = pd.merge(annotate,raw_df,on='gene',how='right').fillna("unkown")

然后我要进行测试:

count = 10001
while (count > 10000):
    raw_df2 = pd.merge(annotate,raw_df,on='gene',how='right').fillna("unkown")
    count = len(raw_df2[raw_df2["type"]=="unkown"])
    print(count)

如果合并失败,则“ raw_df”在运行期间总是失败。我必须重新提交脚本,结果可能是成功的。

[前两列来自“ annotate”;其他列来自“ from raw_df”]
失败的结果:

|  type  |     gene      |          locus           | sample_1 | sample_2 | status | value_1 | value_2  |
+--------+---------------+--------------------------+----------+----------+--------+---------+----------+
| unknow | 0610040J01Rik | chr5:63812494-63899619   | Ctrl     | SPION10  | OK     | 2.02125 | 0.652688 |
| unknow | 1110008F13Rik | chr2:156863121-156887078 | Ctrl     | SPION10  | OK     | 87.7115 |  49.8795 |
+--------+---------------+--------------------------+----------+----------+--------+---------+----------+

成功的结果:

+--------+----------+------------------------+----------+----------+--------+----------+---------+
|  gene  |   type   |         locus          | sample_1 | sample_2 | status | value_1  | value_2 |
+--------+----------+------------------------+----------+----------+--------+----------+---------+
| St18   | misc_RNA | chr1:6487230-6860940   | Ctrl     | SPION10  | OK     |  1.90988 | 3.91643 |
| Arid5a | misc_RNA | chr1:36307732-36324029 | Ctrl     | SPION10  | OK     |  1.33796 | 2.21057 |
| Carf   | misc_RNA | chr1:60076867-60153953 | Ctrl     | SPION10  | OK     | 0.846988 | 1.47619 |
+--------+----------+------------------------+----------+----------+--------+----------+---------+

0 个答案:

没有答案