将选项卡打印合并到具有生成打印内容的ID的数据框中

时间:2019-02-28 09:25:34

标签: python pandas

我使用python函数为数据框中的每个i返回制表符格式。这是一个示例:

这是我用于为每张打印生成标签格式的代码:

for i in df1['col1']:
    print(u.search(i,frmt="tab",columns=("lineage-id,id,go, go(biological process), go(molecular function),go(cellular component), go-id,reviewed"))

结果是:

Taxonomic lineage IDs   Entry   Gene ontology (GO)  Gene ontology (biological process)  Gene ontology (molecular function)  Gene ontology (cellular component)  Gene ontology IDs   Status
    619591  Q8V552  extracellular space [GO:0005615]            extracellular space [GO:0005615]    GO:0005615  unreviewed

Taxonomic lineage IDs   Entry   Gene ontology (GO)  Gene ontology (biological process)  Gene ontology (molecular function)  Gene ontology (cellular component)  Gene ontology IDs   Status
878992  Q8G553  extracellular space [GO:0005616]        golgi   extracellular space [GO:0005615]    GO:0005616  reviewed

Taxonomic lineage IDs   Entry   Gene ontology (GO)  Gene ontology (biological process)  Gene ontology (molecular function)  Gene ontology (cellular component)  Gene ontology IDs   Status
5672    Q89554  extracellular space [GO:0005617]        golgi   extracellular space [GO:0005615]    GO:0005617  reviewed

(如您所见,有8个同名名称中有一些空格,有些列中没有信息。您还可以注意到,Num_009418726.1不会产生打印,因为没有结果)。

新的姓氏是:

Taxonomic lineage IDs
Entry
Gene ontology (GO)
Gene ontology (biological process)
Gene ontology (molecular function)
Gene ontology (cellular component)
Gene ontology IDs
Status

df1['col1']由ID组成,例如:

NUm_009468701.1
Num_009418725.1
Num_009418726.1
Num_009429300.1

,想法是将这三个选项卡打印与df1['col1']中的相应ID合并到df1中:

并获得结尾:

col1    Taxonomic lineage IDs   Entry   Gene ontology (GO)  Gene ontology (biological process)  Gene ontology (molecular function)  Gene ontology (cellular component)  Gene ontology IDs   Status
Num_009468701.1 619591  Q8V552  extracellular space [GO:0005615]    NA  NA  extracellular space [GO:0005615]    GO:0005615  unreviewed
Num_009418725.1 878992  Q8G553  extracellular space [GO:0005616]    NA  golgi   extracellular space [GO:0005615]    GO:0005616  reviewed
Num_009418726.1 NA  NA  NA  NA  NA  NA  NA  NA
Num_009429300.1 5672    Q89554  extracellular space [GO:0005617]    NA  golgi   extracellular space [GO:0005615]    GO:0005617  reviewed

谢谢您的时间。

1 个答案:

答案 0 :(得分:1)

您可以输出该函数以创建列表列表

const NoQ = "1234, 4321, 6789"

const NoQi = i => `Number: ${NoQ.split(/,\s*/)[i - 1]}`

console.log(NoQi(1))
console.log(NoQi(2))
console.log(NoQi(3))

然后从中创建一个熊猫数据框-

base_list = []
//I am using "..." to indicate "etc." - it is not part of the syntax
for i in df1['col1']:
    if u.search(...):
       base_list.append([i, *u.search(...).split("\t")])