将for循环的输出写入pandas数据帧

时间:2019-12-04 10:59:58

标签: python pandas dataframe

如何将for循环的输出写入熊猫数据帧?

输入数据是数据帧(df_elements)的列表。

[                          seq  score    status
1652  TGGCTTCGATTTTGTTATCGATG  -0.22  negative
1277  GTACTGTGGAATCTCGGCAGGCT   4.87  negative
302   CCAAAGTCTCACTTGTTGAGAAC  -4.66  negative
1756  TGGCGGTGGTGGCGGCGCAGAGC   1.55  negative
5043  TGACGAAACATCTTATAAAGGAA   1.96  negative
3859  CAGAGCTCTTCAAACTTAAGAAC  -0.39  negative
1937  GTATGCTTGTGCTTCTCCAAAAA  -0.91  negative
2805  GGCCGGCCTGTGGTCGACGGGGA  -3.26  negative
3353                CCGATGGGC  -1.97  negative
5352  ACTTACTATTTACTGATCAGCAC   3.53  negative
5901  TTGAGGCTCTCCTTATCCAGATT   6.37  negative
5790  AAGGAAACGTGTAATGATAGGCG  -2.69  negative,                           seq  score    status
2197  CTTCCATTGAGCTGCTCCAGCAC  -0.97  negative
1336  CCAAATGCAACAATTCAAAGCCC  -0.44  negative
4825                CAATTTTGT  -6.44  negative
4991  ATACTGTTTGCTCACAAAAGGAG   2.15  negative
1652  TGGCTTCGATTTTGTTATCGATG  -0.22  negative
1964  ACCACTTTGTGGACGAATACGAC  -4.51  negative
4443  TTCCTCGTCTAGCCTTTCAGTGC   3.05  negative
4208  TGGCTGTGAACCCCTATCAGCTG   2.70  negative
212   CTGTCGTTTCAATGTTTAAGATA   6.43  negative
775                 GCTTTAAGT   0.06  negative
3899                GAGCAAAGC  -6.61  negative

我正在尝试将以下for循环的输出写入数据帧。我尝试通过创建一个空列表(数据)并使用data.append附加逐行输出。我收到类似无法连接类型为“”的对象的错误;

下面给出的代码将在控制台中显示输出:


cut_off = [0,1,2]

for co in cut_off:
    for df in df_elements:
        print co, "\t", str((df['score'] > co).sum())

代码应将cut_off值与列分数进行比较,并打印每个数据帧元素的总和,其中分数大于cut_off。

输出应如下所示:

cutoff number
0   5  #for first dataframe element
0   5  #for second dataframe element

1 个答案:

答案 0 :(得分:0)

# create empty lists for cutoff and number
cutoff_list = []
number_list = []

# loop through cutoff values and dataframes, to populate your lists
for co in cut_off:
    for df in df_elements:
        cutoff_list.append(co)
        number_list.append((df['score'] > co).sum())

# create dataframe from your lists
df = pd.DataFrame(list(zip(cutoff_list , number_list)), 
           columns =['cutoff', 'number']) 

# get your desired output
print(df)