Question

我有两个文件如下：

文件1：

A1  A2   description

1   10   apo_descriptorX_0001
4   52   apo_descriptorY_0001
30  1    apo_descriptorZ_0001
20  10   apo_descriptorX_0002
1   30   apo_descriptorX_0003
2   4    apo_descriptorY_0002

文件2：

A1  A2   description

1   10   holo_descriptorX_0001
4   52   holo_descriptorY_0001
30  1    holo_descriptorZ_0001
20  10   holo_descriptorX_0002
1   30   holo_descriptorX_0003
2   4    holo_descriptorY_0002

我想为每个描述符类型绘制值A1和A2的频率。因此，描述符X的每个值A1应该出现在关于其最终数字（0001,0002等）的频率图中。

我和朋友如何解决：

names=set(i[13:-5] for i in holo_data['description'])
#define variable "names" with the portion of the description you want to compare.
#In this case all the characters from 13 up to the final less 5 in the holo_data dataset.

for i in names:
    apo_i =("apo_")+(i)
    holo_i = ("holo_")+(i)
    fig1,ax1= plt.subplots(1,figsize=(10,5))
    sns.distplot(apo_data[apo_data['description'].str.contains(apo_i)]['A2'],ax=ax1,label='Apo')
    sns.distplot(holo_data[holo_data['description'].str.contains(holo_i)]['A1'],ax=ax1,label='Holo')
    ax1.legend()
    plt.title(i)
    ax1.set_ylabel('y', fontsize=12)
    ax1.set_xlabel(r 'x', fontsize=20)
    plt.show()`

;)

解析具有不同描述符的列表

0 个答案: