pandas ValueError:无法从重复的轴重新索引

时间:2017-04-25 22:18:35

标签: python pandas

我有20 df(名为sample1 ... sample20),每个df都使用

加载
sample1 = pd.read_table('pathtosample1.csv', sep='\t', index_col=0)["score"]

我再次使用不同的变量为每个文件加载以下元步骤

meta1 = pd.read_table('pathtosample1.csv', sep='\t', index_col=0).loc[:,['junction_id','splice_site','intron_size', 'anchor','genes','transcripts', 'exons_skipped']]

sample1 df

Unique  junction_id score   splice_site anchor  intron_size exons_skipped   genes   transcripts
3:107915006-107915391(-)    ENSMUSG00000000001:E001 1017    GT-AG   DA  386 0   Gnai3   ENSMUST00000000001
3:107912225-107912321(-)    ENSMUSG00000000001:E002 10  GT-AG   D   97  0   Gnai3   ENSMUST00000000001
3:107912234-107912321(-)    ENSMUSG00000000001:E003 979 GT-AG   DA  88  0   Gnai3   ENSMUST00000000001
3:107912530-107914853(-)    ENSMUSG00000000001:E004 996 GT-AG   DA  2324    0   Gnai3   ENSMUST00000000001
3:107912530-107915391(-)    ENSMUSG00000000001:E005 3   GT-AG   NDA 2862    1   Gnai3   ENSMUST00000000001
3:107915520-107918681(-)    ENSMUSG00000000001:E006 1113    GT-AG   DA  3162    0   Gnai3   ENSMUST00000000001
3:107915520-107921219(-)    ENSMUSG00000000001:E007 1   GT-AG   NDA 5700    1   Gnai3   ENSMUST00000000001
3:107915520-107915944(-)    ENSMUSG00000000001:E008 1   GT-AG   A   425 0   Gnai3   ENSMUST00000000001
3:107918809-107921219(-)    ENSMUSG00000000001:E009 1141    GT-AG   DA  2411    0   Gnai3   ENSMUST00000000001

表示我使用这些命令仅指示6个样本

concat =  pd.concat([sample1,sample2,sample3,sample4,sample5,sample6], axis=1).fillna(0)
concat.columns = ["score_1", "score_2", "score_3","score_4", "score_5", "score_6"]
meta = pd.concat([meta1,meta2,meta3,meta4,meta5,meta6], ignore_index=True)

meta = meta[~meta.index.duplicated(keep='first')]
concat = pd.concat([concat, meta], axis=1)
concat.to_csv('data.csv')

我得到的错误是,

ValueError:无法从重复轴重新索引

我的预期输出是从第一列获取所有文件的第一列中的所有元素,并在列中添加每个样本的分数,然后添加对应于每行的其余元列,预期输出

Junction_id score1  score2  score3  score4 score5 score6    Unique  splice_site intron_size anchor  genes   transcripts exons_skipped
ENSMUSG00000000001:E001 1017    1   1651    6   3   1   3:107915006-107915391(-)    GT-AG   386 DA  Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E002 10  7   3   1144    1193    895 3:107912225-107912321(-)    GT-AG   97  D   Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E003 979 1075    1588    923 1223    1017    3:107912234-107912321(-)    GT-AG   88  DA  Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E004 996 3   1522    1   1   2   3:107912530-107914853(-)    GT-AG   2324    DA  Gnai3   ENSMUST00000000001  0
ENSMUSG00000000001:E005 3   1759    14  1127    4   1112    3:107912530-107915391(-)    GT-AG   2862    NDA Gnai3   ENSMUST00000000001  1

不确定导致此错误的步骤

0 个答案:

没有答案