我有csv数据
index username
1 ailee
2 yura
3 sony
4 lily
5 alex
6 eunji
7 hyun
8 jingo
9 kim
10 min
和集群的数据框结果:
index cluster
1 1
3 1
5 1
7 1
8 1
9 2
4 2
2 2
10 2
6 2
可以根据csv数据在pd.dataframe中添加用户名栏吗?
答案 0 :(得分:1)
我正在使用'DataFrame.merge'。这是代码
>>> import StringIO as sio
>>> import pandas as pd
>>> s1='''index username
1 ailee
2 yura
3 sony
4 lily
5 alex
6 eunji
7 hyun
8 jingo
9 kim
10 min'''
>>> s2 = '''index cluster
1 1
3 1
5 1
7 1
8 1
9 2
4 2
2 2
10 2
6 2'''
>>> df1=pd.read_csv(sio.StringIO(s1), index_col=0, delim_whitespace=True)
>>> df2=pd.read_csv(sio.StringIO(s2), index_col=0, delim_whitespace=True)
>>> df1
username
index
1 ailee
2 yura
3 sony
4 lily
5 alex
6 eunji
7 hyun
8 jingo
9 kim
10 min
>>> df2
cluster
index
1 1
3 1
5 1
7 1
8 1
9 2
4 2
2 2
10 2
6 2
>>> df1.merge(df2, left_index=True, right_index=True)
username cluster
index
1 ailee 1
3 sony 1
5 alex 1
7 hyun 1
8 jingo 1
9 kim 2
4 lily 2
2 yura 2
10 min 2
6 eunji 2
答案 1 :(得分:0)
您可以使用join
:
print (df2.join(df1))
cluster username
index
1 1 ailee
3 1 sony
5 1 alex
7 1 hyun
8 1 jingo
9 2 kim
4 2 lily
2 2 yura
10 2 min
6 2 eunji
或map
:
#map by column cluster
df2['username'] = df2.cluster.map(df1.username)
#map by index
df2['username1'] = df2.index.to_series().map(df1.username)
print (df2)
cluster username username1
index
1 1 ailee ailee
3 1 ailee sony
5 1 ailee alex
7 1 ailee hyun
8 1 ailee jingo
9 2 yura kim
4 2 yura lily
2 2 yura yura
10 2 yura min
6 2 yura eunji