我有3个数据框,包括来自同一组的信息,现在我尝试$ sudo tcpdump -nnvs0 -I -i en0 -w output.pcap
$ ifconfig
...
en0: flags=8963<UP,BROADCAST,SMART,RUNNING,PROMISC,SIMPLEX,MULTICAST> mtu 1500
ether 60:03:08:a5:fa:0c
inet 192.168.1.33 netmask 0xffffff00 broadcast 192.168.1.255
media: autoselect
status: active
这些数据框的组,concate
作为组名,但因为{{1}包含不唯一的索引,因此我无法set_index
它们。有没有办法绕过它?
输入样本df:
df1
想要输出:
concate
我的代码:
df1:
group A B
cat 1 0
cat 2 7
cat 5 5
dog 0.4 1
dog 2 4
dog 8 7
seal 7 5
seal 1 8
seal 7 9
df2:
group C D
cat 1 3
seal 0 5
dog 3 4
df3:
group E F
cat 1 5
dog 0 3
seal 5 9
错误:
group A B C D E F
cat 1 0 1 3 1 5
cat 2 7 1 3 1 5
cat 5 5 1 3 1 5
dog 0.4 1 3 4 0 3
dog 2 4 3 4 0 3
dog 8 7 3 4 0 3
seal 7 5 0 5 5 9
seal 1 8 0 5 5 9
seal 7 9 0 5 5 9
谢谢!
答案 0 :(得分:1)
我认为如果相同尺寸,则可以先使用df2
的{{3}}和df3
,然后使用concat
:
df = pd.concat([df2.set_index('group'), df3.set_index('group')], axis = 1)
all_data = df1.join(df, on='group')
print (all_data)
group A B C D E F
0 cat 1.0 0 1 3 1 5
1 cat 2.0 7 1 3 1 5
2 cat 5.0 5 1 3 1 5
3 dog 0.4 1 3 4 0 3
4 dog 2.0 4 3 4 0 3
5 dog 8.0 7 3 4 0 3
6 seal 7.0 5 0 5 5 9
7 seal 1.0 8 0 5 5 9
8 seal 7.0 9 0 5 5 9
也可以在join
中使用参数index_col
代替read_csv
:
df1 = pd.read(file)
df2 = pd.read(file, index_col='group')
df3 = pd.read(file, index_col='group')
df = pd.concat([df2, df3], axis = 1)
all_data = df1.join(df, on='group')