使用来自第二个数据框的值填充pandas数据框,其中包含一些共同的行和列

时间:2017-07-28 15:55:58

标签: python pandas dataframe

我们假设我有两个pandas数据帧df1df2

df1
         s1   s2   s3
    bob  nan  nan  nan
    john nan  nan  nan
    matt nan  nan  nan

df2
         s1   s3   s4
    bob  32   11   22
    matt 1   nan    2

我会在df1中填充df2中存在df1行和列的值,以便输出

     s1   s2   s3
bob  32   nan   11
john nan  nan  nan
matt 1    nan  nan

这意味着,在这个玩具案例中,我对s4的{​​{1}}列填充df2不感兴趣。 我使用df1的所有尝试都遗憾地失败了,我总是最终得到一个包含所有merge的数据框。

1 个答案:

答案 0 :(得分:4)

就地操作
使用pd.DataFrame.update
这将覆盖df1df2

中存在非空值的df1.update(df2) df1 s1 s2 s3 bob 32.0 NaN 11.0 john NaN NaN NaN matt 1.0 NaN NaN 中的所有位置
fillna

制作副本1
使用pd.DataFrame.alignpd.DataFrame.fillnapd.DataFrame.reindex_like
除非索引和列已对齐,否则pd.DataFrame.fillna(*df1.align(df2)).reindex_like(df1) s1 s2 s3 bob 32.0 NaN 11.0 john NaN NaN NaN matt 1.0 NaN NaN 无法正常工作。

df1

制作副本2
pd.DataFrame.combine_firstpd.DataFrame.reindex_like
你首先提出哪一个是值得商榷的。考虑nan全部是df1,它并不重要。但这将保留df2.combine_first(df1)中任何预先存在的非空值。否则,您可以将位置切换为df1.combine_first(df2).reindex_like(df1) s1 s2 s3 bob 32.0 NaN 11.0 john NaN NaN NaN matt 1.0 NaN NaN

# -*- mode: ruby -*-
# vi: set ft=ruby :

IP = "192.168.33.55"
VM_NAME = "jenkins"

VAGRANTFILE_API_VERSION = "2"

Vagrant.configure(VAGRANTFILE_API_VERSION) do |config|
  config.vm.box = "geerlingguy/ubuntu1604" #target OS: Ubuntu 16.04
  config.ssh.insert_key = false
  config.vm.synced_folder ".", "/vagrant", disabled: true
  config.ssh.forward_agent = true

  config.vm.provider :virtualbox do |v|
    v.name = VM_NAME
    v.memory = 1024
    v.cpus = 2
    v.customize ["modifyvm", :id, "--natdnshostresolver1", "on"]
    v.customize ["modifyvm", :id, "--ioapic", "on"]
  end

  config.vm.hostname = VM_NAME
  config.vm.network :private_network, ip: IP
  config.vm.network "forwarded_port", guest: 80, host: 8080

  # Set the name of the VM. See: http://stackoverflow.com/a/17864388/100134
  config.vm.define :jenkins do |jenkins|
  end

  # Ansible provisioner.
  config.vm.provision "ansible" do |ansible|
    ansible.playbook = "jenkins/playbook.yml"
    ansible.inventory_path = "jenkins/inventory"
    ansible.sudo = true
  end
end