我有一个类似于以下内容的DataFrame,其中有一个具有非唯一值的列(在这种情况下为地址),还有一些包含有关其信息的列。
df = pd.DataFrame({'address': {0:'11 Star Street', 1:'22 Milky Way', 2:'88 Dark Drive', 3:'33 Planet Place', 4:'22 Milky Way', 5:'22 Milky Way'}, 'val': {0:10, 1:'', 2:'', 3:20, 4: 20, 5:''}, 'val2': {0:20, 1:'', 2:'', 3:40, 4:10, 5:''}})
address val val2
0 11 Star Street 10 20
1 22 Milky Way
2 88 Dark Drive
3 33 Planet Place 20 40
4 22 Milky Way 20 10
5 22 Milky Way
某些地址在DataFrame中出现多次,而某些重复的地址则缺少信息。如果某一行缺少值,但该地址出现在DataFrame的另一行中,我想用来自同一地址的NaN值替换NaN值,以得到如下结果:
address val val2
0 11 Star Street 10 20
1 22 Milky Way 20 10
2 88 Dark Drive
3 33 Planet Place 20 40
4 22 Milky Way 20 10
5 22 Milky Way 20 10
由于DataFrame包含数千个不同的地址,因此无法使用像字典这样的东西。
编辑:可以安全地假定两个值都缺失或都存在。换句话说,永远不会只有val而没有val2的行,反之亦然。但是,可以考虑这种可能情况的答案会更好!
答案 0 :(得分:1)
多种方法,最简单的方法是groupby并填充/填充组。
Traceback (most recent call last):
File "/home/ernesto/odoo12/odoo/http.py", line 656, in _handle_exception
return super(JsonRequest, self)._handle_exception(exception)
File "/home/ernesto/odoo12/odoo/http.py", line 314, in _handle_exception
raise pycompat.reraise(type(exception), exception, sys.exc_info()[2])
File "/home/ernesto/odoo12/odoo/tools/pycompat.py", line 87, in reraise
raise value
File "/home/ernesto/odoo12/odoo/http.py", line 698, in dispatch
result = self._call_function(**self.params)
File "/home/ernesto/odoo12/odoo/http.py", line 346, in _call_function
return checked_call(self.db, *args, **kwargs)
File "/home/ernesto/odoo12/odoo/service/model.py", line 98, in wrapper
return f(dbname, *args, **kwargs)
File "/home/ernesto/odoo12/odoo/http.py", line 339, in checked_call
result = self.endpoint(*a, **kw)
File "/home/ernesto/odoo12/odoo/http.py", line 941, in __call__
return self.method(*args, **kw)
File "/home/ernesto/odoo12/odoo/http.py", line 519, in response_wrap
response = f(*args, **kw)
File "/home/ernesto/odoo12/addons/web/controllers/main.py", line 966, in call_button
action = self._call_kw(model, method, args, {})
File "/home/ernesto/odoo12/addons/web/controllers/main.py", line 954, in _call_kw
return call_kw(request.env[model], method, args, kwargs)
File "/home/ernesto/odoo12/odoo/api.py", line 759, in call_kw
return _call_kw_multi(method, model, args, kwargs)
File "/home/ernesto/odoo12/odoo/api.py", line 746, in _call_kw_multi
result = method(recs, *args, **kwargs)
File "<decorator-gen-61>", line 2, in button_immediate_install
File "/home/ernesto/odoo12/odoo/addons/base/models/ir_module.py", line 74, in check_and_log
return method(self, *args, **kwargs)
File "/home/ernesto/odoo12/odoo/addons/base/models/ir_module.py", line 445, in button_immediate_install
return self._button_immediate_function(type(self).button_install)
File "/home/ernesto/odoo12/odoo/addons/base/models/ir_module.py", line 561, in _button_immediate_function
modules.registry.Registry.new(self._cr.dbname, update_module=True)
File "/home/ernesto/odoo12/odoo/modules/registry.py", line 86, in new
odoo.modules.load_modules(registry._db, force_demo, status, update_module)
File "/home/ernesto/odoo12/odoo/modules/loading.py", line 421, in load_modules
loaded_modules, update_module, models_to_check)
File "/home/ernesto/odoo12/odoo/modules/loading.py", line 313, in load_marked_modules
perform_checks=perform_checks, models_to_check=models_to_check
File "/home/ernesto/odoo12/odoo/modules/loading.py", line 222, in load_module_graph
load_data(cr, idref, mode, kind='data', package=package, report=report)
File "/home/ernesto/odoo12/odoo/modules/loading.py", line 68, in load_data
tools.convert_file(cr, package.name, filename, idref, mode, noupdate, kind, report)
File "/home/ernesto/odoo12/odoo/tools/convert.py", line 798, in convert_file
convert_csv_import(cr, module, pathname, fp.read(), idref, mode, noupdate)
File "/home/ernesto/odoo12/odoo/tools/convert.py", line 841, in convert_csv_import
result = env[model].load(fields, datas)
File "/home/ernesto/odoo12/odoo/models.py", line 943, in load
for id, xid, record, info in converted:
File "/home/ernesto/odoo12/odoo/models.py", line 1068, in _convert_records
for record, extras in stream:
File "/home/ernesto/odoo12/odoo/tools/misc.py", line 859, in next
val = next(self.stream, _ph)
File "/home/ernesto/odoo12/odoo/models.py", line 991, in _extract_records
for index, fnames in enumerate(fields_)
File "/home/ernesto/odoo12/odoo/models.py", line 992, in <listcomp>
if fields[fnames[0]].type == 'one2many'
KeyError: 'id
另一种更有效的方法是沿轴使用import numpy as np
import pandas as pd
df = df.replace('',np.nan,regex=True).groupby('address').apply(lambda x : x.ffill().bfill())
print(df)
address val val2
0 11 Star Street 10.0 20.0
1 22 Milky Way 20.0 10.0
2 88 Dark Drive NaN NaN
3 33 Planet Place 20.0 40.0
4 22 Milky Way 20.0 10.0
5 22 Milky Way 20.0 10.0
。
update