熊猫:无法从重复的轴重新索引

时间:2019-06-18 12:38:20

标签: pandas

我有此代码:

missing_columns = list(set(model_header) - set(combined_data.columns))
if missing_columns:
    combined_data = combined_data.reindex(columns=np.append(combined_data.columns.values, missing_columns))

有时会产生此错误

  

无法从重复的轴重新索引

我从其他帖子中了解到,当您有重复的列时会发生这种情况,但是我不知道如何添加缺少的列

这是回溯

Traceback:

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/core/handlers/exception.py" in inner
  41.             response = get_response(request)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/core/handlers/base.py" in _legacy_get_response
  249.             response = self._get_response(request)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/core/handlers/base.py" in _get_response
  187.                 response = self.process_exception_by_middleware(e, request)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/core/handlers/base.py" in _get_response
  185.                 response = wrapped_callback(request, *callback_args, **callback_kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/contrib/admin/options.py" in wrapper
  552.                 return self.admin_site.admin_view(view)(*args, **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/utils/decorators.py" in _wrapped_view
  149.                     response = view_func(request, *args, **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/views/decorators/cache.py" in _wrapped_view_func
  57.         response = view_func(request, *args, **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/contrib/admin/sites.py" in inner
  224.             return view(request, *args, **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/utils/decorators.py" in _wrapper
  67.             return bound_func(*args, **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/utils/decorators.py" in _wrapped_view
  149.                     response = view_func(request, *args, **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/utils/decorators.py" in bound_func
  63.                 return func.__get__(self, type(self))(*args2, **kwargs2)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/contrib/admin/options.py" in changelist_view
  1590.                 response = self.response_action(request, queryset=cl.get_queryset(request))

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/contrib/admin/options.py" in response_action
  1287.             response = func(self, request, queryset)

File "/home/henry/Documents/Sites/Development/web-cdi/webcdi/researcher_UI/admin_actions.py" in scoring_data
  20.     return download_data(request, study_obj, administrations)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/django/contrib/auth/decorators.py" in _wrapped_view
  23.                 return view_func(request, *args, **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/webcdi/researcher_UI/views.py" in download_data
  145.         combined_data = combined_data.reindex(columns=np.append(combined_data.columns.values, missing_columns))

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/pandas/core/frame.py" in reindex
  2733.                                               **kwargs)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/pandas/core/generic.py" in reindex
  2515.                                   fill_value, copy).__finalize__(self)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/pandas/core/frame.py" in _reindex_axes
  2674.                                            fill_value, limit, tolerance)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/pandas/core/frame.py" in _reindex_columns
  2699.                                            allow_dups=False)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/pandas/core/generic.py" in _reindex_with_indexers
  2627.                                                 copy=copy)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/pandas/core/internals.py" in reindex_indexer
  3886.             self.axes[axis]._can_reindex(indexer)

File "/home/henry/Documents/Sites/Development/web-cdi/env/local/lib/python2.7/site-packages/pandas/core/indexes/base.py" in _can_reindex
  2836.             raise ValueError("cannot reindex from a duplicate axis")

Exception Type: ValueError at /wcadmin/researcher_UI/study/
Exception Value: cannot reindex from a duplicate axis

我已经查看了各列,但没有看到任何重叠。还可以吗?

1 个答案:

答案 0 :(得分:2)

我猜在DataFrames之一或两者中都有重复的列名,解决方案是在解决方案之前手动或通过以下代码对它们进行重复数据删除:

model_header = pd.DataFrame(columns=list('ABDB'))
combined_data = pd.DataFrame(columns=list('ABCA'))
print (model_header)
Empty DataFrame
Columns: [A, B, D, B]
Index: []

print (combined_data)
Empty DataFrame
Columns: [A, B, C, A]
Index: []

s1 = model_header.columns.to_series()
model_header.columns = (model_header.columns + 
                        s1.groupby(s1).cumcount().astype(str).radd('_').str.replace('_0',''))

s2 = combined_data.columns.to_series()
combined_data.columns = (combined_data.columns + 
                         s2.groupby(s2).cumcount().astype(str).radd('_').str.replace('_0',''))

print (model_header)
Empty DataFrame
Columns: [A, B, D, B_1]
Index: []

print (combined_data)
Empty DataFrame
Columns: [A, B, C, A_1]
Index: []

missing_columns = list(set(model_header) - set(combined_data.columns))
print (missing_columns)
['D', 'B_1']

if missing_columns:
    combined_data = combined_data.reindex(columns=np.append(combined_data.columns.values, missing_columns))
    print (combined_data)
Empty DataFrame
Columns: [A, B, C, A_1, D, B_1]
Index: []