DataFrame Pandas中的ValueError

时间:2017-11-03 06:46:56

标签: python pandas

我的目标是......

  • 如果数据框为空,我需要插入一行index->value of the variable URLcolumns-> value of URL along with the sorted_list
  • 如果非空,我需要插入一行index->value of the variable URLcolumns->sorted_list

我做的是...我初始化了一个DataFrame self.pd,然后对于每个上面带有值的行,我创建了一个本地DataFrame变量df1并将其附加到self.df

我的代码:

import pandas as pd

class Reward_Matrix:
    def __init__(self):
        self.df = pd.DataFrame()

    def add(self, URL, webpage_list):
        sorted_list = []
        check_list = list(self.df.columns.values)
        print('check_list: ',check_list)
        for i in webpage_list:     #to ensure no duplication columns
            if i not in check_list:
                sorted_list.append(i)
        if self.df.empty:
            sorted_list.insert(0, URL)
            df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
        else:
            df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
        print(df1)
        print('sorted_list: ',sorted_list)
        print("length: ",len(df1.columns))
        self.df.append(df1)

但是我收到以下错误:

Traceback (most recent call last):
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4294, in create_block_manager_from_blocks
placement=slice(0, len(axes[0])))]
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 2719, in make_block
return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 115, in __init__
len(self.mgr_locs)))
ValueError: Wrong number of items passed 1, placement implies 450

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "...eclipse-workspace\Crawler\crawl_core\src_main\run.py", line 23, in test_start
test.crawl_run(self.URL)
  File "...eclipse-workspace\Crawler\crawl_core\src_main\test_crawl.py", line 42, in crawl_run
self.reward.add(URL, webpage_list)
  File "...eclipse-workspace\Crawler\crawl_core\src_main\dynamic_matrix.py", line 21, in add
df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 352, in __init__
copy=False)
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 483, in _init_ndarray
return create_block_manager_from_blocks([values], [columns, index])
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4303, in create_block_manager_from_blocks
construction_error(tot_items, blocks[0].shape[1:], axes, e)
  File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4280, in construction_error
passed, implied))
ValueError: Shape of passed values is (1, 1), indices imply (450, 1)

我不熟悉DataFrame和Pandas。我已经收到这个错误已经有一段时间了,当我在StackOverflow中提出类似的问题时,我感到困惑,因为我无法理解我哪里出错!

有人能帮助我吗?

1 个答案:

答案 0 :(得分:1)

我认为你需要删除[],因为否则会获得嵌套列表:

df1 = pd.DataFrame(0,index=[URL], columns=sorted_list)

样品:

sorted_list = ['a','b','c']
URL = 'url1'
df1 = pd.DataFrame(0,index=[URL], columns=sorted_list)
print (df1)
      a  b  c
url1  0  0  0

df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
print (df1)

>ValueError: Shape of passed values is (1, 1), indices imply (3, 1)