我的目标是......
index->value of the variable URL
和columns-> value of URL along with the sorted_list
index->value of the variable URL
和columns->sorted_list
我做的是...我初始化了一个DataFrame self.pd
,然后对于每个上面带有值的行,我创建了一个本地DataFrame变量df1
并将其附加到self.df
。
我的代码:
import pandas as pd
class Reward_Matrix:
def __init__(self):
self.df = pd.DataFrame()
def add(self, URL, webpage_list):
sorted_list = []
check_list = list(self.df.columns.values)
print('check_list: ',check_list)
for i in webpage_list: #to ensure no duplication columns
if i not in check_list:
sorted_list.append(i)
if self.df.empty:
sorted_list.insert(0, URL)
df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
else:
df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
print(df1)
print('sorted_list: ',sorted_list)
print("length: ",len(df1.columns))
self.df.append(df1)
但是我收到以下错误:
Traceback (most recent call last):
File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4294, in create_block_manager_from_blocks
placement=slice(0, len(axes[0])))]
File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 2719, in make_block
return klass(values, ndim=ndim, fastpath=fastpath, placement=placement)
File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 115, in __init__
len(self.mgr_locs)))
ValueError: Wrong number of items passed 1, placement implies 450
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "...eclipse-workspace\Crawler\crawl_core\src_main\run.py", line 23, in test_start
test.crawl_run(self.URL)
File "...eclipse-workspace\Crawler\crawl_core\src_main\test_crawl.py", line 42, in crawl_run
self.reward.add(URL, webpage_list)
File "...eclipse-workspace\Crawler\crawl_core\src_main\dynamic_matrix.py", line 21, in add
df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
File "...Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 352, in __init__
copy=False)
File "...Continuum\anaconda3\lib\site-packages\pandas\core\frame.py", line 483, in _init_ndarray
return create_block_manager_from_blocks([values], [columns, index])
File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4303, in create_block_manager_from_blocks
construction_error(tot_items, blocks[0].shape[1:], axes, e)
File "...Continuum\anaconda3\lib\site-packages\pandas\core\internals.py", line 4280, in construction_error
passed, implied))
ValueError: Shape of passed values is (1, 1), indices imply (450, 1)
我不熟悉DataFrame和Pandas。我已经收到这个错误已经有一段时间了,当我在StackOverflow中提出类似的问题时,我感到困惑,因为我无法理解我哪里出错!
有人能帮助我吗?
答案 0 :(得分:1)
我认为你需要删除[]
,因为否则会获得嵌套列表:
df1 = pd.DataFrame(0,index=[URL], columns=sorted_list)
样品:
sorted_list = ['a','b','c']
URL = 'url1'
df1 = pd.DataFrame(0,index=[URL], columns=sorted_list)
print (df1)
a b c
url1 0 0 0
df1 = pd.DataFrame(0,index=[URL], columns=[sorted_list])
print (df1)
>ValueError: Shape of passed values is (1, 1), indices imply (3, 1)