:只有在尝试使用to_dict时才必须使用布尔值传递DataFrame

时间:2018-05-07 16:40:37

标签: python pandas prediction

import pandas as pd
dataset = "C:/Users/ashik swaroop/Desktop/anaconda/Gene Dataset/acancergenecensus.csv"
datacan = pd.read_csv(dataset)
datacan = datacan.fillna(0)
cols_to_retain = datacan[[ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]]
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )

运行此错误后请帮助或提出建议:

cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )
Traceback (most recent call last):

  File "<ipython-input-47-dde9a2c1af34>", line 1, in <module>
    cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )

  File "C:\Users\ashik swaroop\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2055, in __getitem__
    return self._getitem_frame(key)

  File "C:\Users\ashik swaroop\Anaconda2\lib\site-packages\pandas\core\frame.py", line 2130, in _getitem_frame
    raise ValueError('Must pass DataFrame with boolean values only')

ValueError: Must pass DataFrame with boolean values only

1 个答案:

答案 0 :(得分:0)

你需要改变:

cols_to_retain = datacan[[ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]]
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )

为:

cols_to_retain = [ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]
cat_dict = datacan[ cols_to_retain ].to_dict( orient = 'records' )

因为如果选择double [],则称为子集并返回已过滤的DataFrame,而不是列名称。

另一种可能的解决方案是:

df = datacan[[ "Tumour_Types_Somatic","Tumour_Types_Germline","Mutation_Types","Tissue_Type"]]

cat_dict = df.to_dict( orient = 'records' )