即使这是一个重复的问题,我也无法在下面找到我的问题的确切解决方案。
我有一个名为“ data1”的熊猫数据框,并且我想获取数据类型为“对象”的列的唯一类别的数量。 以下是我使用的代码
for col in data1.columns:
if data1[col].dtypes =='object':
unique_category = len(data1[col].unique())
print("feature '{col}' has '{unique_category}' unique catogories".format(col=col,unique_category=unique_category))
此代码在其他程序中运行良好。但这一次它给出了以下错误
V
alueError Traceback (most recent call last)
<ipython-input-178-03999268fffa> in <module>()
1 for col in data1.columns:
----> 2 if data1[col].dtypes =='object':
3 unique_category = len(data1[col].unique())
4 print("feature '{col}' has '{unique_category}' unique catogories".format(col=col,unique_category=unique_category))
5
~\AppData\Local\Continuum\anaconda3\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1571 raise ValueError("The truth value of a {0} is ambiguous. "
1572 "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
-> 1573 .format(self.__class__.__name__))
1574
1575 __bool__ = __nonzero__
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().
这有什么理由给出错误消息吗?
答案 0 :(得分:1)
这是一个例子:
# Create data set
d = {'foo':[100, 111, 222],
'bar':['333', '444', '555']}
df = pd.DataFrame(d)
df
# bar foo
# 0 333 100
# 1 444 111
# 2 555 222
df.info()
# <class 'pandas.core.frame.DataFrame'>
# RangeIndex: 3 entries, 0 to 2
# Data columns (total 2 columns):
# bar 3 non-null object # <- object type column
# foo 3 non-null int64
# dtypes: int64(1), object(1)
# memory usage: 128.0+ bytes
for col in range(len(df.dtypes)):
if df.dtypes[col] == 'O': # <- can also use `O`
unique_category = len(df.loc[:,df.columns[col]].unique())
print("feature '{col}' has '{unique_category}' unique categories".format(col=df.columns[col],unique_category=unique_category))
# feature 'bar' has '3' unique categories
答案 1 :(得分:1)
您可以只使用select_dtypes:
for col in data1.select_dtypes('object'):
print(f'feature {col} has {data1[col].nunique()} unique categories')
它将自动为您选择对象列