我正在做一个机器学习作业,在那里我查看错误数据库,进行多类分类,然后插入一个包含分类文本的新列。作为调试的一部分,当我再次运行该特定单元格时,它说列已经存在。我只是想知道是否有办法克服它(除了通常的异常处理)。
我写的那段代码如下:
trigger_dict = {
'Config-Change':['change','changing','changed'], \
'Upgrade-Downgrade':['Upgrade','Downgrade','ISSU'], \
'VPC-Related':['MCT','MCEC','VPC'], \
'CLI-Related':['CC','Consistency','Checker','Show','Debug','Clear'], \
'Interface-Flap': ['Flap','Shut'] ,\
'Reload-Related': ['reload','reboot','ASCII','Replay'],\
'Process-Related': ['Restart','Kill','Process'],\
'ACL-Related': ['RACL','PACL','IFACL'],\
'Config-Unconfig': ['config','remove','removal','Unconfig','reconfig'],\
'HA-Related': ['SSO','LC','Switchover'],\
}
cat_1 = pd.Series([])
flag = 0
for index in range(df['Headline'].shape[0]):
text = df['Headline'][index]
for key, value in trigger_dict.items():
for val in value:
if re.search(val, text, re.I):
if not flag:
cat_1[index] = key
flag = 1
flag = 0
df.insert(len(df.columns),"Trigger_Type", cat_1)
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-45-d23348f7bbac> in <module>
12 flag = 0
13
---> 14 df.insert(len(df.columns),"Trigger_Type", cat_1)
~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/pandas/core/frame.py in insert(self, loc, column, value, allow_duplicates)
3220 value = self._sanitize_column(column, value, broadcast=False)
3221 self._data.insert(loc, column, value,
-> 3222 allow_duplicates=allow_duplicates)
3223
3224 def assign(self, **kwargs):
~/Desktop/Anaconda/anaconda3/envs/nlp_course/lib/python3.7/site-packages/pandas/core/internals.py in insert(self, loc, item, value, allow_duplicates)
4336 if not allow_duplicates and item in self.items:
4337 # Should this be a different kind of error??
-> 4338 raise ValueError('cannot insert {}, already exists'.format(item))
4339
4340 if not isinstance(loc, int):
ValueError: cannot insert Trigger_Type, already exists
答案 0 :(得分:1)
它不起作用,因为您已经有一个具有该名称的列。如果您可以接受重复的列,则可以传递 allow_duplicates=True。
df.insert(len(df.columns),"Trigger_Type", cat_1, allow_duplicates=True)
否则,您必须将该列重命名为其他名称。
如果要完全替换列,也可以使用:
df['Trigger_Type'] = cat1