我有以下代码来处理数据帧:
header_updated = self.schemata.get_header_from_schema_id(df.at[0, 'schema_id'])
for col in (col for col in df.columns.values if df[col].dtype == 'object' and col not in [col_name[0] for col_name in self.header + header_updated if col_name[1] == 'string']):
df.loc[:, col] = df.loc[:, col].apply(lambda x: is_nan(x))
df.columns = ['timestamp_create', 'schema_id'] + [item[0] for item in header_updated] + df.columns.tolist()[len(header_updated)+2:]
for column in [column for column in df.columns.values if column not in [elem[0] for elem in self.header] and column not in ('timestamp_create', 'schema_id')]:
del df[column]
for elem in [elem for elem in self.header if elem[0] not in df.columns.values]:
df[elem[0]] = 0 if elem[1] == 'uint64' else ''
df = df[['timestamp_create', 'schema_id'] + [elem[0] for elem in self.header]]
if 'rat' in df.columns.values:
df['rat'].fillna('unclassified', inplace=True)
for el in (col_name[0] for col_name in self.header if col_name[1] == 'string' and col_name[0] not in self.numeric_columns):
df[el].fillna('', inplace=True)
df.fillna(0, inplace=True, downcast='infer')
df[self.numeric_columns].astype('int64')
如果我把它放在像:
这样的功能上 def process_frame_obsolete_schema(self, df):...
我收到错误消息:
/usr/lib64/python2.7/site-packages/pandas/core/generic.py:2862: SettingWithCopyWarning:
A value is trying to be set on a copy of a slice from a DataFrame
See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
self._update_inplace(new_data)
为什么pandas在使用函数时会抱怨?