How to check splitted values of a column without changing the dataframe?

时间:2019-04-23 15:10:15

标签: python dataframe sklearn-pandas strsplit

I am trying to find columns hitting specific conditions and put a value in the column col.

My current implementation is:

df.loc[~(df['myCol'].isin(myInfo)), 'col'] = 'ok'

In the future, myCol will have multiple info. So I need to split the value in myCol without changing the dataframe and check if any of the splitted values are in myInfo. If one of them are, the current row should get the value 'ok' in the column col. Is there an elegant way without really splitting and saving in an extra variable? Currently, I do not know how the multiple info will be represented (either separated by a character or just concatenated one after one, each consisting of 4 alphanumeric values).

1 个答案:

答案 0 :(得分:0)

假设您需要在myCol列的“-”上进行分割。

sep='-'
deconcat = df['MyCol'].str.split(sep, expand=True)
new_df=df.join(deconcat)

new_df DataFrame将具有与df相同的索引,因此您可以使用new_df做您想做的事,然后将join回到{{ 1}}对其进行过滤。

您可以对每个新的拆分列执行上面的df代码,以获得所需的结果。

来源: 来自.isin文档的代码,该文档具有内置功能deconcatenate_column,可以执行此操作。

deconcatenate_column的源代码