Question

我有一个数据框，其中一栏对应这种格式的信用卡号：

123456 ****** 1234

我想使用Sep“ ******”创建两个新列：'First'和'Last'

我尝试过：

df[['First','Last']] = df['credit_card'].str.split("******",expand=True)

并获得：

re.error: nothing to repeat at position 0

注意：该系列中的所有值的长度都是统一的，而不是NaNs

我以这种方式解决了，但我对更实用，更快速的方法感兴趣

for x in range(len(df)):
    df.loc[x,'bin'] = str(df.loc[x,6]).split("******")[0]
    df.loc[x,'last_four'] = str(df.loc[x,6]).split("******")[1]

Answer 1

定界符被视为正则表达式，*在regexp中具有特殊含义，因此需要转义以进行字面匹配。你可以写

df[['First','Last']] = df['credit_card'].str.split(r"\*{6}",expand=True)

{6}意味着重复6次模式，这比编写\*\*\*\*\*\*

短