Question

嗨，有一个数据帧，如下面的数据帧df1。数据类型是字符串。

    eye         nose       mouse       ear
  34_35_a      45_66_b    45_64_a     78_87_a
  35_38_a      75_76_b    95_37_a     38_79_a
  64_43_a      85_66_b    65_45_a     87_45_a

我想获取类似以下数据帧的数据帧。眼睛数据分为eye_x，eye_y，其他列相同，数据类型为float。

 eye_x   eye_y    nose_x   nose_y     mouse_x  mouse_y     ear_x   ear_y        
    34       35       45       66         45        64        78       87
    35       38       75       76         95        37        38       79
    64       43       85       66         65        45        87       45

到目前为止，我知道如何将（x，y）值与以下代码一起获取：

 eye           nose       mouse       ear
  (34, 35)      (45,66)    (45,64)     (78,87)
  (35, 38)      (75,76)    (95,37)     (38,79)
  (64, 43)      (85,66)    (65,45)     (87,45)

def process_xy(val_str):
    s = val_str.split('_')
    x = float(s[0])
    y = float(s[1])
    label = int(s[2])
    return np.array([x, y])

keypoint_cols = list(df.columns)
d = None
for col in keypoint_cols:
    df[col+'_xy'] = df[col].apply(process_xy)

df2 = df.drop(keypoint_cols, axis=1)

Answer 1

您可以再次尝试stack和unstack。

v = df.stack().str.split('_', expand=True).iloc[:, :-1]
v.columns = ['x', 'y']

v = v.unstack().swaplevel(0, 1, axis=1)
v.columns = v.columns.map('_'.join)

v.sort_index(axis=1)

  ear_x ear_y eye_x eye_y mouse_x mouse_y nose_x nose_y
0    78    87    34    35      45      64     45     66
1    38    79    35    38      95      37     75     76
2    87    45    64    43      65      45     85     66

Answer 2

我将使用str.split和stack

做apply(pd.Series)

s=df.apply(lambda x : x.str.split('_')).stack().apply(pd.Series)# convert to list then unnesting it 
s=s.apply(pd.to_numeric,errors='coerce').dropna(1).rename(columns={0:'x',1:'y'}).unstack() # apply the numeric check , drop the na
s.columns=s.columns.map('{0[1]}_{0[0]}'.format)# change multiple column to flatten 
s
Out[1274]: 
   eye_x  nose_x  mouse_x  ear_x  eye_y  nose_y  mouse_y  ear_y
0     34      45       45     78     35      66       64     87
1     35      75       95     38     38      76       37     79
2     64      85       65     87     43      66       45     45

Answer 3

以下是使用列表推导和pd.concat的一种方法。

res = pd.concat([df[col].str.split('_', expand=True).iloc[:, :2].add_prefix(col) \
                for col in df], axis=1).astype(int)

我将列后缀重命名为练习。

<强>结果

  eye0 eye1 nose0 nose1 mouse0 mouse1 ear0 ear1
0   34   35    45    66     45     64   78   87
1   35   38    75    76     95     37   38   79
2   64   43    85    66     65     45   87   45

<强>解释

使用pd.concat和axis=1
按_拆分值，使用expand=True并仅使用前2个组件。
使用int转换为pd.DataFrame.astype。

Answer 4

您可以将嵌套列表理解与concat一起使用：

df1 = pd.concat([pd.DataFrame([dict(zip([i + '_x',i + '_y'], y.split('_')[:2])) for y in x]) 
                               for i, x in df.items()], axis=1).astype(int)
print (df1)
   eye_x  eye_y  nose_x  nose_y  mouse_x  mouse_y  ear_x  ear_y
0     34     35      45      66       45       64     78     87
1     35     38      75      76       95       37     38     79
2     64     43      85      66       65       45     87     45

Python pandas：如何快速处理列中的值

4 个答案: