Question

我有一个9,000列和100行的数据框。我想在每第3列之后插入一列，以使所有行的值等于50。

现有DataFrame

  0 1 2 3 4 5 6 7 8 9....9000
0 a b c d e f g h i j ....x
1 k l m n o p q r s t ....x
.
.

100 u v w x y z aa bb cc .... x

所需的数据框

  0 1 2 3 4 5 6 7 8 9....12000
0 a b c 50 d e f  50 g h i j ....x
1 k l m 50 n o p  50 q r s t ....x
.
.
100 u v w 50 x y z 50 aa bb cc....x

Answer 1

通过为每个DataFrame编制索引来创建新的3rd，添加.5以进行正确的排序，并使用concat添加到原始内容：

df.columns = np.arange(len(df.columns))

df1 = pd.DataFrame(50, index=df.index, columns= df.columns[2::3] + .5)

df2 = pd.concat([df, df1], axis=1).sort_index(axis=1)
df2.columns = np.arange(len(df2.columns))
print (df2)
  0  1  2   3  4  5  6   7  8  9  10  11 12
0  a  b  c  50  d  e  f  50  g  h  i  50  j
1  k  l  m  50  n  o  p  50  q  r  s  50  t

Answer 2

所以这是一种解决方法

s=pd.concat([y.assign(new=50) for x, y in df.groupby(np.arange(df.shape[1])//3,axis=1)],axis=1)
s.columns=np.arange(s.shape[1])

Answer 3

脾气暴躁

# How many columns to group
x    = 3
# Get the shape of things
a    = df.to_numpy()
m, n = a.shape
k    = n // x
# Get only a multiple of x columns and reshape
b    = a[:, :k * x].reshape(m, k, x)
# Get the other columns missed by b
c    = a[:, k * x:]
# array of 50's that we'll append to the last dimension
_50  = np.ones((m, k, 1), np.int64) * 50
# append 50's and reshape back to 2D
d    = np.append(b, _50, axis=2).reshape(m, k * (x + 1))
# Create DataFrame while appending the missing bit
pd.DataFrame(np.append(d, c, axis=1))

   0  1  2   3  4  5  6   7  8  9 10  11 12
0  a  b  c  50  d  e  f  50  g  h  i  50  j
1  k  l  m  50  n  o  p  50  q  r  s  50  t

设置

df = pd.DataFrame(np.reshape([*'abcdefghijklmnopqrst'], (2, -1)))

每第n列后添加一列到数据框

3 个答案:

脾气暴躁

设置