Question

我有一个包含14列的表，我想将选择的列拉入新的数据帧。假设我想要第0列，然后是第8-14列

  dfnow = pd.Series([df.iloc[row_count,0], \
                    df.iloc[row_count,8], \
                    df.iloc[row_count,9], \
                    ....

工作但看起来很笨拙

我想写

  dfnow = pd.Series([df.iloc[row_count,0], \
          df.iloc[row_count, range (8, 14)]])

但是这会抛出一个ValueError：传递的项目数量错误

现在，从下面的答案中，我知道我可以创建两个单独的sereis并连接它们，但这似乎有点次优。

Adding pandas Series with different indices without getting NaNs

Answer 1

这就是你想要的吗？

In [52]: df = pd.DataFrame(np.arange(30).reshape(5,6), columns=list('abcdef'))

In [53]: df
Out[53]:
    a   b   c   d   e   f
0   0   1   2   3   4   5
1   6   7   8   9  10  11
2  12  13  14  15  16  17
3  18  19  20  21  22  23
4  24  25  26  27  28  29

In [54]: df[[0,2,4]]
Out[54]:
    a   c   e
0   0   2   4
1   6   8  10
2  12  14  16
3  18  20  22
4  24  26  28

将列0，2，4连接（重新整理）为单个系列：

In [68]: df[[0,2,4]].values.T.reshape(-1,)
Out[68]: array([ 0,  6, 12, 18, 24,  2,  8, 14, 20, 26,  4, 10, 16, 22, 28])

In [69]: pd.Series(df[[0,2,4]].values.T.reshape(-1,))
Out[69]:
0      0
1      6
2     12
3     18
4     24
5      2
6      8
7     14
8     20
9     26
10     4
11    10
12    16
13    22
14    28
dtype: int32

Answer 2

考虑df

from string import ascii_uppercase
import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(150).reshape(-1, 15),
                  columns=list(ascii_uppercase[:15]))
df

使用np.r_为您想要的切片构建必要的数组

np.r_[0, 8:14]

array([ 0,  8,  9, 10, 11, 12, 13])

然后切片

df.iloc[:, np.r_[0, 8:14]]

Answer 3

我认为您可以将所有值转换为lists，然后创建Series，但随后丢失索引：

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)
   A  B  C  D  E  F
0  1  4  7  1  5  7
1  2  5  8  3  3  4
2  3  6  9  5  6  3

row_count = 1

print (df.iloc[row_count, range (2, 4)])
C    8
D    3
Name: 1, dtype: int64

dfnow = pd.Series([df.iloc[row_count,0]]  + df.iloc[row_count, range (2, 4)].tolist())
print (dfnow)
0    2
1    8
2    3
dtype: int64

或者您可以使用concat，然后索引是列名：

row_count = 1

a = df.iloc[row_count, range (2, 4)]
b = df.iloc[row_count, range (4, 6)]

print (a)
C    8
D    3
Name: 1, dtype: int64

print (b)
E    3
F    4
Name: 1, dtype: int64

print (pd.concat([a,b]))
C    8
D    3
E    3
F    4
Name: 1, dtype: int64

但如果需要添加标量（a），则有点复杂 - 需要Series：

row_count = 1

a = pd.Series(df.iloc[row_count, 0], index=[df.columns[0]])
b = df.iloc[row_count, range (2, 4)]
c = df.iloc[row_count, range (4, 6)]

print (a)
A    2
dtype: int64

print (b)
C    8
D    3
Name: 1, dtype: int64

print (c)
E    3
F    4
Name: 1, dtype: int64

print (pd.concat([a,b,c]))
A    2
C    8
D    3
E    3
F    4
dtype: int64

在熊猫系列中嵌入一个范围

3 个答案: