在熊猫系列中嵌入一个范围

时间:2016-10-13 20:27:31

标签: python pandas

我有一个包含14列的表,我想将选择的列拉入新的数据帧。 假设我想要第0​​列,然后是第8-14列

  dfnow = pd.Series([df.iloc[row_count,0], \
                    df.iloc[row_count,8], \
                    df.iloc[row_count,9], \
                    ....

工作但看起来很笨拙

我想写

  dfnow = pd.Series([df.iloc[row_count,0], \
          df.iloc[row_count, range (8, 14)]])

但是这会抛出一个ValueError:传递的项目数量错误

现在,从下面的答案中,我知道我可以创建两个单独的sereis并连接它们,但这似乎有点次优。

Adding pandas Series with different indices without getting NaNs

3 个答案:

答案 0 :(得分:1)

这就是你想要的吗?

In [52]: df = pd.DataFrame(np.arange(30).reshape(5,6), columns=list('abcdef'))

In [53]: df
Out[53]:
    a   b   c   d   e   f
0   0   1   2   3   4   5
1   6   7   8   9  10  11
2  12  13  14  15  16  17
3  18  19  20  21  22  23
4  24  25  26  27  28  29

In [54]: df[[0,2,4]]
Out[54]:
    a   c   e
0   0   2   4
1   6   8  10
2  12  14  16
3  18  20  22
4  24  26  28

将列024连接(重新整理)为单个系列:

In [68]: df[[0,2,4]].values.T.reshape(-1,)
Out[68]: array([ 0,  6, 12, 18, 24,  2,  8, 14, 20, 26,  4, 10, 16, 22, 28])

In [69]: pd.Series(df[[0,2,4]].values.T.reshape(-1,))
Out[69]:
0      0
1      6
2     12
3     18
4     24
5      2
6      8
7     14
8     20
9     26
10     4
11    10
12    16
13    22
14    28
dtype: int32

答案 1 :(得分:1)

考虑df

from string import ascii_uppercase
import pandas as pd
import numpy as np

df = pd.DataFrame(np.arange(150).reshape(-1, 15),
                  columns=list(ascii_uppercase[:15]))
df

enter image description here

使用np.r_为您想要的切片构建必要的数组

np.r_[0, 8:14]

array([ 0,  8,  9, 10, 11, 12, 13])

然后切片

df.iloc[:, np.r_[0, 8:14]]

enter image description here

答案 2 :(得分:0)

我认为您可以将所有值转换为lists,然后创建Series,但随后丢失索引:

df = pd.DataFrame({'A':[1,2,3],
                   'B':[4,5,6],
                   'C':[7,8,9],
                   'D':[1,3,5],
                   'E':[5,3,6],
                   'F':[7,4,3]})

print (df)
   A  B  C  D  E  F
0  1  4  7  1  5  7
1  2  5  8  3  3  4
2  3  6  9  5  6  3

row_count = 1

print (df.iloc[row_count, range (2, 4)])
C    8
D    3
Name: 1, dtype: int64

dfnow = pd.Series([df.iloc[row_count,0]]  + df.iloc[row_count, range (2, 4)].tolist())
print (dfnow)
0    2
1    8
2    3
dtype: int64

或者您可以使用concat,然后索引是列名:

row_count = 1

a = df.iloc[row_count, range (2, 4)]
b = df.iloc[row_count, range (4, 6)]

print (a)
C    8
D    3
Name: 1, dtype: int64

print (b)
E    3
F    4
Name: 1, dtype: int64

print (pd.concat([a,b]))
C    8
D    3
E    3
F    4
Name: 1, dtype: int64

但如果需要添加标量(a),则有点复杂 - 需要Series

row_count = 1

a = pd.Series(df.iloc[row_count, 0], index=[df.columns[0]])
b = df.iloc[row_count, range (2, 4)]
c = df.iloc[row_count, range (4, 6)]

print (a)
A    2
dtype: int64

print (b)
C    8
D    3
Name: 1, dtype: int64

print (c)
E    3
F    4
Name: 1, dtype: int64

print (pd.concat([a,b,c]))
A    2
C    8
D    3
E    3
F    4
dtype: int64