我有一个包含14列的表,我想将选择的列拉入新的数据帧。 假设我想要第0列,然后是第8-14列
dfnow = pd.Series([df.iloc[row_count,0], \
df.iloc[row_count,8], \
df.iloc[row_count,9], \
....
工作但看起来很笨拙
我想写
dfnow = pd.Series([df.iloc[row_count,0], \
df.iloc[row_count, range (8, 14)]])
但是这会抛出一个ValueError:传递的项目数量错误
现在,从下面的答案中,我知道我可以创建两个单独的sereis并连接它们,但这似乎有点次优。
Adding pandas Series with different indices without getting NaNs
答案 0 :(得分:1)
这就是你想要的吗?
In [52]: df = pd.DataFrame(np.arange(30).reshape(5,6), columns=list('abcdef'))
In [53]: df
Out[53]:
a b c d e f
0 0 1 2 3 4 5
1 6 7 8 9 10 11
2 12 13 14 15 16 17
3 18 19 20 21 22 23
4 24 25 26 27 28 29
In [54]: df[[0,2,4]]
Out[54]:
a c e
0 0 2 4
1 6 8 10
2 12 14 16
3 18 20 22
4 24 26 28
将列0
,2
,4
连接(重新整理)为单个系列:
In [68]: df[[0,2,4]].values.T.reshape(-1,)
Out[68]: array([ 0, 6, 12, 18, 24, 2, 8, 14, 20, 26, 4, 10, 16, 22, 28])
In [69]: pd.Series(df[[0,2,4]].values.T.reshape(-1,))
Out[69]:
0 0
1 6
2 12
3 18
4 24
5 2
6 8
7 14
8 20
9 26
10 4
11 10
12 16
13 22
14 28
dtype: int32
答案 1 :(得分:1)
考虑df
from string import ascii_uppercase
import pandas as pd
import numpy as np
df = pd.DataFrame(np.arange(150).reshape(-1, 15),
columns=list(ascii_uppercase[:15]))
df
使用np.r_
为您想要的切片构建必要的数组
np.r_[0, 8:14]
array([ 0, 8, 9, 10, 11, 12, 13])
然后切片
df.iloc[:, np.r_[0, 8:14]]
答案 2 :(得分:0)
我认为您可以将所有值转换为lists
,然后创建Series
,但随后丢失索引:
df = pd.DataFrame({'A':[1,2,3],
'B':[4,5,6],
'C':[7,8,9],
'D':[1,3,5],
'E':[5,3,6],
'F':[7,4,3]})
print (df)
A B C D E F
0 1 4 7 1 5 7
1 2 5 8 3 3 4
2 3 6 9 5 6 3
row_count = 1
print (df.iloc[row_count, range (2, 4)])
C 8
D 3
Name: 1, dtype: int64
dfnow = pd.Series([df.iloc[row_count,0]] + df.iloc[row_count, range (2, 4)].tolist())
print (dfnow)
0 2
1 8
2 3
dtype: int64
或者您可以使用concat
,然后索引是列名:
row_count = 1
a = df.iloc[row_count, range (2, 4)]
b = df.iloc[row_count, range (4, 6)]
print (a)
C 8
D 3
Name: 1, dtype: int64
print (b)
E 3
F 4
Name: 1, dtype: int64
print (pd.concat([a,b]))
C 8
D 3
E 3
F 4
Name: 1, dtype: int64
但如果需要添加标量(a
),则有点复杂 - 需要Series
:
row_count = 1
a = pd.Series(df.iloc[row_count, 0], index=[df.columns[0]])
b = df.iloc[row_count, range (2, 4)]
c = df.iloc[row_count, range (4, 6)]
print (a)
A 2
dtype: int64
print (b)
C 8
D 3
Name: 1, dtype: int64
print (c)
E 3
F 4
Name: 1, dtype: int64
print (pd.concat([a,b,c]))
A 2
C 8
D 3
E 3
F 4
dtype: int64