如果我有R data.frame df
和
colnames(df)
,
[1] "a" "b" "c" "d" "e"
我可以选择“a”,“c”,“d”和“e”列,如下所示:
df[ , c(1, 3:5)]
熊猫有一个简单的等价物吗?我知道我可以使用
df.loc[:, ['a', 'c', 'd', 'e']]
这适用于几列。
对于许多列序列,R代码仍然很简单
df2[ , c(1:10, 25:30, 40, 50:100)]
答案 0 :(得分:7)
更新:无需使用numpy.hstack
,您可以按以下方式致电numpy.r_
使用iloc
+ numpy.r_
:
In [20]: df = DataFrame(randn(10, 3), columns=list('abc'))
In [21]: df
Out[21]:
a b c
0 0.228163 -1.311485 -1.335604
1 0.292547 -1.636901 0.001765
2 0.744605 -0.325580 0.205003
3 -0.580471 -0.531553 -0.740697
4 0.250574 1.076019 -0.594915
5 -0.148449 0.076951 -0.653595
6 -1.065314 -0.166018 -1.471532
7 1.133336 -0.529738 -1.213841
8 -1.715281 -2.058831 0.113237
9 -0.382412 -0.072540 0.294853
[10 rows x 3 columns]
In [22]: df.iloc[:, r_[:2]]
Out[22]:
a b
0 0.228163 -1.311485
1 0.292547 -1.636901
2 0.744605 -0.325580
3 -0.580471 -0.531553
4 0.250574 1.076019
5 -0.148449 0.076951
6 -1.065314 -0.166018
7 1.133336 -0.529738
8 -1.715281 -2.058831
9 -0.382412 -0.072540
[10 rows x 2 columns]
要连接整数范围,请使用numpy.r_
:
In [35]: df = DataFrame(randn(10, 6), columns=list('abcdef'))
In [36]: df.iloc[:, r_[:2, 2:df.columns.size:2]]
Out[36]:
a b c e
0 -1.358623 -0.622909 0.025609 -1.166303
1 0.527027 0.310530 2.892384 0.190451
2 -0.251138 -1.246113 0.738264 0.062078
3 -1.716028 0.419139 0.060225 -1.191527
4 -1.308635 0.045396 -0.599367 -0.202491
5 -0.620343 0.796364 -0.008802 0.160020
6 0.199739 0.111816 -0.278119 1.051317
7 -0.311206 0.090348 -0.237887 0.958215
8 0.363161 2.449031 1.023352 0.743853
9 0.039451 -0.855733 -0.836921 -0.835078
[10 rows x 4 columns]
答案 1 :(得分:0)
现在你可以在 python 中使用类似的语法了:
>>> from datar.all import c, f, select
>>> from datar.datasets import starwars
>>>
>>> starwars
name height mass hair_color skin_color eye_color birth_year sex gender homeworld species
<object> <float64> <float64> <object> <object> <object> <float64> <object> <object> <object> <object>
0 Luke Skywalker 172.0 77.0 blond fair blue 19.0 male masculine Tatooine Human
1 C-3PO 167.0 75.0 NaN gold yellow 112.0 none masculine Tatooine Droid
2 R2-D2 96.0 32.0 NaN white, blue red 33.0 none masculine Naboo Droid
3 Darth Vader 202.0 136.0 none white yellow 41.9 male masculine Tatooine Human
.. ... ... ... ... ... ... ... ... ... ... ...
4 Leia Organa 150.0 49.0 brown light brown 19.0 female feminine Alderaan Human
82 Rey NaN NaN brown light hazel NaN female feminine NaN Human
83 Poe Dameron NaN NaN brown light brown NaN male masculine NaN Human
84 BB8 NaN NaN none none black NaN none masculine NaN Droid
85 Captain Phasma NaN NaN unknown unknown unknown NaN NaN NaN NaN NaN
86 Padmé Amidala 165.0 45.0 brown light brown 46.0 female feminine Naboo Human
[87 rows x 11 columns]
>>>
>>> starwars >> select(c(1, f[3:5], 7))
name mass hair_color skin_color birth_year
<object> <float64> <object> <object> <float64>
0 Luke Skywalker 77.0 blond fair 19.0
1 C-3PO 75.0 NaN gold 112.0
2 R2-D2 32.0 NaN white, blue 33.0
3 Darth Vader 136.0 none white 41.9
.. ... ... ... ... ...
4 Leia Organa 49.0 brown light 19.0
82 Rey NaN brown light NaN
83 Poe Dameron NaN brown light NaN
84 BB8 NaN none none NaN
85 Captain Phasma NaN unknown unknown NaN
86 Padmé Amidala 45.0 brown light 46.0
[87 rows x 5 columns]
>>>
>>> # even with column names
>>> starwars >> select(c(f.name, f[f.mass:f.skin_color], f.birth_year))
name mass hair_color skin_color birth_year
<object> <float64> <object> <object> <float64>
0 Luke Skywalker 77.0 blond fair 19.0
1 C-3PO 75.0 NaN gold 112.0
2 R2-D2 32.0 NaN white, blue 33.0
3 Darth Vader 136.0 none white 41.9
.. ... ... ... ... ...
4 Leia Organa 49.0 brown light 19.0
82 Rey NaN brown light NaN
83 Poe Dameron NaN brown light NaN
84 BB8 NaN none none NaN
85 Captain Phasma NaN unknown unknown NaN
86 Padmé Amidala 45.0 brown light 46.0
[87 rows x 5 columns]
我是 datar
包的作者。如果您有任何问题,请随时提交问题。