Question

我已经定义了一个数据集：

df=pd.DataFrame(list(xx))

然后，我根据性别过滤了一些数据。

df=df[df["sex"]=="1"]

那我应该遍历所有数据。

row,col=df.shape
for i in range(row):
    print(df["name"][i])  # error

我调试代码，发现“ df”行索引是旧索引，因为删除了许多不合格的数据。例如df [“ sex”] [1] == 1被删除，因此循环将

如何对DataFrame行序列进行重新排序非常感谢！

Answer 1

永远不要使用以下构造：

HTMltestrunner

效率低下。您几乎从不想要在数据帧上使用for循环，但是如果需要，请使用for i in range(nrows): do_stuff(df[column][i])：

pd.Dataframe.itertuples

注意，现在索引是否更改都无关紧要了：

>>> df = pd.DataFrame({'a':[1,2,3],'b':[3,4,5]})
>>> for row in df.itertuples():
...     print("the index", row.Index)
...     print("sum of row", row.a + row.b)
...
the index 0
sum of row 4
the index 1
sum of row 6
the index 2
sum of row 8

最后，假设您可以总是重置索引，

>>> df = df.iloc[[2,0,1]]
>>> df
   a  b
2  3  5
0  1  3
1  2  4
>>> for row in df.itertuples():
...     print("the index", row.Index)
...     print("sum of row", row.a + row.b)
...
the index 2
sum of row 8
the index 0
sum of row 4
the index 1
sum of row 6

现在，只需使用：

>>> df.drop(0, axis=0, inplace=True)
>>> df
   a  b
2  3  5
1  2  4

并使用>>> df.reset_index() index a b 0 2 3 5 1 1 2 4参数将旧索引不包括为列：

drop

如何重新排列DataFrame行序列

1 个答案: