给定一个包含许多列的数据框,是否有两种方法只能循环通过其中两列?

时间:2017-03-01 06:07:09

标签: python loops pandas dataframe

给定一个包含列id,名称,评级,购买,支出的数据框。假设我想在同一时间循环评级和消费列,所以在同一个循环中。我该怎么做呢?我想到的一个想法是创建一个只有这两列的较小数据框并循环通过它,但我认为知道如何将数据帧作为一个整体并循环遍历特定列会很好。

3 个答案:

答案 0 :(得分:3)

使用itertuples

for r, s in df[['rating','spending']].itertuples(index=False):
    print(r, s)

7 5
8 3
9 6

借用@ jezrael的设置

df = pd.DataFrame({'id':[1,2,3],
                   'name':[4,5,6],
                   'rating':[7,8,9],
                   'purchase':[1,3,5],
                   'spending':[5,3,6]})

print (df)
   id  name  purchase  rating  spending
0   1     4         1       7         5
1   2     5         3       8         3
2   3     6         5       9         6

答案 1 :(得分:2)

您可以将iterrows与子集一起使用:

Series作为输出:

df = pd.DataFrame({'id':[1,2,3],
                   'name':[4,5,6],
                   'rating':[7,8,9],
                   'purchase':[1,3,5],
                   'spending':[5,3,6]})

print (df)
   id  name  purchase  rating  spending
0   1     4         1       7         5
1   2     5         3       8         3
2   3     6         5       9         6

for idx, row in df.iterrows():
    print (row[['rating','spending']])

rating      7
spending    5
Name: 0, dtype: int64
rating      8
spending    3
Name: 1, dtype: int64
rating      9
spending    6
Name: 2, dtype: int64    
for idx, row in df[['rating','spending']].iterrows():
    print (row)

rating      7
spending    5
Name: 0, dtype: int64
rating      8
spending    3
Name: 1, dtype: int64
rating      9
spending    6
Name: 2, dtype: int64              

标量输出 - iterrowsitertuples

for idx, row in df.iterrows():
    print (row["rating"])
    print (row["spending"])

7
5
8
3
9
6              

for row in df.itertuples():
    print (row.rating)
    print (row.spending)

7
5
8
3
9
6

答案 2 :(得分:1)

这是循环通过2列的pythonic方式:

for rating, spending in zip(df["rating"],df["spending"]):
    print (rating, spending)

如果您使用的是python2:

from itertools import izip
for rating, spending in izip(df["rating"],df["spending"]):
    print (rating, spending)

这是循环的熊猫方式:

for _,row in df.iterrows():
    print (row["rating"],row["spending"])