有没有办法迭代列表并返回以其内容命名的变量?

时间:2017-11-10 02:44:33

标签: python pandas

我有一个按日期组织的熊猫数据框我试图按年分割(在一个名为'year'的列中)。我想每年返回一个数据帧,名称类似于“df19XX”。

我希望写一个可以处理这个问题的“For”循环......就像...

<meta charset="utf-8">
    <meta name="viewport" content="width=device-width, initial-scale=1, shrink-to-fit=no">
    <meta http-equiv="x-ua-compatible" content="ie=edge">

    <!-- Bootstrap CSS -->
    <link rel="stylesheet" href="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-alpha.2/css/bootstrap.min.css" integrity="sha384-y3tfxAZXuh4HwSYylfB+J125MxIs6mR5FOHamPBG064zB+AFeWH94NdvaCBm8qnd" crossorigin="anonymous">

    <script src="https://code.jquery.com/jquery-3.2.1.slim.min.js" integrity="sha384-KJ3o2DKtIkvYIK3UENzmM7KCkRr/rE9/Qpg6aAZGJwFDMVNA/GpGFF93hXpG5KkN" crossorigin="anonymous"></script>
<script src="https://cdnjs.cloudflare.com/ajax/libs/popper.js/1.12.3/umd/popper.min.js" integrity="sha384-vFJXuSJphROIrBnz7yo7oB41mKfc8JzQZiCq4NCceLEaO4IHwicKwpJf9c9IpFgh" crossorigin="anonymous"></script>
<script src="https://maxcdn.bootstrapcdn.com/bootstrap/4.0.0-beta.2/js/bootstrap.min.js" integrity="sha384-alpBpkh1PFOepccYVYDB4do5UnbKysX5WZXm3XxPqe5iKTfUKjNkCk9SaVuEZflJ" crossorigin="anonymous"></script>

...将返回三个名为df1980,df1981和df1982的数据帧。

谢谢!

2 个答案:

答案 0 :(得分:2)

您可以遍历groupby:

In [11]: df = pd.DataFrame({"date": pd.date_range("2012-12-28", "2013-01-03"), "A": np.random.rand(7)})

In [12]: df
Out[12]:
          A       date
0  0.434715 2012-12-28
1  0.208877 2012-12-29
2  0.912897 2012-12-30
3  0.226368 2012-12-31
4  0.100489 2013-01-01
5  0.474088 2013-01-02
6  0.348368 2013-01-03

In [13]: g = df.groupby(df.date.dt.year)

In [14]: for k, v in g:
    ...:     print(k)
    ...:     print(v)
    ...:     print()
    ...:
2012
          A       date
0  0.434715 2012-12-28
1  0.208877 2012-12-29
2  0.912897 2012-12-30
3  0.226368 2012-12-31

2013
          A       date
4  0.100489 2013-01-01
5  0.474088 2013-01-02
6  0.348368 2013-01-03

我会强烈认为这比仅仅有一个带有变量的字典和使用locals()字典(我声称使用locals()所说的更好,所以不是“pythonic” ):

In [14]: {k: grp for k, grp in g}
Out[14]:
{2012:           A       date
 0  0.434715 2012-12-28
 1  0.208877 2012-12-29
 2  0.912897 2012-12-30
 3  0.226368 2012-12-31, 2013:           A       date
 4  0.100489 2013-01-01
 5  0.474088 2013-01-02
 6  0.348368 2013-01-03}

虽然您可能会考虑动态计算(而不是存储在字典或变量中)。您可以使用get_group

In [15]: g.get_group(2012)
Out[15]:
          A       date
0  0.865239 2012-12-28
1  0.019071 2012-12-29
2  0.362088 2012-12-30
3  0.031861 2012-12-31

答案 1 :(得分:2)

这样的东西?也使用@Andy的df

variables = locals()
for i in [2012, 2013]:
    variables["df{0}".format(i)]=df.loc[df.date.dt.year==i]
df2012
Out[118]: 
          A       date
0  0.881468 2012-12-28
1  0.237672 2012-12-29
2  0.992287 2012-12-30
3  0.194288 2012-12-31
df2013
Out[119]: 
          A       date
4  0.151854 2013-01-01
5  0.855312 2013-01-02
6  0.534075 2013-01-03