导入CSV并在Python中进行分析

时间:2012-11-20 21:34:45

标签: python csv

寻找一个小项目的帮助。我正在努力学习Python,我完全迷失在一个问题上。请让我解释一下。

我有一个包含'Apple股价'的csv文件,到目前为止我可以使用csv模块导入Python,但是,我需要分析数据并生成月平均值并确定最佳和最差的6个月。我的csv列是Date,Price。

非常感谢帮助。

"Date","Open","High","Low","Close","Volume","Adj Close"
"2012-11-14",660.66,662.18,650.5,652.55,1668400,652.55
"2012-11-13",663,667.6,658.23,659.05,1594200,659.05
"2012-11-12",663.75,669.8,660.87,665.9,1405900,665.9
"2012-11-09",654.65,668.34,650.3,663.03,3114100,663.03
"2012-11-08",670.2,671.49,651.23,652.29,2597000,652.29
"2012-11-07",675,678.23,666.49,667.12,2232300,667.12
"2012-11-06",685.48,686.5,677.55,681.72,1582800,681.72
"2012-11-05",684.5,686.86,675.56,682.96,1635900,682.96
"2012-11-02",694.79,695.55,687.37,687.92,2324400,687.92
"2012-11-01",679.5,690.9,678.72,687.59,2050100,687.59
"2012-10-31",679.86,681,675,680.3,1537000,680.3
"2012-10-26",676.5,683.03,671.2,675.15,1950800,675.15
"2012-10-25",680,682,673.51,677.76,2401100,677.76
"2012-10-24",686.8,687,675.27,677.3,2496500,677.3

等...

2 个答案:

答案 0 :(得分:2)

使用pandas,这将是

In [28]: df = pd.read_csv('my_data.csv', parse_dates=True, index_col=0, sep=',')

In [29]: df
Out[29]: 
              Open    High     Low   Close   Volume  Adj Close
Date                                                          
2012-11-14  660.66  662.18  650.50  652.55  1668400     652.55
2012-11-13  663.00  667.60  658.23  659.05  1594200     659.05
2012-11-12  663.75  669.80  660.87  665.90  1405900     665.90
2012-11-09  654.65  668.34  650.30  663.03  3114100     663.03
2012-11-08  670.20  671.49  651.23  652.29  2597000     652.29
2012-11-07  675.00  678.23  666.49  667.12  2232300     667.12
2012-11-06  685.48  686.50  677.55  681.72  1582800     681.72
2012-11-05  684.50  686.86  675.56  682.96  1635900     682.96
2012-11-02  694.79  695.55  687.37  687.92  2324400     687.92
2012-11-01  679.50  690.90  678.72  687.59  2050100     687.59
2012-10-31  679.86  681.00  675.00  680.30  1537000     680.30
2012-10-26  676.50  683.03  671.20  675.15  1950800     675.15
2012-10-25  680.00  682.00  673.51  677.76  2401100     677.76
2012-10-24  686.80  687.00  675.27  677.30  2496500     677.30

In [30]: monthly = df.resample('1M')

In [31]: monthly
Out[30]: 
               Open      High      Low     Close   Volume  Adj Close
Date                                                                
2012-10-31  680.790  683.2575  673.745  677.6275  2096350   677.6275
2012-11-30  673.153  677.7450  665.682  670.0130  2020510   670.0130

您可以对所需的列进行排序

In [33]: monthly.sort('Close')
Out[33]: 
               Open      High      Low     Close   Volume  Adj Close
Date                                                                
2012-11-30  673.153  677.7450  665.682  670.0130  2020510   670.0130
2012-10-31  680.790  683.2575  673.745  677.6275  2096350   677.6275

您甚至可以从雅虎财务中获取数据:

In [37]: from pandas.io import data as pddata

In [40]: df = pddata.DataReader('AAPL', data_source='yahoo', start='2012-01-01')

In [41]: df.resample('1M').sort('Close')
Out[44]: 
                  Open        High         Low       Close           Volume   Adj Close
Date                                                                                   
2012-01-31  428.760000  431.008500  425.810500  428.578000  12249740.000000  424.804500
2012-02-29  494.803000  500.849000  491.437500  497.571000  20300990.000000  493.191000
2012-11-30  560.365385  566.118462  548.523846  555.789231  24861884.615385  554.970769
2012-05-31  565.785000  572.141364  558.397273  564.673182  18029781.818182  559.702273
2012-06-30  574.660952  578.889048  569.213333  574.562381  13360247.619048  569.504762
2012-03-31  576.858182  582.064545  570.245909  577.507727  25299250.000000  572.424545
2012-07-31  599.610000  604.920952  594.680476  601.068095  15152466.666667  595.776667
2012-04-30  609.607500  615.487500  598.650000  606.003000  27855340.000000  600.668500
2012-10-31  638.667143  643.650476  628.213810  634.714286  20651071.428571  631.828571
2012-08-31  641.527826  646.655217  637.138261  642.696087  12851252.173913  639.090870
2012-09-30  682.118421  687.007895  676.095263  681.568421  17291363.157895  678.470526

答案 1 :(得分:1)

在阅读完项目并在列表中保存[month,mean_price]对后,您可以对列表进行排序:

import operator
values_list.sort(key=operator.itemgetter(1))

这将按价格对值进行排序。获得前n个值:

print values_list[-n:]

或底部的n:

print values_list[:n]