Python:使用Excel CSV文件只读取某些列和行

时间:2013-03-08 04:11:43

标签: python excel file

虽然我可以读取csv文件而不是读取整个文件,但是如何只打印某些行和列?

想象一下这就像是Excel:

  A              B              C                  D                    E
State  |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate

Alabama     235.5             54.5                 16.7                 18.01

Alaska      147.9             44.3                  3.2                  N/A    

Arizona     152.5             32.7                 11.9                  N/A    

Arkansas    221.8             57.4                 10.2                  N/A    

California  177.9             42.2                  N/A                  N/A    

Colorado    145.3             39                    8.4                 9.25    

继承我所拥有的:

import csv

try:
    risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file

except:
    while risk != "riskfactors.csv":  # if the file cant be found if there is an error
    print("Could not open", risk, "file")
    risk = input("\nPlease try to open file again: ")
else:
    with open("riskfactors.csv") as f:
        reader = csv.reader(f, delimiter=' ', quotechar='|')

        data = []
        for row in reader:# Number of rows including the death rates 
            for col in (2,4): # The columns I want read   B and D
                data.append(row)
                data.append(col)
        for item in data:
            print(item) #print the rows and columns

我只需阅读B和D列,所有统计数据都是这样读的:

  A              B                D                    
 State  |Heart Disease Rate| HIV Diagnosis Rate |

 Alabama       235.5             16.7                

  Alaska       147.9             3.2                     

  Arizona      152.5             11.9                     

  Arkansas     221.8             10.2                    

 California    177.9             N/A                     

 Colorado      145.3             8.4                

被修改

没有错误

关于如何解决这个问题的任何想法?我尝试的一切都不起作用。非常感谢任何帮助或建议。

3 个答案:

答案 0 :(得分:10)

我希望您听说过Pandas for Data Analysis。

以下代码将执行读取列的工作,但是关于读取行,您可能需要解释更多。

import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io 

答案 1 :(得分:3)

如果您仍然卡住了,那么您无需使用CSV模块读取文件,因为所有CSV文件都只是逗号分隔的字符串。所以,对于一些简单的事情,你可以试试这个,它会给你一个表格的元组列表(状态,心脏病率,HIV诊断率)

output = []

f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
    cells = line.split( "," )
    output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column

f.close()

print output

请注意,如果您想进行任何类型的数据分析,则必须通过并忽略标题行。

答案 2 :(得分:2)

试试这个

data = []
for row in reader:# Number of rows including the death rates
    data.append([row[1],row[3]) # The columns I want read  B and D
for item in data
            print(item) #print the rows and columns