虽然我可以读取csv文件而不是读取整个文件,但是如何只打印某些行和列?
想象一下这就像是Excel:
A B C D E
State |Heart Disease Rate| Stroke Death Rate | HIV Diagnosis Rate |Teen Birth Rate
Alabama 235.5 54.5 16.7 18.01
Alaska 147.9 44.3 3.2 N/A
Arizona 152.5 32.7 11.9 N/A
Arkansas 221.8 57.4 10.2 N/A
California 177.9 42.2 N/A N/A
Colorado 145.3 39 8.4 9.25
继承我所拥有的:
import csv
try:
risk = open('riskfactors.csv', 'r', encoding="windows-1252").read() #find the file
except:
while risk != "riskfactors.csv": # if the file cant be found if there is an error
print("Could not open", risk, "file")
risk = input("\nPlease try to open file again: ")
else:
with open("riskfactors.csv") as f:
reader = csv.reader(f, delimiter=' ', quotechar='|')
data = []
for row in reader:# Number of rows including the death rates
for col in (2,4): # The columns I want read B and D
data.append(row)
data.append(col)
for item in data:
print(item) #print the rows and columns
我只需阅读B和D列,所有统计数据都是这样读的:
A B D
State |Heart Disease Rate| HIV Diagnosis Rate |
Alabama 235.5 16.7
Alaska 147.9 3.2
Arizona 152.5 11.9
Arkansas 221.8 10.2
California 177.9 N/A
Colorado 145.3 8.4
没有错误
关于如何解决这个问题的任何想法?我尝试的一切都不起作用。非常感谢任何帮助或建议。
答案 0 :(得分:10)
我希望您听说过Pandas for Data Analysis。
以下代码将执行读取列的工作,但是关于读取行,您可能需要解释更多。
import pandas
io = pandas.read_csv('test.csv',sep=",",usecols=(1,2,4)) # To read 1st,2nd and 4th columns
print io
答案 1 :(得分:3)
如果您仍然卡住了,那么您无需使用CSV模块读取文件,因为所有CSV文件都只是逗号分隔的字符串。所以,对于一些简单的事情,你可以试试这个,它会给你一个表格的元组列表(状态,心脏病率,HIV诊断率)
output = []
f = open( 'riskfactors.csv', 'rU' ) #open the file in read universal mode
for line in f:
cells = line.split( "," )
output.append( ( cells[ 0 ], cells[ 1 ], cells[ 3 ] ) ) #since we want the first, second and third column
f.close()
print output
请注意,如果您想进行任何类型的数据分析,则必须通过并忽略标题行。
答案 2 :(得分:2)
试试这个
data = []
for row in reader:# Number of rows including the death rates
data.append([row[1],row[3]) # The columns I want read B and D
for item in data
print(item) #print the rows and columns