Question

我有一个包含5列和13行的csv文件，如下所示：

现场实验长度宽度高度

1   1   2.2 1.3 9.6
1   2   2.1 2.2 7.6
1   3   2.7 1.5 2.2
2   1   3   4.5 1.5
2   2   3.1 3.1 4
2   3   2.5 2.8 3
3   1   1.9 1.8 4.5
3   2   1.1 0.5 2.3
3   3   3.5 2   7.5
4   1   2.9 2.7 3.2
4   2   4.5 4.8 6.5
4   3   1.2 1.8 2.7

长度/宽度/高度是植物的长度。

对于数据集中的每一行，我想创建一个条件代码，以查看工厂是高（高度> 5），中等（2 <=高度<5）还是短（高度<2），然后确定每株植物的碳总量。

植物中的总碳= 1.8 + 2 * log（体积），其中体积=长度x宽度x高度。

然后我想将这些信息作为表存储在嵌套列表中，其中第一列包含实验编号，第二列包含字符串'tall'，'medium'或'short'，具体取决于工厂的高度，第三栏包含植物的碳含量。

到目前为止，这是我的代码：

from __future__ import division
import math
import numpy
shrub_exp=numpy.loadtxt("/Users/louisestevens/Downloads/shrub_volume_experiment.csv",dtype=float,delimiter=',',skiprows=1,usecols=(2,3,4))
for rows in shrub_exp:
    print(rows)

height=(shrub_exp,4)
def height_test(height):
    if height > 5:
        return 'Tall'
    elif 2 <= height < 5:
        return 'Medium'
    else:
        return 'Short'
for x in height:
    print(height_test(x))

for x,y,z in shrub_exp:
    volume=(x*y*z)
    total_carbon=1.8 + 2 * math.log(volume)
    print(total_carbon)

我不确定我是否正确选择了高度列 - 这是最后一列 - 以及如何将此信息存储在嵌套列表中。

请允许我对如何简洁有效地编写此脚本有一些指示。

Answer 1

shhrub_exp是一个列表列表，每个列表都是CSV中的一行。这条线

height=(shrub_exp,4)

创建一个包含两个元素的新元组，第一个是shrub_exp，第二个是数字4。这对你没什么用。

如果你想从每一行处理高度;

for row in shrub_exp:
    print( height_test(row[2]) )

为什么2？因为您在加载文件时跳过了第0列和第1列。因此，文件中的第4列现在是行数据列表中的第2列。

您的最终for循环将每行解包为x,y,z。 z然后是身高。要在类似的列表列表中捕获输出，您可以这样做;

results = [] # start with empty list
for length,width,height in shrub_exp:
    volume=(length*width*height)
    total_carbon=1.8 + 2 * math.log(volume)
    results.append( [height_test(height) , volume, total_carbon] )  # add new row to the result

Answer 2

不使用numpy，以下代码是获取结果的一种方法。假设csv文件在本地目录中名为shrub.csv，如下所示：

1,1,2.2,1.3,9.6
1,2,2.1,2.2,7.6
1,3,2.7,1.5,2.2
2,1,3,4.5,1.5
2,2,3.1,3.1,4
2,3,2.5,2.8,3
3,1,1.9,1.8,4.5
3,2,1.1,0.5,2.3
3,3,3.5,2,7.5
4,1,2.9,2.7,3.2
4,2,4.5,4.8,6.5
4,3,1.2,1.8,2.7

import math
f=open('shrub.csv')
shrub_exp=f.readlines()
f.close()

def height_test(height):
    if height > 5:
        return 'Tall'
    elif height >= 2:
        return 'Medium'
    else:
        return 'Short'
res=[]
for row in shrub_exp:
    site,exp,leng,wid,hgt = row.split(',')
    volume=(float(leng)*float(wid)*float(hgt))
    total_carbon=1.8 + 2 * math.log(volume)
    res.append([exp, height_test(float(hgt)), total_carbon])
for r in res:
    print r

请注意，数据没有错误检查。

['1', 'Tall', 8.425169446611104]
['2', 'Tall', 8.917085904771866]
['3', 'Medium', 6.174348482965436]
['1', 'Short', 7.8163095871050965]
['2', 'Medium', 9.098197168204184]
['3', 'Medium', 7.889044875446846]
['1', 'Medium', 7.267435895701576]
['2', 'Medium', 2.270144244358967]
['3', 'Tall', 9.721626339195156]
['1', 'Medium', 8.242226639616785]
['2', 'Tall', 11.688990983183421]
['3', 'Medium', 5.326719989412714]

循环中的条件语句

2 个答案: