Question

我有2列数据：一列有温度值，另一列有观察每个温度的频率。我一直在尝试在python中编写代码来获取这个2列频率数据，并创建一个扩展的温度数据数组。我本质上想要反转“计数”值的过程，并将所有原始数据值暴露在单个数组中。

我目前正在阅读数据的方式如下：

f = np.genfromtxt('playground_sum.txt', usecols=(0,1))

freq = f[:,1]
temp = f[:,0]
freq = freq.astype(int)

new = []
for line in f:
    new = np.repeat(temp,freq)
print new

这很有效！欢迎任何其他方法。

Answer 1

试试这个：

array = []
with open('myfile.txt') as f:
    for line in f:
        line = line.strip()[1: -1] # gives ex: '10, 0'
        try:
            temp, freq = [int(i) for i in line.split(',')] # a list comprehension
        except ValueError:
            continue
        array.extend([temp] * freq)

这假设文件中的每一行都是这样的：[10, 0]

此代码输出如下所示的列表：[10, 20, 20, 30, 30, 30]

Answer 2

与@Totem解决方案类似，但我认为你应该将频率转换为整数。

array = []
with open('test.csv') as f:
    for line in f:
        temp, freq = line.split(',')
        try:
            freq = int(freq)
        except Exception as e:
            continue

        array.extend([temp] * freq)

print array

扩展单个变量的频率数据集

2 个答案: