Question

我正在使用三个小型数据集，并且出于可重复性的原因，我正在共享数据here。

从第2列开始，我想读取当前行并将其与上一行的值进行比较。如果更大，我会继续比较。如果当前值小于上一行的值，我想将当前值（较小）除以上一个值（较大）。因此，以下代码：

import numpy as np
import matplotlib.pyplot as plt

protocols = {}

types = {"data_c": "data_c.csv", "data_r": "data_r.csv", "data_v": "data_v.csv"}

for protname, fname in types.items():
    col_time,col_window = np.loadtxt(fname,delimiter=',').T
    trailing_window = col_window[:-1] # "past" values at a given index
    leading_window  = col_window[1:]  # "current values at a given index
    decreasing_inds = np.where(leading_window < trailing_window)[0]
    quotient = leading_window[decreasing_inds]/trailing_window[decreasing_inds]
    quotient_times = col_time[decreasing_inds]

    protocols[protname] = {
        "col_time": col_time,
        "col_window": col_window,
        "quotient_times": quotient_times,
        "quotient": quotient,
    }

data_c是numpy.array，只有一个 unique quotient值0.7，data_r具有唯一的{ quotient的{1}}值。但是，0.5具有两个唯一的data_v值（quotient或0.5）。

我想遍历这些CSV文件的0.8值，并使用简单的quotient语句（例如，if-else）对它们进行分类，但出现此错误：< / p>

if quotient==0.7: print("data_c")

更新：我发现可以通过使用ValueError: The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()函数来解决此错误，如下所示。

.all()

但是，这将打印if (quotient==0.7).all(): print("data_c") elif (quotient>=0.5).all() and (quotient <=0.8).all(): print("data_v") elif (quotient==0.5).all(): print("data_r")。我们如何解决这个问题？

Answer 1

如果我理解正确，则要求您使用quotient数组的 unique 值对数据进行分类。如果是这种情况，那么您可以轻松利用numpy.unique来提供帮助：

import numpy as np
unique_quotient = np.unique(quotient)
# For data_c this is just a single value

如果将unique_quotient数组放置在protocol_dictionary中，那么它将为您提供一些可比较的内容（例如，using numpy.array_equal）：

unique_data_c_quotient = np.r_[ 0.7 ]
if np.array_equal( unique_quotient, unique_data_c_quotient ): 
    print('data_c')
...

Answer 2

我正在尝试重复该过程，直到获得商为止。这里我只处理一个文件

SELECT cl.MerchantSequenceKey, c.ChainOID
FROM hpstChainList cl
JOIN hpstChains c ON cl.ChainOID = c.ChainOID
WHERE c.ChainTypeOID = 2
GROUP BY cl.MerchantSequenceKey, c.ChainOID
HAVING COUNT(*) > 1

循环浏览多个CSV文件

2 个答案: