熊猫:建立价格差异矩阵?

时间:2019-06-23 11:04:37

标签: python python-3.x pandas

我正在尝试建立比特币和交易所之间的价格差异,例如,我有一个数据框,

    Exchange coin           lastUpdate    price   volume
0   Bitfinex  BTC  2019-06-23 06:23:27    10646  24299.4
1   Bitfinex  ETH  2019-06-23 06:23:13   308.47   225945
2   Bitfinex  LTC  2019-06-23 06:23:18   140.41   215698
3   Bitstamp  BTC  2019-06-23 06:23:21  10546.4  9620.04
4   Bitstamp  ETH  2019-06-23 06:22:48   305.15  46062.6
5   Bitstamp  LTC  2019-06-23 06:22:46   139.22  85160.5
6     CCCAGG  BTC  2019-06-23 06:23:23  10580.4  79049.8
7     CCCAGG  ETH  2019-06-23 06:23:20   306.74   681056
8     CCCAGG  LTC  2019-06-23 06:23:24   139.71   752875
9   Coinbase  BTC  2019-06-23 06:23:17  10557.5  23731.2
10  Coinbase  ETH  2019-06-23 06:23:11   306.09   247213
11  Coinbase  LTC  2019-06-23 06:23:13   139.49   381421

我正在尝试弄清硬币与它所交易的所有交易所之间的所有价格差异,

我希望它看起来像

price_combos                        diff
Price Diff: BTC - Bitfinex-Bitstamp 14.06
Price Diff: BTC - Bitfinex-CCCAGG   14.32
Price Diff: BTC - Bitstamp-CCCAGG   0.26
Price Diff: BTC - Coinbase-Bitfinex -17.99
Price Diff: BTC - Coinbase-Bitstamp -3.93
Price Diff: BTC - Coinbase-CCCAGG   -3.67

然后为每个硬币重复一次。

编辑:将价格添加到组合中,请注意,差异来自不同的数据集,因此它与第一个数据框的实际差异不匹配。

2 个答案:

答案 0 :(得分:1)

我们可以通过以下方式解决此问题:

  1. 我们对每个硬币本身进行outer merge,以便将所有组合返还给我们。
  2. 我们用ne(不相等)过滤掉了交换相同(我们不想比较它们)的行。
  3. 减去价格创建我们的Price diff
# Step 1 outer merge
df2 = df[['Exchange', 'coin', 'price']].merge(df[['Exchange', 'coin', 'price']], 
                                              on='coin', 
                                              how='outer', 
                                              suffixes=['', '_2'])

# Step 2 filter out same exchange
df2 = df2[df2['Exchange'].ne(df2['Exchange_2'])]

# Step 3 create Price Diff column
df2['Price Diff'] = df2['price'] = df2['price_2']

    Exchange coin     price Exchange_2   price_2  Price Diff
1   Bitfinex  BTC  10546.40   Bitstamp  10546.40    10546.40
2   Bitfinex  BTC  10580.40     CCCAGG  10580.40    10580.40
3   Bitfinex  BTC  10557.50   Coinbase  10557.50    10557.50
4   Bitstamp  BTC  10646.00   Bitfinex  10646.00    10646.00
6   Bitstamp  BTC  10580.40     CCCAGG  10580.40    10580.40
7   Bitstamp  BTC  10557.50   Coinbase  10557.50    10557.50
8     CCCAGG  BTC  10646.00   Bitfinex  10646.00    10646.00
9     CCCAGG  BTC  10546.40   Bitstamp  10546.40    10546.40
11    CCCAGG  BTC  10557.50   Coinbase  10557.50    10557.50
12  Coinbase  BTC  10646.00   Bitfinex  10646.00    10646.00
13  Coinbase  BTC  10546.40   Bitstamp  10546.40    10546.40
14  Coinbase  BTC  10580.40     CCCAGG  10580.40    10580.40
17  Bitfinex  ETH    305.15   Bitstamp    305.15      305.15
18  Bitfinex  ETH    306.74     CCCAGG    306.74      306.74
19  Bitfinex  ETH    306.09   Coinbase    306.09      306.09
20  Bitstamp  ETH    308.47   Bitfinex    308.47      308.47
22  Bitstamp  ETH    306.74     CCCAGG    306.74      306.74
23  Bitstamp  ETH    306.09   Coinbase    306.09      306.09
24    CCCAGG  ETH    308.47   Bitfinex    308.47      308.47
25    CCCAGG  ETH    305.15   Bitstamp    305.15      305.15
27    CCCAGG  ETH    306.09   Coinbase    306.09      306.09
28  Coinbase  ETH    308.47   Bitfinex    308.47      308.47
29  Coinbase  ETH    305.15   Bitstamp    305.15      305.15
30  Coinbase  ETH    306.74     CCCAGG    306.74      306.74
33  Bitfinex  LTC    139.22   Bitstamp    139.22      139.22
34  Bitfinex  LTC    139.71     CCCAGG    139.71      139.71
35  Bitfinex  LTC    139.49   Coinbase    139.49      139.49
36  Bitstamp  LTC    140.41   Bitfinex    140.41      140.41
38  Bitstamp  LTC    139.71     CCCAGG    139.71      139.71
39  Bitstamp  LTC    139.49   Coinbase    139.49      139.49
40    CCCAGG  LTC    140.41   Bitfinex    140.41      140.41
41    CCCAGG  LTC    139.22   Bitstamp    139.22      139.22
43    CCCAGG  LTC    139.49   Coinbase    139.49      139.49
44  Coinbase  LTC    140.41   Bitfinex    140.41      140.41
45  Coinbase  LTC    139.22   Bitstamp    139.22      139.22
46  Coinbase  LTC    139.71     CCCAGG    139.71      139.71

答案 1 :(得分:0)

您应该看看itertools模块(doc)。有很多不错的迭代功能。

在这里,您正在寻找 combination 函数。

一旦有了组合,就变得很简单:

# Import modules
import pandas as pd
import itertools as iter

# Your data
df = pd.DataFrame([
    ["Bitfinex",  "BTC", "2019-06-23 06:23:27",  10646, 24299.4],
    ["Bitfinex",  "ETH", "2019-06-23 06:23:13",  308.47,  225945],
    ["Bitfinex",  "LTC", "2019-06-23 06:23:18",  140.41,  215698],
    ["Bitstamp",  "BTC", "2019-06-23 06:23:21", 10546.4, 9620.04],
    ["Bitstamp",  "ETH", "2019-06-23 06:22:48",  305.15, 46062.6],
    ["Bitstamp", "LTC", "2019-06-23 06:22:46", 139.22, 85160.5],
    ["CCCAGG",  "BTC", "2019-06-23 06:23:23", 10580.4, 79049.8],
    ["CCCAGG",  "ETH", "2019-06-23 06:23:20", 306.74,  681056],
    ["CCCAGG",  "LTC", "2019-06-23 06:23:24", 139.71, 752875],
    ["Coinbase",  "BTC", "2019-06-23 06:23:17", 10557.5, 23731.2],
    ["Coinbase", "ETH", "2019-06-23 06:23:11", 306.09, 247213],
    ["Coinbase", "LTC", "2019-06-23 06:23:13", 139.49,  381421],
], columns=["Exchange", "coin", "lastUpdate", "price", "volume"])


# Print all combinations for one coin
def print_combi(df, coin):
    # subset dataframe with matching rows
    sub_df = df[df["coin"] == coin]
    # Create all combinations for the exchange columns
    list_combi = [cb for cb in iter.combinations(sub_df.Exchange, 2)]

    # Print the expected output
    for combi in list_combi:
        print("Price diff: {0} - {1}-{2}".format(coin, combi[0], combi[1]))

print_combi(df, 'BTC')
# Price diff: BTC - Bitfinex-Bitstamp
# Price diff: BTC - Bitfinex-CCCAGG
# Price diff: BTC - Bitfinex-Coinbase
# Price diff: BTC - Bitstamp-CCCAGG
# Price diff: BTC - Bitstamp-Coinbase
# Price diff: BTC - CCCAGG-Coinbase

EDIT1:

返回一个数据框。 diff列来自上面代码段中使用的数据。

def combo_money_df(df, coin):
    # subset the dataframe
    sub_df = df[df["coin"] == coin]

    new_data = []
    # For each subset
    for combi in iter.combinations(sub_df.index, 2):
        # Select corresponding row
        row_1 = sub_df.loc[combi[0]]
        row_2 = sub_df.loc[combi[1]]
        # Create new rows
        new_data.append([row_1.Exchange + "-" + row_2.Exchange, row_1.price - row_2.price])
    # Return a dataframe object
    return pd.DataFrame(new_data, columns=["price_combo", "diff"])

print(combo_money_df(df, "BTC"))
#          price_combo  diff
# 0  Bitfinex-Bitstamp  99.6
# 1    Bitfinex-CCCAGG  65.6
# 2  Bitfinex-Coinbase  88.5
# 3    Bitstamp-CCCAGG -34.0
# 4  Bitstamp-Coinbase -11.1
# 5    CCCAGG-Coinbase  22.9