我正在尝试建立比特币和交易所之间的价格差异,例如,我有一个数据框,
Exchange coin lastUpdate price volume
0 Bitfinex BTC 2019-06-23 06:23:27 10646 24299.4
1 Bitfinex ETH 2019-06-23 06:23:13 308.47 225945
2 Bitfinex LTC 2019-06-23 06:23:18 140.41 215698
3 Bitstamp BTC 2019-06-23 06:23:21 10546.4 9620.04
4 Bitstamp ETH 2019-06-23 06:22:48 305.15 46062.6
5 Bitstamp LTC 2019-06-23 06:22:46 139.22 85160.5
6 CCCAGG BTC 2019-06-23 06:23:23 10580.4 79049.8
7 CCCAGG ETH 2019-06-23 06:23:20 306.74 681056
8 CCCAGG LTC 2019-06-23 06:23:24 139.71 752875
9 Coinbase BTC 2019-06-23 06:23:17 10557.5 23731.2
10 Coinbase ETH 2019-06-23 06:23:11 306.09 247213
11 Coinbase LTC 2019-06-23 06:23:13 139.49 381421
我正在尝试弄清硬币与它所交易的所有交易所之间的所有价格差异,
我希望它看起来像
price_combos diff
Price Diff: BTC - Bitfinex-Bitstamp 14.06
Price Diff: BTC - Bitfinex-CCCAGG 14.32
Price Diff: BTC - Bitstamp-CCCAGG 0.26
Price Diff: BTC - Coinbase-Bitfinex -17.99
Price Diff: BTC - Coinbase-Bitstamp -3.93
Price Diff: BTC - Coinbase-CCCAGG -3.67
然后为每个硬币重复一次。
编辑:将价格添加到组合中,请注意,差异来自不同的数据集,因此它与第一个数据框的实际差异不匹配。
答案 0 :(得分:1)
我们可以通过以下方式解决此问题:
outer merge
,以便将所有组合返还给我们。 ne
(不相等)过滤掉了交换相同(我们不想比较它们)的行。Price diff
列# Step 1 outer merge
df2 = df[['Exchange', 'coin', 'price']].merge(df[['Exchange', 'coin', 'price']],
on='coin',
how='outer',
suffixes=['', '_2'])
# Step 2 filter out same exchange
df2 = df2[df2['Exchange'].ne(df2['Exchange_2'])]
# Step 3 create Price Diff column
df2['Price Diff'] = df2['price'] = df2['price_2']
Exchange coin price Exchange_2 price_2 Price Diff
1 Bitfinex BTC 10546.40 Bitstamp 10546.40 10546.40
2 Bitfinex BTC 10580.40 CCCAGG 10580.40 10580.40
3 Bitfinex BTC 10557.50 Coinbase 10557.50 10557.50
4 Bitstamp BTC 10646.00 Bitfinex 10646.00 10646.00
6 Bitstamp BTC 10580.40 CCCAGG 10580.40 10580.40
7 Bitstamp BTC 10557.50 Coinbase 10557.50 10557.50
8 CCCAGG BTC 10646.00 Bitfinex 10646.00 10646.00
9 CCCAGG BTC 10546.40 Bitstamp 10546.40 10546.40
11 CCCAGG BTC 10557.50 Coinbase 10557.50 10557.50
12 Coinbase BTC 10646.00 Bitfinex 10646.00 10646.00
13 Coinbase BTC 10546.40 Bitstamp 10546.40 10546.40
14 Coinbase BTC 10580.40 CCCAGG 10580.40 10580.40
17 Bitfinex ETH 305.15 Bitstamp 305.15 305.15
18 Bitfinex ETH 306.74 CCCAGG 306.74 306.74
19 Bitfinex ETH 306.09 Coinbase 306.09 306.09
20 Bitstamp ETH 308.47 Bitfinex 308.47 308.47
22 Bitstamp ETH 306.74 CCCAGG 306.74 306.74
23 Bitstamp ETH 306.09 Coinbase 306.09 306.09
24 CCCAGG ETH 308.47 Bitfinex 308.47 308.47
25 CCCAGG ETH 305.15 Bitstamp 305.15 305.15
27 CCCAGG ETH 306.09 Coinbase 306.09 306.09
28 Coinbase ETH 308.47 Bitfinex 308.47 308.47
29 Coinbase ETH 305.15 Bitstamp 305.15 305.15
30 Coinbase ETH 306.74 CCCAGG 306.74 306.74
33 Bitfinex LTC 139.22 Bitstamp 139.22 139.22
34 Bitfinex LTC 139.71 CCCAGG 139.71 139.71
35 Bitfinex LTC 139.49 Coinbase 139.49 139.49
36 Bitstamp LTC 140.41 Bitfinex 140.41 140.41
38 Bitstamp LTC 139.71 CCCAGG 139.71 139.71
39 Bitstamp LTC 139.49 Coinbase 139.49 139.49
40 CCCAGG LTC 140.41 Bitfinex 140.41 140.41
41 CCCAGG LTC 139.22 Bitstamp 139.22 139.22
43 CCCAGG LTC 139.49 Coinbase 139.49 139.49
44 Coinbase LTC 140.41 Bitfinex 140.41 140.41
45 Coinbase LTC 139.22 Bitstamp 139.22 139.22
46 Coinbase LTC 139.71 CCCAGG 139.71 139.71
答案 1 :(得分:0)
您应该看看itertools
模块(doc)。有很多不错的迭代功能。
在这里,您正在寻找 combination
函数。
一旦有了组合,就变得很简单:
# Import modules
import pandas as pd
import itertools as iter
# Your data
df = pd.DataFrame([
["Bitfinex", "BTC", "2019-06-23 06:23:27", 10646, 24299.4],
["Bitfinex", "ETH", "2019-06-23 06:23:13", 308.47, 225945],
["Bitfinex", "LTC", "2019-06-23 06:23:18", 140.41, 215698],
["Bitstamp", "BTC", "2019-06-23 06:23:21", 10546.4, 9620.04],
["Bitstamp", "ETH", "2019-06-23 06:22:48", 305.15, 46062.6],
["Bitstamp", "LTC", "2019-06-23 06:22:46", 139.22, 85160.5],
["CCCAGG", "BTC", "2019-06-23 06:23:23", 10580.4, 79049.8],
["CCCAGG", "ETH", "2019-06-23 06:23:20", 306.74, 681056],
["CCCAGG", "LTC", "2019-06-23 06:23:24", 139.71, 752875],
["Coinbase", "BTC", "2019-06-23 06:23:17", 10557.5, 23731.2],
["Coinbase", "ETH", "2019-06-23 06:23:11", 306.09, 247213],
["Coinbase", "LTC", "2019-06-23 06:23:13", 139.49, 381421],
], columns=["Exchange", "coin", "lastUpdate", "price", "volume"])
# Print all combinations for one coin
def print_combi(df, coin):
# subset dataframe with matching rows
sub_df = df[df["coin"] == coin]
# Create all combinations for the exchange columns
list_combi = [cb for cb in iter.combinations(sub_df.Exchange, 2)]
# Print the expected output
for combi in list_combi:
print("Price diff: {0} - {1}-{2}".format(coin, combi[0], combi[1]))
print_combi(df, 'BTC')
# Price diff: BTC - Bitfinex-Bitstamp
# Price diff: BTC - Bitfinex-CCCAGG
# Price diff: BTC - Bitfinex-Coinbase
# Price diff: BTC - Bitstamp-CCCAGG
# Price diff: BTC - Bitstamp-Coinbase
# Price diff: BTC - CCCAGG-Coinbase
EDIT1:
返回一个数据框。 diff列来自上面代码段中使用的数据。
def combo_money_df(df, coin):
# subset the dataframe
sub_df = df[df["coin"] == coin]
new_data = []
# For each subset
for combi in iter.combinations(sub_df.index, 2):
# Select corresponding row
row_1 = sub_df.loc[combi[0]]
row_2 = sub_df.loc[combi[1]]
# Create new rows
new_data.append([row_1.Exchange + "-" + row_2.Exchange, row_1.price - row_2.price])
# Return a dataframe object
return pd.DataFrame(new_data, columns=["price_combo", "diff"])
print(combo_money_df(df, "BTC"))
# price_combo diff
# 0 Bitfinex-Bitstamp 99.6
# 1 Bitfinex-CCCAGG 65.6
# 2 Bitfinex-Coinbase 88.5
# 3 Bitstamp-CCCAGG -34.0
# 4 Bitstamp-Coinbase -11.1
# 5 CCCAGG-Coinbase 22.9