我正在尝试为以下数据创建代码:
我使用以下代码导入数据:
$$_PRO_FILE_PWD_/libs/mylib
我想创建一个具有输出的代码:
import csv
import itertools
import pandas as pd
input_file="computation.csv"
cmd=pd.read_csv(input_file)
subset = cmd[['Carbon A', 'Carbon B']]
carbon_pairs = [tuple(y) for y in subset.values]
c_pairs = carbon_pairs
请注意,对于'carbon'2,我希望它重复它与碳1连接。我在想一些排列能够证明这一点,但我不确定从哪里开始。基本上,代码需要输出:
1 is connected to
2
4
6
7
8
2 is connected to
1
4
5
答案 0 :(得分:1)
您可以使用以下函数(Python 2)获得所需的输出而不使用pandas依赖项,这将允许您传入所需的任何文件名,并使用您尝试查询的索引(从零开始)进行控制。此解决方案假定数据按您提供的示例进行排序。
ReportRepository
然后根据您的示例调用它:
import csv
def printAdjacentNums(filename, firstIdx, secondIdx):
with open(filename, 'rb') as csvfile:
# handle header line
header = next(csvfile)
reader = csv.reader(csvfile)
current_val = ''
current_adj = []
# dict of lists for lookback
lookback = {}
for row in reader:
if current_val == '':
current_val = row[firstIdx]
if row[firstIdx] == current_val:
current_adj.append(row[secondIdx])
else:
# check lookback
for k, v in lookback.items():
if current_val in v:
current_adj.append(k)
# print what we need to
print current_val + ' is connected to'
for i in current_adj:
print i
# append current vals to lookback
lookback[current_val] = current_adj
# reassign
current_val = row[firstIdx]
current_adj = [row[secondIdx]]
# print final set
for k, v in lookback.items():
if current_val in v:
current_adj.append(k)
print current_val + ' is connected to'
for i in current_adj:
print i
答案 1 :(得分:0)
从问题的结尾开始:
c_pairs = [(1, 2), (1, 4), (1, 6), (1, 7), (1, 8), (2, 1), (2, 4), (2, 5)]
你可能希望得到更像的东西:
groups = {1: [2, 4, 6, 7, 8], 2: [1, 4, 5]}
有很多方法可以获得这个。
如果您知道自己的数据已经排序,那么非常快捷的方法就是使用itertools.groupby
,例如:
first_item = lambda (a, b): a
for key, items in itertools.groupby(c_pairs, first_item):
print '%s is connected to' % key
for (a, b) in items:
print ' %s' % b
如果您的数据没有排序,它仍然可能是最快的方式,只需先对其进行排序:
c_pairs = sorted(c_pairs, key=first_item)
更自助的解决方案是使用defaultdict
或标准字典来创建从一个到另一个的映射。
groups = collections.defaultdict(list)
for a, b in c_pairs:
groups[a].append(b)
相当于没有集合:
groups = {}
for a, b in c_pairs:
groups.setdefault(a, []) # many ways to do this as well
groups[a].append(b)