我有一个列表列表,即mylist
,如下所示。
mylist = [[9, ["nuts", "fruits"]], [12, ["france", "italy", "rome", "paris"]], [18, ["cat", "dog", "parrot", "rabbit", "cow"]], [19, ["ebay", "wish"]]]
我还有一个如下的tsv文件,即myinput
。
ID Category
12 ["places", "locations"]
19 ["online", "customer"]
18 ["pets"]
9 ["food"]
我想将这两个与ID
结合如下。
my_output = [[9, ["nuts", "fruits"], ["food"]], [12, ["france", "italy", "rome", "paris"], ["places", "locations"]], [18, ["cat", "dog", "parrot", "rabbit", "cow"], ["pets"]], [19, ["ebay", "wish"], ["online", "customer"]]]
最初,我被认为将tsv文件转换为列表,如下所示,并使用python执行典型的列表处理。
input_data = pd.read_csv("myinput.tsv", header=0, delimiter="\t", quoting=3 )
my_data = input_data.values.tolist()
但是,我想知道是否有可能使用熊猫做同样的事情。
很高兴在需要时提供更多详细信息。
答案 0 :(得分:2)
使用map
例如:
mylist = [[9, ["nuts", "fruits"]], [12, ["france", "italy", "rome", "paris"]], [18, ["cat", "dog", "parrot", "rabbit", "cow"]], [19, ["ebay", "wish"]]]
d = dict(mylist)
df = pd.DataFrame({"ID": [12, 19, 18, 9],
"Category": [["places", "locations"], ["online", "customer"], ["pets"], ["food"]]})
df["D"] = df["ID"].map(d)
print(df)
输出:
ID Category D
0 12 [places, locations] [france, italy, rome, paris]
1 19 [online, customer] [ebay, wish]
2 18 [pets] [cat, dog, parrot, rabbit, cow]
3 9 [food] [nuts, fruits]
print(list(zip(df.ID, df.Category, df.D)))
输出:
[(12, ['places', 'locations'], ['france', 'italy', 'rome', 'paris']), (19, ['online', 'customer'], ['ebay', 'wish']), (18, ['pets'], ['cat', 'dog', 'parrot', 'rabbit', 'cow']), (9, ['food'], ['nuts', 'fruits'])]
答案 1 :(得分:1)
我喜欢使用Pandas.merge()
,我认为它灵活而明确。
将您的整数列用作索引是可选的,但是我认为这很有意义,因此我添加了它。
import pandas as pd
l = [[1, ['first', 'F']], [2, ['second', 'S']], [3, ['third']]]
dfl = pd.DataFrame(l)
dfl.set_index(0, inplace=True)
dfl.columns=['l']
r = [[1, ['first', 'F']], [4, ['fourth', 'F']], [3, ['third']]]
dfr = pd.DataFrame(r)
dfr.set_index(0, inplace=True)
dfr.columns=['r']
dfc = pd.merge(dfl, dfr, left_index=True, right_index=True, how='outer') # change how to meet your needs
print(dfc)
lc = dfc.reset_index().values.tolist()