我正在尝试根据熊猫中的另一列提取一列中的值, 例如,假设我在数据框中有2列,如下所示
>>> check
child parent
0 b a
1 c a
2 d b
3 e d
现在我要提取“子”列中的所有值作为“父”列中的值 现在,我的初始值可能会有所不同,假设它在“父”列中为“ a”
数据帧的长度也可能不同。
我在下面尝试过,但是如果再有几个匹配值并且数据帧的长度更多,则无法正常工作
check = pd.read_csv("Book2.csv",encoding='cp1252')
new = (check.loc[check['parent'] == 'a', 'child']).tolist()
len(new)
a=[]
a.append(new)
for i in range(len(new)):
new[i]
new1 = (check.loc[check['parent'] == new[i], 'child']).tolist()
len(new1)
if(len(new1)>0):
a.append(new1)
for i in range(len(new1)):
new2 = (check.loc[check['parent'] == new1[i], 'child']).tolist()
if(len(new1)>0):
a.append(new2)
flat_list = [item for sublist in a for item in sublist]
>>> flat_list
['b', 'c', 'd', 'e']
有什么有效的方法来获得理想的结果,这将是一个很大的帮助。请指教
答案 0 :(得分:2)
递归是一种实现方法。假设check
是您的数据框,请定义一个递归函数:
final = [] #empty list which is used to store all results
def getchilds(df, res, value):
where = df['parent'].isin([value]) #check rows where parent is equal to value
newvals = list(df['child'].loc[where]) #get the corresponding child values
if len(newvals) > 0:
res.extend(newvals)
for i in newvals: #recursive calls using child values
getchilds(df, res, i)
getchilds(check, final, 'a')
print(final)
print(final)
打印['b', 'c', 'd', 'e']
,如果您的示例是check
。
如果您没有循环调用,例如'b'
是'a'
的子项,而'a'
是'b'
的子项,则此方法有效。在这种情况下,您需要添加更多检查以防止无限递归。
答案 1 :(得分:0)
for ($i = 3; $i < $num/2; $i += 2)
然后调用for ($i = 3; $i*$i <= $num; $i += 2)
打印:
out_dict = {}
for v in pd.unique(check['parent']):
out_dict[v] = list(pd.unique(check['child'][check['parent']==v]))
答案 2 :(得分:0)
让我猜测一下,说您想获取父级值为 x
的列子级的所有值import pandas as pd
def get_x_values_of_y(comparison_val, df, val_type="get_parent"):
val_to_be_found = ["child","parent"][val_type=="get_parent"]
val_existing = ["child","parent"][val_type != "get_parent"]
mask_value = df[val_existing] == "a"
to_be_found_column = df[mask_value][val_to_be_found]
unique_results = to_be_found_column.unique().tolist()
return unique_results
check = pd.read_csv("Book2.csv",encoding='cp1252')
# to get results of all parents of child "a"
print get_x_values_of_y("a", check)
# to get results of all children of parent "b"
print get_x_values_of_y("b", check, val_type="get_child")
# to get results of all parents of every child
list_of_all_children = check["child"].unique().tolist()
for each_child in list_of_all_children:
print get_x_values_of_y(each_child, check)
# to get results of all children of every parent
list_of_all_parents = check["parent"].unique().tolist()
for each_parent in list_of_all_parents:
print get_x_values_of_y(each_parent, check, val_type= "get_child")
希望这可以解决您的问题。