目标是根据现有列的数据框的行级信息的返回值创建新列。
df = pd.DataFrame({"A": [The quick brown fox jumps over the lazy dog,Glib jocks quiz nymph to vex dwarf], "B": [10, 20]})
A B
0 The quick brown fox jumps over the lazy dog 10
1 Glib jocks quiz nymph to vex dwarf 20
存在方法:
def returnTopic(model, query, numberOftopics):
# strip out topics per query/row and return topics that are relevant to a query/row
return topicDict`
topicDict
包含{'x': ['fox','brown'], 'y':['jumps','over','the']}
我想从字典中的这些返回元素创建两个新列。
A B x y
0 The quick brown fox jumps over the lazy dog 10 ['fox','brown'] ['jumps','over','the']
1 Glib jocks quiz nymph to vex dwarf 20
这是我的尝试:
df['x'] = df.apply(lambda x: returnTopic(tmodel['x'], x['A'], 2))
答案 0 :(得分:1)
您可以从记录中创建新的DataFrame,并将其连接到旧的DataFrame。
像这样:
import pandas as pd
df = pd.DataFrame({"A": ['The quick brown fox jumps over the lazy dog',
'Glib jocks quiz nymph to vex dwarf'], "B": [10, 20]})
def f(text, something_else):
return {'x':len(text), 'y': text.count(' ')}
new_df = pd.concat([df, pd.DataFrame.from_records(df['A'].apply(lambda x: f(x, 0)))], axis=1)
print(new_df)
它会返回
A B x y
0 The quick brown fox jumps over the lazy dog 10 43 8
1 Glib jocks quiz nymph to vex dwarf 20 34 6
答案 1 :(得分:1)
让你的函数返回一个pd.Series
对象:
def foo(x):
...
return pd.Series(topicDict)
现在,沿第一个轴调用apply
:
v = df.apply(foo, 1)
v
x y
0 [fox, brown] [jumps, over, the]
1 [fox, brown] [jumps, over, the]
使用pd.concat
将结果与原始内容连接起来。
pd.concat([df, v], 1)
A B x \
0 The quick brown fox jumps over the lazy dog 10 [fox, brown]
1 Glib jocks quiz nymph to vex dwarf 20 [fox, brown]
y
0 [jumps, over, the]
1 [jumps, over, the]