我在python中有两个pandas DataFrames。 DF A包含一列,基本上是句子长度的字符串。
|---------------------|------------------|
| sentenceCol | other column |
|---------------------|------------------|
|'this is from france'| 15 |
|---------------------|------------------|
DF B包含一列国家列表
|---------------------|------------------|
| country | other column |
|---------------------|------------------|
|'france' | 33 |
|---------------------|------------------|
|'spain' | 34 |
|---------------------|------------------|
如何遍历DF A并指定字符串包含的国家/地区?这就是我想象的DF A在分配后的样子……
|---------------------|------------------|-----------|
| sentenceCol | other column | country |
|---------------------|------------------|-----------|
|'this is from france'| 15 | 'france' |
|---------------------|------------------|-----------|
另一个复杂之处在于,每个句子可以有一个以上的国家,因此理想情况下,可以将每个适用的国家/地区分配给该句子。
|-------------------------------|------------------|-----------|
| sentenceCol | other column | country |
|-------------------------------|------------------|-----------|
|'this is from france and spain'| 16 | 'france' |
|-------------------------------|------------------|-----------|
|'this is from france and spain'| 16 | 'spain' |
|-------------------------------|------------------|-----------|
答案 0 :(得分:0)
您可以使用方法iterrows()
遍历数据框。您可以尝试以下方法:
# Dataframes definition
df_1 = pd.DataFrame({"sentence": ["this is from france and spain", "this is from france", "this is from germany"], "other": [15, 12, 33]})
df_2 = pd.DataFrame({"country": ["spain", "france", "germany"], "other_column": [7, 7, 8]})
# Create the new dataframe
df_3 = pd.DataFrame(columns = ["sentence", "other_column", "country"])
count=0
# Iterate through the dataframes, first through the country dataframe and inside through the sentence one.
for index, row in df_2.iterrows():
country = row.country
for index_2, row_2 in df_1.iterrows():
if country in row_2.sentence:
df_3.loc[count] = (row_2.sentence, row_2.other, country)
count+=1
所以输出是:
sentence other_column country
0 this is from france and spain 15 spain
1 this is from france and spain 15 france
2 this is from france 12 france
3 this is from germany 33 germany