我有一个这样的数据框:
Column_A
1. A lot of text inhere, but I want all words that have a comma in the middle. Like this: hello,world. A string can contain multiple relevant words, like hello,python and we have also many whit spaces in the text
2. What I want is to abstract,all words with that pattern. Not sure if it has an impact, but some parts of the strings containing "this signs". or "this,signs" thanks for helpingme greets!
所需结果:
hello,world
hello,python
abstract,all
"this,signs"
我尝试使用以下代码执行此操作:
df['B'] = df['Column_A'].str.findall(r',').str.join(' ').str.strip()
但是那给了我不想要的结果。
答案 0 :(得分:3)
鉴于预期输出的特定格式,看来您可以使用:
from itertools import chain
l = chain.from_iterable(df.Column_a.str.findall(r'\w+,\w+').values.tolist())
pd.Dataframe(l, columns=['Column_A'])
Column_A
0 hello,world
1 hello,python
2 abstract,all
3 this,signs