数据框中的一列采用以下格式
Row 1 :
Counter({'First': 3, 'record': 2})
Row 2 :
Counter({'Second': 2, 'record': 1}).
我想创建一个具有以下值的新列:
Row 1 :
First First First record record
Row 2 :
Second Second record
答案 0 :(得分:1)
将apply
与counter
的迭代值一起使用,并与空格连接-首先是重复的值,然后是:
import ast
#convert values to dictionaries
df['col'] = df['col'].str.extract('\((.+)\)', expand=False).apply(ast.literal_eval)
df['new'] = df['col'].apply(lambda x: ' '.join(' '.join([k] * v) for k, v in x.items()))
print (df)
col new
0 {'First': 3, 'record': 2} First First First record record
1 {'Second': 2, 'record': 1} Second Second record
或列表理解:
df['new'] = [' '.join(' '.join([k] * v) for k, v in x.items()) for x in df['col']]
答案 1 :(得分:1)
我可以通过以下代码自己解决问题。它与正则表达式非常相关。
def transform_word_count(text):
words = re.findall(r'\'(.+?)\'',text)
n = re.findall(r"[0-9]",text)
result = []
for i in range(len(words)):
for j in range(int(n[i])):
result.append(words[i])
return result
df['new'] = df.apply(lambda row: transform_word_count(row['old']), axis=1)