我有一个由一堆.json文件创建的数据框,显然,我以.json通常格式化的列表形式获取数据。
数据:
'id' 'col1'
1a [student,teacher,missing data]
22 [teacher]
de6 []
34 [missing data, teacher]
de2 []
de5 []
如何替换此列中的missing data
?
如果它是字符串,我可以将其missing data
删除,但是不能以[ ]
格式删除。不知道为什么。
我在下面尝试了这些组合。我没有任何错误,但是它不能替代任何东西。
# df['col1'].replace('[]','')
# df['col1'].replace('\[\]','')
# df['col1'].replace('\[','')
# df['col1'].replace('\]','')
df['col1'].replace('missing data','')
输出我正在寻找它。有可能吗?
'id' 'col1'
1a [student,teacher]
22 [teacher]
de6 []
34 [teacher]
de2 []
de5 []
答案 0 :(得分:0)
您可以过滤出等于'missing data'
的元素:
import pandas as pd
data = [['1a', ['student', 'teacher', 'missing data']],
['22', ['teacher']],
['de6', []],
['34', ['missing data', 'teacher']],
['de2', []],
['de5', []]]
df = pd.DataFrame(data=data, columns=['id', 'col1'])
df['col1'] = [[e for e in lst if e != 'missing data'] for lst in df.col1]
print(df)
输出
id col1
0 1a [student, teacher]
1 22 [teacher]
2 de6 []
3 34 [teacher]
4 de2 []
5 de5 []