This is my DataFrame
Date Value
0 "date": "1999-01-01 "s1":3.0000}
1 "date": "1999-01-02 "s1":3.0000}
2 "date": "1999-01-03 "s1":3.0000}
3 "date": "1999-01-04 "s1":3.0000}
4 "date": "1999-01-05 "s1":3.0000}
I want this DataFrame to be Transformed like this
Date Value
1999-01-01 3
1999-01-02 3
1999-01-03 3
1999-01-04 3
1999-01-05 3
1999-01-06 3
我尝试过
cols = ['Date', 'Value']
for col in cols:
DataAll[col] = DataAll[col].map(lambda x: str(x).lstrip('{}').rstrip('"date:")({)(:)(s1)(})'))
如果有人对此有解决方案,请提供帮助。 我已经为该解决方案花了很多时间,但我没有得到任何具有纯解决方案的解决方案。
答案 0 :(得分:2)
您可以先为文本带{}
链接文本方法,然后按:
拆分文本,选择第二个列表,最后删除结尾的"
和空格:
cols = ['Date', 'Value']
f = lambda x: x.astype(str).str.strip('{}').str.split(':').str[1].str.strip(' "')
DataAll[cols] = DataAll[cols].apply(f)
print (DataAll)
Date Value
0 1999-01-01 3.0000
1 1999-01-02 3.0000
2 1999-01-03 3.0000
3 1999-01-04 3.0000
4 1999-01-05 3.0000
如果列中的json,则首先将值转换为列表理解中的字典,然后传递给DataFrame
构造函数:
print (DataAll)
json_col
0 {"date": "1999-01-01","s1":3.0000}
1 {"date": "1999-01-02","s1":3.0000}
2 {"date": "1999-01-03","s1":3.0000}
3 {"date": "1999-01-04","s1":3.0000}
4 {"date": "1999-01-05","s1":3.0000}
import ast
DataAll1 = pd.DataFrame([ast.literal_eval(x) for x in DataAll['json_col']])
print (DataAll1)
date s1
0 1999-01-01 3.0
1 1999-01-02 3.0
2 1999-01-03 3.0
3 1999-01-04 3.0
4 1999-01-05 3.0
答案 1 :(得分:1)
您只能在':'和'。之间找到字符串。如下
import numpy as np
import pandas as pd
pan = pd.DataFrame({'date': ["1999-01-01", "1999-01-02","1999-01-03","1999-01-04","1999-01-05"], 'Value': ['"s1":3.0000', '"s1":3.0000', '"s1":3.0000', '"s1":3.0000', '"s1":3.0000']})
def find_between( s, first, last ):
try:
start = s.index( first ) + len( first )
end = s.index( last, start )
return s[start:end]
except ValueError:
return ""
for index, row in pan.iterrows():
print(row['date'],find_between(row['Value'], ':', '.'))
find_between函数将返回介于和之间的字符串。