这是我的数据集(仅一列)
Apr 1 09:14:55 i have apple
Apr 2 08:10:10 i have mango
有我需要的结果
month date time message
Apr 1 09:14:55 i have apple
Apr 2 09:10:10 i have mango
这就是我所做的
import pandas as pd
month = []
date = []
time = []
message = []
for line in dns_data:
month.append(line.split()[0])
date.append(line.split()[1])
time.append(line.split()[2])
df = pd.DataFrame(data={'month': month, 'date':date, 'time':time})
这是我得到的输出
month date time
0 Apr 1 09:14:55
1 Apr 2 09:10:10
如何显示message
列?
答案 0 :(得分:2)
将Series.str.split
中的参数n
用于前三个空格的分割,expand=True
用于输出DataFrame
:
print (df)
col
0 Apr 1 09:14:55 i have apple
1 Apr 2 08:10:10 i have mango
df1 = df['col'].str.split(n=3, expand=True)
df1.columns=['month','date','time','message']
print (df1)
month date time message
0 Apr 1 09:14:55 i have apple
1 Apr 2 08:10:10 i have mango
具有列表理解功能的另一种解决方案:
c = ['month','date','time','message']
df1 = pd.DataFrame([x.split(maxsplit=3) for x in df['col']], columns=c)
print (df1)
month date time message
0 Apr 1 09:14:55 i have apple
1 Apr 2 08:10:10 i have mango
答案 1 :(得分:2)
您可以将Series.str.extractall
与正则表达式一起使用:
df = pd.DataFrame({'text': {0: 'Apr 1 09:14:55 i have apple', 1: 'Apr 2 08:10:10 i have mango'}})
df_new = (df.text.str
.extractall(r'^(?P<month>\w{3})\s?(?P<date>\d{1,2})\s?(?P<time>\d{2}:\d{2}:\d{2})\s?(?P<message>.*)$')
.reset_index(drop=True))
print(df_new)
month date time message
0 Apr 1 09:14:55 i have apple
1 Apr 2 08:10:10 i have mango
答案 2 :(得分:0)
这可能会对您有所帮助。
(?<Month>\w+)\s(?<Date>\d+)\s(?<Time>[\w:]+)\s(?<Message>.*)
Match 1
Month Apr
Date 1
Time 09:14:55
Message i have apple
Match 2
Month Apr
Date 2
Time 08:10:10
Message i have mango