Is it possible to retrieve the sub-string from a string (complex string) in golang

时间:2019-04-17 02:09:46

标签: regex go

I met a problem in using regular expression, two questions need to be solved, from simple to complex. Firstly is to use regular expression to match the string, after that it should retrieve some substrings from the message.

like I have a string, which is

"In current chatting room: what do you eat for today? (This message is edited by Sharon, the message is sent on 2018-11-10 21:00:00 from Leon)"

"In current chatting room: Hey mate, do you like golang? (This message is edited by Leon, the message is sent on 2018-01-10 10:00:59 from Mike)"

In the above message, some part will not change like "In current chatting room:" and "This message is edited by ..., the message is sent on ... from ..."

When I met this kind of message, this is considered as "Editing Notice" I need to filter all the message that compile with the structure.

What I write is

var testRgx = regexp.MustCompile(`^In current chatting room: .* \(This message is edited by .*, the message is sent on .* from .*\)$`)

I know it is a little stupid, but at least could work

and when I run it, the result shows it is true.

sample := "In current chatting room: what do you eat for today? I input some shit (sdfhjskdfjksljhfdsjkdf) can you detect this? (This message is edited by Sharon, the message is sent on 2018-11-10 21:00:00 from Leon)"
fmt.Println(testRgx.MatchString(sample ))

Until now I think it is fine

The second step is to retrieve the content, the editor, the time and the original sender.

What I did is I replace the first part, which is "In current chatting room: " and then the string is changed to

changedString := "what do you eat for today? I input some shit (sdfhjskdfjksljhfdsjkdf) can you detect this? (This message is edited by Sharon, the message is sent on 2018-11-10 21:00:00 from Leon)"

And from the end of string, I cut the string after the last from, so I could fetch "Leon" out.

//after cut after from
cutString := "what do you eat for today? I input some shit (sdfhjskdfjksljhfdsjkdf) can you detect this? (This message is edited by Sharon, the message is sent on 2018-11-10 21:00:00 "

Then cut the string after the last on to get the time.

//after cut after on
cutString := "what do you eat for today? I input some shit (sdfhjskdfjksljhfdsjkdf) can you detect this? (This message is edited by Sharon, the message is sent "

Then the last step is to retrieve the editor out.

I think this method is quite stupid, I have searched some example like retrieve component using regexp Golang: extract data with Regex

but this is a little complex case, I think the method to retrieve component I written is quite ugly.

Can I please ask whether there is a way to directly use regular expression to fetch the components?

For the notice message,

"In current chatting room: " will not change, the component of edited message will change, and the content inside bracket will only change the editor (Sharon), time (2018-11-10 21:00:00) and sender (Leon), other part in the bracket will not change like

(This message is edited by xxxxx, the message is sent on xxxx from xxxx)

2 个答案:

答案 0 :(得分:1)

让我尝试了解您的问题,在给定的输入字符串中,您要查找编辑者和发件人名称,还希望提取日期和时间。

首先,您可以有两个正则表达式,一个用于匹配名称,另一个用于日期和时间,您可以执行以下操作

namesRegex, _ := regexp.Compile("by\\s(.*?),(.*?)\\s*from\\s*(.*?)\\)")
dateTimeRegex, _ := regexp.Compile("(\\d{4})-(\\d{2})-(\\d{2}) (\\d{2}):(\\d{2}):(\\d{2})")
input := "In current chatting room: what do you eat for today? (This message is edited by Sharon, the message is sent on 2018-11-10 21:00:00 from Leon)"
if namesRegex.MatchString(input) {
    res := namesRegex.FindStringSubmatch(input)
    fmt.Println("Edited by = ", strings.TrimSpace(res[1]))
    fmt.Println("From = ", strings.TrimSpace(res[3]))
}
if dateTimeRegex.MatchString(input) {
    res := dateTimeRegex.FindAllString(input, 1)
    fmt.Println(res[0])
}

输出

编辑=沙龙

来自=莱昂

2018-11-10 21:00:00

答案 1 :(得分:0)

I couldn't post a comment, so I had to put this here... Have you researched regex capture groups?

eg How to get capturing group functionality in Golang regular expressions?

相关问题