Golang正则表达式总是返回false?

时间:2018-03-18 18:45:00

标签: regex go

我正在接受用户输入(正则表达式),并检查文件的给定行是否与之匹配。然后我返回一些ID,如果匹配(行的ID),那就是它。但是,看来我的match总是返回false?但是,有趣的是,如果我抛出一个通配符.*,程序将比特定的正则表达式花费更长的时间来执行。所以,必须有一些事情发生 - 它为什么总是返回假?

示例代码:

func main() {

    // User input from command line
    reader := bufio.NewReader(os.Stdin)
    fmt.Print("Enter regexp: ")
    userRegexp, _ := reader.ReadString('\n')

    // List all .html files in static dir
    files, err := filepath.Glob("static/*.html")
    if err != nil {
        log.Fatal(err)
    }

    // Empty array of int64's to be returned with matching results
    var lineIdArr []int64

    for _, file := range files {
        htmlFile, _ := os.Open(file)
        fscanner := bufio.NewScanner(htmlFile)

        // Loop over each line
        for fscanner.Scan() {

            line := fscanner.Text()

            match := matchLineByValue(userRegexp, line) // This is always false?

            // ID is always the first item. Seperate by ":" and cast it to int64.
            lineIdStr := line[:strings.IndexByte(line, ':')]
            lineIdInt, err := strconv.ParseInt(lineIdStr, 10, 64)

            if err != nil {
                panic(err)
            }

            // If matched, append ID to lineIdArr
            if match {
                lineIdArr = append(lineIdArr, lineIdInt)
            }
        }
    }
    fmt.Println("Return array: ", lineIdArr)
    fmt.Println("Using regular expression: ", userRegexp)
}

func matchLineByValue(re string, s string) bool {
    return regexp.MustCompile(re).MatchString(s)
}

regexp.MustCompile(re).MatchString(s)是不是从用户输入构造正则表达式并将其匹配到整行的正确方法?

它匹配的字符串相当长(它基本上是一个完整的html文件),会出现问题吗?

1 个答案:

答案 0 :(得分:2)

调用userRegexp, _ := reader.ReadString('\n')返回带有尾随换行符的字符串。修剪换行符:

 userRegexp, err := reader.ReadString('\n')
 if err != nil {
    // handle error
 }
 userRegexp = userRegexp[:len(userRegexp)-1]

以下是包含其他一些改进的代码(编译regexp一次,使用scanner Bytes):

// User input from command line
reader := bufio.NewReader(os.Stdin)
fmt.Print("Enter regexp: ")
userRegexp, err := reader.ReadString('\n')
if err != nil {
    log.Fatal(err)
}
userRegexp = userRegexp[:len(userRegexp)-1]
re, err := regexp.Compile(userRegexp)
if err != nil {
    log.Fatal(err)
}

// List all .html files in static dir
files, err := filepath.Glob("static/*.html")
if err != nil {
    log.Fatal(err)
}

// Empty array of int64's to be returned with matching results
var lineIdArr []int64

for _, file := range files {
    htmlFile, _ := os.Open(file)
    fscanner := bufio.NewScanner(htmlFile)
    // Loop over each line
    for fscanner.Scan() {
        line := fscanner.Bytes()
        if !re.Match(line) {
            continue
        }
        lineIdStr := line[:bytes.IndexByte(line, ':')]
        lineIdInt, err := strconv.ParseInt(string(lineIdStr), 10, 64)
        if err != nil {
            log.Fatal(err)
        }
        lineIdArr = append(lineIdArr, lineIdInt)
    }
}
fmt.Println("Return array: ", lineIdArr)
fmt.Println("Using regular expression: ", userRegexp)