Question

我有两个数组。第一个数组庞大，有数千个元素。第二个数组包含三十个左右的单词列表。我想从第一个数组中选择以第二个数组中的单词开头的行。

我正在思考正则表达式，但我不太确定如何使用它完成它。

来自first_array的示例：

array[0] = [ 'jsmith88:*:4185:208:jsmith113:/students/jsmith88:/usr/bin/bash' ]

来自second_array的示例：

 array[5] = [ 'jsmith88' ]

Answer 1

您可以尝试使用Array类中的select方法，如下所示：

# lines is the first array, word_list is the second array
words = words_list.join '|'
result = lines.select { |line| line =~ /^(#{words})/ }

result应包含以第二个数组中的单词开头的每一行。

如下所述@Sabuj Hassan，^表示该行的开头。 |字符表示OR。

修改：按@ oro2k建议使用Regexp.union：

words = Regexp.union word_list
result = lines.select { |line| line =~ /^(#{words})/ }

Answer 2

假设您的单词中不包含任何特殊字符。加入用竖线（|）

分隔的单词

words = [ 'jsmith88', 'alex' ]
word_list = words.join("|")

现在使用正则表达式中的连接字符串来表示来自其他arrray的每一行：

lines = [ 'jsmith88:*:4185:208:jsmith113:/students/jsmith88:/usr/bin/bash' ]
if(lines[0] =~ /^(#{word_list})/)
    print "ok"
end

此处^表示该行的开头。在括号(..)中，它将单词保持为OR条件。

Answer 3

试试这个： -

lines = [ 'jsmith88:*:4185:208:jsmith113:/students/jsmith88:/usr/bin/bash' ]
words = [ 'jsmith88' ]

lines.each_with_object([]){|line, array_obj| array_obj << line if words.include?(line.scan(/\b\w+\b/)[0])}

如何搜索以数组中的单词开头的行？

3 个答案: