解释了解此正则表达式在此代码中的作用

时间:2017-11-26 21:16:15

标签: javascript regex

我很想知道这四行是做什么的,两个包含正则表达式:

import urllib.request
import json

url = 'http://www.sentiment140.com/api/bulkClassifyJson'
values = {'data': [{'text': 'I love Titanic.'}, {'text': 'I hate Titanic.'}]} 

data = json.dumps(values)
response = urllib.request.urlopen(url, data=data.encode("utf-8"))
page = response.read()

我理解结果会像text.replace(/\W/g, " ") text.split(/\s+/); text.filter(v => !!v) text.reduce((dict, v) => {dict[v] = v in dict ? dict[v] + 1 : 1; return dict}, {}); 。但是,有人可以向我详细解释每条线路的用途。

1 个答案:

答案 0 :(得分:0)

此代码用于计算文本中的单词。

第一个RegExp .replace(/\W/g, " ")将非单词字符(不是数字,字母或下划线)转换为空格。

第二个RegExp使用空格序列作为分隔符来分割文本。

由于所有操作都会生成新的字符串/数组/对象,因此您需要将结果存储在变量中,或者将方法链接起来。



var text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus tempor risus eu nisl pretium ultrices. Vivamus a malesuada est. Donec fringilla pharetra dolor, vitae mattis lorem pulvinar sit amet. Sed tristique tellus sit amet maximus rhoncus. Vestibulum accumsan quam in ligula finibus fermentum.";

var result = text.replace(/\W/g, " ") // convert all now word characters to spaces
  .split(/\s+/) // split with continuous spaces as the delimeter 
  .filter(v => !!v) // filter falsy values, ie 0 probably in this case
  .reduce((dict, v) => {dict[v] = v in dict ? dict[v] + 1 : 1; return dict}, {}); // count the number of times a word appears
  
console.log(result);




您可以将2个RegExps组合成一个.match(/\w+/g) - 获取所有单词序列的数组:



var text = "Lorem ipsum dolor sit amet, consectetur adipiscing elit. Phasellus tempor risus eu nisl pretium ultrices. Vivamus a malesuada est. Donec fringilla pharetra dolor, vitae mattis lorem pulvinar sit amet. Sed tristique tellus sit amet maximus rhoncus. Vestibulum accumsan quam in ligula finibus fermentum.";

var result = text.match(/\w+/g) // get all word sequences
  .filter(v => !!v) // filter falsy values, ie 0 probably in this case
  .reduce((dict, v) => {dict[v] = v in dict ? dict[v] + 1 : 1; return dict}, {}); // count the number of times a word appears
  
console.log(result);