我需要在Spark Scala数据帧中将以下内容拆分为多个令牌。还未使用正则表达式。任何帮助或指示都是好的。
php maintenance/resetUserEmail.php --dbuser myuser --dbpass wordpass uuuu new@email.address
预期的输出(每个令牌在其自己的行中):
<c#><floating-point><type-conversion><double><decimal>
我已经尝试过c#
floating-point
type-conversion
double
decimal
,但是它给了我以下结果。如何忽略标签
<(.*?)>
答案 0 :(得分:1)
应用先行断言和后行断言应该可以解决问题:
scala> val pattern = "(?<=<)(.*?)(?=>)".r
pattern: scala.util.matching.Regex = (?<=<)(.*?)(?=>)
scala> val s= "<c#><floating-point><type-conversion><double><decimal>"
s: String = <c#><floating-point><type-conversion><double><decimal>
scala> for { m <- pattern.findAllIn(s) } println(m)
c#
floating-point
type-conversion
double
decimal
答案 1 :(得分:1)
在这里,我们可能想要简单地设计一个表达式来捕获<>
并用新行替换,也许这行得通:
(?:\<|\>)
const regex = /(?:\<|\>)/gm;
const str = `<c#><floating-point><type-conversion><double><decimal>`;
const subst = `\n`;
// The substituted value will be contained in the result variable
const result = str.replace(regex, subst);
console.log('Substitution result: ', result);