Question

我有一个包含标题和值的多行文件。因为这些值将被插入到数据库中，所以我想使用标题来表示列名。所以示例数据如下。

Sales-Date
Item
Sale Price
Discount
Cost of Item
Profit (loss)

我已将列仅放入数组中，并删除了括号和短划线。结果如下：

Sales Date
Item
Sale Price
Discount
Cost of Item
Profit loss

所以我需要做的是提出一个正在查看该行的正则表达式，如果只有一个单词，则返回说出前4个字母，如果是多个单词，则返回每个单词的第一个字母。理想的大写。所以期望的数据看起来像：

SD
ITEM
SP
DISC
COI
PL

我没有太多运气。谢谢。

Answer 1

这样的事情，也许是：

#!/usr/bin/perl

use strict;
use warnings;
use 5.010;

while (<DATA>) {
  chomp;

  # If the line contains whitespace...
  if (/\s/) {
    # ... split the line into words ...
    # ... take the first letter of each word ...
    # ... join the letters together ...
    # ... and upper-case the resulting string.
    say uc join '', map { substr $_, 0, 1 } split /\s+/;
  } else {
    # ... otherwise, take the first four characters from the string ...
    # ... and upper-case them.
    say uc substr $_, 0, 4;
  }
}

__END__
Sales Date
Item
Sale Price
Discount
Cost of Item
Profit loss

Answer 2

一种可能的解决方案是通过空格将线条分割成数组，而不是仅捕获每个单词的每个字母。类似的东西：

my $line = "Sales Date";

# Split line into an array separated by whitespace
my @words = split /\s+/, $line;

my $letters;
# For loop through number of words in array
for (@words) {
    m/(.)/;
    $letters .= $1;
}

print $letters;

以上将打印SD。您只需更改m /(.)/即可表示要捕获的字符数。

Answer 3

my @arr = map {
  # make entire string upper case
  local $_ = uc;
  # remove trailing white-spaces (sometimes chomp fails on line endings)
  s/\s+\z//;

  # more words?
  /\s/
      # take first letter of every word
      ? join("", /\b(\w)/g)
      # take first 1 to 4 letters (and be greedy at that)
      : /(\w{1,4})/;
}
<DATA>;

print $_, "\n" for @arr;

__DATA__
Sales Date
Item
Sale Price
Discount
Cost of Item
Profit loss

输出

SD
ITEM
SP
DISC
COI
PL

perl regex从文件中缩写列

3 个答案: