/ etc / passwd中的行的正则表达式验证

时间:2010-11-07 17:45:51

标签: regex perl unix

我需要验证/ etc / passwd文件是否有效,并且认为正则表达式是验证不是注释的行的好主意。我如何验证如下行:

root:*:0:0:System Administrator:/var/root:/bin/sh 

经过一些研究,第5个字段(系统管理员)可以包含其他数据,如电子邮件和地址,第二个字段可以包含除:之外的任何内容,最后2个字段是完整路径。

我是如何为此创建正则表达式的?

4 个答案:

答案 0 :(得分:5)

不希望变得滑稽 - Passwd::Unix可能是你最好的选择。

答案 1 :(得分:1)

需要使用Perl吗?检查密码文件的常用方法是使用 awk 作为数据库查询语言。例如:

awk -F: '$3 ~ /pattern/'

当然,您可以使用perl -lane代替。但是如果您使用的是Perl,则应该使用标准的User::pwent模块。

答案 2 :(得分:1)

想要正则表达式?好的,我会给你一个正则表达式:它在$is_valid_pwent_rx变量中。

享受。

重要提示:不得被误解为理智的passwd文件的语义检查程序。它只是一个语法检查器

目前为OpenBSD配置。

#!/usr/bin/env perl

use 5.010;
use strict;
use warnings;

our $PASSWD = "/etc/passwd";

our $Errors = 0;

sub is_valid_pwent(_);
sub main();

#########################################################

main();
exit($Errors != 0);

#########################################################

sub main() {

    open(PASSWD)        || die "can't open $PASSWD: $!";

    while (my $line = <PASSWD>) {
        chomp $line;
        ## NEXT LINE IS WRONG: NO "COMMENTS" ALLOWED!!!
        next if $line =~ /^#/;
        next if is_valid_pwent($line);   
        say "$0: Invalid entry at $PASSWD $.: $line";
        $Errors++;
    }

    close(PASSWD)       || die "can't close $PASSWD: $!";

    say "$0: $PASSWD appears ok." unless $Errors;
}

#########################################################

INIT {

    state $is_valid_pwent_rx = qr{

      ^ (?&any_pwent)  $

###############################################

      (?(DEFINE)

      (?<any_pwent>     (?&yp_pwent) | (?&pwent) )

# The `+' token may also be alone in the name field, which causes all users
# from the passwd.byname and passwd.byuid YP maps to be included.
#
# If the entry contains non-empty uid or gid fields, the specified numbers
# will override the information retrieved from the YP maps.  Additionally,
# if the gecos, dir, or shell entries contain text, it will override the
# information included via YP.  On some systems, the passwd field may also
# be overridden.  It is recommended that the standard way to enable YP
# passwd support in /etc/master.passwd is:
#
#     +:*::::::::   

        (?<yp_pwent>
                       (?&PLUS)         # substitute in YP
         : (?&EMPTY) | (?&pw_passwd)    # user's encrypted password.
         : (?&EMPTY) | (?&pw_uid)       # user's login user ID.
         : (?&EMPTY) | (?&pw_gid)       # user's login group ID.
         : (?&EMPTY) | (?&pw_gecos)     # Honeywell login info.
         : (?&EMPTY) | (?&pw_dir)       # user's home directory.
         : (?&EMPTY) | (?&pw_shell)     # user's login shell.
        )

# A normal password entry

        (?<pwent>

           (?&pw_name)      # user's login name.
         : (?&pw_passwd)    # user's encrypted password.
         : (?&pw_uid)       # user's login user ID.
         : (?&pw_gid)       # user's login group ID.
         : (?&pw_gecos)     # Honeywell login info.
         : (?&pw_dir)       # user's home directory.
         : (?&pw_shell)     # user's login shell.
        )

# A master password entry

        (?<master_pwent>
           (?&pw_name)      # user's login name.
         : (?&pw_passwd)    # user's encrypted password.
         : (?&pw_uid)       # user's login user ID.
         : (?&pw_gid)       # user's login group ID.
         : (?&pw_class)     # user's general classification (see login.conf(5))
         : (?&pw_change)    # password change time.
         : (?&pw_expire)    # account expiration time.
         : (?&pw_gecos)     # general information about the user.
         : (?&pw_dir)       # user's home directory.
         : (?&pw_shell)     # user's login shell.
        )

# The name field is the login used to access the computer account, and the
# uid field is the number associated with it.  They should both be unique
# across the system (and often across a group of systems) since they con-
# trol file access.
#
# While it is possible to have multiple entries with identical login names
# and/or identical user IDs, it is usually a mistake to do so.  Routines
# that manipulate these files will often return only one of the multiple
# entries, and that one by random selection.
#
# The login name may be up to 31 characters long.  For compatibility with
# legacy software, a login name should start with a letter and consist
# solely of letters, numbers, dashes and underscores.  The login name must
# never begin with a hyphen (`-'); also, it is strongly suggested that nei-
# ther uppercase characters nor dots (`.') be part of the name, as this
# tends to confuse mailers.  No field may contain a colon as this has been
# used historically to separate the fields in the user database.

        (?<pw_name>

            (?= (?&NON_COLON){1,31} )

            (?: (?&UNDERSCORE)
              | (?&LETTER)
            )

            (?: (?&LETTER)
              | (?&number)
              | (?&HYPHEN)
              | (?&UNDERSCORE)
            ){0,30}

        )

# The password field is the *encrypted* form of the password.  If the
# password field is empty, no password will be required to gain access to 
# the machine.  This is almost invariably a mistake.  By convention, ac-  
# counts that are not intended to be logged in to (e.g. bin, daemon, sshd)
# have a star (`*') in the password field.  Note that there is nothing spe-
# cial about `*', it is just one of many strings that is not a valid en-
# crypted password (see crypt(3)).  Because master.passwd contains the en-
# crypted user passwords, it should not be readable by anyone without ap-
# propriate privileges.
#
# Which type of cipher is used to encrypt the password information depends
# on the configuration in login.conf(5).  It can be different for local and
# YP passwords.

        (?<pw_passwd>
            (?&STAR)
          | (?&NON_COLON) +
          | (?&EMPTY)           # should not allow this!
        )

# The uid field is the numeric user ID assigned to this login name.
# It need not strictly be unique.

        (?<pw_uid>
            (?&number) +
        )

# The group (gid) field is the group that the user will be placed in
# upon login. Since this system supports multiple groups (see groups(1))
# this field currently has little special meaning.

        (?<pw_gid>
            (?&number) +
        )

        (?<pw_class>
            (?&EMPTY)   
          | (?&any_text)
        )

        (?<pw_change>
            (?&EMPTY)
          | (?&number)
        )

        (?<pw_expire>
            (?&EMPTY)
          | (?&number)
        )

        (?<pw_gecos>
            # (?&EMPTY) | (?&gecos_fields)
            (?&any_text)
        )

        # some have an extra field in them after hphone
        (?<gecos_fields>
            (?&gecos_name)   # User's full name.
            (?&COMMA)
            (?&gecos_office) # User's office location.
            (?&COMMA)
            (?&gecos_wphone) # User's work phone number.
            (?&COMMA)
            (?&gecos_hphone) # User's home phone number.
          )

        (?<gecos_name>      (?&gecos_text)  )
        (?<gecos_office>    (?&gecos_text)  )
        (?<gecos_wphone>    (?&gecos_text)  )
        (?<gecos_hphone>    (?&gecos_text)  )

        (?<pw_dir>
            (?&EMPTY)   # bad idea
          | (?&directory_name)
        )

        (?<pw_shell>
            (?&EMPTY)   # means "/bin/sh"
          | (?&filename)

        )

#########################

        (?<directory_name>      (?&pathname)    )
        (?<filename>            (?&pathname)    )

        (?<pathname>
            (?&SLASH)
            (?&any_text)
        )

        (?<LETTER>      [a-z]       )  # \p{Ll} && \p{ASCII}

        (?<DIGIT>       [0-9]       )  # \p{Nd} && \p{ASCII}
        (?<ZERO>         0          )
        (?<NON_ZERO>    [1-9]       )

        (?<PLUS>        \x2B        )  # PLUS SIGN
        (?<COMMA>       \x2C        )  # COMMA
        (?<HYPHEN>      \x2D        )  # HYPHEN-MINUS   
        (?<SLASH>       \x2F        )  # SOLIDUS
        (?<COLON>       \x3A        )  # COLON
        (?<STAR>        \x2A        )  # ASTERISK
        (?<UNDERSCORE>  \x5F        )  # LOW LINE

        (?<NON_COLON> [^\x3A]       )

        (?<EMPTY> (?# this space intentionally left blank) )

        (?<number>
            (?&ZERO)
          | (?&NON_ZERO) (?&DIGIT) *
        )

        (?<any_text>
            (?&NON_COLON) *
        )

        (?<gecos_text>  
            (?:
                (?! (?&COMMA) )
                (?! (?&COLON) )
                .
            ) *
        )

      )

    }x;

    sub is_valid_pwent(_) {
        my $pwent = shift();
        return $pwent =~ $is_valid_pwent_rx;
    }

}

答案 3 :(得分:-2)

这样的东西?

^(#.*|[a-z]*:[^:]*:[0-9]*:[0-9]*:[^:]*:/[^:]*:/[^:]*)$

(假设用户名由小写字母组成)