Question

尝试检查字符串是否有逗号分隔。在检查字符串之后，我将使用它来帮助加载SQL数据库，因此字符串中的单词不能用逗号之外的任何东西分隔。我确实有一种方法可行但对Python来说似乎非常笨重。是否有更简洁/更便宜的方法来检查字符串是否有逗号分隔？

这是我在Python 2.7.4解释器中运行的尝试：

# List of possible Strings
comma_check_list = ['hello, world', 'hello world', 'hello,  world',\ 
                    'hello world, good, morning']

# Dictionary of punctuation that's not a comma
 punct_dict = {'@': True, '^': True, '!': True, ' ': True, '#': True, '%': True,\
               '$': True, '&': True, ')': True, '(': True, '+': True, '*': True,\ 
               '-': True, '=': True}

# Function to check the string
def string_check(comma_check_list, punct_dict):
    for string in comma_check_list:
        new_list = string.split(", ")
        if char_check(new_list, punct_dict) == False:
            print string, False
        else:
            print string, True

# Function to check each character
def char_check(new_list, punct_dict):
    for item in new_list:
        for char in item:
            if char in punct_dict:
                return False

# Usage
string_check(comma_check_list, punct_dict)

# Output
hello, world True
hello world False
hello,  world False
hello world, good, morning False

提前感谢您的帮助！

Answer 1

for currentString in comma_check_list:
    if any(True for char in currentString if char in '@^! #%$&)(+*-="'):
        print currentString, False
    else:
        print currentString, True

@^! #%$&)(+*-="是字符串中不需要它们的字符。因此，如果currentString中的任何字符位于该列表中，我们将打印False。

Answer 2

我可能会将您的代码减少到以下内容。

# List of possible Strings
comma_check_list = ['hello, world', 'hello world', 'hello,  world', 'hello world, good, morning']

# Dictionary of punctuation that's not a comma
punct = set('@^! #%$&)(+*-="')

# Function to check the string
def string_check(comma_check_list, punct):
    for string in comma_check_list:
        new_list = string.split(", ")
        print string, not any(char in punct for item in new_list for char in item)

# Usage
string_check(comma_check_list, punct)

所做的更改。

使用set，因为您只使用字典进行查找。
已使用any。
打印而不是if条件。

输出

In [6]: %run 
hello, world True
hello world False
hello,  world False
hello world, good, morning False

Answer 3

您应该将有效的SQL标识符列入白名单：

import re

ID_RE = re.compile(r'^[a-zA-Z_][a-zA-Z_0-9$]+$')

def is_sql_columns(columns):
    return all(ID_RE.match(column_name.strip()) 
               for column_name in columns.split(','))

### Test cases ###

def main():
    test = [
        'hello,world',     # True
        ' hello , world ', # True
        'hello world',     # False
        '!@#$%^&*,yuti',   # False
        'hello',           # True
        'hello,',          # False
        'a!b,c@d',         # False
        ''                 # False
    ]

    for t in test:
        print '{!r:>16}{!r:>8}'.format(t, is_sql_columns(t))

if __name__ == '__main__':
    main()

这是有效标识符in PostgreSQL的保守RE，它不处理非ASCII字母或带引号的标识符。它还允许在单词之间留出额外的空格，因为无论如何这些空格无关紧要。

另请注意，这将拒绝使用列别名的SELECT的有效列列表。（例如SELECT first_name AS fname, last_name lname…）

在Python中检查用于逗号分隔的字符串的不那么笨重的方法？

3 个答案: