此代码使用pandasschema包来验证从csv文件加载的数据帧中的数据。我需要的是不使用任何验证包的代码,它仅应使用函数,条件语句或异常等。我尝试了这种方法,但是它不起作用df['customertype'].isin(['type1','type2'])
,这给我带来了错误“列表索引必须为整数”或切片,而不是str错误”。请帮忙
from pandas_schema.validation import (
InListValidation
,IsDtypeValidation
,DateFormatValidation
,MatchesPatternValidation
)
schema = Schema([
# Match a string of length between 1 and 5
Column('CompanyID', [MatchesPatternValidation(r".{1,5}")]),
# Match a date-like string of ISO 8601 format (https://www.iso.org/iso-8601-date-and-time-format.html)
Column('initialdate', [DateFormatValidation("%Y-%m-%d %H:%M:%S")], allow_empty=True),
# Match only strings in the following list
Column('customertype', [InListValidation(["type1", "type2", "type3"])]),
# Match an IP address RegEx (https://www.oreilly.com/library/view/regular-expressions-cookbook/9780596802837/ch07s16.html)
Column('ip', [MatchesPatternValidation(r"(?:[0-9]{1,3}\.){3}[0-9]{1,3}")]),
# Match only strings in the following list
Column('customersatisfied', [InListValidation(["yes", "no"])], allow_empty=True)
])