在Python数据集中过滤“包含值”

时间:2020-08-28 06:26:32

标签: python pandas

在Python中,如何通过包含特定值的值来过滤列?

一个示例是一个数据集,该数据集的列名为“城市”,值可以为“悉尼”,“大悉尼”,“北悉尼”等。如果使用输入“悉尼”,我如何确保所有变化包含在过滤中?

#user inputs column
input1 = input()
country_city = input1.title()

#user inputs value
input2 = input()
country_city_value = input2.title()

#filtering step (current)
filtered = dataset[dataset[country_city] == country_city_value]
print(filtered)

2 个答案:

答案 0 :(得分:1)

如果要过滤包含输入词的内容,请应用str.contains

data = {'City': ['Sydney', 'Greater Sydney', 'North Sydney'],}

dataset = pd.DataFrame(data, columns = ['City'])

#user inputs column
input1 = input()              
country_city = input1.title()        # 'City'

#user inputs value
input2 = input()
country_city_value = input2.title()  # 'Sydney'

#filtering step (current)
filtered = dataset[dataset[country_city].str.contains(country_city_value)]

#              City
# 0          Sydney
# 1  Greater Sydney
# 2    North Sydney
print(filtered)

答案 1 :(得分:1)

str.contains是个不错的选择,但是如果您输入的内容是“ North Sydney”,结果您将不会收到Sydney,而只会收到north sydney 例如:

df= pd.DataFrame({
    'A':['Sydney','North Sydney','Alaska']

})
print(df)
              A
0        Sydney
1  North Sydney
2        Alaska
input='North Sydney'
filtered = df[df.A.str.contains(input)]

print(filtered)
              A
1  North Sydney

因此,使用split() with str.contains()

input=input.split()
print(input)
['North', 'Sydney']

filtered = df[df.A.str.contains('%s'%[x for x in input])]

print(filtered)
              A
0        Sydney
1  North Sydney

因此,您可以确保输入的所有部分都得到考虑