我正在尝试编写一个清理URL的功能(将它们删除像" www。"," http://"等等)以创建列表我可以按字母顺序排序。
我试图通过创建一个类来实现这一点,该类包含一个方法来检测我想从URL字符串中删除的术语,并将其删除。我正在努力的一点是,我想将修改后的URL添加到名为new_strings
的新列表中,然后在我在另一个术语上第二次调用该方法时使用该新列表,以便逐步执行步骤我可以从URL字符串中删除所有不需要的元素。
由于某些原因,我当前的代码返回一个空列表,我也在努力了解new_strings
是否应该传递给__init__
?我想我对全局变量和局部变量有点困惑,一些帮助和解释将非常感激。 :)
谢谢!代码如下。
class URL_Cleaner(object):
def __init__(self, old_strings, new_strings, term):
self.old_strings = old_strings
self.new_strings = new_strings
self.term = term
new_strings = []
def delete_term(self, new_strings):
for self.string in self.old_strings:
if self.term in string:
new_string = string.replace(term, "")
self.new_strings.append(new_string)
else:
self.new_strings.append(string)
return self.new_strings
print "\n" .join(new_strings) #for checking; will be removed later
strings = ["www.google.com", "http://www.google.com", "https://www.google.com"]
new_strings = []
www = URL_Cleaner(strings, new_strings, "www.")
答案 0 :(得分:2)
我们为什么要上课呢?
for string in strings:
string.replace("www.","")
这不是你想要完成的吗?
无论问题出在您的班级定义中。注意范围:
class URL_Cleaner(object):
def __init__(self, old_strings, new_strings, term):
"""These are all instance objects"""
self.old_strings = old_strings
self.new_strings = new_strings
self.term = term
new_strings = [] # this is a class object
def delete_term(self, new_strings):
"""You never actually call this function! It never does anything!"""
for self.string in self.old_strings:
if self.term in string:
new_string = string.replace(term, "")
self.new_strings.append(new_string)
else:
self.new_strings.append(string)
return self.new_strings
print "\n" .join(new_strings) #for checking; will be removed later
# this is referring the class object, and will be evaluated when
# the class is defined, NOT when the object is created!
我已经为您的代码评论了必要的原因....要修复:
class URL_Cleaner(object):
def __init__(self, old_strings):
"""Cleans URL of 'http://www.'"""
self.old_strings = old_strings
cleaned_strings = self.clean_strings()
def clean_strings(self):
"""Clean the strings"""
accumulator = []
for string in self.old_strings:
string = string.replace("http://", "").replace("www.", "")
# this might be better as string = re.sub("http://(?:www.)?", "", string)
# but I'm not going to introduce re yet.
accumulator.append(string)
return accumulator
# this whole function is just:
## return [re.sub("http://(?:www.)?", "", string, flags=re.I) for string in self.old_strings]
# but that's not as readable imo.
答案 1 :(得分:1)
您只需要将new_strings定义为
self.new_strings = []
并从构造函数中删除new_strings参数。
' new_strings'和' self.new_strings'是两个不同的列表。