我有一个数据文件,其中列出了日期(由包含-(void)viewDidAppear:(BOOL)animated
{
dispatch_async(dispatch_get_global_queue(DISPATCH_QUEUE_PRIORITY_HIGH, 0), ^ {
[imageView sd_setImageWithURL:[NSURL URLWithString:[NSString stringWithFormat:@"%@", img]] placeholderImage:[UIImage imageNamed:@"stub_image.jpg"]
completed:^(UIImage *image, NSError *error, SDImageCacheType cacheType, NSURL *imageURL)
{
dispatch_async(dispatch_get_main_queue(), ^ {
[MBProgressHUD hideHUDForView:self.view animated:YES];
});
}];
});
}
的行表示)和名称后跟数字:
.
这个列表文件很长(约97k行并且每天都在增长),我希望(快速)列出所有唯一名称。在bash我可以这样做:
2015.05.22
nameA 15
nameB 32
2015.05.20
nameA 2
nameC 26
但我在Python中使用这些数据,我想知道是否有一种在Python中做同样事情的方法。显然,我可以简单地从python脚本中调用这个shell命令,但我宁愿学习最佳实践'这样做的方式。
答案 0 :(得分:1)
This will do the trick which basically implements the same set of behaviours as your "Shell" script:
Filter lines in a given file; Remove any line that contains a .
; Get a unique set of this data; Print it
Example:
from __future__ import print_function
lines = (line.strip() for line in open("foo.txt", "r"))
all_names = (line.split(" ", 1)[0] for line in lines if "." not in line)
unique_names = set(all_names)
print("\n".join(unique_names))
Output:
$ python foo.py
nameC
nameB
nameA
答案 1 :(得分:1)
只需使用re
>>> input_str = """
2015.05.22
nameA 15
nameB 32
2015.05.20
nameA 2
nameC 26
"""
>>> import re
>>> set(re.findall('[a-zA-Z]+', input_str))
set(['nameB', 'nameC', 'nameA'])
>>>
答案 2 :(得分:0)
您只需一个awk
命令即可完成所有这些操作:
$ awk 'NF && $1!~/\./ {a[$1]} END {for (i in a) print i}' file
nameC
nameA
nameB
这将检查那些具有某些数据并且其第一个字段不包含点的行。在这种情况下,它将值存储在数组a[]
中,稍后会打印出来。
在Python中,您可以使用set()
来存储数据并防止重复:
for name in set([line.split()[0] for line in open('a') if line.split()[0] and "." not in line.split()[0]]):
print name
答案 3 :(得分:0)
更详细的做法:
unique_results = set()
with open("my file.txt") as my_file:
for line in my_file:
if "." not in line:
name = line.split(" ")
unique_results.add(name)
答案 4 :(得分:0)
只需一行代码即可实现(假设是Python 2.x):
unique_names = {}.fromkeys([line.split()[0] for line in open("file.txt", "r") if "." not in line]).keys()
print unique_names
输出:
['nameB', 'nameC', 'nameA']
如果你想像shell那样输出:
print "\n".join(unique_names)
输出:
nameB
nameC
nameA
如果名字的顺序并不重要,那么python也很优雅。