我想遍历列表并grep查找项目,然后使用awk从每个grep结果中提取重要信息。 (这是我想的方式,但如果有更好的方法,则不需要awk和grep。)
输入文件包含许多与此类似的行:
chr1 12345 . A G 3e-12 . AB=0;ABP=0;AC=0;AF=0;AN=2;AO=2;CIGAR=1X;
我有许多位置应该匹配第二列的某些部分。
locList="123, 789"
对于每个匹配的位置,我想从第4列和第5列获取信息,并将它们写入具有相应位置的输出文件。
所以上面列表的输出应该是:
123 A G
这就是我在想的事情:
for i in locList; do
grep i inputFile.txt | awk '{print $2,$4,$5}'
done
答案 0 :(得分:4)
每个位置调用一次grep / awk将非常低效。您想调用一个将执行解析的命令。例如,awk:
awk -v locList="12345 789" '
BEGIN {
# parse the location list, and create an array where
# the locations are the array indexes
n = split(locList, a)
for (i=1; i<=n; i++) locations[a[i]] = 1
}
$2 in locations {print $2, $4, $5}
' file
修订后的要求
awk -v locList="123 789" '
BEGIN { n = split(locList, patterns) }
{
for (i=1; i<=n; i++) {
if ($2 ~ "^" patterns[i]) {
print $2, $4, $5
break
}
}
}
' file
~
运算符是正则表达式匹配运算符。
这将从您的示例输入中输出12345 A G
。如果您只想输出123 A G
,请打印patterns[i]
而不是$2
。
答案 1 :(得分:1)
我会做什么:
public static class DbContextExtensions
{
private static readonly ConcurrentDictionary< EntityType, ReadOnlyDictionary< string, NavigationProperty>> s_navPropMappings = new ConcurrentDictionary< EntityType, ReadOnlyDictionary< string, NavigationProperty>>();
public static void DeleteOrphans( this DbContext source )
{
var context = ((IObjectContextAdapter)source).ObjectContext;
foreach (var entry in context.ObjectStateManager.GetObjectStateEntries(EntityState.Modified))
{
var entityType = entry.EntitySet.ElementType as EntityType;
if (entityType == null)
continue;
var navPropMap = s_navPropMappings.GetOrAdd(entityType, CreateNavigationPropertyMap);
var props = entry.GetModifiedProperties().ToArray();
foreach (var prop in props)
{
NavigationProperty navProp;
if (!navPropMap.TryGetValue(prop, out navProp))
continue;
var related = entry.RelationshipManager.GetRelatedEnd(navProp.RelationshipType.FullName, navProp.ToEndMember.Name);
var enumerator = related.GetEnumerator();
if (enumerator.MoveNext() && enumerator.Current != null)
continue;
entry.Delete();
break;
}
}
}
private static ReadOnlyDictionary<string, NavigationProperty> CreateNavigationPropertyMap( EntityType type )
{
var result = type.NavigationProperties
.Where(v => v.FromEndMember.RelationshipMultiplicity == RelationshipMultiplicity.Many)
.Where(v => v.ToEndMember.RelationshipMultiplicity == RelationshipMultiplicity.One || (v.ToEndMember.RelationshipMultiplicity == RelationshipMultiplicity.ZeroOrOne && v.FromEndMember.GetEntityType() == v.ToEndMember.GetEntityType()))
.Select(v => new { NavigationProperty = v, DependentProperties = v.GetDependentProperties().Take(2).ToArray() })
.Where(v => v.DependentProperties.Length == 1)
.ToDictionary(v => v.DependentProperties[0].Name, v => v.NavigationProperty);
return new ReadOnlyDictionary<string, NavigationProperty>(result);
}
}
答案 2 :(得分:1)
awk -v locList='123|789' '$2~"^("locList")" {print $2,$4,$5}' file
或者如果您愿意:
locList='123, 789'
awk -v locList="^(${locList//, /|})" '$2~locList {print $2,$4,$5}' file
或者你喜欢的任何其他排列。关键是你根本不需要循环 - 只需从locList中的数字列表创建一个正则表达式并测试该正则表达式。