在bash中遍历列表并运行多个grep命令

时间:2015-09-22 18:46:26

标签: bash awk

我想遍历列表并grep查找项目,然后使用awk从每个grep结果中提取重要信息。 (这是我想的方式,但如果有更好的方法,则不需要awk和grep。)

输入文件包含许多与此类似的行:

chr1    12345   .   A   G   3e-12   .   AB=0;ABP=0;AC=0;AF=0;AN=2;AO=2;CIGAR=1X;

我有许多位置应该匹配第二列的某些部分。

locList="123, 789"

对于每个匹配的位置,我想从第4列和第5列获取信息,并将它们写入具有相应位置的输出文件。

所以上面列表的输出应该是:

123 A G

这就是我在想的事情:

for i in locList; do
    grep i inputFile.txt | awk '{print $2,$4,$5}'
done

3 个答案:

答案 0 :(得分:4)

每个位置调用一次grep / awk将非常低效。您想调用一个将执行解析的命令。例如,awk:

awk -v locList="12345 789" '
    BEGIN {
        # parse the location list, and create an array where
        # the locations are the array indexes
        n = split(locList, a)
        for (i=1; i<=n; i++) locations[a[i]] = 1
    }
    $2 in locations {print $2, $4, $5}
' file

修订后的要求

awk -v locList="123 789" '
    BEGIN { n = split(locList, patterns) }
    {
        for (i=1; i<=n; i++) {
            if ($2 ~ "^" patterns[i]) {
                print $2, $4, $5
                break
            }
        }
    }
' file

~运算符是正则表达式匹配运算符。

这将从您的示例输入中输出12345 A G。如果您只想输出123 A G,请打印patterns[i]而不是$2

答案 1 :(得分:1)

我会做什么:

public static class DbContextExtensions
{
    private static readonly ConcurrentDictionary< EntityType, ReadOnlyDictionary< string, NavigationProperty>> s_navPropMappings = new ConcurrentDictionary< EntityType, ReadOnlyDictionary< string, NavigationProperty>>();

    public static void DeleteOrphans( this DbContext source )
    {
        var context = ((IObjectContextAdapter)source).ObjectContext;
        foreach (var entry in context.ObjectStateManager.GetObjectStateEntries(EntityState.Modified))
        {
            var entityType = entry.EntitySet.ElementType as EntityType;
            if (entityType == null)
                continue;

            var navPropMap = s_navPropMappings.GetOrAdd(entityType, CreateNavigationPropertyMap);
            var props = entry.GetModifiedProperties().ToArray();
            foreach (var prop in props)
            {
                NavigationProperty navProp;
                if (!navPropMap.TryGetValue(prop, out navProp))
                    continue;

                var related = entry.RelationshipManager.GetRelatedEnd(navProp.RelationshipType.FullName, navProp.ToEndMember.Name);
                var enumerator = related.GetEnumerator();
                if (enumerator.MoveNext() && enumerator.Current != null)
                    continue;

                entry.Delete();
                break;
            }
        }
    }

    private static ReadOnlyDictionary<string, NavigationProperty> CreateNavigationPropertyMap( EntityType type )
    {
        var result = type.NavigationProperties
            .Where(v => v.FromEndMember.RelationshipMultiplicity == RelationshipMultiplicity.Many)
            .Where(v => v.ToEndMember.RelationshipMultiplicity == RelationshipMultiplicity.One || (v.ToEndMember.RelationshipMultiplicity == RelationshipMultiplicity.ZeroOrOne && v.FromEndMember.GetEntityType() == v.ToEndMember.GetEntityType()))
            .Select(v => new { NavigationProperty = v, DependentProperties = v.GetDependentProperties().Take(2).ToArray() })
            .Where(v => v.DependentProperties.Length == 1)
            .ToDictionary(v => v.DependentProperties[0].Name, v => v.NavigationProperty);

        return new ReadOnlyDictionary<string, NavigationProperty>(result);
    }
}

答案 2 :(得分:1)

awk -v locList='123|789' '$2~"^("locList")" {print $2,$4,$5}' file

或者如果您愿意:

locList='123, 789'
awk -v locList="^(${locList//, /|})" '$2~locList {print $2,$4,$5}' file

或者你喜欢的任何其他排列。关键是你根本不需要循环 - 只需从locList中的数字列表创建一个正则表达式并测试该正则表达式。