如何将一个csv文件内容映射到第二个csv文件,并使用unix将其写入另一个csv

时间:2011-06-15 09:35:02

标签: shell unix csv

在编写了一些unix脚本之后,我能够设法将不同xml文件中的数据转换为csv格式,现在我遇到了以下问题

file1.csv:包含

1,5,6,7,8
2,3,4,5,9
1,6,10,11,12
1,5,11,12

file2.csv:包含

1,Mango,Tuna,Webby,Through,Franky,Sam,Sumo
2,Franky
3,Sam
4,Sumo
5,Mango,Tuna,Webby
6,Tuna,Webby,Through
7,Through,Sam,Sumo
8,Nothing
9,Sam,Sumo
10,Sumo,Mango,Tuna
11,Mango,Tuna,Webby,Through
12,Mango,Tuna,Webby,Through,Franky

我想要的输出是

1,5,6,7,8
Mango,Tuna,Webby,Through,Franky,Sam,Sumo
Mango,Tuna,Webby
Tuna,Webby,Through
Through,Sam,Sumo
Nothing
Common word:None

2,3,4,5,9
Franky
Sam
Sumo
Mango,Tuna,Webby
Sam, Sumo
Common Word:None

1,6,10,11,12
Mango,Tuna,Webby,Through,Franky,Sam,Sumo
Tuna,Webby,Through
Sumo,Mango,Tuna
Mango,Tuna,Webby,Through
Mango,Tuna,Webby,Through,Franky
Common word: Tuna

1,5,11,12
Mango,Tuna,Webby,Through,Franky,Sam,Sumo
Mango,Tuna,Webby
Mango,Tuna,Webby,Through
Mango,Tuna,Webby,Through,Franky
Common word: Mango,Tuna,Webby

我欢迎任何帮助。

由于

我得到了一些解决方案,但没有完成

##!/bin/bash
count=1
count_2=1
for i in `cat file1.csv`
do
    echo $i > $count.txt
    cat $count.txt | tr "," "\n" > $count_2.txt
    count=`expr $count + 1`
    count_2=`expr $count_2 + 1`
done;
#this code will create separte files for each line in file1.csv,
bash file3_search.sh
##########################

file3_search.sh
================
##!/bin/bash
cat file2.csv | sed '/^$/d' | sed 's/[ ]*$//' > trim.txt
dos2unix -q 1.txt 1.txt
dos2unix 2.txt 2.txt
dos2unix 3.txt 3.txt
echo "1st Combination results"
for i in `cat 1.txt`
do
cat trim.txt | egrep -w $i
done > Combination1.txt;
echo "2nd Combination results"
for i in `cat 2.txt`
do
    cat trim.txt | egrep -w $i
done > Combination2.txt;
echo "3rd Combination results"
for i in `cat 3.txt`
do
    cat trim.txt | egrep -w $i
done > Combination3.txt;

伙计我不擅长编程(我是软件测试人员)请有人可以重新考虑我的代码,还请告诉我如何在这些Combination.txt文件中获取常用词

2 个答案:

答案 0 :(得分:0)

恕我直言,它有效:

for line in $(cat 1.csv) ; do 
    echo $line ; 
    grepline=`echo $line | sed 's/ \+//g;s/,/,|/g;s/^\(.*\)$/^(\1,)/'`; 
    egrep $grepline 2.csv
    egrep $grepline 2.csv | \
    awk -F "," '
      { for (i=2;i<=NF;i++) 
          {s[$i]+=1} 
      }
      END { for (key in s) 
              {if (s[key]==NR) { tp+=key "," } 
              } 
            if (tp!="") {print "Common word(s): " gensub(/,$/,"","g",tp)} 
              else {print "Common word: None"}}'
   echo
done

HTH

答案 1 :(得分:0)

这是给你的答案。它取决于bash版本4的关联数组功能:

IFS=,
declare -a words

# read and store the words in file2
while read line; do
    set -- $line
    n=$1
    shift
    words[$n]="$*"
done < file2.csv

# read file1 and process
while read line; do
    echo "$line"

    set -- $line
    indexes=( "$@" )
    NF=${#indexes[@]}
    declare -A common

    for (( i=0; i<$NF; i++)); do
        echo "${words[${indexes[$i]}]}"

        set -- ${words[${indexes[$i]}]}
        for word; do
            common[$word]=$(( ${common[$word]} + 1))
        done
    done

    printf "Common words: "
    n=0
    for word in "${!common[@]}"; do
        if [[ ${common[$word]} -eq $NF ]]; then
            printf "%s " $word
            (( n++ ))
        fi
    done
    [[ $n -eq 0 ]] && printf "None"

    unset common
    printf "\n\n"
done < file1.csv