我试图在它们各自的列下打印2个变量(Phaster_positions和GBKPositions)。我希望将每个变量打印在由选项卡分隔的列下。这是我得到的:
Phaster_positions GBKPositions Phaster_positions GBKPositions
371860-418565 247..381
2947108-2988239 378..1781
4663633-4680174 1884..2987
5756724-5793879 3008..3103
5794433-5829445 3128..4405
6867447-6901202 4479..5081
5102..6229
6253..8670
complement(8742..9269)
complement(9583..10563)
complement(10560..12458)
complement(12455..13402)
complement(13973..15541)
complement(15881..16051)
16440..16814
complement(16858..18234)
complement(18254..18628)
complement(18710..20266)
complement(20317..22452)
complement(22888..23454)
complement(23474..25552)
complement(25557..26504)
26735..27631
complement(27655..29334)
29603..30559
complement(30534..31982)
complement(32016..33389)
complement(33391..34734)
complement(34736..35692)
complement(35761..36267)
36431..37459
37519..38688
我想要:
Phaster_positions GBKPositions
371860-418565 247..381
2947108-2988239 378.1781
4663633-4680174 etc
5756724-5793879 etc
5794433-5829445 etc
6867447-6901202 etc
我的脚本:
#!/bin/bash
printf "Phaster_positions\n\n">gbk31.txt
printf "GBKPositions\n\n">gbk32.txt
PhasterPositions=`awk '$2~/[0-9]Kb/{print ($5)}' CP000155.phaster`
GBKPositions=`awk '$1~/CDS/{print ($2)}' CP000155.gbk`
echo -e "$PhasterPositions">>gbk31.txt
echo -e "$GBKPositions">>gbk32.txt
joined=`paste gbk31.txt gbk32.txt | column -s $'\t' -t`
echo -e "$joined">> gbkfinal.txt
第一个变量的源文件:
gi|00000000|ref|NC_000000| Hahella chejuensis KCTC 2396, complete genome. .7215267, gc%: 53.87%
REGION REGION_LENGTH COMPLETENESS(score) SPECIFIC_KEYWORD REGION_POSITION TRNA_NUM TOTAL_PROTEIN_NUM PHAGE_HIT_PROTEIN_NUM HYPOTHETICAL_PROTEIN_NUM PHAGE+HYPO_PROTEIN_PERCENTAGE BACTERIAL_PROTEIN_NUM ATT_SITE_SHOWUP PHAGE_SPECIES_NUM MOST_COMMON_PHAGE_NAME(hit_genes_count) FIRST_MOST_COMMON_PHAGE_NUM FIRST_MOST_COMMON_PHAGE_PERCENTAGE GC_PERCENTAGE
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
1 46.7Kb questionable(80) head,terminase,tail,capsid,recombinase 371860-418565 0 38 27 8 92.1% 3 yes 10 PHAGE_Pseudo_phi3_NC_030940(17),PHAGE_Aeromo_phiO18P_NC_009542(15),PHAGE_Haemop_HP1_NC_001697(11),PHAGE_Pasteu_F108_NC_008193(9),PHAGE_Vibrio_8_NC_022747(8),PHAGE_Vibrio_K139_NC_003313(8),PHAGE_Haemop_HP2_NC_003315(7),PHAGE_Phormi_MIS_PhV1A_NC_029032(3),PHAGE_Ralsto_RSY1_NC_025115(3),PHAGE_Burkho_KS14_NC_015273(2),PHAGE_Entero_186_NC_001317(2),PHAGE_Entero_N15_NC_001901(1),PHAGE_Salmon_SEN1_NC_029003(1),PHAGE_Mannhe_vB_MhM_587AP1_NC_028898(1),PHAGE_Salmon_RE_2010_NC_019488(1),PHAGE_Vibrio_vB_VpaM_MAR_NC_019722(1),PHAGE_Klebsi_phiKO2_NC_005857(1),PHAGE_Burkho_KS5_NC_015265(1),PHAGE_Pseudo_YuA_NC_010116(1),PHAGE_Vibrio_VP882_NC_009016(1),PHAGE_Mannhe_phiMHaA1_NC_008201(1),PHAGE_Pseudo_MP1412_NC_018282(1),PHAGE_Stenot_Smp131_NC_023588(1),PHAGE_Pseudo_JG004_NC_019450(1),PHAGE_Bdello_phi1422_NC_019525(1),PHAGE_Salmon_Fels_2_NC_010463(1),PHAGE_Bacill_G_NC_023719(1),PHAGE_Pseudo_phiCTX_NC_003278(1),PHAGE_Psychr_Psymv2_NC_023734(1),PHAGE_Entero_fiAA91_ss_NC_022750(1),PHAGE_Escher_pro483_NC_028943(1),PHAGE_Burkho_KL3_NC_015266(1) 16 44.73% 55.35%
2 41.1Kb intact(120) integrase,head,recombinase,capsid,tail 2947108-2988239 1 53 23 28 96.2% 2 yes 18 PHAGE_Pseudo_phi2_NC_030931(10),PHAGE_Entero_lambda_NC_001416(4),PHAGE_Pseudo_F10_NC_007805(4),PHAGE_Escher_vB_EcoM_ECO1230_10_NC_027995(3),PHAGE_Entero_N15_NC_001901(3),PHAGE_Burkho_AH2_NC_018283(3),PHAGE_Shewan_1/44_NC_025463(2),PHAGE_Achrom_phiAxp_2_NC_029106(2),PHAGE_Vibrio_VvAW1_NC_020488(2),PHAGE_Burkho_BcepNazgul_NC_005091(2),PHAGE_Entero_Arya_NC_031048(2),PHAGE_Entero_mEp460_NC_019716(2),PHAGE_Entero_HK630_NC_019723(2),PHAGE_Vibrio_X29_NC_024369(2),PHAGE_Escher_vB_EcoM_ep3_NC_025430(1),PHAGE_Salmon_phiSG_JL2_NC_010807(1),PHAGE_Rueger_DSS3_P1_NC_025428(1),PHAGE_Shigel_SfIV_NC_022749(1),PHAGE_Klebsi_phiKO2_NC_005857(1),PHAGE_Shigel_Ss_VASD_NC_028685(1),PHAGE_Entero_SfV_NC_003444(1),PHAGE_Marino_P12026_NC_018269(1),PHAGE_Entero_HK629_NC_019711(1),PHAGE_Entero_phi80_NC_021190(1),PHAGE_Entero_BP_4795_NC_004813(1),PHAGE_Burkho_BcepIL02_NC_012743(1),PHAGE_Vibrio_VP882_NC_009016(1),PHAGE_Entero_VT2phi_272_NC_028656(1),PHAGE_Phage_Gifsy_1_NC_010392(1),PHAGE_Bdello_phi1422_NC_019525(1),PHAGE_Vibrio_VpKK5_NC_026610(1),PHAGE_Pectob_ZF40_NC_019522(1),PHAGE_Ralsto_RS138_NC_029107(1),PHAGE_Entero_mEp237_NC_019704(1),PHAGE_Salico_CGphi29_NC_020844(1),PHAGE_Entero_HK225_NC_019717(1),PHAGE_Bacill_Slash_NC_022774(1),PHAGE_Rhodob_RcapNL_NC_020489(1),PHAGE_Pseudo_F116_NC_006552(1),PHAGE_Escher_80001_NC_027387(1),PHAGE_Salmon_FSLSP088_NC_021780(1),PHAGE_Pseudo_PPpW_3_NC_023006(1),PHAGE_Vibrio_vB_VpaM_MAR_NC_019722(1),PHAGE_Synech_S_CBS1_NC_016164(1),PHAGE_Burkho_KS14_NC_015273(1),PHAGE_Stenot_S1_NC_011589(1),PHAGE_Escher_TL_2011c_NC_019442(1),PHAGE_Entero_186_NC_001317(1),PHAGE_Entero_cdtI_NC_009514(1),PHAGE_Burkho_DC1_NC_018452(1),PHAGE_Bacter_Lily_NC_028841(1),PHAGE_Burkho_BcepMigl_NC_019917(1),PHAGE_Salmon_iEPS5_NC_021783(1),PHAGE_Erwini_vB_EamP_L1_NC_019510(1),PHAGE_Escher_P13374_NC_018846(1),PHAGE_Vibrio_SIO_2_NC_016567(1) 4 18.86% 53.18%
3 16.5Kb intact(110) tail,head,capsid,terminase 4663633-4680174 0 17 12 5 100% 0 no 10 PHAGE_Salmon_ST64B_NC_004313(3),PHAGE_Entero_phiP27_NC_003356(3),PHAGE_Burkho_phi6442_NC_009235(3),PHAGE_Burkho_phiE125_NC_003309(3),PHAGE_Burkho_phi1026b_NC_005284(3),PHAGE_Entero_SfV_NC_003444(2),PHAGE_Entero_HK140_NC_019710(2),PHAGE_Salmon_118970_sal3_NC_031940(2),PHAGE_Strept_phiSASD1_NC_014229(2),PHAGE_Salmon_118970_sal3_NC_031940(2),PHAGE_Idioma_1N2_2_NC_025439(1),PHAGE_Shigel_SfIV_NC_022749(1),PHAGE_Entero_mEp235_NC_019708(1),PHAGE_Mannhe_vB_MhS_1152AP2_NC_028956(1),PHAGE_Vibrio_12B8_NC_021073(1),PHAGE_Mycoba_Lockley_NC_011021(1),PHAGE_Entero_HK022_NC_002166(1),PHAGE_Entero_mEp390_NC_019721(1),PHAGE_Entero_BP_4795_NC_004813(1),PHAGE_Marino_P12026_NC_018269(1),PHAGE_Colwel_9A_NC_018088(1),PHAGE_Vibrio_VpKK5_NC_026610(1),PHAGE_Clostr_phiCD6356_NC_015262(1),PHAGE_Entero_HK542_NC_019769(1),PHAGE_Entero_IME_EFm5_NC_028826(1),PHAGE_Geobac_E2_NC_009552(1),PHAGE_Entero_IME_EFm1_NC_024356(1),PHAGE_Burkho_KS9_NC_013055(1),PHAGE_Pseudo_Pq0_NC_029100(1),PHAGE_Rhizob_vB_RleS_L338C_NC_023502(1),PHAGE_Entero_SfI_NC_027339(1),PHAGE_Geobac_GBK2_NC_023612(1),PHAGE_Shigel_SfII_NC_021857(1),PHAGE_Rhodoc_REQ1_NC_016655(1),PHAGE_Burkho_Bcep176_NC_007497(1),PHAGE_Entero_mEpX2_NC_019705(1),PHAGE_Mycoba_MOOREtheMARYer_NC_028791(1) 3 17.64% 58.49%
4 37.1Kb questionable(90) tail,virion,capsid,portal,terminase 5756724-5793879 0 30 22 4 86.6% 4 no 15 PHAGE_Pseudo_JBD93_NC_030918(5),PHAGE_Pseudo_M6_NC_007809(5),PHAGE_Pseudo_YuA_NC_010116(4),PHAGE_Pseudo_PAE1_NC_028980(4),PHAGE_Pseudo_JBD24_NC_020203(4),PHAGE_Pseudo_vB_PaeS_PAO1_Ab30_NC_026601(3),PHAGE_Vibrio_vB_VpaM_MAR_NC_019722(3),PHAGE_Synech_S_CBS1_NC_016164(3),PHAGE_Vibrio_VHML_NC_004456(3),PHAGE_Vibrio_VP58.5_NC_027981(3),PHAGE_Pseudo_MP1412_NC_018282(3),PHAGE_Pseudo_DMS3_NC_008717(2),PHAGE_Stenot_vB_SmaS_DLP_2_NC_029019(2),PHAGE_Synech_S_CBS3_NC_015465(2),PHAGE_Pseudo_PaMx11_NC_028770(2),PHAGE_Rhizob_RR1_A_NC_021560(2),PHAGE_Pseudo_MP38_NC_011611(2),PHAGE_Pseudo_vB_PaeS_PAO1_Ab18_NC_026594(2),PHAGE_Pseudo_PaMx28_NC_028931(2),PHAGE_Vibrio_SIO_2_NC_016567(2),PHAGE_Rueger_DSS3_P1_NC_025428(1),PHAGE_Klebsi_phiKO2_NC_005857(1),PHAGE_Cellul_phi18:3_NC_021794(1),PHAGE_Shewan_1/44_NC_025463(1),PHAGE_Achrom_phiAxp_2_NC_029106(1),PHAGE_Pseudo_vB_Pae_Kakheti25_NC_017864(1),PHAGE_Vibrio_12A10_NC_029067(1),PHAGE_Pseudo_vB_PaeS_PM105_NC_028667(1),PHAGE_Pseudo_vB_PaeS_SCH_Ab26_NC_024381(1),PHAGE_Cellul_phi46:3_NC_021792(1),PHAGE_Vibrio_12B3_NC_021067(1),PHAGE_Ralsto_RS138_NC_029107(1),PHAGE_Salmon_SSU5_NC_018843(1),PHAGE_Vibrio_12B12_NC_021070(1),PHAGE_Cellul_phi39:1_NC_021804(1),PHAGE_Pseudo_phiMK_NC_031110(1),PHAGE_Pseudo_73_NC_007806(1),PHAGE_Pseudo_PaMx74_NC_028809(1),PHAGE_Pseudo_MP22_NC_009818(1),PHAGE_Rhizob_vB_RleS_L338C_NC_023502(1),PHAGE_Pseudo_PaMx42_NC_028879(1),PHAGE_Burkho_phi6442_NC_009235(1),PHAGE_Stenot_S1_NC_011589(1),PHAGE_Pseudo_B3_NC_006548(1),PHAGE_Pseudo_D3112_NC_005178(1),PHAGE_Bacter_Lily_NC_028841(1),PHAGE_Burkho_phiE125_NC_003309(1),PHAGE_Vibrio_X29_NC_024369(1),PHAGE_Burkho_AH2_NC_018283(1) 3 16.66% 56.85%
5 35Kb incomplete(50) capsid,integrase 5794433-5829445 0 18 10 3 72.2% 5 yes 9 PHAGE_Entero_JenP1_NC_029028(2),PHAGE_Entero_CAjan_NC_028776(2),PHAGE_Entero_JenP2_NC_028997(2),PHAGE_Psychr_pOW20_A_NC_020841(1),PHAGE_Idioma_1N2_2_NC_025439(1),PHAGE_Burkho_BcepGomr_NC_009447(1),PHAGE_Strept_MM1_NC_003050(1),PHAGE_Strept_EJ_1_NC_005294(1),PHAGE_Mycoba_Milly_NC_026598(1),PHAGE_Entero_JenK1_NC_029021(1),PHAGE_Mycoba_Cheetobro_NC_028979(1),PHAGE_Strept_phiARI0746_NC_031907(1),PHAGE_Salico_CGphi29_NC_020844(1),PHAGE_Gordon_Wizard_NC_030913(1),PHAGE_Entero_phiFL3A_NC_013648(1),PHAGE_Mycoba_Phelemich_NC_022063(1),PHAGE_Deep_s_D6E_NC_019544(1),PHAGE_Verruc_P8625_NC_029047(1),PHAGE_Pseudo_PPpW_3_NC_023006(1),PHAGE_Bacill_TP21_L_NC_011645(1),PHAGE_Aurant_AmM_1_NC_027334(1),PHAGE_Bacill_BM5_NC_029069(1),PHAGE_Burkho_phiE12_2_NC_009236(1),PHAGE_Bacill_phi105_NC_004167(1),PHAGE_Bacill_BMBtp2_NC_019912(1),PHAGE_Escher_slur01_NC_028831(1),PHAGE_Mycoba_ZoeJ_NC_024147(1),PHAGE_Mycoba_Acadian_NC_023701(1),PHAGE_Thermo_THSA_485A_NC_018264(1),PHAGE_Entero_phiFL1A_NC_013646(1),PHAGE_Lactob_Lj771_NC_010179(1),PHAGE_Mycoba_Baee_NC_028742(1) 2 11.11% 49.25%
6 33.7Kb questionable(80) recombinase,capsid,terminase,tail,head 6867447-6901202 0 37 26 7 89.1% 4 yes 7 PHAGE_Pseudo_phi3_NC_030940(19),PHAGE_Aeromo_phiO18P_NC_009542(17),PHAGE_Haemop_HP1_NC_001697(10),PHAGE_Pasteu_F108_NC_008193(9),PHAGE_Vibrio_8_NC_022747(9),PHAGE_Vibrio_K139_NC_003313(9),PHAGE_Haemop_HP2_NC_003315(8),PHAGE_Ralsto_RSY1_NC_025115(3),PHAGE_Burkho_KS14_NC_015273(2),PHAGE_Burkho_KS5_NC_015265(2),PHAGE_Salmon_Fels_2_NC_010463(2),PHAGE_Ralsto_RSA1_NC_009382(1),PHAGE_Phormi_MIS_PhV1A_NC_029032(1),PHAGE_Entero_N15_NC_001901(1),PHAGE_Salmon_RE_2010_NC_019488(1),PHAGE_Vibrio_vB_VpaM_MAR_NC_019722(1),PHAGE_Halomo_phiHAP_1_NC_010342(1),PHAGE_Klebsi_phiKO2_NC_005857(1),PHAGE_Vibrio_VP882_NC_009016(1),PHAGE_Bdello_phi1422_NC_019525(1),PHAGE_Entero_186_NC_001317(1),PHAGE_Pseudo_phiCTX_NC_003278(1),PHAGE_Entero_fiAA91_ss_NC_022750(1),PHAGE_Haemop_SuMu_NC_019455(1),PHAGE_Burkho_KL3_NC_015266(1) 18 51.35% 55.42%
第二个变量的源文件(这是一个很大的文件):
source 1..7215267
/organism="Hahella chejuensis KCTC 2396"
/mol_type="genomic DNA"
/strain="KCTC 2396"
/db_xref="taxon:349521"
gene 247..381
/locus_tag="HCH_00001"
CDS 247..381
/locus_tag="HCH_00001"
/codon_start=1
/transl_table=11
/product="hypothetical protein"
/protein_id="ABC26924.1"
/translation="MGFGHRVLFSLKNINIRFSLYIESRRLKFAQKKSKHVRILEVWK
"
gene 378..1781
/gene="dnaA"
/locus_tag="HCH_00002"
CDS 378..1781
/gene="dnaA"
/locus_tag="HCH_00002"
/note="TIGRFAMsMatches:TIGR00362"
/codon_start=1
/transl_table=11
/product="chromosomal replication initiator protein DnaA"
/protein_id="ABC26925.1"
/translation="MTSELWHQCLGYLEDELPAQQFNTWLRPLQAKGSEEELLLFAPN
RFVLDWVNEKYIGRINEILSELTSQKAPRISLKIGSITGNSKGQQASKDSAVGATRTT
APSRPVIADVAPSGERNVTVEGAIKHESYLNPTFTFETFVEGKSNQLARAAAMQVADN
PGSAYNPLFLYGGVGLGKTHLMQAVGNAIFKKNPNAKILYLHSERFVADMVKALQLNA
FNEFKRLYRSVDALLIDDIQFFARKERSQEEFFHTFNALLEGGQQMILTCDRYPKEID
HMEERLKSRFGWGLTVMVEPPELETRVAILMKKAEQANVHLSSESAFFIAQKIRSNVR
ELEGALKLVIANAHFTGQEITPAFIRECLKDLLALHEKQVSIDNIQRTVAEYYKIRIA
DILSKRRTRSITRPRQMAMALAKELTNHSLPEIGEAFGGRDHTTVLHACKVMIELQQS
DPTLRDDYQNFMRMLTS"
gene 1884..2987
/gene="dnaN"
/locus_tag="HCH_00003"
CDS 1884..2987
/gene="dnaN"
/locus_tag="HCH_00003"
/EC_number="2.7.7.7"
/note="TIGRFAMsMatches:TIGR00663"
/codon_start=1
/transl_table=11
/product="DNA polymerase III, beta subunit"
/protein_id="ABC26926.1"
/translation="MKLTITREALVTSLQMISGVVEKRQTMPVLANVLLDARDGKLVI
TGTNMEVELVAEISDVNIEHESRITVPAKKFTDICRALPEGAAIGIELKDGRLNVRYG
SSHFILSTLPAEHFPNVEEEPESVKVTLPQRELKRLIDATAFAMAQQDVRYYLNGMLM
ELDEQGLRTVATDGHRLALANVSLQTGVSEKRQPIVPRKGILELGRLLNDTDESCTLV
FGDNHVRASVGHFTFTSKLIDGKFPDYQRVIPRSGDKVMLADRVLLKGVLSRASILSH
ESIRGVRLQFEEGLLKVFANNPDQEEAEDSLEVEYPHEALQIGFNVGYLIDVLNALDD
EQVKVTLSNANSSALVEGVDTRDAVYVVMPMRL"
gene 3008..3103
/locus_tag="HCH_00004"
CDS 3008..3103
/locus_tag="HCH_00004"
/codon_start=1
/transl_table=11
/product="hypothetical protein"
/protein_id="ABC26927.1"
/translation="MNLFELERSRRVARSGMTLGKDVSPLNADRV"
gene 3128..4405
/gene="aarF"
/locus_tag="HCH_00005"
CDS 3128..4405
/gene="aarF"
/locus_tag="HCH_00005"
/note="Predicted unusual protein kinase; COG0661"
/codon_start=1
/transl_table=11
/product="ABC1 family protein kinase"
/protein_id="ABC26928.1"
/translation="MGKIVNAVKGAARIGQTAAVISKVGLGWLKGNRAPAPRLLRQTF
EELGATYIKLGQFIASSPTFFPADYVEEFQLCLDKTKPLPYSQIEKILKEEFKRPLQS
IYSHIDTKPLASASIAQVHAARLVTGEDVVIKVQKPGVRNVLLTDLNFLYVAARVVEY
LAPKLSWTSLSGIVEEIQRTMMEECDFYQEAANLKEFREFLVSSGNDQAVVPTVYEQA
STMRVLTMERFYGVPLTDLETIRKYCSDPEKTLITAMNTWFASLTQCDFFHADVHAGN
LMVLEDGRIGFIDFGIVGRIGAGTWQAVSDFITAIMMGNFHGMADAMSRIGITKSQLS
VDDLAADIADVYKKMDAMTPDMPPIYYDQQTGDDEVNNILMDLVRIGEQHGLHFPREF
ALLLKQFLYFDRYVHVLAPELDMFMDERLSLIQ"
答案 0 :(得分:0)
您需要分隔并命名要对数据执行的所有操作。然后找到等效的UNIX命令。大多数UNIX工具都是按行而不是按列工作,因此学习按行思考是有益的;)请不要将bash当作普通的编程语言,它是一种胶粘剂。将所有工具放在一起-就是不要将潜在的大数据分配给变量。
您要从两个文件(awk
中提取一列,然后将两个提取的列粘贴到输出(paste
)中,并以\t
分隔(粘贴使用{{1 }}(默认)),其后是标题行。您可以创建两个中间文件,也可以使用外壳替换。
\t
编辑:看到源数据,这可能会产生所需的文件格式,但数据不正确。您需要确保来自第一个paste\
<( <CP000155.phaster awk '$2~/[0-9]Kb/{print ($5)}' )\
<( <CP000155.gbk awk '$1~/CDS/{print ($2)}' ) |
(echo -e 'Phaster_positions\tGBKPositions'; cat) \
> gbk3.txt
的第三行与来自第二个awk
的第三行完全对应。您的数据可能需要使用唯一标识符由awk
进行组合。.
答案 1 :(得分:0)
我并没有非常彻底地浏览这些文件,基本上是从您的代码中分离出一部分并将其合并到一个脚本中,并添加了哈希和输出:
$ awk -v OFS="\t" ' # tab as output delimiter
NR==FNR && $2~/[0-9]Kb/ { # process the first file (with a condition)
a[++i]=$5 # hash $5 to a
next # process next record
}
$1~/CDS/ { # process the second file (with a condition)
b[++j]=$2 # hash $2 to b
}
END {
print "Phaster_positions","GBKPositions"
if(i>=j) # was there more is or js
n=i # take the bigger value and use it...
else
n=j
for(i=1;i<=n;i++) # ... here
print a[i],b[i] # output side by side
}' first second
输出:
Phaster_positions GBKPositions
371860-418565 247..381
2947108-2988239 378..1781
4663633-4680174 1884..2987
5756724-5793879 3008..3103
5794433-5829445 3128..4405
6867447-6901202
这有意义吗?它存储与a
哈希匹配的文件1和与b
匹配的文件2。如果有大量数据,则可能内存不足。如果是这种情况,请返回报告,我们将为您提供其他解决方案。
更新:
此文件仅存储文件1到a
,并在输出文件2时清空文件:
awk '
BEGIN {
OFS="\t" # the output field separator
print "Phaster_positions","GBKPositions" # output the header
}
NR==FNR && $2~/[0-9]Kb/ { # process the first file (with a condition)
a[++i]=$5 # hash $5 to a
next # process next record
}
$1~/CDS/ { # process the second file (with a condition)
print ((++j in a)?a[j]:"") OFS $2 # output from a if exists and $2
delete a[j] # delete after output
}
END {
for(j=1;j<=i;j++) # stupid loop
if(j in a) # if there are any left in a
print a[j] OFS # output them
}' first second
未经战斗测试,for
中的END
循环很愚蠢。
答案 2 :(得分:-1)
printf "Phaster_positions\tGBKPositions\n\n">gbk3.txt
PhasterPositions=`awk '$2~/[0-9]Kb/{print ($5)}' CP000155.phaster`
GBKPositions=`awk '$1~/CDS/{print ($2)}' CP000155.gbk`
printf "$PhasterPositions\t$GBKPositions">>gbk3.txt
看看是否可行