I tried the below code
#!usr/local/bin/perl
open(f1, "/home/httpd/cgi-bin/LDU/list1.txt");
while ( $line = <f1> ) {
$line =~ m/(?:=")\w+/g;
print "$line";
}
I need the output to be displayed as follows
acinetobacter_baumannii_26016_2
acinetobacter_baumannii_44839_10
acinetobacter_baumannii_45002_9
acinetobacter_baumannii_45075_6
__DATA__
<A HREF="acinetobacter_baumannii_26016_2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="acinetobacter_baumannii_26016_2/">acinetobacter_baumannii_26016_2</A>. Mar 16 18:12
<A HREF="acinetobacter_baumannii_44839_10/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="acinetobacter_baumannii_44839_10/">acinetobacter_baumannii_44839_1></A> Mar 16 18:12
<A HREF="acinetobacter_baumannii_45002_9/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="acinetobacter_baumannii_45002_9/">acinetobacter_baumannii_45002_9</A>. Mar 16 18:11
<A HREF="acinetobacter_baumannii_45075_6/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="acinetobacter_baumannii_45075_6/">acinetobacter_baumannii_45075_6</A>. Mar 16 18:13
<A HREF="acinetobacter_baumannii_796380_1375/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="acinetobacter_baumannii_796380_1375/">acinetobacter_baumannii_796380_></A> Mar 16 18:13
<A HREF="amycolatopsis_mediterranei_gca_000700945_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="amycolatopsis_mediterranei_gca_000700945_1/">amycolatopsis_mediterranei_gca_></A> Mar 16 18:11
<A HREF="bacillus_subtilis_e1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bacillus_subtilis_e1/">bacillus_subtilis_e1</A> . . . . . . Mar 16 18:13
<A HREF="bdellovibrio_bacteriovorus/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bdellovibrio_bacteriovorus/">bdellovibrio_bacteriovorus</A> . . . Mar 16 18:11
<A HREF="bifidobacterium_adolescentis/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bifidobacterium_adolescentis/">bifidobacterium_adolescentis</A> . . Mar 16 18:12
<A HREF="bifidobacterium_breve_31l/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bifidobacterium_breve_31l/">bifidobacterium_breve_31l</A>. . . . Mar 16 18:13
<A HREF="bordetella_bronchiseptica_00_p_2796/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bordetella_bronchiseptica_00_p_2796/">bordetella_bronchiseptica_00_p_></A> Mar 16 18:12
<A HREF="bordetella_bronchiseptica_980_2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bordetella_bronchiseptica_980_2/">bordetella_bronchiseptica_980_2</A>. Mar 16 18:12
<A HREF="bordetella_bronchiseptica_d993/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bordetella_bronchiseptica_d993/">bordetella_bronchiseptica_d993</A> . Mar 16 18:13
<A HREF="bordetella_bronchiseptica_mbord665/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bordetella_bronchiseptica_mbord665/">bordetella_bronchiseptica_mbord></A> Mar 16 18:11
<A HREF="bordetella_bronchiseptica_mbord782/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="bordetella_bronchiseptica_mbord782/">bordetella_bronchiseptica_mbord></A> Mar 16 18:13
<A HREF="borrelia_garinii_sz/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="borrelia_garinii_sz/">borrelia_garinii_sz</A>. . . . . . . Mar 16 18:12
<A HREF="brucella_pinnipedialis/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="brucella_pinnipedialis/">brucella_pinnipedialis</A> . . . . . Mar 16 18:13
<A HREF="burkholderia_sp_mp_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="burkholderia_sp_mp_1/">burkholderia_sp_mp_1</A> . . . . . . Mar 16 18:11
<A HREF="campylobacter_jejuni_10227/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="campylobacter_jejuni_10227/">campylobacter_jejuni_10227</A> . . . Mar 16 18:13
<A HREF="campylobacter_jejuni_subsp_jejuni_81_176_drh212/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="campylobacter_jejuni_subsp_jejuni_81_176_drh212/">campylobacter_jejuni_subsp_jeju></A> Mar 16 18:13
<A HREF="candidatus_caedibacter_acanthamoebae/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="candidatus_caedibacter_acanthamoebae/">candidatus_caedibacter_acantham></A> Mar 16 18:11
<A HREF="clostridium_botulinum_d_str_16868/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="clostridium_botulinum_d_str_16868/">clostridium_botulinum_d_str_168></A> Mar 16 18:11
<A HREF="criblamydia_sequanensis_crib_18/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="criblamydia_sequanensis_crib_18/">criblamydia_sequanensis_crib_18</A>. Mar 16 18:11
<A HREF="enterococcus_faecalis_atcc_29212_gca_000742975_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="enterococcus_faecalis_atcc_29212_gca_000742975_1/">enterococcus_faecalis_atcc_2921></A> Mar 16 18:12
<A HREF="enterococcus_faecalis_ga2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="enterococcus_faecalis_ga2/">enterococcus_faecalis_ga2</A>. . . . Mar 16 18:11
<A HREF="enterococcus_faecalis_gan13/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="enterococcus_faecalis_gan13/">enterococcus_faecalis_gan13</A>. . . Mar 16 18:12
<A HREF="enterococcus_faecium_t110/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="enterococcus_faecium_t110/">enterococcus_faecium_t110</A>. . . . Mar 16 18:12
<A HREF="enterococcus_faecium_uc7251/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="enterococcus_faecium_uc7251/">enterococcus_faecium_uc7251</A>. . . Mar 16 18:13
<A HREF="enterococcus_faecium_uc8668/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="enterococcus_faecium_uc8668/">enterococcus_faecium_uc8668</A>. . . Mar 16 18:12
<A HREF="enterococcus_faecium_vre1044/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="enterococcus_faecium_vre1044/">enterococcus_faecium_vre1044</A> . . Mar 16 18:12
<A HREF="erythrobacter_litoralis/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="erythrobacter_litoralis/">erythrobacter_litoralis</A>. . . . . Mar 16 18:11
<A HREF="escherichia_coli_1_110_08_s1_c1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_1_110_08_s1_c1/">escherichia_coli_1_110_08_s1_c1</A>. Mar 16 18:11
<A HREF="escherichia_coli_2_052_05_s3_c1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_2_052_05_s3_c1/">escherichia_coli_2_052_05_s3_c1</A>. Mar 16 18:12
<A HREF="escherichia_coli_2_177_06_s3_c2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_2_177_06_s3_c2/">escherichia_coli_2_177_06_s3_c2</A>. Mar 16 18:12
<A HREF="escherichia_coli_2_177_06_s4_c3/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_2_177_06_s4_c3/">escherichia_coli_2_177_06_s4_c3</A>. Mar 16 18:12
<A HREF="escherichia_coli_2_222_05_s1_c2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_2_222_05_s1_c2/">escherichia_coli_2_222_05_s1_c2</A>. Mar 16 18:13
<A HREF="escherichia_coli_3_020_07_s4_c3/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_3_020_07_s4_c3/">escherichia_coli_3_020_07_s4_c3</A>. Mar 16 18:11
<A HREF="escherichia_coli_3_073_06_s3_c2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_3_073_06_s3_c2/">escherichia_coli_3_073_06_s3_c2</A>. Mar 16 18:11
<A HREF="escherichia_coli_3_105_05_s3_c3/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_3_105_05_s3_c3/">escherichia_coli_3_105_05_s3_c3</A>. Mar 16 18:13
<A HREF="escherichia_coli_6_537_08_s3_c2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_6_537_08_s3_c2/">escherichia_coli_6_537_08_s3_c2</A>. Mar 16 18:12
<A HREF="escherichia_coli_6_537_08_s3_c3/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_6_537_08_s3_c3/">escherichia_coli_6_537_08_s3_c3</A>. Mar 16 18:13
<A HREF="escherichia_coli_8_415_05_s4_c1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_8_415_05_s4_c1/">escherichia_coli_8_415_05_s4_c1</A>. Mar 16 18:13
<A HREF="escherichia_coli_bidmc_72/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_bidmc_72/">escherichia_coli_bidmc_72</A>. . . . Mar 16 18:12
<A HREF="escherichia_coli_isc56/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_isc56/">escherichia_coli_isc56</A> . . . . . Mar 16 18:13
<A HREF="escherichia_coli_o111_h8_str_f6627/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_o111_h8_str_f6627/">escherichia_coli_o111_h8_str_f6></A> Mar 16 18:12
<A HREF="escherichia_coli_o121_h19_str_2011c_3108/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_o121_h19_str_2011c_3108/">escherichia_coli_o121_h19_str_2></A> Mar 16 18:11
<A HREF="escherichia_coli_o157_h7_str_08_3527/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_o157_h7_str_08_3527/">escherichia_coli_o157_h7_str_08></A> Mar 16 18:13
<A HREF="escherichia_coli_o157_h7_str_08_4529/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_o157_h7_str_08_4529/">escherichia_coli_o157_h7_str_08></A> Mar 16 18:12
<A HREF="escherichia_coli_o157_h7_str_k4527/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_o157_h7_str_k4527/">escherichia_coli_o157_h7_str_k4></A> Mar 16 18:12
<A HREF="escherichia_coli_o6_h16_str_f5656c1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_o6_h16_str_f5656c1/">escherichia_coli_o6_h16_str_f56></A> Mar 16 18:11
<A HREF="escherichia_coli_str_st540_gca_000599685_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_str_st540_gca_000599685_1/">escherichia_coli_str_st540_gca_></A> Mar 16 18:11
<A HREF="escherichia_coli_str_st540_gca_000599705_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_str_st540_gca_000599705_1/">escherichia_coli_str_st540_gca_></A> Mar 16 18:13
<A HREF="escherichia_coli_uci_53/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="escherichia_coli_uci_53/">escherichia_coli_uci_53</A>. . . . . Mar 16 18:13
<A HREF="flavobacterium_reichenbachii/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="flavobacterium_reichenbachii/">flavobacterium_reichenbachii</A> . . Mar 16 18:12
<A HREF="gammaproteobacteria_bacterium_mfb021/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="gammaproteobacteria_bacterium_mfb021/">gammaproteobacteria_bacterium_m></A> Mar 16 18:12
<A HREF="georgenia_sp_subg003/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="georgenia_sp_subg003/">georgenia_sp_subg003</A> . . . . . . Mar 16 18:13
<A HREF="gilliamella_apicola_scgc_ab_598_i20/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="gilliamella_apicola_scgc_ab_598_i20/">gilliamella_apicola_scgc_ab_598></A> Mar 16 18:12
<A HREF="haemophilus_parasuis_gca_000742795_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="haemophilus_parasuis_gca_000742795_1/">haemophilus_parasuis_gca_000742></A> Mar 16 18:12
<A HREF="haemophilus_parasuis_hps9/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="haemophilus_parasuis_hps9/">haemophilus_parasuis_hps9</A>. . . . Mar 16 18:11
<A HREF="halobacillus_karajensis/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="halobacillus_karajensis/">halobacillus_karajensis</A>. . . . . Mar 16 18:12
<A HREF="halostagnicola_sp_a56/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="halostagnicola_sp_a56/">halostagnicola_sp_a56</A>. . . . . . Mar 16 18:11
<A HREF="hyphomonas_jannaschiana_vp2/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="hyphomonas_jannaschiana_vp2/">hyphomonas_jannaschiana_vp2</A>. . . Mar 16 18:12
<A HREF="hyphomonas_sp_25b14_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="hyphomonas_sp_25b14_1/">hyphomonas_sp_25b14_1</A>. . . . . . Mar 16 18:11
<A HREF="klebsiella_pneumoniae_chs_43/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="klebsiella_pneumoniae_chs_43/">klebsiella_pneumoniae_chs_43</A> . . Mar 16 18:13
<A HREF="klebsiella_pneumoniae_chs_49/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="klebsiella_pneumoniae_chs_49/">klebsiella_pneumoniae_chs_49</A> . . Mar 16 18:13
<A HREF="lactobacillus_oryzae_jcm_18671/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="lactobacillus_oryzae_jcm_18671/">lactobacillus_oryzae_jcm_18671</A> . Mar 16 18:11
<A HREF="listeria_monocytogenes_fsl_f6_684_gca_000525815_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_fsl_f6_684_gca_000525815_1/">listeria_monocytogenes_fsl_f6_6></A> Mar 16 18:12
<A HREF="listeria_monocytogenes_gca_000726305_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000726305_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:11
<A HREF="listeria_monocytogenes_gca_000726325_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000726325_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:12
<A HREF="listeria_monocytogenes_gca_000726695_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000726695_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:11
<A HREF="listeria_monocytogenes_gca_000727065_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000727065_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:11
<A HREF="listeria_monocytogenes_gca_000727735_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000727735_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:12
<A HREF="listeria_monocytogenes_gca_000728125_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000728125_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:12
<A HREF="listeria_monocytogenes_gca_000728365_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000728365_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:12
<A HREF="listeria_monocytogenes_gca_000728805_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000728805_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:13
<A HREF="listeria_monocytogenes_gca_000728845_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_gca_000728845_1/">listeria_monocytogenes_gca_0007></A> Mar 16 18:13
<A HREF="listeria_monocytogenes_lm_1880/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_lm_1880/">listeria_monocytogenes_lm_1880</A> . Mar 16 18:12
<A HREF="listeria_monocytogenes_wslc1042/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_monocytogenes_wslc1042/">listeria_monocytogenes_wslc1042</A>. Mar 16 18:13
<A HREF="listeria_riparia_fsl_s10_1204/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="listeria_riparia_fsl_s10_1204/">listeria_riparia_fsl_s10_1204</A>. . Mar 16 18:11
<A HREF="morganella_sp_egd_hp17/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="morganella_sp_egd_hp17/">morganella_sp_egd_hp17</A> . . . . . Mar 16 18:11
<A HREF="mycobacterium_africanum_gca_000666065_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_africanum_gca_000666065_1/">mycobacterium_africanum_gca_000></A> Mar 16 18:11
<A HREF="mycobacterium_africanum_mal010074/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_africanum_mal010074/">mycobacterium_africanum_mal0100></A> Mar 16 18:13
<A HREF="mycobacterium_africanum_mal010081/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_africanum_mal010081/">mycobacterium_africanum_mal0100></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_btb03_108/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb03_108/">mycobacterium_tuberculosis_btb0></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_btb04_416/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb04_416/">mycobacterium_tuberculosis_btb0></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_btb05_285/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb05_285/">mycobacterium_tuberculosis_btb0></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_btb07_323/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb07_323/">mycobacterium_tuberculosis_btb0></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_btb08_022/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb08_022/">mycobacterium_tuberculosis_btb0></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_btb08_309/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb08_309/">mycobacterium_tuberculosis_btb0></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_btb10_357/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb10_357/">mycobacterium_tuberculosis_btb1></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_btb11_027/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb11_027/">mycobacterium_tuberculosis_btb1></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_btb11_207/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb11_207/">mycobacterium_tuberculosis_btb1></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_btb12_001/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb12_001/">mycobacterium_tuberculosis_btb1></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_btb12_046/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_btb12_046/">mycobacterium_tuberculosis_btb1></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_gca_000736075_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_gca_000736075_1/">mycobacterium_tuberculosis_gca_></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_h2438/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_h2438/">mycobacterium_tuberculosis_h243></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_h2581/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_h2581/">mycobacterium_tuberculosis_h258></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_h3005/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_h3005/">mycobacterium_tuberculosis_h300></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_kt_0043/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_kt_0043/">mycobacterium_tuberculosis_kt_0></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_kt_0084/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_kt_0084/">mycobacterium_tuberculosis_kt_0></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_kzn_1435_gca_000669675_1/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_kzn_1435_gca_000669675_1/">mycobacterium_tuberculosis_kzn_></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_m1236/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m1236/">mycobacterium_tuberculosis_m123></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_m1274/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m1274/">mycobacterium_tuberculosis_m127></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_m1461/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m1461/">mycobacterium_tuberculosis_m146></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_m1475/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m1475/">mycobacterium_tuberculosis_m147></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_m1848/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m1848/">mycobacterium_tuberculosis_m184></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_m1893/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m1893/">mycobacterium_tuberculosis_m189></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_m2086/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m2086/">mycobacterium_tuberculosis_m208></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_m2116/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m2116/">mycobacterium_tuberculosis_m211></A> Mar 16 18:11
<A HREF="mycobacterium_tuberculosis_m2193/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m2193/">mycobacterium_tuberculosis_m219></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_m2211/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m2211/">mycobacterium_tuberculosis_m221></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_m2435/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_m2435/">mycobacterium_tuberculosis_m243></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_mal010078/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_mal010078/">mycobacterium_tuberculosis_mal0></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_mal020120/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_mal020120/">mycobacterium_tuberculosis_mal0></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_mal020150/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_mal020150/">mycobacterium_tuberculosis_mal0></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_md14844/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_md14844/">mycobacterium_tuberculosis_md14></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_md14847/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_md14847/">mycobacterium_tuberculosis_md14></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_md17647/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_md17647/">mycobacterium_tuberculosis_md17></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_md17902/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_md17902/">mycobacterium_tuberculosis_md17></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_md17973/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_md17973/">mycobacterium_tuberculosis_md17></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_nritld54/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_nritld54/">mycobacterium_tuberculosis_nrit></A> Mar 16 18:12
<A HREF="mycobacterium_tuberculosis_ofxr_11/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_ofxr_11/">mycobacterium_tuberculosis_ofxr></A> Mar 16 18:13
<A HREF="mycobacterium_tuberculosis_ofxr_15/"><IMG border="0" SRC="/squid-internal-static/icons/anthony-dir.gif" ALT="[DIR] "></A> <A HREF="mycobacterium_tuberculosis_ofxr_15/">mycobacterium_tuberculosis_ofxr></A> Mar 16 18:13
答案 0 :(得分:3)
只要href
属性是每个<a>
标记中的第一个属性,此程序就会按您的要求执行。它还检查以前是否看过每个名称,只有在新名称时才打印出来。
use strict;
use warnings;
use 5.010;
my %seen;
while ( <DATA> ) {
while ( m{<a\s+href="([^"]*)/"}ig ) {
say $1 unless $seen{$1}++;
}
}
<强>输出强>
acinetobacter_baumannii_26016_2
acinetobacter_baumannii_44839_10
acinetobacter_baumannii_45002_9
acinetobacter_baumannii_45075_6
acinetobacter_baumannii_796380_1375
amycolatopsis_mediterranei_gca_000700945_1
bacillus_subtilis_e1
bdellovibrio_bacteriovorus
bifidobacterium_adolescentis
bifidobacterium_breve_31l
bordetella_bronchiseptica_00_p_2796
bordetella_bronchiseptica_980_2
bordetella_bronchiseptica_d993
bordetella_bronchiseptica_mbord665
bordetella_bronchiseptica_mbord782
borrelia_garinii_sz
brucella_pinnipedialis
burkholderia_sp_mp_1
campylobacter_jejuni_10227
campylobacter_jejuni_subsp_jejuni_81_176_drh212
candidatus_caedibacter_acanthamoebae
clostridium_botulinum_d_str_16868
criblamydia_sequanensis_crib_18
enterococcus_faecalis_atcc_29212_gca_000742975_1
enterococcus_faecalis_ga2
enterococcus_faecalis_gan13
enterococcus_faecium_t110
enterococcus_faecium_uc7251
enterococcus_faecium_uc8668
enterococcus_faecium_vre1044
erythrobacter_litoralis
escherichia_coli_1_110_08_s1_c1
escherichia_coli_2_052_05_s3_c1
escherichia_coli_2_177_06_s3_c2
escherichia_coli_2_177_06_s4_c3
escherichia_coli_2_222_05_s1_c2
escherichia_coli_3_020_07_s4_c3
escherichia_coli_3_073_06_s3_c2
escherichia_coli_3_105_05_s3_c3
escherichia_coli_6_537_08_s3_c2
escherichia_coli_6_537_08_s3_c3
escherichia_coli_8_415_05_s4_c1
escherichia_coli_bidmc_72
escherichia_coli_isc56
escherichia_coli_o111_h8_str_f6627
escherichia_coli_o121_h19_str_2011c_3108
escherichia_coli_o157_h7_str_08_3527
escherichia_coli_o157_h7_str_08_4529
escherichia_coli_o157_h7_str_k4527
escherichia_coli_o6_h16_str_f5656c1
escherichia_coli_str_st540_gca_000599685_1
escherichia_coli_str_st540_gca_000599705_1
escherichia_coli_uci_53
flavobacterium_reichenbachii
gammaproteobacteria_bacterium_mfb021
georgenia_sp_subg003
gilliamella_apicola_scgc_ab_598_i20
haemophilus_parasuis_gca_000742795_1
haemophilus_parasuis_hps9
halobacillus_karajensis
halostagnicola_sp_a56
hyphomonas_jannaschiana_vp2
hyphomonas_sp_25b14_1
klebsiella_pneumoniae_chs_43
klebsiella_pneumoniae_chs_49
lactobacillus_oryzae_jcm_18671
listeria_monocytogenes_fsl_f6_684_gca_000525815_1
listeria_monocytogenes_gca_000726305_1
listeria_monocytogenes_gca_000726325_1
listeria_monocytogenes_gca_000726695_1
listeria_monocytogenes_gca_000727065_1
listeria_monocytogenes_gca_000727735_1
listeria_monocytogenes_gca_000728125_1
listeria_monocytogenes_gca_000728365_1
listeria_monocytogenes_gca_000728805_1
listeria_monocytogenes_gca_000728845_1
listeria_monocytogenes_lm_1880
listeria_monocytogenes_wslc1042
listeria_riparia_fsl_s10_1204
morganella_sp_egd_hp17
mycobacterium_africanum_gca_000666065_1
mycobacterium_africanum_mal010074
mycobacterium_africanum_mal010081
mycobacterium_tuberculosis_btb03_108
mycobacterium_tuberculosis_btb04_416
mycobacterium_tuberculosis_btb05_285
mycobacterium_tuberculosis_btb07_323
mycobacterium_tuberculosis_btb08_022
mycobacterium_tuberculosis_btb08_309
mycobacterium_tuberculosis_btb10_357
mycobacterium_tuberculosis_btb11_027
mycobacterium_tuberculosis_btb11_207
mycobacterium_tuberculosis_btb12_001
mycobacterium_tuberculosis_btb12_046
mycobacterium_tuberculosis_gca_000736075_1
mycobacterium_tuberculosis_h2438
mycobacterium_tuberculosis_h2581
mycobacterium_tuberculosis_h3005
mycobacterium_tuberculosis_kt_0043
mycobacterium_tuberculosis_kt_0084
mycobacterium_tuberculosis_kzn_1435_gca_000669675_1
mycobacterium_tuberculosis_m1236
mycobacterium_tuberculosis_m1274
mycobacterium_tuberculosis_m1461
mycobacterium_tuberculosis_m1475
mycobacterium_tuberculosis_m1848
mycobacterium_tuberculosis_m1893
mycobacterium_tuberculosis_m2086
mycobacterium_tuberculosis_m2116
mycobacterium_tuberculosis_m2193
mycobacterium_tuberculosis_m2211
mycobacterium_tuberculosis_m2435
mycobacterium_tuberculosis_mal010078
mycobacterium_tuberculosis_mal020120
mycobacterium_tuberculosis_mal020150
mycobacterium_tuberculosis_md14844
mycobacterium_tuberculosis_md14847
mycobacterium_tuberculosis_md17647
mycobacterium_tuberculosis_md17902
mycobacterium_tuberculosis_md17973
mycobacterium_tuberculosis_nritld54
mycobacterium_tuberculosis_ofxr_11
mycobacterium_tuberculosis_ofxr_15
答案 1 :(得分:2)
我建议首先使用模块,因为HTML无法正确解析正则表达式。它可能会起作用,但容易出现脆弱的代码。
因此,这样的事情:(感谢:http://www.perlmonks.org/?node_id=557357)
use strict;
use warnings;
use WWW::Mechanize;
my $mech = WWW::Mechanize->new();
$mech->get( 'file://C:/path/to/your_html/file.html' );
my @links = $mech->links();
foreach my $link (@links) {
my $url = $link -> url;
$url =~ s,/$,,g;
print $url,"\n";
}
对于您的简单数据集,这应该可以解决问题:
local $/;
my @links = <DATA> =~ m,<A HREF=\"(.*?)/?\">,g;
print join ( "\n", @links );
答案 2 :(得分:0)
使用您的数据尝试以下代码我得到的结果如下:
acinetobacter_baumannii_26016_2
acinetobacter_baumannii_44839_1
acinetobacter_baumannii_45002_9
...
代码是:
open(f1,"/home/httpd/cgi-bin/LDU/list1.txt");
while($line=<f1>){
$line=~/([0-9A-Za-z_]*)(\s*)[\.>].*/;
print $1 . "\n";
}
答案 3 :(得分:-1)
#!/usr/bin/perl
use strict;
use warnings;
open(f1,"/home/httpd/cgi-bin/LDU/list1.txt")||die("error");
while(my $line =<f1> )
{
my ($match) = ($line =~ m/(?:=")(\w+)/g);
print "$match\n";
}