提取7.100.023c,13,134.057d @ 015_4019个案例

时间:2014-10-08 17:23:12

标签: regex text-extraction

我需要清除Mahabharat Shlokas的数字。 简单案例:

7.100.023c devānāṃ yasya yā yonir vānarā ṛṣka rākṣasāḥ
7.100.024a tām eva viviśuḥ sarve devān nikṣipya cāmbhasi
7.100.024c tathā svargagataṃ sarvaṃ kṛtvā lokagurur divam
7.100.025a jagāma tridaśaiḥ sārdhaṃ hṛṣṭair hṛṣṭo mahāmatiḥ

更难案例:

13,120.009  vyāsa uvāca
13,120.009a bho bho viprarṣabha śrīman mā vyathiṣṭhāḥ kathaṃ cana
13,119.008b*0599_01 śakaṭavrajaś ca sumahān āgataś ca yadṛcchayā
13,119.008b*0599_02 cakrākrameṇa bhinnaś ca kīṭaḥ prāṇān mumoca ha
13,119.008b*0599_03 saṃbhūtaḥ kṣatriyakule prasādād amitaujasaḥ
13,134.057d@015_4018    hṛdi kāmamayaś citro mohasaṃcayasaṃbhavaḥ
13,134.057d@015_4019    ajñānarūḍhamūlas tu vidhitsāpariṣecanaḥ

18,005.053x@003_0054 dharmamaṅgalalābhaṃ ca śrutvaitat prāpnuyān naraḥ 18,005.053x@003_0055 bhāti sarveṣu vedeṣu ratiḥ sarvatra jantuṣu

7.100.023c消失并不难。但有时只有数字13,120.009,星标13,119.008b*0599_01甚至@ 13,134.057d@015_4018,所以我在http://regex101.com/r/iM2wF9/1失败了。不确定如何将所有这些结合起来,谢谢。

1 个答案:

答案 0 :(得分:1)

\d+[.,]\d{3}\.\d{3}[a-z]?[*@]?\d*_?\d*|\(\S+\)

试试这个。看看演示。

http://regex101.com/r/iM2wF9/9