使用RegEx查找和打印土耳其语中的复数单词

时间:2018-04-02 19:33:44

标签: python regex python-3.x

我在python中相当新。在代码中,我读取了一个文本文件作为输入,并将每行重新登录在此文本文件中作为元素列入。

我正在尝试使用RegEx编写代码来查找和打印复数单词。在土耳其语中,复数词是'-ler'或'-lar'后缀。

我的代码如下:

import re

f = open('C:/Users/ENE/Desktop/CSE & Kodlar/nlp/utf8textfile.txt', encoding='utf-8-sig', errors='ignore')


with f as file:
    list = file.readlines()
list = [x.strip() for x in list]

print(list)

total = 0
for i in list:
    total += len(i)
ave_size = float(total) / float(len(list))
print("Average word length = " + str(ave_size))

#p = re.compile('.*l[ae]r.*')

for element in list:
    m = re.findall(".*l[ae]r.*", element)
    if m:
        print(m)

,输出

list = ['Aliler geldiler','Selam olsun sana','Merhabalar','Javakitabınerede']

for循环: ['Aliler geldiler'] [ 'Merhabalar']

我正在尝试逐字打印,例如['Aliler'],['geldiler']和['Merhabalar']。我怎么能这样做?

3 个答案:

答案 0 :(得分:1)

您可以通过以下方式实现您想要的目标:

import re

example = "example words Aliler Merhabalar"

words = example.split()

for word in words:
    if (re.search(r"ler$", word)):
        print (word)
    elif (re.search(r"lar$", word)):
        print (word)

这将输出:

Aliler
Merhabalar

答案 1 :(得分:1)

您可以使用lar正则表达式找到以ler\w*l[ea]r\b结尾的所有单词:

results = re.findall(r'\w*l[ea]r\b', s)

请参阅regex demo。在Python 3.x中,\b字边界默认为Unicode,在Python 2.x中,我建议添加re.U标记。

在这里,s可以是整行,甚至是整个文档。

<强>详情

  • \w* - 0+个字母,数字和_(在Python 3.x中,它将匹配所有Unicode字母,数字或_,您可以使用[^\W\d_]*只匹配字母)
  • l - l来信
  • [ea] - ea
  • r - r来信
  • \b - 一个单词边界(注意r'..'符号用于避免双重转义\b以使引擎将其解析为单词边界。

答案 2 :(得分:0)

.*匹配所有内容(行终止符除外)。

这意味着,如果.*l[ae]r.*包含larler,则l[ae]r将完整输入,否则将无匹配。

您希望匹配单词,而不是整行。

由于该字必须以r结尾,因此您需要确保\b是单词的结尾。这可以使用l[ae]r(字边界)来完成。

由于该字必须以+结尾,因此必须由一个或多个(\w)个字符预先确定,即\w

现在,\w仅匹配ASCII字母(A-Z),因此您需要启用Unicode模式,因此它匹配所有字母(例如ñı)。另请注意,r"\w+l[ae]r\b"u 匹配数字(0-9)和下划线(_),但通常都可以。

所以,你的正则表达式应该是:

    package com.example.android.miwok;

import android.annotation.TargetApi;
import android.app.Activity;
import android.app.AlertDialog;
import android.content.Context;
import android.content.DialogInterface;
import android.media.AudioManager;
import android.media.MediaPlayer;
import android.util.Log;
import android.view.LayoutInflater;
import android.view.View;
import android.view.ViewGroup;
import android.widget.*;

import java.util.ArrayList;
import java.util.Timer;
import java.util.TimerTask;

import static android.view.View.GONE;

/**
 * Created by Lukas on 14.02.2018.
 */

public class WordAdapter extends ArrayAdapter<Word> {
    public static Timer timer;
    Activity act;
    MediaPlayer player;

    public WordAdapter(Activity context, ArrayList<Word> list) {
        super(context, 0, list);
        this.act = context;
    }

    @Override
    public View getView(int position, View convertView, ViewGroup parent) {
        View listItemView = convertView;
        if (listItemView == null) {
            listItemView = LayoutInflater.from(getContext()).inflate(R.layout.listitem_normal, parent, false);
        }
        Word current_word = getItem(position);
        final TimerTask progresstask;
        final AlertDialog.Builder dialog = new AlertDialog.Builder(getContext());
        LayoutInflater inflater = LayoutInflater.from(getContext());
        View dialogview = inflater.inflate(R.layout.player_dialog, null);
        final ProgressBar progress = dialogview.findViewById(R.id.progress);
        final SeekBar volseek = dialogview.findViewById(R.id.volumeseek);
        TextView MiwokView = listItemView.findViewById(R.id.miwok_word);
        TextView DefaultView = listItemView.findViewById(R.id.default_word);
        final TextView TimeGone = dialogview.findViewById(R.id.time_gone_player);
        final TextView TimeTotal = dialogview.findViewById(R.id.time_total);
        final ImageView icon = listItemView.findViewById(R.id.item_image);
        final ImageButton playbutton = listItemView.findViewById(R.id.playbutton);
        final ImageButton volup = dialogview.findViewById(R.id.vol_up);
        final ImageButton voldown = dialogview.findViewById(R.id.vol_down);
        MiwokView.setText(current_word.getMiwokTranslation());
        DefaultView.setText(current_word.getDefaultTranslation());
        if (current_word.getImageRes() == 0) {
            icon.setVisibility(GONE);
        } else {
            icon.setImageResource(current_word.getImageRes());
        }
        volup.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {
                AudioManager audioManager = (AudioManager) getContext().getSystemService(Context.AUDIO_SERVICE);
                audioManager.setStreamVolume(AudioManager.STREAM_MUSIC, audioManager.getStreamVolume(AudioManager.STREAM_MUSIC) + 1, 0);
            }
        });
        voldown.setOnClickListener(new View.OnClickListener() {
            @Override
            public void onClick(View view) {
                AudioManager audioManager = (AudioManager) getContext().getSystemService(Context.AUDIO_SERVICE);
                audioManager.setStreamVolume(AudioManager.STREAM_MUSIC, audioManager.getStreamVolume(AudioManager.STREAM_MUSIC) - 1, 0);
            }
        });

        dialog.setTitle(current_word.getMiwokTranslation());
        dialog.setMessage(current_word.getDefaultTranslation());
        dialog.setView(dialogview);
        progresstask = new TimerTask() {
            @Override
            public void run() {
                act.runOnUiThread(new Runnable() {
                    @Override
                    public void run() {
                        TimeGone.setText(Integer.toString(player.getCurrentPosition()));
                    }
                });
                progress.setProgress(player.getCurrentPosition());
            }
        };
        final AlertDialog crdialog = dialog.create();
        crdialog.setOnDismissListener(new DialogInterface.OnDismissListener() {
            @Override
            public void onDismiss(DialogInterface dialogInterface) {
                player.stop();
                timer.cancel();
                timer.purge();
                timer = null;
                Log.i("Miwok/Timer Task", "Timer cancelled");
                playbutton.setImageResource(android.R.drawable.ic_media_play);
                Log.d("onDismiss", "onDismiss: executed");
            }
        });
        playbutton.setOnClickListener(new View.OnClickListener() {
            public void onClick(View v) {
                player = MediaPlayer.create(getContext(), R.raw.unison_aperture_ncs_release);
                player.start();
                playbutton.setImageResource(android.R.drawable.ic_media_pause);
                crdialog.show();
                progress.setMax(player.getDuration());
                TimeTotal.setText(Integer.toString(player.getDuration() / 60));
                timer = new Timer();
                timer.scheduleAtFixedRate(progresstask, 0, 500);
                player.setOnCompletionListener(new MediaPlayer.OnCompletionListener() {
                    @Override
                    public void onCompletion(MediaPlayer mediaPlayer) {
                        crdialog.dismiss();
                        playbutton.setImageResource(android.R.drawable.ic_media_play);
                    }
                });
            }
        });
        return listItemView;

    }
}

请参阅regex101.com了解演示。