如何使用正则表达式python部分搜索单词

时间:2018-08-30 18:33:49

标签: python regex word

我想获取其中所有带有“反馈报告”的所有“ xlsx”文件。我想使此过滤器非常坚固。因此,任何部分匹配(例如“ feedback_report”,“ feedback report”,“ Feedback Report”)都应返回true。

示例文件名:

  1. ZSS Project_JKIAL-SA_FEEDBACK_REPORT_2015年1月29日。xlsx
  2. ZL-SA_feedback report_012844.xlsx
  3. ASARanem-SA_Feedback Report_012844.xlsx

下面是徒劳的尝试。

regex = re.compile(r"[a-zA-Z0-0]*[fF][eE][eE][dD][bB][aA][cC][kK]\s[rR][eE][pP][oO][rR][tT][a-zA-Z0-0]*.xlsx")

5 个答案:

答案 0 :(得分:1)

首先,使用小写文件名,以减少可能的选项数量

coef_dict = {}
for coef, feat in zip(model_1.coef_[0,:],model_1_features):
    coef_dict[feat] = coef

查找“反馈”,接下来最多3个字符,下一个“报告”,然后再重复,以点和xls或xlsx扩展名结尾

或者只是

      protected void onCreate(Bundle savedInstanceState) {
           super.onCreate(savedInstanceState);
           setContentView(R.layout.activity_main);

    try {
        bitmap = Bitmap.createBitmap(
                500, // Width
                500, // Height
                Bitmap.Config.ARGB_8888 // Config
        );

        canvas = new Canvas(bitmap);
        canvas.drawColor(Color.LTGRAY);
         c_height = canvas.getHeight();
         c_width = canvas.getWidth();

         paint = new Paint();
        paint.setStyle(Paint.Style.STROKE);
        paint.setStrokeWidth(18);
        paint.setColor(Color.RED);

         paint_circle = new Paint();
        paint_circle.setStyle(Paint.Style.STROKE);
        paint_circle.setStrokeWidth(5);
        paint_circle.setColor(Color.RED);


       float dpi = getResources().getDisplayMetrics().density;

        textView =(TextView)findViewById(R.id.text);

        mImageView=(ImageView)findViewById(R.id.imageView);
        mImageView.setImageBitmap(bitmap);
        mScaleGestureDetector = new ScaleGestureDetector(this, new ScaleListener());

      buildGoogleApiClient();


    }catch (Exception e){
        Log.i("ERRORR","ERRRRR ="+e.getMessage());
    }
}


protected synchronized void buildGoogleApiClient() {
    mGoogleApiClient = new GoogleApiClient.Builder(this)
            .addApi(LocationServices.API)
            .addConnectionCallbacks(this)
            .addOnConnectionFailedListener(this)
            .build();

    mLocationProvider = LocationServices.FusedLocationApi;

    mLocationRequest = new LocationRequest();
    mLocationRequest.setInterval(30000);
    //mLocationRequest.setFastestInterval(950);
    mLocationRequest.setPriority(LocationRequest.PRIORITY_HIGH_ACCURACY);
}

@Override
protected void onStart() {
    super.onStart();
    checkGps();
    mGoogleApiClient.connect();
}

@Override
protected void onStop() {
    super.onStop();
    if (mGoogleApiClient.isConnected()) {
        mGoogleApiClient.disconnect();
    }
}

@Override
public void onConnected(Bundle bundle) {
    RequestLocationUpdates();

    mLastLocation = LocationServices.FusedLocationApi.getLastLocation(mGoogleApiClient);

}

@Override
public void onLocationChanged(Location location) {
    try {
        if (flag == 0) {
            Log.i("XCXCX 1", "RGRGRG we");
            first_lat = location.getLatitude();
            first_long = location.getLongitude();
            old_lat = location.getLatitude();
            old_long = location.getLongitude();
            start_altitude = location.getAltitude();
  //            first_x= (float) (340 + ((first_long*340)/180));
  //            first_y= (float) (240 - ((first_lat*240)/180));

  //                first_x = (float) (100 + ((first_long * 100) / 180));
  //                first_y = (float) (100 - ((first_lat * 100) / 180));

            first_x=  (int) ((c_width/360.0) * (180 + first_x));
            first_y =  (int) ((c_height/180.0) * (90 - first_y));

            canvas.drawCircle(first_x, first_y, 5, paint_circle);
            mImageView.setImageBitmap(bitmap);
   //            first_x = (float) (6371 * Math.cos(first_lat) * 
     Math.cos(first_long));
     //            first_y = (float) (6371 * Math.cos(first_lat) * 
       Math.sin(first_long));
             flag = 1;
          } else {
          }

        if (second_flag == 1) {
            Log.i("XCXCX 2", "RGRGRG we");
            second_lat = location.getLatitude();
            second_long = location.getLongitude();
            end_altitude = location.getAltitude();
   //                second_x = (float) (100 + ((second_long * 100) / 180));
   //                second_y = (float) (100 - ((second_lat * 100) / 180));

            second_x=  (int) ((c_width/360.0) * (180 + second_long));
            second_y =  (int) ((c_height/180.0) * (90 - second_lat));
    //second_x = first_x+second_x;
   //second_y = first_y+second_y;
  //            second_x= (float) (340 + ((second_long*340)/180));
  //            second_y= (float) (240 - ((second_lat*240)/180));
 //            second_x = (float) (6371 * Math.cos(second_lat) * 
   Math.cos(second_long));
  //            second_y = (float) (6371 * Math.cos(second_lat) * 
    Math.sin(second_long));

            Location loc1 = new Location("");

   //                loc1.setLatitude(old_lat);
   //                loc1.setLongitude(old_long);

            loc1.setLatitude(first_lat);
            loc1.setLongitude(first_long);

            Location loc2 = new Location("");
            loc2.setLatitude(second_lat);
            loc2.setLongitude(second_long);
            distanceInMeters = loc1.distanceTo(loc2);

            total_distance = total_distance + distanceInMeters;


            drawline();
            first_lat = second_lat;
            first_long = second_long;
 //                first_x = (float) (100 + ((first_long * 100) / 180));
 //                first_y = (float) (100 - ((first_lat * 100) / 180));

            first_x=  (int) ((c_width/360.0) * (180 + first_long));
            first_y =  (int) ((c_height/180.0) * (90 - first_lat));


   //            first_x= (float) (340 + ((first_long*340)/180));
//            first_y= (float) (240 - ((first_lat*240)/180));
//            first_x = (float) (6371 * Math.cos(first_lat) * 
    Math.cos(first_long));
     //            first_y = (float) (6371 * Math.cos(first_lat) * 
     Math.sin(first_long));
     //second_flag = 1;
         } else {

        }
        second_flag = 1;
        int x = (int) first_x;
        int y = (int) first_y;
        ar.add("Latitude: " + first_lat + " longitude: " + first_long + "\n 
x : " + x + " y : " + y + " Distance  = " + total_distance + " \n");
       // textView.setText(ar.toString());

        Log.d("Background Location ", "::::***********Latitude: " + 
   location.getLatitude() + " Longitude: " + location.getLongitude());
    }catch (Exception e){

       Toast.makeText(getBaseContext(),"Error="+e.getMessage(),
       Toast.LENGTH_LONG).show();
        Log.e("ERRROR","ERRRORR ="+e.getMessage());
    }
}
private void RequestLocationUpdates() {


  LocationServices.FusedLocationApi.requestLocationUpdates(mGoogleApiClient, 
  mLocationRequest, this); //getting error here..for casting..!

}

@Override
public void onConnectionSuspended(int i) {
    Log.i(TAG, "Connection suspended");
    mGoogleApiClient.connect();

}


@Override
public void onPointerCaptureChanged(boolean hasCapture) {

}

@Override
public void onConnectionFailed(ConnectionResult connectionResult) {

}

public void drawline(){

    Path p = new Path();

 float radius = (float) 5;

 canvas.drawCircle(second_x, second_y, radius, paint_circle);


    p.moveTo(first_x, first_y);
    p.lineTo(second_x, second_y);

canvas.drawPath(p, paint);


    mImageView.setImageBitmap(bitmap);


} 

您还可以使用python glob模块以linux方式搜索文件:

regex = re.compile('feedback.{0,3}report.*\.xlsx?', flags=re.IGNORECASE)

答案 1 :(得分:1)

您的正则表达式几乎可以接受,但是开头和结尾部分不能正确匹配,因为示例中带有下划线。我不确定这些数据对您的实际数据的代表性如何,但要与您的数据相匹配,您将需要:

regex = re.compile(r"[a-zA-Z0-0\_\-\s]*(feedback)[\s\_\-](report)[a-zA-Z0-0\_\-\s]*.xlsx", 
    flags = re.IGNORECASE)

您可能要注意的另一件事是,确保您实际上只使用文件名而不是文件路径,因为在这种情况下,您将不得不担心\和{{1 }}个字符。另请注意,我只匹配我发现您丢失的确切字符。您可能要尝试

/

但是,同样,我不确定您的数据实际上是什么样的。希望这会有所帮助

答案 2 :(得分:1)

这将起作用:

re.search("(feedback)(.*?|\s)(report)",string,re.IGNORECASE)

在以下输入列表中使用代码对其进行了测试

import re
a=["ZSS Project_JKIAL-SA_FEEDBACK_REPORT_Jan 29th 2015.xlsx",
"ZL-SA_feedback report_012844.xlsx",
"ASARanem-SA_Feedback Report_012844.xlsx",
"some report",
"feedback-report"]

for i in a:
    print(re.search("(feedback)(.*?|\s)(report)",i,re.IGNORECASE))

OP期望的相同输出是:

<_sre.SRE_Match object; span=(21, 36), match='FEEDBACK_REPORT'>
<_sre.SRE_Match object; span=(6, 21), match='feedback report'>
<_sre.SRE_Match object; span=(12, 27), match='Feedback Report'>
None
<_sre.SRE_Match object; span=(0, 15), match='feedback-report'>

答案 3 :(得分:0)

您可以只使用如下所示的字符串方法吗?

dbo.Score

还有

com.google.gms.googleservices.GoogleServicesPlugin.config.disableVersionCheck = true

给你类似的东西

'feedbackreport' in name.replace('_', '').replace(' ', '').lower()

如果还有更多可能引起问题的字符,例如name.endswith('.xlsx') ,那么您还可以使用快速功能删除不良字符:

fileList = [
    'ZSS Project_JKIAL-SA_FEEDBACK_REPORT_Jan 29th 2015.xlsx',
    'ZL-SA_feedback report_012844.xlsx',
    'ASARanem-SA_Feedback Report_012844.xlsx'
]

fileNames = [name for name in fileList
             if ('feedbackreport' in name.replace('_', '').replace(' ', '').lower()
                 and name.endswith('.xlsx'))]

将if语句的适当部分修改为:

-

答案 4 :(得分:0)

我根据您的所有建议将此字符串用作我的字符串。在99%的情况下,这对我都有效。

regex = re.compile(r"[a-zA-Z0-9\_\-\s]*(feedback)(\s|\_)(report)s?[a-zA-Z0-9\_\-\s]*.xlsx",flags = re.IGNORECASE)