我正在使用来自标题电子邮件的字符串列表中的代码。看起来像这样:
Recyclerview
到目前为止,我尝试过的是:
public class GetAttributesResponse {
@SerializedName("statusCode")
@Expose
private int statusCode;
@SerializedName("success")
@Expose
private boolean success;
@SerializedName("message")
@Expose
private String message;
@SerializedName("data")
@Expose
private ArrayList<DataClass> dataclass = null;
public int getStatusCode() {
return statusCode;
}
public void setStatusCode(int statusCode) {
this.statusCode = statusCode;
}
public boolean isSuccess() {
return success;
}
public void setSuccess(boolean success) {
this.success = success;
}
public String getMessage() {
return message;
}
public void setMessage(String message) {
this.message = message;
}
public ArrayList<DataClass> getDataclass() {
return dataclass;
}
public void setDataclass(ArrayList<DataClass> dataclass) {
this.dataclass = dataclass;
}
public class DataClass {
@SerializedName("attributes_id")
@Expose
private int attributes_id;
@SerializedName("category_id")
@Expose
private String category_id;
@SerializedName("subcategory_id")
@Expose
private String subcategory_id;
@SerializedName("product_id")
@Expose
private String product_id;
@SerializedName("attribute_name")
@Expose
private String attribute_name;
@SerializedName("isRequired")
@Expose
private String isRequired;
@SerializedName("attribute_type")
@Expose
private String attribute_type;
@SerializedName("created_at")
@Expose
private String created_at;
@SerializedName("updated_at")
@Expose
private String updated_at;
@SerializedName("attribute_options")
@Expose
private ArrayList<String> attribute_options = null;
public int getAttributes_id() {
return attributes_id;
}
public void setAttributes_id(int attributes_id) {
this.attributes_id = attributes_id;
}
public String getCategory_id() {
return category_id;
}
public void setCategory_id(String category_id) {
this.category_id = category_id;
}
public String getSubcategory_id() {
return subcategory_id;
}
public void setSubcategory_id(String subcategory_id) {
this.subcategory_id = subcategory_id;
}
public String getProduct_id() {
return product_id;
}
public void setProduct_id(String product_id) {
this.product_id = product_id;
}
public String getAttribute_name() {
return attribute_name;
}
public void setAttribute_name(String attribute_name) {
this.attribute_name = attribute_name;
}
public String getIsRequired() {
return isRequired;
}
public void setIsRequired(String isRequired) {
this.isRequired = isRequired;
}
public String getAttribute_type() {
return attribute_type;
}
public void setAttribute_type(String attribute_type) {
this.attribute_type = attribute_type;
}
public String getCreated_at() {
return created_at;
}
public void setCreated_at(String created_at) {
this.created_at = created_at;
}
public String getUpdated_at() {
return updated_at;
}
public void setUpdated_at(String updated_at) {
this.updated_at = updated_at;
}
public ArrayList<String> getAttribute_options() {
return attribute_options;
}
public void setAttribute_options(ArrayList<String> attribute_options) {
this.attribute_options = attribute_options;
}
}
我的问题是,我无法提取text_list = ['Industry / Gemany / PN M564839', 'Industry / France / PN: 575-439', 'Telecom / Gemany / P/N 26-59-29', 'Mobile / France / P/N: 88864839']
之前的单词旁边的代码,特别是如果后面的代码以字母(例如'M')开头或斜线之间(即26-59-29)。
我想要的输出是:
def get_p_number(text):
rx = re.compile(r'[p/n:]\s+((?:\w+(?:\s+|$)){1})',
re.I)
res = []
m = rx.findall(text)
if len(m) > 0:
m = [p_number.replace(' ', '').upper() for p_number in m]
m = remove_duplicates(m)
res.append(m)
else:
res.append('no P Number found')
return res
答案 0 :(得分:1)
在您的模式中,字符类[p/n:]\s+
将与列出的字符之一匹配,后跟1+个空格字符。在示例数据中,将匹配正斜杠或冒号,后跟空格的数据。
下一部分(?:\w+(?:\s+|$))
将匹配1+个单词字符,后跟字符串的末尾或1+个空格字符,而不考虑中间的空格字符或连字符。
一种选择是将PN与可选的:
和/
匹配,而不是使用字符类[p/n:]
并将您的值分配到捕获组中:
/ P/?N:? ([\w-]+)
例如:
import re
text_list = ['Industry / Gemany / PN M564839', 'Industry / France / PN: 575-439', 'Telecom / Gemany / P/N 26-59-29', 'Mobile / France / P/N: 88864839']
regex = r"/ P/?N:? ([\w-]+)"
res = []
for text in text_list:
matches = re.search(regex, text)
if matches:
res.append(matches.group(1))
print(res)
结果
['M564839', '575-439', '26-59-29', '88864839']
答案 1 :(得分:1)
简单模式M?[-\d]+
应该适合您。这是一个演示:
import re
text_list = ['Industry / Gemany / PN M564839', 'Industry / France / PN: 575-439', 'Telecom / Gemany / P/N 26-59-29', 'Mobile / France / P/N: 88864839']
res = []
for elem in text_list:
for code in re.findall(r'M?[-\d]+', elem):
res.append(code)
print(res)
输出:
['M564839', '575-439', '26-59-29', '88864839']