我有一个包含公司创新数据的数据集,通过使用一些正则表达式,我想检索许可证数据
company licences/patents
1 UX226, licence-pp-zz, licence-zz-pp, licence-xx-tt
2 VV3346E, SS345
3 licence-dd-zz
4 UT223, licence, ss
5 XBTYU, licence-tt-kk, licence-ss-tt
6 xc, zz
7 licence-xb-xz
所需的输出:
company licences/patents licence
1 UX226, licence-pp-zz, licence-zz-pp, licence-xx-tt licence-pp-zz, licence-zz-pp, licence-xx-tt
2 VV3346E, SS345
3 licence-dd-zz licence-dd-zz
4 UT223, licence, ss
5 XBTYU, licence-tt-kk, licence-ss-tt licence-tt-kk, licence-ss-tt
6 xc, zz
7 licence-xb-xz licence-xb-xz
答案 0 :(得分:3)
您可以尝试:
df['licence'] = df['licences/patents'].str.extractall('(licence-\w{2}-\w{2})')\
.unstack().apply(lambda x: ', '.join(x.dropna()), axis=1)
输出:
company licences/patents licence
0 1 UX226, licence-pp-zz, licence-zz-pp, licence-x... licence-pp-zz, licence-zz-pp, licence-xx-tt
1 2 VV3346E, SS345 NaN
2 3 licence-dd-zz licence-dd-zz
3 4 UT223, licence, ss NaN
4 5 XBTYU, licence-tt-kk, licence-ss-tt licence-tt-kk, licence-ss-tt
5 6 xc, zz NaN
6 7 licence-xb-xz licence-xb-xz
答案 1 :(得分:2)
另一种方法,使用Series.str.findall
和Series.str.join
:
$this->db->where('amount', 'discount', false);
[出]
using ExcelDataReader;
using System;
using System.Collections.Generic;
using System.Data;
using System.IO;
using System.Linq;
namespace Driven
{
public class ExcelLib
{
public static DataTable ExcelToDataTable(string fileName)
{
//open file and returns as Stream
FileStream stream = File.Open(fileName, FileMode.Open, FileAccess.Read);
{
using (var reader = ExcelReaderFactory.CreateReader(stream))
{
var result = reader.AsDataSet(new ExcelDataSetConfiguration()
{
ConfigureDataTable = (data) => new ExcelDataTableConfiguration()
{
UseHeaderRow = true
}
});
//Get all the Tables
DataTableCollection table = result.Tables;
//Store it in DataTable
DataTable resultTable = table["Sheet1"];
//return
return resultTable;
}
}
}
static List<Datacollection> dataCol = new List<Datacollection>();
public static void PopulateInCollection(string fileName)
{
DataTable table = ExcelToDataTable(fileName);
//Iterate through the rows and columns of the Table
for (int row = 1; row <= table.Rows.Count - 1; row++)
{
for (int col = 0; col <= table.Columns.Count; col++)
{
Datacollection dtTable = new Datacollection()
{
rowNumber = row,
colName = table.Columns[col].ColumnName,
colValue = table.Rows[row - 1][col].ToString()
};
//Add all the details for each row
dataCol.Add(dtTable);
}
}
}
public static string ReadData(int rowNumber, string columnName)
{
try
{
//Retriving Data using LINQ to reduce much of iterations
string data = (from colData in dataCol
where colData.colName == columnName && colData.rowNumber == rowNumber
select colData.colValue).SingleOrDefault();
//var datas = dataCol.Where(x => x.colName == columnName && x.rowNumber == rowNumber).SingleOrDefault().colValue;
return data.ToString();
}
catch (Exception e)
{
Console.WriteLine(e.Message);
return null;
}
}
public class Datacollection
{
public int rowNumber { get; set; }
public string colName { get; set; }
public string colValue { get; set; }
}
}
}
答案 2 :(得分:1)
答案 3 :(得分:1)
function formikCustomValidation({ field, form }: FieldProps, ownerCmp: Component<any, any>) {
return {
...field,
onBlur: e => {
if (field.value !== form.initialValues[field.name]) {
ownerCmp.setState({
changedFields: {
...ownerCmp.state.changedFields,
[field.name]: true
}
});
}
field.onBlur(e);
setTimeout(form.validateForm);
},
onChange: e => {
field.onChange(e);
if (ownerCmp.state.changedFields[field.name]) {
setTimeout(form.validateForm);
}
}
};
}
它将创建新的列许可证,并从原始列中剥离许可证。
答案 4 :(得分:0)
尝试以下代码:
df['licences/patents'].str.findall('(licence-[\w\-]+)').apply(", ".join)