我有一个带有特定列Message的CSV文件,其中包含以下输入,我想正确地将其分开。请注意,下面的代码段在Excel中看起来不是这样,我目前需要对其进行格式化
["CorrelationId: b99fb632-78cf-4910-ab23-4f69833ed2d9
Request for API: /api/acmsxdsreader/readpolicyfrompolicyassignment Caller:C2F023C52E2148C9C1D040FBFAC113D463A368B1 RequestedSchemas: {urn:schema:Microsoft.Rtc.Management.Policy.Voice.2008}VoicePolicy, {urn:schema:Microsoft.Rtc.Management.Policy.Voice.2008}OnlineVoiceRoutingPolicy, TenantId: 7a205197-8e59-487d-b9fa-3fc1b108f1e5"]
我想将其分离出来,使它看起来像这样(列的名称将在冒号之前,而其中的信息将在冒号之后)。
CorrelationID: b99fb632-78cf-4910-ab23-4f69833ed2d9
Request for API:
/api/acmsxdsreader/readpolicyfrompolicyassignment
Caller:C2F023C52E2148C9C1D040FBFAC113D463A368B1
RequestedSchemas: {urn:schema:Microsoft.Rtc.Management.Policy.Voice.2008}VoicePolicy, {urn:schema:Microsoft.Rtc.Management.Policy.Voice.2008}OnlineVoiceRoutingPolicy,
TenantId: 7a205197-8e59-487d-b9fa-3fc1b108f1e5[![enter image description here]
我尝试使用文本到列,但在Excel中却无法正确显示
我想知道什么是最好的方法。我目前正在用C#编写一个程序,尝试对其进行正确解析,但是我所拥有的无法正常工作。
供参考,这是我的C#代码。但是我愿意采取任何方式。
static void Main(string[] args)
{
using (TextFieldParser parser = new TextFieldParser(@"C:\Users\t-maucal\Desktop\MachineLearningTestSets\CSVParse.csv"))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(" ");
while (!parser.EndOfData)
{
//Process row
string[] fields = parser.ReadFields();
foreach (string field in fields)
{
Console.WriteLine(field);
}
}
}
}
答案 0 :(得分:1)
使用公式,@ cybernetic.nomad大部分使用该方法。为了从数据中删除标题,您可以尝试以下操作:
将每个列的类别(CorrelationId :,请求API :)放入单元格B1:G1
在B2
中,使用以下公式:
=RIGHT(LEFT($A2,FIND(C$1,$A2)-1),LEN(LEFT($A2,FIND(C$1,$A2)-1))-(LEN(B1)+2))
在C2
中,使用以下公式:
=RIGHT(MID($A2,FIND(C$1,$A2),FIND(D$1,$A2,FIND(C$1,$A2))-FIND(C$1,$A2)),LEN(MID($A2,FIND(C$1,$A2),FIND(D$1,$A2,FIND(C$1,$A2))-FIND(C$1,$A2)))-(LEN(C1)+1))
在D2
中,使用以下公式:
=RIGHT(MID($A2,FIND(D$1,$A2),FIND(E$1,$A2,FIND(D$1,$A2))-FIND(D$1,$A2)),LEN(MID($A2,FIND(D$1,$A2),FIND(E$1,$A2,FIND(D$1,$A2))-FIND(D$1,$A2)))-(LEN(D1)+2))
在E2
中,使用以下公式:
=RIGHT(MID($A2,FIND(E$1,$A2),FIND(F$1,$A2,FIND(E$1,$A2))-FIND(E$1,$A2,FIND(D$1,$A2))-1),LEN(MID($A2,FIND(E$1,$A2),FIND(F$1,$A2,FIND(E$1,$A2))-FIND(E$1,$A2,FIND(D$1,$A2))))-(LEN(E1)+2))
在F2
中,使用以下公式:
=RIGHT(MID($A2,FIND(F$1,$A2),FIND(G$1,$A2,FIND(F$1,$A2))-FIND(F$1,$A2)),LEN(MID($A2,FIND(F$1,$A2),FIND(G$1,$A2,FIND(F$1,$A2))-FIND(F$1,$A2)))-(LEN(F1)+2))
在G2
中,使用以下公式:
=RIGHT($A2,LEN($A2)-FIND(G$1,$A2)-LEN(G1))
答案 1 :(得分:1)
您可以使用用VBA编写的宏。
我创建了一个类,并使用您不同列标题的属性将其重命名为cData
。
然后,我使用正则表达式从您提供的数据中分离出不同的属性,将其收集到Dictionary中,然后按指定顺序将结果输出到单独的工作表中。
我假设您的命名列标题是您要查找的信息,并且像您的文本示例一样,每个类别都只有一个实例要关注。
我还假设您的数据以B1
开头。
仔细阅读宏中的注释。
请务必按照常规模块代码中的指示设置参考。
课程模块
'Rename this Module **cData**
Option Explicit
Private pCorrelationID As String
Private pRequestForApi As String
Private pCaller As String
Private pRequestedSchemas As String
Private pTenantID As String
Public Property Get CorrelationID() As String
CorrelationID = pCorrelationID
End Property
Public Property Let CorrelationID(Value As String)
pCorrelationID = Value
End Property
Public Property Get RequestForApi() As String
RequestForApi = pRequestForApi
End Property
Public Property Let RequestForApi(Value As String)
pRequestForApi = Value
End Property
Public Property Get Caller() As String
Caller = pCaller
End Property
Public Property Let Caller(Value As String)
pCaller = Value
End Property
Public Property Get RequestedSchemas() As String
RequestedSchemas = pRequestedSchemas
End Property
Public Property Let RequestedSchemas(Value As String)
pRequestedSchemas = Value
End Property
Public Property Get TenantID() As String
TenantID = pTenantID
End Property
Public Property Let TenantID(Value As String)
pTenantID = Value
End Property
常规模块
'Set Reference to Microsoft Scripting Runtime
'Set Reference to Microsoft VBScript Regular Expressions 5.5
Option Explicit
Sub ttcSpecial()
Dim wsSrc As Worksheet, wsRes As Worksheet
Dim vSrc As Variant, vRes As Variant
Dim rRes As Range
Dim dD As Dictionary
Dim RE As RegExp, MC As MatchCollection, M As Match
Dim cD As cData
Dim myKey, I As Long, sTemp As String
Set wsSrc = Worksheets("sheet1")
Set wsRes = Worksheets("sheet2")
Set rRes = wsRes.Cells(1, 1)
With wsSrc
vSrc = .Range(.Cells(1, 2), .Cells(.Rows.Count, 2).End(xlUp))
If Not IsArray(vSrc) Then
sTemp = vSrc
ReDim vSrc(1 To 1, 1 To 1)
vSrc(1, 1) = sTemp
End If
End With
Set RE = New RegExp
With RE
.Global = True
.IgnoreCase = True
.MultiLine = False
.Pattern = "((?:CorrelationID|Request For API|Caller|RequestedSchemas|TenantID)):([\s\S]+?)(?=(?:CorrelationID|Request For API|Caller|RequestedSchemas|TenantID|$))"
End With
Set dD = New Dictionary
dD.CompareMode = TextCompare
For I = 1 To UBound(vSrc, 1)
Set cD = New cData
With cD
If RE.Test(vSrc(I, 1)) = True Then
myKey = I
Set MC = RE.Execute(vSrc(I, 1))
For Each M In MC
Select Case M.SubMatches(0)
Case "CorrelationID"
.CorrelationID = M.SubMatches(1)
Case "Request for API"
.RequestForApi = M.SubMatches(1)
Case "Caller"
.Caller = M.SubMatches(1)
Case "RequestedSchemas"
.RequestedSchemas = M.SubMatches(1)
Case "TenantID"
.TenantID = M.SubMatches(1)
End Select
Next M
dD.Add Key:=myKey, Item:=cD
End If
End With
Next I
ReDim vRes(0 To dD.Count, 1 To 5)
'Headers
vRes(0, 1) = "Correlation ID"
vRes(0, 2) = "Request for API"
vRes(0, 3) = "Caller"
vRes(0, 4) = "Requested Schemas"
vRes(0, 5) = "Tenant ID"
I = 0
For Each myKey In dD.Keys
I = I + 1
With dD(myKey)
vRes(I, 1) = .CorrelationID
vRes(I, 2) = .RequestForApi
vRes(I, 3) = .Caller
vRes(I, 4) = .RequestedSchemas
vRes(I, 5) = .TenantID
End With
Next myKey
Set rRes = rRes.Resize(UBound(vRes, 1) + 1, UBound(vRes, 2))
With rRes
.EntireColumn.Clear
.Value = vRes
With .Rows(1)
.Font.Bold = True
.HorizontalAlignment = xlCenter
End With
.EntireColumn.AutoFit
End With
End Sub
原始问题中文本示例的结果
正则表达式 简化的解释
答案 2 :(得分:0)
这是很多容易出错的工作。只需使用Josh Close的CSVHelper。这是一个优秀的程序包,快速且易于使用。