我们正在开发一个程序来从一组服务器中获取幻灯片图像数据,这些服务器没有一致的架构设置(我担心它无效,但我不够精通这个调用)。作为独立无关的研究人员,我们对服务器没有影响力。
数据是通过一系列形式(n> 50)手动输入(大部分),具有不一致的字段(数据返回到90年代)。以下是回复示例:
{
"form12873": [
{
"id": "9202075838",
"timestamp": "2015-06-25 10:24:51",
"user_agent": "Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit\/600.6.3 (KHTML, like Gecko) Version\/8.0.6 Safari\/600.6.3",
"remote_addr": "[Re.dact.ed]",
"processed": "1",
"data": {
"33885124": {
"field": "33885124",
"value": "CDat Lab",
"flat_value": "CDat Lab",
"label": "Completed by:",
"type": "select"
},
''**Several more fields as above**''...
"33884660": {
"field": "33884660",
"value": {
"slideX": "2456123",
"slideY": "456632",
"label": "K-20150322148",
"approved": "1",
"score": "30144"
},
"flat_value": "slideX = 2456123\nslideY = 456632\nlabel = K-20150322148\napproved = 1\nscore = 30144",
"label": "Slide Stats:",
"type": "slidestats"
},
''**Some of the fields are as above...
"31970564": {
"field": "31970564",
"value": [
"System",
"Crated",
"Mirax",
"NanoZoomer",
"ThinPrep",
"Aperio",
"Intellisite"
],
"flat_value": "System\nCrated\nMirax\nNanoZoomer\nThinPrep\nAperio\nIntellisite",
"label": "System Information",
"type": "checkbox"
},
''**Some of the values are Arrays...
"33883781": {
"field": "33883781",
"selection": "Retain",
"label": "4. Retain\/Remove\/Review",
"type": "selectdrop"
},
''**Some of the fields don't have the same children
"52792890": {
"field": "52792890",
"image": "'A really large byte[], removed for ease of reading'",
"type": "image"
}
''**Somewhere near the end of each response is the actual image...
}
},
{
"id": "33884681",
''**Then it continues on as above until the end:
}
], "total": 170, "pages": 5, "pretty_id": "478125624983" }
过去当我能够model/class for the structure of the JSON时,我已经知道如何处理它(创建一个定义了字段,值等的数据类)。
尝试以下解决方案:
var result = JsonConvert.DeserializeObject<List<Dictionary<string,
Dictionary<string, string>>>>(content);
始终导致数组错误或转换问题(即使添加了直接强制转换)。我能够得到实际的first array using:
Public Shared Function Tabulate(json As String) As DataTable
Dim jsonLinq = Newtonsoft.Json.Linq.JObject.Parse(json)
' Find the first array using Linq
Dim srcArray = jsonLinq.Descendants().Where(Function(d) TypeOf d Is JArray).First()
Dim trgArray = New Newtonsoft.Json.Linq.JArray()
For Each row As JObject In srcArray.Children(Of JObject)()
Dim cleanRow = New JObject()
For Each column As JProperty In row.Properties()
' Only include JValue types
If TypeOf column.Value Is JValue Then
cleanRow.Add(column.Name, column.Value)
End If
Next
trgArray.Add(cleanRow)
Next
Return JsonConvert.DeserializeObject(Of DataTable)(trgArray.ToString())
End Function
我的最终目标也是获取数据表,并且循环/图像字节让我担心尝试逐步向更多的孩子求助。然后我尝试使用第一个数组进行反序列化,然后才会出现。
如果有快速处理方法,我会喜欢这个解决方案。如果问题是我正在尝试处理垃圾JSON,我会喜欢参考当前标准被破坏的地方(所以我至少可以尝试让其他机构改变他们的服务器)。也就是说,无论如何,我可能不得不处理它,即使它是循环的。
*注意:该项目是在VB.net中启动的,所以我们保持这种方式,但我可能决定移植到C#。两者中的代码都很棒。
以下是应该可用于测试的Json的未标记示例。我的最终目标是将其扁平化为数据表:
{
"form12873": [
{
"id": "9202075838",
"timestamp": "2015-06-25 10:24:51",
"user_agent": "Mozilla\/5.0 (Macintosh; Intel Mac OS X 10_10_3) AppleWebKit\/600.6.3 (KHTML, like Gecko) Version\/8.0.6 Safari\/600.6.3",
"remote_addr": "[Re.dact.ed]",
"processed": "1",
"data": {
"33885124": {
"field": "33885124",
"value": "CDat Lab",
"flat_value": "CDat Lab",
"label": "Completed by:",
"type": "select"
},
"33884660": {
"field": "33884660",
"value": {
"slideX": "2456123",
"slideY": "456632",
"label": "K-20150322148",
"approved": "1",
"score": "30144"
},
"flat_value": "slideX = 2456123\nslideY = 456632\nlabel = K-20150322148\napproved = 1\nscore = 30144",
"label": "Slide Stats:",
"type": "slidestats"
},
"31970564": {
"field": "31970564",
"value": [
"System",
"Crated",
"Mirax",
"NanoZoomer",
"ThinPrep",
"Aperio",
"Intellisite"
],
"flat_value": "System\nCrated\nMirax\nNanoZoomer\nThinPrep\nAperio\nIntellisite",
"label": "System Information",
"type": "checkbox"
},
"33883781": {
"field": "33883781",
"selection": "Retain",
"label": "4. Retain\/Remove\/Review",
"type": "select"
}
}
}
], "total": 170, "pages": 5, "pretty_id": "478125624983" }
答案 0 :(得分:1)
即使DataColumns
已包含DataTable
,也可以DataRows
添加DataTable
。
我没有做太多JSON,但我对狡猾的XML的一般方法是分解为键值对的流,其中键是XPATH“地址”,值是节点的内容(不包括子节点)节点),然后遍历流以构建class text:
def __init__(self, size, message, color, position, button = False, action = None):
self.size = size
self.message = message
self.color = color
self.position = position
self.text_size = pygame.font.SysFont(None, int(size*displaywidth))
self.Textsurface = self.text_size.render(self.message, True, self.color)
Textrect = self.Textsurface.get_rect()
self.Textrect = Textrect
self.Textwidth = Textrect[2]
self.Textheight = Textrect[3]
self.second_x_pos = Textrect[2] + position[0]
self.second_y_pos = Textrect[3] + position[1]
self.button = button
self.action = action
def display(self):
self.Textrect.topleft = (self.position)
gameWindow.blit(self.Textsurface, self.Textrect)
if self.button == True:
self.Textrect.topleft = (self.position)
gameWindow.blit(self.Textsurface, self.Textrect)
for event in pygame.event.get():
if event.type == pygame.MOUSEMOTION:
if self.position[0] < event.pos[0] < self.second_x_pos and self.position[1] < event.pos[1] < self.second_y_pos:
print("yee")
self.color = white
self.Textsurface = self.text_size.render(self.message, True, self.color)
gameWindow.blit(self.Textsurface, self.Textrect)
else:
self.Textsurface = self.text_size.render(self.message, True, self.color)
if event.type == pygame.MOUSEBUTTONUP :
self.action()
# menu screen
def menu_screen():
global wine
global purple
menu = True
global displaywidth
global displayheight
global gameWindow
global compltely_red
global brown
global red
# Texts
menu_txt = text(0.2,"Timm", red, (displaywidth/2,displayheight/9))
Play_txt = text(0.04, "Play ", wine, (displaywidth/7, displayheight/1.5), True, game_loop)
parallel_button = displaywidth - (displaywidth/7) - Play_txt.Textwidth
Quit_txt = text(0.04, "Quit ", compltely_red, (parallel_button, displayheight/1.5), True, quit_Everything)
#loop
while menu == True:
#the loop
for event in pygame.event.get():
if event.type == pygame.QUIT:
quit_Everything()
if event.type == pygame.KEYDOWN:
if event.key == pygame.K_ESCAPE:
quit_Everything()
if event.key == pygame.K_f:
displaywidth = 1920
displayheight = 1080
gameWindow = pygame.display.set_mode((displaywidth,displayheight), pygame.FULLSCREEN)
if event.key == pygame.K_g:
displaywidth = 960
displayheight = 960
gameWindow = pygame.display.set_mode((displaywidth,displayheight))
gameWindow.fill(green)
menu_txt.display()
Play_txt.display()
Quit_txt.display()
pygame.display.update()
。也许这里可以采用类似的方法使用JSONPath。
答案 1 :(得分:1)
下面的丑陋的装置能够(粗略地)做你想要的。将json源字符串作为参数提供给DeserializeToDataTable
并收集结果数据表。它适用于您的样本。我无法保证它可以在其余数据中使用。这里的目的是提供一个工作启动工具包,您可以学习,理解,调试和适应您的需求。
Private Function DeserializeToDataTable(ByVal jsource As String)
Dim JRootObject = JObject.Parse(jsource)
Dim Children = JRootObject.SelectTokens("$..data.*").ToArray
Dim Records = Children.OfType(Of JObject).ToArray
Dim dicList As New List(Of Dictionary(Of String, Object))
For Each rec In Records
dicList.Add(DeserializeToDictionary(rec))
Next
Dim fieldnames = dicList.SelectMany(Function(d) d.Keys).Distinct.ToArray
Dim dt As New DataTable
For Each fieldname In fieldnames
dt.Columns.Add(fieldname, GetType(Object))
Next
Dim row As DataRow
For Each dic In dicList
row = dt.NewRow
For Each kvp In dic
row.SetField(kvp.Key, kvp.Value)
Next
dt.Rows.Add(row)
Next
Return dt
End Function
Private Function DeserializeToDictionary(ByVal json_object As JObject) As Dictionary(Of String, Object)
Dim dic = New Dictionary(Of String, Object)
For Each field In json_object.Properties
Select Case field.Value.Type
Case JTokenType.Array
Dim subobject = New JObject
Dim item = 0
For Each token In field.Value
subobject("item" & item) = token
item += 1
Next
Dim subdic = DeserializeToDictionary(subobject)
For Each kvp In subdic
dic(kvp.Key) = kvp.Value
Next
Case JTokenType.Boolean
dic(field.Name) = field.Value.ToObject(Of Boolean)
Case JTokenType.Bytes
dic(field.Name) = field.Value.ToObject(Of Byte())
Case JTokenType.Date
dic(field.Name) = field.Value.ToObject(Of Date)
Case JTokenType.Float
dic(field.Name) = field.Value.ToObject(Of Double)
Case JTokenType.Guid
dic(field.Name) = field.Value.ToObject(Of Guid)
Case JTokenType.Integer
dic(field.Name) = field.Value.ToObject(Of Integer)
Case JTokenType.Object
Dim subdic = DeserializeToDictionary(field.Value)
For Each kvp In subdic
dic(kvp.Key) = kvp.Value
Next
Case JTokenType.String
Try
dic(field.Name) = field.Value.ToObject(Of String)
Catch ex As Exception
dic(field.Name) = field.Value.ToObject(Of Object)
End Try
Case JTokenType.TimeSpan
dic(field.Name) = field.Value.ToObject(Of TimeSpan)
Case Else
dic(field.Name) = field.Value.ToString
End Select
Next
Return dic
End Function
使用上述代码时必须注意这一点:
它使用递归来展平多分支结构。所以,
{
"A":"aaaa",
"B":"bbbb",
"C":{
"D":"dddd",
"E":"eeee",
"F":"ffff"
}
}
}
将成为
A |B |D |E |F
----+----+----+----+----
aaaa|bbbb|dddd|eeee|ffff
我所采取的方式假设在展平时不会重复;如果有那些,它将保留最后一个。所以,
{
"A":"aaaa",
"B":"bbbb",
"C":{
"D":"d1d1",
"E":"e1e1",
"F":"f1f1"
},
"G":{
"D":"d2d2",
"E":"e2e2",
"F":"f2f2"
}
}
}
将成为
A |B |D |E |F
----+----+----+----+----
aaaa|bbbb|d2d2|e2e2|f2f2
这是一个明显有缺陷的错误行为,需要一种更复杂的方法,我留给你建立我的划痕。