我有一些在线订单数据作为XML。我想用订单,销售,退货等总数做一份报告。
<ArrayOfItem>
<Item>
<total>333.3</total>
<terminalid>1</terminalid>
<subtotal>330</subtotal>
<storeid>1000</storeid>
<itemlist>
<TransactionLine><LineNumber>1</LineNumber><Name>Moto G Turbo Edition Black</Name><ItemUPC>5479892348535</ItemUPC><Quantity>1</Quantity><SalePrice>330</SalePrice><IndividualPrice>330</IndividualPrice><CreatedDate>2017-06-13T09:42:52.1411148Z</CreatedDate><Status>0</Status><ShippingCost>0</ShippingCost><TotalTax>3.3</TotalTax><AppliedTaxes><LineTax><TaxId>0</TaxId><Amount>0</Amount><CreatedDate>0001-01-01T00:00:00</CreatedDate></LineTax></AppliedTaxes><AppliedDiscounts /><ItemCondition>SellableAsNew</ItemCondition><ReturnReason>PoorQuality</ReturnReason></TransactionLine>
</itemlist>
<transactiontenders>1</transactiontenders>
<transactiontenders>2</transactiontenders>
<transactiontenders>4</transactiontenders>
<transactiontype>1</transactiontype>
<transdate>2017-06-13T09:52:54Z</transdate>
<transtime>09:52</transtime>
</Item>
<Item>
<total>343.59</total>
<terminalid>1</terminalid>
<subtotal>340.29</subtotal>
<storeid>1000</storeid>
<itemlist>
<TransactionLine><LineNumber>1</LineNumber><Name>Moto G Turbo Edition Black</Name><ItemUPC>5479892348535</ItemUPC><Quantity>1</Quantity><SalePrice>330</SalePrice><IndividualPrice>330</IndividualPrice><CreatedDate>2017-06-13T09:53:00.8548823Z</CreatedDate><Status>0</Status><ShippingCost>0</ShippingCost><TotalTax>3.3</TotalTax><AppliedTaxes><LineTax><TaxId>0</TaxId><Amount>0</Amount><CreatedDate>0001-01-01T00:00:00</CreatedDate></LineTax></AppliedTaxes><AppliedDiscounts /><ItemCondition>SellableAsNew</ItemCondition><ReturnReason>PoorQuality</ReturnReason></TransactionLine>
<TransactionLine><LineNumber>2</LineNumber><Name>This Was A Man</Name><ItemUPC>777221028297</ItemUPC><Quantity>1</Quantity><SalePrice>4.99</SalePrice><IndividualPrice>4.99</IndividualPrice><CreatedDate>2017-06-13T09:53:07.8263895Z</CreatedDate><Status>0</Status><ShippingCost>0</ShippingCost><TotalTax>0</TotalTax><AppliedTaxes /><AppliedDiscounts /><ItemCondition>SellableAsNew</ItemCondition><ReturnReason>PoorQuality</ReturnReason></TransactionLine>
<TransactionLine><LineNumber>3</LineNumber><Name>A Prisoner of Birth</Name><ItemUPC>4000111222302</ItemUPC><Quantity>1</Quantity><SalePrice>5.3</SalePrice><IndividualPrice>5.3</IndividualPrice><CreatedDate>2017-06-13T09:53:11.124866Z</CreatedDate><Status>0</Status><ShippingCost>0</ShippingCost><TotalTax>0</TotalTax><AppliedTaxes /><AppliedDiscounts /><ItemCondition>SellableAsNew</ItemCondition><ReturnReason>PoorQuality</ReturnReason></TransactionLine>
</itemlist>
<transactiontenders>1</transactiontenders><transactiontenders>2</transactiontenders>
<transactiontype>1</transactiontype>
<transdate>2017-06-13T09:53:29Z</transdate>
<transtime>09:53</transtime>
</Item>
</ArrayOfItem>
我做过这样的事情:
library(XML)
y <- xmlToDataFrame('C:\\App\\06122017.XML')
nrow(y) # To get total number of order
doc = xmlInternalTreeParse('C:\\App\\06122017.XML')
transactionlineItems <- xpathSApply(doc, '//TransactionLine') # list
transactionlineItems
我试过这个来得到总数的总和,但它没有用。
colSums(y[,c("total")]) # not working
transactionlineItems
是XML元素的列表,我想从中导出数据框,应用一些逻辑(查看特定的订单项是销售还是退货),并为销售创建单独的总计并返回。此外,获取每个产品的数量,看看哪个产品销售更多。现在,通过将逻辑应用于JSON格式的相同数据,我正在做这个浏览器端。我想将它移到服务器端并选择R编程。
答案 0 :(得分:0)
如果您确实在数据帧转换时设置了热量:
你走在正确的轨道上。此答案结合了您的xmlToDataFrame
和xpathSApply
提示。您应该小心确保数值不会被处理为字符,甚至是因素。
library(XML)
order.xml.string <- '<?xml version="1.0" encoding="UTF-8"?>
<ArrayOfItem>
<Item>
<total>333.3</total>
<terminalid>1</terminalid>
<subtotal>330</subtotal>
<storeid>1000</storeid>
<itemlist>
<TransactionLine>
<LineNumber>1</LineNumber>
<Name>Moto G Turbo Edition Black</Name>
<ItemUPC>5479892348535</ItemUPC>
<Quantity>1</Quantity>
<SalePrice>330</SalePrice>
<IndividualPrice>330</IndividualPrice>
<CreatedDate>2017-06-13T09:42:52.1411148Z</CreatedDate>
<Status>0</Status>
<ShippingCost>0</ShippingCost>
<TotalTax>3.3</TotalTax>
<AppliedTaxes>
<LineTax>
<TaxId>0</TaxId>
<Amount>0</Amount>
<CreatedDate>0001-01-01T00:00:00</CreatedDate>
</LineTax>
</AppliedTaxes>
<AppliedDiscounts/>
<ItemCondition>SellableAsNew</ItemCondition>
<ReturnReason>PoorQuality</ReturnReason>
</TransactionLine>
</itemlist>
<transactiontenders>1</transactiontenders>
<transactiontenders>2</transactiontenders>
<transactiontenders>4</transactiontenders>
<transactiontype>1</transactiontype>
<transdate>2017-06-13T09:52:54Z</transdate>
<transtime>09:52</transtime>
</Item>
<Item>
<total>343.59</total>
<terminalid>1</terminalid>
<subtotal>340.29</subtotal>
<storeid>1000</storeid>
<itemlist>
<TransactionLine>
<LineNumber>1</LineNumber>
<Name>Moto G Turbo Edition Black</Name>
<ItemUPC>5479892348535</ItemUPC>
<Quantity>1</Quantity>
<SalePrice>330</SalePrice>
<IndividualPrice>330</IndividualPrice>
<CreatedDate>2017-06-13T09:53:00.8548823Z</CreatedDate>
<Status>0</Status>
<ShippingCost>0</ShippingCost>
<TotalTax>3.3</TotalTax>
<AppliedTaxes>
<LineTax>
<TaxId>0</TaxId>
<Amount>0</Amount>
<CreatedDate>0001-01-01T00:00:00</CreatedDate>
</LineTax>
</AppliedTaxes>
<AppliedDiscounts/>
<ItemCondition>SellableAsNew</ItemCondition>
<ReturnReason>PoorQuality</ReturnReason>
</TransactionLine>
<TransactionLine>
<LineNumber>2</LineNumber>
<Name>This Was A Man</Name>
<ItemUPC>777221028297</ItemUPC>
<Quantity>1</Quantity>
<SalePrice>4.99</SalePrice>
<IndividualPrice>4.99</IndividualPrice>
<CreatedDate>2017-06-13T09:53:07.8263895Z</CreatedDate>
<Status>0</Status>
<ShippingCost>0</ShippingCost>
<TotalTax>0</TotalTax>
<AppliedTaxes/>
<AppliedDiscounts/>
<ItemCondition>SellableAsNew</ItemCondition>
<ReturnReason>PoorQuality</ReturnReason>
</TransactionLine>
<TransactionLine>
<LineNumber>3</LineNumber>
<Name>A Prisoner of Birth</Name>
<ItemUPC>4000111222302</ItemUPC>
<Quantity>1</Quantity>
<SalePrice>5.3</SalePrice>
<IndividualPrice>5.3</IndividualPrice>
<CreatedDate>2017-06-13T09:53:11.124866Z</CreatedDate>
<Status>0</Status>
<ShippingCost>0</ShippingCost>
<TotalTax>0</TotalTax>
<AppliedTaxes/>
<AppliedDiscounts/>
<ItemCondition>SellableAsNew</ItemCondition>
<ReturnReason>PoorQuality</ReturnReason>
</TransactionLine>
</itemlist>
<transactiontenders>1</transactiontenders>
<transactiontenders>2</transactiontenders>
<transactiontype>1</transactiontype>
<transdate>2017-06-13T09:53:29Z</transdate>
<transtime>09:53</transtime>
</Item>
</ArrayOfItem>'
然后
doc <- xmlParse(order.xml.string, asText = TRUE)
y <-
xmlToDataFrame(nodes = getNodeSet(doc, "//TransactionLine"),
stringsAsFactors = FALSE)
nrow(y) # To get total number of order
numeric.cols <- c("Quantity",
"SalePrice",
"IndividualPrice",
"ShippingCost",
"TotalTax")
y[, numeric.cols] <-
lapply(y[, numeric.cols], as.numeric)
colSums(y[(y$ItemCondition == "SellableAsNew" &
y$ReturnReason == "PoorQuality"), numeric.cols])
Quantity SalePrice IndividualPrice ShippingCost TotalTax
4.00 670.29 670.29 0.00 6.60
xmlToList方法:
我喜欢数据帧,就像任何人一样,但我通常不会发现xmlToDataFrame
是一个很好的解决方案。我不认为这个XML内容现在确实具有严格的矩形形状。例如,即使在TransactionLine路径中,看起来税收和折扣路径也是嵌套的(不是平坦的)。即使当前格式适合于数据帧转换,它可能在将来发生变化,然后您需要从数据帧单元中解析数据结构。
也许考虑xmlToList
?或者甚至将数据保留为XML并在XPath
函数中应用xmlApply
个表达式的所有逻辑。
order.xml <-
xmlTreeParse(order.xml.string,
asText = TRUE,
useInternalNodes = TRUE)
orders <- xmlRoot(order.xml)
y <- xmlToList(orders)
my.totals <- sapply(y, function(one.item) {
return(as.numeric(one.item$total))
})
total.total <- sum(my.totals)
print(total.total)
[1] 676.89