如何将具有不同数量或行的矢量组合到R中的数据帧中。以下是示例。每个向量有7或9行。 sourceVersion和device是另外两行。我希望这些包含在数据框中并留空或设置为NA用于7行向量观察,如下表所示。
我希望像这样的数据框中的数据。
type sourceName sourceVersion device unit creationDate startDate endDate value
HKQuantityTypeIdentifierFlightsClimbed Ryan Praskievicz iPhone 9.3.2 <<HKDevice: 0x15a4af3f0>, name:iPhone, manufacturer:Apple, model:iPhone, hardware:iPhone8,1, software:9.3.2> count 6/2/2016 12:27 6/2/2016 12:09 6/2/2016 12:09 1
HKQuantityTypeIdentifierStepCount Ryan Praskievicz iPhone count 10/2/2014 8:30 9/24/2014 15:07 9/24/2014 15:07 7
这就是我的尝试。
library(XML)
xmlstr <- '<?xml version="1.0" encoding="UTF-8"?>
<HealthData locale="en_US">
<ExportDate value="2016-06-02 14:05:23 -0400"/>
<Me HKCharacteristicTypeIdentifierDateOfBirth="" HKCharacteristicTypeIdentifierBiologicalSex="HKBiologicalSexNotSet" HKCharacteristicTypeIdentifierBloodType="HKBloodTypeNotSet" HKCharacteristicTypeIdentifierFitzpatrickSkinType="HKFitzpatrickSkinTypeNotSet"/>
<Record type="HKQuantityTypeIdentifierStepCount" sourceName="Ryan Praskievicz iPhone" unit="count" creationDate="2014-10-02 08:30:17 -0400" startDate="2014-09-24 15:07:06 -0400" endDate="2014-09-24 15:07:11 -0400" value="7"/> <Record type="HKQuantityTypeIdentifierFlightsClimbed" sourceName="Ryan Praskievicz iPhone" sourceVersion="9.3.2" device="<<HKDevice: 0x15a4af3f0>, name:iPhone, manufacturer:Apple, model:iPhone, hardware:iPhone8,1, software:9.3.2>" unit="count" creationDate="2016-06-02 12:27:46 -0400" startDate="2016-06-02 12:09:29 -0400" endDate="2016-06-02 12:09:29 -0400" value="1"/> </HealthData>'
xml <- xmlParse(xmlstr)
recordAttribs <- xpathSApply(doc=xml, path="//HealthData/Record", xmlAttrs)
df <- data.frame(t(recordAttribs))
df
这是我输出到R控制台的原因
X1
1 HKQuantityTypeIdentifierStepCount, Ryan Praskievicz iPhone, count, 2014-10-02 08:30:17 -0400, 2014-09-24 15:07:06 -0400, 2014-09-24 15:07:11 -0400, 7
X2
1 HKQuantityTypeIdentifierFlightsClimbed, Ryan Praskievicz iPhone, 9.3.2, <<HKDevice: 0x15a4af3f0>, name:iPhone, manufacturer:Apple, model:iPhone, hardware:iPhone8,1, software:9.3.2>, count, 2016-06-02 12:27:46 -0400, 2016-06-02 12:09:29 -0400, 2016-06-02 12:09:29 -0400, 1
答案 0 :(得分:2)
依赖性有点深奥,但你可以这样做:
library(data.table)
rbindlist(lapply(recordAttribs, function(x) data.table(t(x))), fill=TRUE)
这将返回data.table
,其继承data.frame
。
type sourceName unit
1: HKQuantityTypeIdentifierStepCount Ryan Praskievicz iPhone count
2: HKQuantityTypeIdentifierFlightsClimbed Ryan Praskievicz iPhone count
creationDate startDate endDate value
1: 2014-10-02 08:30:17 -0400 2014-09-24 15:07:06 -0400 2014-09-24 15:07:11 -0400 7
2: 2016-06-02 12:27:46 -0400 2016-06-02 12:09:29 -0400 2016-06-02 12:09:29 -0400 1
sourceVersion
1: NA
2: 9.3.2
device
1: NA
2: <<HKDevice: 0x15a4af3f0>, name:iPhone, manufacturer:Apple, model:iPhone, hardware:iPhone8,1, software:9.3.2>
我使用data.table
的原因是它有一个智能rbind
方法,其use.names=TRUE
选项允许行长度不等,匹配名称上的列而不是位置,并填充NA的缺失值。
rbind.data.table
如何运作的简单示例:
d1 = data.table(a="foo", b = "bar", c = "baz")
d2 = data.table(b="bar", a = "foo")
rbind(d1, d2) # throws helpful error: "If instead you need to fill missing columns, use set argument 'fill' to TRUE."
rbind(d1, d2, fill=TRUE)
# a b c
# 1: foo bar baz
# 2: foo bar NA
答案 1 :(得分:1)
以下是使用lapply
和recordAttribs <- xpathSApply(doc=xml, path="//HealthData/Record", xmlAttrs)
recordAttribs <- t(recordAttribs)
进行此操作的方法。
TRUE/FALSE
根据列表中的元素是否等于7,使用sapply
获取short.condition <- sapply(recordAttribs, function(x) length(x)==7)
的向量。
lapply
在符合此条件的列表子集上使用NA
。这次你在符合上述条件的向量中连接两个recordAttribs[short.condition] <- lapply(recordAttribs,
function(x) c(x[1:2],NA,NA,x[3:7]))
:
df <- matrix(unlist(recordAttribs),
nrow=2,ncol=9, byrow=TRUE)
df <- data.frame(df, stringsAsFactors=FALSE)
names(df) <- c("type","sourceName","sourceVersion","device","unit","creationDate","startDate","endDate","value")
要将其转换为您想要的格式的data.frame:
> str(df)
'data.frame': 2 obs. of 9 variables:
$ type : chr "HKQuantityTypeIdentifierStepCount" "HKQuantityTypeIdentifierFlightsClimbed"
$ sourceName : chr "Ryan Praskievicz iPhone" "Ryan Praskievicz iPhone"
$ sourceVersion: chr NA "9.3.2"
$ device : chr NA "<<HKDevice: 0x15a4af3f0>, name:iPhone, manufacturer:Apple, model:iPhone, hardware:iPhone8,1, software:9.3.2>"
$ unit : chr "count" "count"
$ creationDate : chr "2014-10-02 08:30:17 -0400" "2016-06-02 12:27:46 -0400"
$ startDate : chr "2014-09-24 15:07:06 -0400" "2016-06-02 12:09:29 -0400"
$ endDate : chr "2014-09-24 15:07:11 -0400" "2016-06-02 12:09:29 -0400"
$ value : chr "7" "1"
看起来像这样:
import { component_name } from '@angular/core'