当我运行我的代码时,我会在一行中获得CIK编号,然后是NA。 我想删除NA并跟随每个公司名称旁边的相应CIK编号。下面是所需的输出
代码:
base_url <- c("https://www.sec.gov/Archives/edgar/data/1409916/000162828017002570/exhibit211nobilishealthcor.htm",
"https://www.sec.gov/Archives/edgar/data/1300317/000119312507128181/dex211.htm",
"https://www.sec.gov/Archives/edgar/data/1453814/000145381417000063/subsidiariesoftheregistran.htm")
df <- lapply(base_url,function(u){
html_obj <- read_html(u)
draft_table <- html_nodes(html_obj,'table')
cik <- substr(u,start = 41,stop = 47)
draft1 <- html_table(draft_table,fill = TRUE)
final <- c(cik,draft1)
})
require(reshape2)
data <- melt(df)
data <- as.data.frame(data,row.names = NULL)
data <- data[,1:2]
names(data) <- c("CIK","Company")
I want the below Ouput:
CIK Company
1 1409916 <NA>
2 1409916 ZYX Ltd
3 1409916 Top Gun
4 1409916 ABC Cements
5 1409916 Hyndai Motors
6 1409916 Zenith
58 1300317 <NA>
59 1300317 Chemical Stores
60 1300317 Motor Chip
61 1300317 PWC
62 1300317 Thomson
63 1300317
答案 0 :(得分:1)
data2 <- transform(data, CIK = na.locf(CIK ))
尝试使用此代码将NA替换为之前的CIK。
哪个会给你:
head(data2)
CIK Company
1 1409916 <NA>
2 1409916
3 1409916
4 1409916 Name of subsidiary
5 1409916 Northstar Healthcare Holdings, Inc.
6 1409916 Northstar Healthcare Acquisitions, L.L.C.
答案 1 :(得分:0)
试试这个:
CIK<-as.character(data$CIK)
for (i in 1:length(CIK))
{
ifelse(is.na(CIK[i]),CIK[i]<-val,val<-CIK[i])
}
如果在CIK的第一个位置你有一个ID,它就有效。
data$CIK<-CIK
head(data)
CIK Company
1 1409916 <NA>
2 1409916
3 1409916
4 1409916 Name of subsidiary
5 1409916 Northstar Healthcare Holdings, Inc.
6 1409916 Northstar Healthcare Acquisitions, L.L.C.
现在,您可以从空行清除data.frame。