Question

当我运行我的代码时，我会在一行中获得CIK编号，然后是NA。我想删除NA并跟随每个公司名称旁边的相应CIK编号。下面是所需的输出

代码：

base_url <- c("https://www.sec.gov/Archives/edgar/data/1409916/000162828017002570/exhibit211nobilishealthcor.htm",
              "https://www.sec.gov/Archives/edgar/data/1300317/000119312507128181/dex211.htm",
              "https://www.sec.gov/Archives/edgar/data/1453814/000145381417000063/subsidiariesoftheregistran.htm")


df <- lapply(base_url,function(u){

  html_obj <- read_html(u)
  draft_table <- html_nodes(html_obj,'table') 
  cik <- substr(u,start = 41,stop = 47)
  draft1 <- html_table(draft_table,fill = TRUE)
  final <- c(cik,draft1)

})


require(reshape2)
data <- melt(df)
data <- as.data.frame(data,row.names = NULL)
data <- data[,1:2]
names(data) <- c("CIK","Company")

I want the below Ouput:
       CIK              Company
1    1409916           <NA>
2    1409916           ZYX Ltd
3    1409916           Top Gun
4    1409916            ABC Cements
5    1409916             Hyndai Motors    
6    1409916             Zenith
58   1300317            <NA>
59   1300317           Chemical Stores 
60   1300317           Motor Chip     
61   1300317            PWC
62   1300317            Thomson
63   1300317

Answer 1

data2 <- transform(data, CIK = na.locf(CIK ))

尝试使用此代码将NA替换为之前的CIK。

哪个会给你：

head(data2)
      CIK                                   Company
1 1409916                                      <NA>
2 1409916                                          
3 1409916                                          
4 1409916                        Name of subsidiary
5 1409916       Northstar Healthcare Holdings, Inc.
6 1409916 Northstar Healthcare Acquisitions, L.L.C.

Answer 2

试试这个：

CIK<-as.character(data$CIK)

for (i in 1:length(CIK))
{
  ifelse(is.na(CIK[i]),CIK[i]<-val,val<-CIK[i])
}

如果在CIK的第一个位置你有一个ID，它就有效。

data$CIK<-CIK
head(data)
      CIK                                   Company
1 1409916                                      <NA>
2 1409916                                          
3 1409916                                          
4 1409916                        Name of subsidiary
5 1409916       Northstar Healthcare Holdings, Inc.
6 1409916 Northstar Healthcare Acquisitions, L.L.C.

现在，您可以从空行清除data.frame。

处理<na>值并用R中的ID替换

2 个答案: