我在文件夹/子文件夹中有多个library(RCurl)
library(jsonlite)
library(purrr)
library(stringr)
library(rvest)
library(dplyr)
library(jsonlite)
Sys.setlocale(locale = "Russian")
vacanciesdf <- data.frame(
Name = character(),
Currency = character(),
From = character(),
Area = character(),
Requerement = character(), stringsAsFactors = T,
Experience = character()
)
# First extract all data into a list
vacanciesdf.list <- list()
for (pageNum in 0:1) {
data <- jsonlite::fromJSON(paste0("https://api.hh.ru/vacancies?text=\"machine+learning\"&page=", pageNum))
message("Processing page:",print(pageNum))
# Here I assume that Name data is always present
# For all other columns, fill them with missing values if they are not present (NULL)
Name = data$items$area$name
Currency = if (is.null(data$items$salary$currency)) rep(NA, length(Name)) else data$items$salary$currency
From = if (is.null(data$items$salary$from)) rep(NA, length(Name)) else data$items$salary$from
Area = if (is.null(data$items$employer$name)) rep(NA, length(Name)) else data$items$employer$name
Requirement = if (is.null(data$items$snippet$requirement)) rep(NA, length(Name)) else data$items$snippet$requirement
Experience = if (is.null(data$items$experience$name)) rep(NA, length(Name)) else data$items$experience$name
# Add to the list
vacanciesdf.list[[pageNum+1]] <- data.frame(Name,
Currency,
From,
Area,
Requirement,
Experience,
stringsAsFactors=FALSE)
# I assume you need it only in between reading and you do not need it at the end
if (pageNum < 1 ) Sys.sleep(3)
}
# Combine all elements in the list into a single data.frame
library(data.table)
vacanciesdf <- as.data.frame( rbindlist(vacanciesdf.list))
文件,如下所示:
我需要将所有文件转换为csv并合并每个文件夹的csv文件(例如arizona_files.csv,alaska_files.csv)。我试图使用下面的代码,没有输出。知道我做错了吗?
.txt
答案 0 :(得分:2)
如https://docs.python.org/3/library/os.html中所述,os.walk()
提供的文件名不包含路径元素和&#34;要获取完整路径(以top开头)到dirpath中的文件或目录,请执行操作.path.join(dirpath,name)。&#34;这就是你得到这个错误的原因。
答案 1 :(得分:0)
您没有在正确的目录中执行代码。在命令提示符中初始化代码时,您需要将python脚本放在迭代路径的顶层。即在States文件夹中或在它上面并从该路径启动它。或者,您可以更改in_text以执行以下操作:
in_txt = csv.reader(open(os.path.join(path,filename), "rb"), delimiter = '\t')
这会告诉csv.reader到底找到当前文件的确切位置。在编写csv时,您还必须添加相同类型的操作。
out_csv.writerows(os.path.join(path,filename))