对于R编程来说相对较新我正在努力处理一个存储在一个目录中的16个文本文件(分离)的庞大数据集。所有文件都有相同的列数和命名约定,例如file_year_2000,file_year_2001等。我想在R中创建一个列表,我可以通过访问列表元素来单独访问每个文件。通过网络搜索,我发现了一些代码,并尝试了以下,但结果我得到一个巨大的列表(16,2 MB),其中输出只是奇怪。我希望列表中有16个元素,每个元素代表从目录中读取的一个文件。我尝试了以下代码,但它不能正常工作:
function search($db) {
$words=''
if (isset($_POST['searchstock'])){$words = $_POST['searchstock'];}
if (isset($_GET['searchstock'])){$words = $_GET['searchstock'];}
if (!empty($words){
$searchQuery = ''; // search query is empty by default
$searchCondition = "(cultivar LIKE '%%' OR description LIKE '%%' OR species LIKE '%%' OR colour LIKE '%%')";
$searchFieldName = 'cultivar'; // name of the field to be searched
$searchFieldName2 = 'description';
$searchFieldName3 = 'species';
$searchFieldName4 = 'colour';
$searchQuery = trim(words); // getting rid of unnecessary white space
$searchTerms = explode(" ", $searchQuery); // Split the words
$searchCondition = "($searchFieldName LIKE '%" . implode("%' OR $searchFieldName LIKE '%", $searchTerms) . "%')"; // Forming the condition for the sql
$searchCondition .= " OR ($searchFieldName2 LIKE '%" . implode("%' OR $searchFieldName2 LIKE '%", $searchTerms) . "%')";
$searchCondition .= " OR ($searchFieldName3 LIKE '%" . implode("%' OR $searchFieldName3 LIKE '%", $searchTerms) . "%')";
$searchCondition .= " OR ($searchFieldName4 LIKE '%" . implode("%' OR $searchFieldName4 LIKE '%", $searchTerms) . "%')";
// the rest is just database connection and retrieving the results
$sql = <<<SQL
SELECT * FROM stock WHERE $searchCondition;
SQL;
if(!$result = $db->query($sql)){
die('There was an error running the query [' . $db->error . ']');
}
while($row = $result->fetch_assoc()){
$searchid = $row['id'];
$searchgenusid = $row['genusid'];
//if its work on _POST then should also work on _GET
//do what ever you want
}
}
}
有什么建议吗? 提前谢谢。
答案 0 :(得分:1)
仅提供更多详情
path = "~/.../.../.../Data_1999-2015"
list.files(path)
file.names <- dir(path, pattern =".txt")
length(file.names)
df_list = list()
for(i in seq(length(file.names))){
year = gsub('[^0-9]', '', file.names[i])
df_list[[year]] = read.csv(file.names[i],header=TRUE, sep=",", stringsAsFactors=FALSE)
}
也许值得将数据框合并到一个大数据框架中,另外一列是年份?
答案 1 :(得分:1)
我认为不是&#34;而是单独访问每个文件&#34;你的意思是你想要在每个文件中单独访问数据。
尝试这样的事情(未经测试):
path = "~/.../.../.../Data_1999-2015"
file.names <- dir(path, pattern =".txt")
df_list = vector("list", length(file.names))
# create a list of data frames with correct length
names(df_list) <- rep("", length(df_list))
# give it empty names to begin with
for( i in seq(along=length(file.names))) {
# now i = 1,2,...,16
file <- read.csv(file.names[i],header=TRUE, sep=",", stringsAsFactors=FALSE)
df_list[[i]] = file
# save the data
year = gsub('[^0-9]', '', file.names[i])
names(df_list)[i] <- year
}
现在,您可以将df_list[[1]]
或df_list[["2000"]]
用于2000年的数据。
我不确定您是否正在读取正确目录中的csv文件。如果没有,请使用
file <- read.csv(paste0(path, file.names[i], sep="/"),header=TRUE, sep=",", stringsAsFactors=FALSE)
阅读文件时。