Question

我正在解析一个包含多列的csv。 csv文件中的列数未固定。它从5到10不等。我需要在函数内重新创建包含这些列的data.frame。我想知道R中是否存在任何多个参数功能，如Ruby（* args）中的功能。如果没有，如何实现这一点？我搜索了一下，发现如果我有一个名字

col1
col2

我可以使用：

list <- ls(pat="^col\\d$")

并将此列表作为参数传递给函数，但它只传递列名作为字符，而不是这些列名称所携带的值。

任何建议????

编辑：我正在从RoR应用程序解析文件并使用RinRuby gem来调用R函数。因此，从ruby解析csv并将单个列内容作为R中的单个变量传递。现在在R中，我需要创建一个data.frame。实际上它实际上不是数据框架。所以在下面的方法cal_norm中，我使用名为col1，col2，col3 ....的循环在R中分配变量，依此类推。

这是rails代码：

 class UploadsController < ApplicationController

  attr_accessor :calib_data, :calib_data_transpose, :inten_data, :pr_list

  def index
    @uploads = Upload.all

    @upload = Upload.new

  respond_to do |format|
  format.html 
  format.json { render json: @uploads }   
  end
 end

 def create
  @upload = Upload.new(params[:upload]) 

 directory = "public/"
 io_calib = params[:upload][:calib]
 io_inten = params[:upload][:inten]   

 name_calib = io_calib.original_filename
 name_inten = io_inten.original_filename
 calib_path = File.join(directory, "calibs", name_calib)
 inten_path = File.join(directory, "intens", name_inten)

respond_to do |format|
  if @upload.save
    @calib_data, @calib_data_transpose = import(calib_path)
    @inten_data = import_ori(inten_path)
    #probe list of the uploaded file
    @probe_list = calib_data_transpose[0]
    logger.debug @probe_list.to_s
    flash[:notice] = "Files were successfully uploaded!!"
    format.html
    #format.js #{ render json: @upload, status: :created, location: @upload }
  else
    flash[:notice] = "Error in uploading!!"
    format.html { render action: "index" }
    format.json { render json: @upload.errors, status: :unprocessable_entity }
    end
  end
 end

def cal_norm
   #ajax request
   data = params['data'].split(',') 

  for i in 0..@calib_data_transpose.length - 1
  R.assign "col#{i}", @calib_data_transpose[i] 
  end

  R.assign "cells", @inten_data
  R.assign "pr", data
  R.eval <<-EOF

# make sure to convert them in character and numeric vectors

#match the selected pr in the table

#convert the found row of values from data.frame to numeric

#divide each column of the table by the respective pr values and create a new table repat it with different pr.

#make a new table with the ce count and different probe normalization and calculate  for individual pr

#finally return a data.frame with pr names and cell counts

#return individual columns as an array not in the form of matrix/data.frame

EOF

end

def import(file_path)
  array = import_ori(file_path)
  array_splitted = array.map {|a| a.split(",")} 
  array_transpose = array_splitted.transpose
  return array_splitted, array_transpose
end

 def import_ori(file_path)
  string = IO.read(file_path)
  array = string.split("\n")
  array.shift
  return array
 end

end

Answer 1

发布更新后的问题：

我是Ruby的新手，但在这里找到了这个例子：col wise data

这里列式数据被读入col_data，这里的0是（col）索引（没有用于测试的Ruby :(）

require 'csv'
col_data = []
CSV.foreach(filename) {|row| col_data << row[0]}

将col数据分配给变量col1 ... coln，并为列数创建计数器（语法可能不是100％正确）

for i in 0..@calib_data_transpose.length - 1
 #R.assign "col#{i}", @calib_data_transpose[i] 
 CSV.foreach(filename) {|row| "col#{i}" << row[i]}
end

R.col_count=@calib_data_transpose.length - 1

创建col1..coln后，将列数据从i = 1开始一次合并一个索引。结果将一个data.frame，列的顺序为col1 .... coln。

R.eval <<-EOF

for(i in 1:col_count) { 
  if (i==1) { 
   df<-data.frame(get(paste0("col",i))) 
  } 
  else { 
   df<-cbind(df,get(paste0("col",i))) 
 } 

 names(df)[i]<-paste0("col",i)
}

EOF

如果这有帮助，请告诉我们......

与更新的问题不再相关，但保留给后人。

给定模式的子集data.frame

正如Roland所说，read.csv将读取整个文件，因为您希望控制data.frame中保留哪些列，您可以执行以下操作：

使用data(mtcars)作为示例data.frame

<强>代码：

读入数据：

> data(mtcars)
> head(mtcars)
                   mpg cyl disp  hp drat    wt  qsec vs am gear carb
Mazda RX4         21.0   6  160 110 3.90 2.620 16.46  0  1    4    4
Mazda RX4 Wag     21.0   6  160 110 3.90 2.875 17.02  0  1    4    4
Datsun 710        22.8   4  108  93 3.85 2.320 18.61  1  1    4    1
Hornet 4 Drive    21.4   6  258 110 3.08 3.215 19.44  1  0    3    1
Hornet Sportabout 18.7   8  360 175 3.15 3.440 17.02  0  0    3    2
Valiant           18.1   6  225 105 2.76 3.460 20.22  1  0    3    1

为某些条件设置数据子集，例如以字母“c＆＃39;

开头的列

> head(mtcars[,grep("^c",colnames(mtcars))])
                   cyl carb
Mazda RX4           6    4
Mazda RX4 Wag       6    4
Datsun 710          4    1
Hornet 4 Drive      6    1
Hornet Sportabout   8    2
Valiant             6    1

此处'^c'类似于您问题中的模式pat="^col\\d$"。您可以将'^c'替换为您选择的任何正则表达式，例如'^col'。'^c'将匹配以字母“c＆＃39;开头的任何模式，以匹配最后一个字符串使用'$c'

如何将未知数量的参数传递给R编程中的函数

1 个答案: