如何使用httparty从网站获取内容?

时间:2015-01-05 21:28:30

标签: ruby-on-rails ruby json ruby-on-rails-4 httparty

我想从网站上获取内容,我将其输入到提交表单中,并将该信息存储为json,我可以保存到我的数据库中。我正在尝试使用HTTParty,但我不太确定如何实现它来获取数据。这是我到目前为止所拥有的。

控制器

  class UrlsController < ApplicationController
  before_action :set_url, only: [:show, :edit, :update, :destroy]
  #require "addressable/uri"
  #Addressable::URI.parse(url)

  # GET /urls
  # GET /urls.json
  def index
    @urls = Url.all
  end

  # GET /urls/1
  # GET /urls/1.json
  def show
  end

  # GET /urls/new
  def new
    @url = Url.new
  end

  # GET /urls/1/edit
  def edit
  end

  def uri?(string)
    uri = URI.parse(string)
    %w( http https ).include?(uri.scheme)
    rescue URI::BadURIError
      false
    rescue URI::InvalidURIError
      false
  end

  # POST /urls
  # POST /urls.json
  def create
    @url = Url.new(url_params)
    @app_url = params[:url]

    respond_to do |format|
      if @url.save
        format.html { redirect_to @url, notice: 'Url was successfully created.' }
        format.json { render action: 'show', status: :created, location: @url }
        wordcount
      else
        format.html { render action: 'new' }
        format.json { render json: @url.errors, status: :unprocessable_entity }
      end
    end
  end

  def wordcount
    # Choose the URL to visit
    @app_url = @url

    @words = HTTParty.get(@app_url)

    # Trick to pretty print headers
    @wordcount = Hash[*@words]
  end

  # PATCH/PUT /urls/1
  # PATCH/PUT /urls/1.json
  def update
    respond_to do |format|
      if @url.update(url_params)
        format.html { redirect_to @url, notice: 'Url was successfully updated.' }
        format.json { head :no_content }
      else
        format.html { render action: 'edit' }
        format.json { render json: @url.errors, status: :unprocessable_entity }
      end
    end
  end

  # DELETE /urls/1
  # DELETE /urls/1.json
  def destroy
    @url.destroy
    respond_to do |format|
      format.html { redirect_to urls_url }
      format.json { head :no_content }
    end
  end

  private
    # Use callbacks to share common setup or constraints between actions.
    def set_url
      @url = Url.find(params[:id])
    end

    # Never trust parameters from the scary internet, only allow the white list through.
    def url_params
      params.require(:url).permit(:url)
    end
end

那是我的controller.rb。我得到了一个错误的参数(预期的URI对象或URI字符串)&#39;从@words = HTTParty.get(@app_url)行我需要将表单中的url更改为有效的URL,从该URL中获取我想要的内容,并保存该信息。

1 个答案:

答案 0 :(得分:0)

尝试这样的事情:

response = HTTParty.get('https://google.com')

puts response.body, response.code, response.message, response.headers.inspect

要回答您的问题,您可以实现以下方法,创建新类或将其放在帮助器中。

可能需要include HTTParty

def url_getter(url)
  HTTParty.get(url)
end

并称之为:

url_getter(@app_url)