ruby,csv,automation,web-scraping,mechanize , Ruby Mechanize form input field text

## Question:

Tag: ruby,csv,automation,web-scraping,mechanize

Resolved - the "abc = list.scan(/[([^)]+)]/).last.first" line was correct but also included the quotes, which the website search form did not accept. Corrected it to abc = list.scan(/\"([^)]+)\"/).join.

Thanks for all the help.

I have to automate a search using a list of 100 keywords that is in a csv file.

With Mechanize, I can submit the search using this example (http://mechanize.rubyforge.org/GUIDE_rdoc.html):

agent = Mechanize.new
pp page


However, when I make it loop through the csv file, it returns an error (in this example, the first csv entry would be 'ruby mechanize':

#i have already imported the csv list, now it is looping through the array "raw_list"

raw_list.each do |list|
abc = list.scan(/$([^\)]+)$/).last.first

# i tested a "puts abc" which returned "ruby mechanize", so I don't understand why the rest of this doesn't work

agent = Mechanize.new

#even though abc = "ruby mechanize", an error occurs.

pp page


It doesn't seem to take the variable "abc", but works if you manually type in 'ruby mechanize' even though both are the same.

The error that appears is:

C:filename: in block (2 levels) in <top (required)>': undefined method text' for nil:NilClass (NoMethodError)
from C:/RailsInstaller/Ruby2.0.0/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize.rb:442:in get'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:23:in block in <top (required)>'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:19:in each'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:19:in <top (required)>'
from -e:1:in load'
from -e:1:in <main>'


Any help would be appreciated.

Your error is telling you that something on line 19 in your code is causing the issue for line 442 in mechanize.

I tried your sample out in IRB and it seems to work fine:

2.2.2 :001 > require 'mechanize'
=> true
2.2.2 :002 > agent = Mechanize.new
=> #<Mechanize:...
2.2.2 :003 > page = agent.get('http://google.com/')
=> #<Mechanize::Page
...
2.2.2 :004 > google_form = page.form('f')
=> #<Mechanize::Form
...
=> ""
2.2.2 :006 > abc = "ruby mechanize"
=> "ruby mechanize"
2.2.2 :007 > google_form.q = abc
=> "ruby mechanize"
2.2.2 :008 > page = agent.submit(google_form)
=> #<Mechanize::Page
...


Scan will return nil if nothing is found so your error is happening here:

abc = list.scan(/$([^\)]+)$/).last.first


http://ruby-doc.org/stdlib-2.2.0/libdoc/strscan/rdoc/StringScanner.html

You can replace that with:

abc = list.scan(/$([^\)]+)$/).join


You'll always get a string although it may be only "".

http://ruby-doc.org/core-2.2.0/Array.html#method-i-join

# Related:

## Split an array into slices, with groupings

arrays,ruby,enumerable
I've got some Ruby code here, that works, but I'm certain I'm not doing it as efficiently as I can. I have an Array of Objects, along this line: [ { name: "foo1", location: "new york" }, { name: "foo2", location: "new york" }, { name: "foo3", location: "new york"...

## Seeding fails validation for nested tables (validates_presence_of)

ruby-on-rails,ruby,validation,ruby-on-rails-4,associations
An Organization model has a 1:many association with a User model. I have the following validation in my User model file: belongs_to :organization validates_presence_of :organization_id, :unless => 'usertype==1' If usertype is 1, it means the user will have no organization associated to it. For a different usertype the presence of...

## rails - NameError (undefined local variable or method while using has_many :through

ruby-on-rails,ruby,ruby-on-rails-4
My rails app gives following error: NameError (undefined local variable or method 'fac_allocs' for #): app/models/room.rb:4:in '' app/models/room.rb:1:in '' app/controllers/rooms_controller.rb:3:in 'index' room.rb file class Room < ActiveRecord::Base has_many :bookings has_many :fac_allocs has_many :facs, :through => fac_allocs end ...

ruby

## Ruby access words in string

ruby
I don't understand the best method to access a certain word by it's number in a string. I tried using [] to access a word but instead it returns letter. puts s # => I went for a walk puts s[3] # => w ...

## Can't map a range of dates in Ruby/Rails

ruby-on-rails,ruby
I'm trying to map a range of dates and pass them to my view as an array, as follows: from, to = Date.parse("2014-01-01"), Date.yesterday date_range = (from..to) @mapped_dates = date_range.map {|date| date.strftime("%b %e")} I reference them in some JS in my view as follows: dateLabels = <%= raw @mapped_dates.to_json %>;...

## How to handle backslash “\” escape characters in q string and heredocument

ruby
Ruby Newbie here. I do not understand why Ruby looks inside %q and escapes the \. I am using Ruby to generate Latex code. I need to generate \\\hline which is used in Latex for table making. I found \\\hline as input generated \hline even though the string was inside...

## Convert strings of data to “Data” objects in R [duplicate]

r,date,csv
This question already has an answer here: as.Date with dates in format m/d/y in R 2 answers My problem is that the as.Date function does not convert the values in a "date" column of a data frame into Date objects. I have a data.frame nmmaps. Here is a short...

## How to rearrange CSV / JSON keys columns? (Javascript)

javascript,json,csv,papaparse
I am converting a JSON object array to CSV using Papa Parse JavaScript Library. Is there a way to have the CSV columns arranged in a certain way. For e.g; I get the column as: OrderStatus, canOp, OpDesc, ID, OrderNumber, FinishTime, UOM, StartTime but would like to be arranged as:...

ruby-on-rails,ruby,rest,activerecord,one-to-many
I'm creating a rails application that is a backend for a mobile application. The backend is implemented with a RESTful web API. Currently I am trying to add gamification to the platform through the use of badges that can be earned by the user. Right now the badges are tied...

## is there an equivalent of the ruby any method in javascript?

javascript,arrays,ruby,iteration
Is there an equivalent of ruby's any method for arrays but in javascript? I'm looking for something like this: arr = ['foo','bar','fizz', 'buzz'] arr.any? { |w| w.include? 'z' } #=> true I can get a similar effect with javascript's forEach method but it requires iterating through the entire array rather...

## Ruby: How to copy the multidimensional array in new array?

ruby-on-rails,arrays,ruby,multidimensional-array
seating_arrangement [ [:first, :second, :none], [:first, :none, :second], [:second, :second, :first], ] I need to copy this array into new array. I tried to do it by following code: class Simulator @@current_state def initialize(seating_arrangement) @@current_state = seating_arrangement.dup end But whenever I am making any changes to seating_arrangement current_state changes automatically....

## Parse text from a .txt file using csv module

python,python-2.7,parsing,csv
I have an email that comes in everyday and the format of the email is always the same except some of the data is different. I wrote a VBA Macro that exports the email to a text file. Now that it is a text file I want to parse the...

## Get the actual value of a boolean attribute

ruby,page-object-gem,rspec3,rspec-expectations
I have the span: <span disabled="disabled">Edit Member</span> When I try to get the value of the disabled attribute: page.in_iframe(:id => 'MembersAreaFrame') do |frame| expect(page.span_element(:xpath => "//span[text()='Edit Member']", :frame => frame).attribute('disabled')).to eq("disabled") end I get: expected: "disabled" got: "true" How do I get the value of specified attribute instead of a...

## Make instance variable accessible through hash in Ruby

ruby-on-rails,ruby,ruby-on-rails-4,activerecord
In Rails, ActiveRecord objects, attributes are accessible via method as well as through Hash. Example: user = User.first # Assuming User to be inheriting from ActiveRecord::Base user.name # Accessing attribute 'name' via method user[:name] # Attribute 'name' is accessible via hash as well How to make instance variables accessible through...

## Create an external Hive table from an existing external table

I have a set of CSV files in a HDFS path and I created an external Hive table, let's say table_A, from these files. Since some of the entries are redundant, I tried creating another Hive table based on table_A, say table_B, which has distinct records. I was able to...

## Using Ruby Pathname to access relative directory

ruby,path,pathname
Given I have a relative path pointing to a directory how can I use it with Ruby's Pathname or File library to get the directory itself? p = Pathname.new('dir/') p.dirname => . p.directory? => false I have tried './dir/', 'dir/', 'dir'. What I want is p.dirname to return 'dir'. I...

## Ruby- get a xml node value

ruby,xml
can someone help me in extracting the node value for the element "Name". Type 1: I am able to extract the "name" value for the below xml by using the below code <Element> <Details> <ID>20367</ID> <Name>Ram</Name> <Name>Sam</Name> </Details> </Element> doc = Nokogiri::XML(response.body) values = doc.xpath('//Name').map{ |node| node.text}.join ',' puts values...

## Appending an element to a page in VoltRb

html,ruby,opalrb,voltrb
I'm trying to append an element to one of my pages in a Volt project, via opal-browser, like so: if RUBY_PLATFORM == 'opal' require 'browser' \$document.body << my_dom_element.to_n end # controller code below Unfortunately, I'm getting an error: [Error] TypeError: null is not an object (evaluating 'value.nodeType') (anonymous function) (main.js,...

## regex to pull in number with decimal or comma

ruby,regex
This is my line of code: col_value = line_item[column].scan(/\d+./).join().to_i When I enter 30,000 into the textfield, col_value is 30. I want it to bring in any number: 30,000 30.5 30.55 30000 Any of these are valid... Is there a problem with the scan and or join which would cause it...

## Stack level too deep because recursion

I have a model named Tweet. The columns of the Tweet model are: -id -content -user_id -picture -group -original_tweet_id Every tweet can have one or multiple retweets. The relation happens with the help of original_tweet_id. All the tweets have original_tweet_id nil , whilst the retweets contain the id of the...

## Rails basic auth not working properly

ruby-on-rails,ruby,authentication
I am building a small API that uses basic authentication. What I have done, is that a user can generate a username and password, that could be used to authenticate to the API. However I have discovered that it is not working 100% as intended. It appears that a request...

## Get X days out of an Array

ruby,ruby-on-rails-4
I have an array filled with Datetime objects: [Mon, 22 Jun 2015, Tue, 23 Jun 2015, Wed, 24 Jun 2015, Thu, 25 Jun 2015, Fri, 26 Jun 2015, Sat, 27 Jun 2015, Sun, 28 Jun 2015] I know how to select what I want from the array ex: week.select{|x|x.monday? ||...

## Loop until i get correct user

ruby,redis
I have users stored in Redis and want to be able to call only certain subsets from a set, if i don't get the correct user back i want to put it back in the set and then try again until i get one of the desired users @redis =...