ruby,csv,automation,web-scraping,mechanize , Ruby Mechanize form input field text

## Question:

Tag: ruby,csv,automation,web-scraping,mechanize

Resolved - the "abc = list.scan(/[([^)]+)]/).last.first" line was correct but also included the quotes, which the website search form did not accept. Corrected it to abc = list.scan(/\"([^)]+)\"/).join.

Thanks for all the help.

I have to automate a search using a list of 100 keywords that is in a csv file.

With Mechanize, I can submit the search using this example (http://mechanize.rubyforge.org/GUIDE_rdoc.html):

agent = Mechanize.new
pp page


However, when I make it loop through the csv file, it returns an error (in this example, the first csv entry would be 'ruby mechanize':

#i have already imported the csv list, now it is looping through the array "raw_list"

raw_list.each do |list|
abc = list.scan(/$([^\)]+)$/).last.first

# i tested a "puts abc" which returned "ruby mechanize", so I don't understand why the rest of this doesn't work

agent = Mechanize.new

#even though abc = "ruby mechanize", an error occurs.

pp page


It doesn't seem to take the variable "abc", but works if you manually type in 'ruby mechanize' even though both are the same.

The error that appears is:

C:filename: in block (2 levels) in <top (required)>': undefined method text' for nil:NilClass (NoMethodError)
from C:/RailsInstaller/Ruby2.0.0/lib/ruby/gems/2.0.0/gems/mechanize-2.7.3/lib/mechanize.rb:442:in get'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:23:in block in <top (required)>'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:19:in each'
from C:/Users/victor/RubymineProjects/untitled/scraper.rb:19:in <top (required)>'
from -e:1:in load'
from -e:1:in <main>'


Any help would be appreciated.

Your error is telling you that something on line 19 in your code is causing the issue for line 442 in mechanize.

I tried your sample out in IRB and it seems to work fine:

2.2.2 :001 > require 'mechanize'
=> true
2.2.2 :002 > agent = Mechanize.new
=> #<Mechanize:...
2.2.2 :003 > page = agent.get('http://google.com/')
=> #<Mechanize::Page
...
2.2.2 :004 > google_form = page.form('f')
=> #<Mechanize::Form
...
=> ""
2.2.2 :006 > abc = "ruby mechanize"
=> "ruby mechanize"
2.2.2 :007 > google_form.q = abc
=> "ruby mechanize"
2.2.2 :008 > page = agent.submit(google_form)
=> #<Mechanize::Page
...


Scan will return nil if nothing is found so your error is happening here:

abc = list.scan(/$([^\)]+)$/).last.first


http://ruby-doc.org/stdlib-2.2.0/libdoc/strscan/rdoc/StringScanner.html

You can replace that with:

abc = list.scan(/$([^\)]+)$/).join


You'll always get a string although it may be only "".

http://ruby-doc.org/core-2.2.0/Array.html#method-i-join

# Related:

## Rails - link_to path based on object's name + refactoring multiple custom actions

ruby-on-rails,ruby,refactoring
I'm looking to simplify the link_to path based on thr object's name and also am looking into refactoring multiple custom actions. I've managed to get this working below. <% ServiceMenu.all.each do |menu| %> <tr class=" <%= cycle('odd', 'even') %>"> <td><%= link_to menu.name, ("tech/""#{menu.name.parameterize}") %></td> </tr> <% end %> I feel...

## How to handle backslash “\” escape characters in q string and heredocument

ruby
Ruby Newbie here. I do not understand why Ruby looks inside %q and escapes the \. I am using Ruby to generate Latex code. I need to generate \\\hline which is used in Latex for table making. I found \\\hline as input generated \hline even though the string was inside...

arrays,perl,csv

## Replace improper commas in CSV file

regex,r,csv
This may have been asked before, but I couldn't find it. I have a list of CSV files (439 or so) where, in a few of the files, someone also used commas in editorial comments. The result is that I can't put the files into a data frame, since the...

## Rails Association Guidance [on hold]

ruby-on-rails,ruby,ruby-on-rails-4,ruby-on-rails-3.2
I am new to rails 4. I have gone through lots of tutorials and trying to solve below scenario. But still no success. Can anybody point me in the right direction. How to handle associations for below scenario. Scenario: 1. Patient can have many surgeries. 2. Surgery has two types...

## Can't map a range of dates in Ruby/Rails

ruby-on-rails,ruby
I'm trying to map a range of dates and pass them to my view as an array, as follows: from, to = Date.parse("2014-01-01"), Date.yesterday date_range = (from..to) @mapped_dates = date_range.map {|date| date.strftime("%b %e")} I reference them in some JS in my view as follows: dateLabels = <%= raw @mapped_dates.to_json %>;...

## Python CSV reader/writer handling quotes: How can I wrap row fields in quotes? (Getting triple quotes as output)

python,csv
I have a problem with the csv reader and writer in python. Whenever I try to take one CSV file and par down the number of columns from roughly 37 to 6, this is the kind of output I am getting. Example of one row: 0,"JOHNSON, JOHN J.",JOHN J. JOHNSON,TECH879,INSPECTION...

## Same enum values for multiple columns

ruby-on-rails,ruby,enums
I need to do something like this: class PlanetEdge < ActiveRecord::Base enum :first_planet [ :earth, :mars, :jupiter] enum :second_planet [ :earth, :mars, :jupiter] end Where my table is a table of edges but each vertex is an integer. However, it seems the abvove is not possible in rails. What might...

## Appending an element to a page in VoltRb

html,ruby,opalrb,voltrb
I'm trying to append an element to one of my pages in a Volt project, via opal-browser, like so: if RUBY_PLATFORM == 'opal' require 'browser' \$document.body << my_dom_element.to_n end # controller code below Unfortunately, I'm getting an error: [Error] TypeError: null is not an object (evaluating 'value.nodeType') (anonymous function) (main.js,...

## Split an array into slices, with groupings

arrays,ruby,enumerable
I've got some Ruby code here, that works, but I'm certain I'm not doing it as efficiently as I can. I have an Array of Objects, along this line: [ { name: "foo1", location: "new york" }, { name: "foo2", location: "new york" }, { name: "foo3", location: "new york"...

## Map with accumulator on an array

ruby,inject
I'm looking to create a method for Enumerable that does map and inject at the same time. For example, calling it map_with_accumulator, [1,2,3,4].map_with_accumulator(:+) # => [1, 3, 6, 10] or for strings ['a','b','c','d'].map_with_accumulator {|acc,el| acc + '_' + el} # => ['a','a_b','a_b_c','a_b_c_d'] I fail to get a solution working. I...

## Seeding fails validation for nested tables (validates_presence_of)

ruby-on-rails,ruby,validation,ruby-on-rails-4,associations
An Organization model has a 1:many association with a User model. I have the following validation in my User model file: belongs_to :organization validates_presence_of :organization_id, :unless => 'usertype==1' If usertype is 1, it means the user will have no organization associated to it. For a different usertype the presence of...

## type conversion performance optimizable?

c#,xml,csv,optimization,type-conversion
The following snippet converts xml data to csv data in a data processing application. element is a XElement. I'm currently trying to optimize the performance of the application and was wondering if I could somehow combine the two operations going on below: Ultimately I still want access to the string...

## Stack level too deep because recursion

I have a model named Tweet. The columns of the Tweet model are: -id -content -user_id -picture -group -original_tweet_id Every tweet can have one or multiple retweets. The relation happens with the help of original_tweet_id. All the tweets have original_tweet_id nil , whilst the retweets contain the id of the...

## Get the actual value of a boolean attribute

ruby,page-object-gem,rspec3,rspec-expectations
I have the span: <span disabled="disabled">Edit Member</span> When I try to get the value of the disabled attribute: page.in_iframe(:id => 'MembersAreaFrame') do |frame| expect(page.span_element(:xpath => "//span[text()='Edit Member']", :frame => frame).attribute('disabled')).to eq("disabled") end I get: expected: "disabled" got: "true" How do I get the value of specified attribute instead of a...

## Get X days out of an Array

ruby,ruby-on-rails-4
I have an array filled with Datetime objects: [Mon, 22 Jun 2015, Tue, 23 Jun 2015, Wed, 24 Jun 2015, Thu, 25 Jun 2015, Fri, 26 Jun 2015, Sat, 27 Jun 2015, Sun, 28 Jun 2015] I know how to select what I want from the array ex: week.select{|x|x.monday? ||...

ruby-on-rails,ruby,rest,activerecord,one-to-many
I'm creating a rails application that is a backend for a mobile application. The backend is implemented with a RESTful web API. Currently I am trying to add gamification to the platform through the use of badges that can be earned by the user. Right now the badges are tied...

## On rendering from controller, current_page method does not seem to work

ruby-on-rails,ruby,ruby-on-rails-4,model-view-controller
I have a navigation bar included in application.html.erb. Because for some pages, such as the signup page, I need to place additional code inside the navigation bar, I have excluded those pages for showing the navigation bar through application.html.erb and instead included it in their respective view pages. See code...

## Keep leading zeroes when converting string to integer

ruby
For no particular reason, I am trying to add a #reverse method to the Integer class: class Integer def reverse self.to_s.reverse.to_i end end puts 1337.reverse # => 7331 puts 1000.reverse # => 1 This works fine except for numbers ending in a 0, as shown when 1000.reverse returns 1 rather...

## regex to pull in number with decimal or comma

ruby,regex
This is my line of code: col_value = line_item[column].scan(/\d+./).join().to_i When I enter 30,000 into the textfield, col_value is 30. I want it to bring in any number: 30,000 30.5 30.55 30000 Any of these are valid... Is there a problem with the scan and or join which would cause it...

## Ruby access words in string

ruby
I don't understand the best method to access a certain word by it's number in a string. I tried using [] to access a word but instead it returns letter. puts s # => I went for a walk puts s[3] # => w ...