Newbie: form fill from CSV, append extract to file

Discussions and Tech Support related to website data extraction, screen scraping and data mining using iMacros.

Moderators: Marcia, iOpus, Hannes, iOpus, Tech Support, Tom, iOpus

Newbie: form fill from CSV, append extract to file

Postby vpataca on Mon Nov 02, 2009 9:36 pm

Hello, I'm trying to to combine two examples in this forum without success - sorry for the newbie question... This involves using the US Census site to repetitively look up census block data for addresses in a CSV file, one per row.

Data in address.csv (two columns, three row example, the number is an ID and not relevant):
"3335 NE WEDGEWOOD DR","109007986"
"895 NE EMERSON DR","109007987"
"1503 NE WALDORF CIR","109007995"

The script is:

VERSION BUILD=6801021
TAB T=1
TAB CLOSEALLOTHERS
URL GOTO=http://factfinder.census.gov/servlet/AGSGeoAddressServlet?_MapEvent=&_category=&_subcategory=&_stateSelectedFromDropDown=Florida&context=ags&programYear=50%3A420&street=3335+NE+WEDGEWOOD+DR&city=palm+bay&states=Florida&zip=&geo_id=10000US120850013027006&_programYear=50&_treeId=420&_lang=en&tree_id=420&bucket_id=
CMDLINE !DATASOURCE ADDRESSES.csv
SET !DATASOURCE_COLUMNS 2
SET !LOOP 1
SET !DATASOURCE_LINE {{!LOOP}}
TAG POS=1 TYPE=INPUT:TEXT FORM=NAME:form1 ATTR=ID:tx_address CONTENT={{!COL1}}
TAG POS=1 TYPE=IMG ATTR=ALT:Go
'New page loaded
TAG POS=1 TYPE=SELECT FORM=NAME:form1 ATTR=NAME:_geo_id EXTRACT=TXT CONTENT=5
SAVEAS TYPE=EXTRACT FOLDER=* FILE=mytable_{{!NOW:yymmdd_hhmmss}}.csv

The odd URL is because I want to load a semi-filled in form (no need that way to repeat filling in the city and state name - all I need to fill in is the address line (tx_address). The script works fine (well, fine enough for me) until the last line, which bombs with error -308. My difficulty is in selecting the 5th row of the output list box (_geo_id, according to the EXTRACT WIZARD), and writing it to a file (appending, actually). I'd even be happy getting the whole list box and doing some post-processing.

Two problems: Without the CONTENT=5 (the row number of the block data in the list box), nothing gets written out to the file(s). With CONTENT=5, the script dies. Second, a new file is created for each LOOP, and it would be nice to just have records appended to the same file.

My apologies in advance if I've missed something in the forum - any help gratefully accepted... Thanks!
Yes, I know LOOP is set to 1 (the first data file had headers, simpler just to set this to 1 and leave the statement in to remind me what to do for other input files, and yes, the URL is a bit clunky (I just used a saves query so the servlet won't choke).
vpataca
 
Posts: 2
Joined: Mon Nov 02, 2009 9:16 pm

Re: Newbie: form fill from CSV, append extract to file

Postby Hannes, iOpus on Tue Nov 03, 2009 2:26 am

You can't use CONTENT and EXTRACT in the same TAG command.

If I understand correctly, you want to know how to extract a certain element on the page. Can you tell us a sample URL we can visit and explain what element you are interested in (screenshot welcome)?
Hannes, iOpus Support
Hannes, iOpus
 
Posts: 1838
Joined: Thu Sep 21, 2006 6:27 am

Re: Newbie: form fill from CSV, append extract to file

Postby vpataca on Tue Nov 03, 2009 6:41 am

Sorry about that, the URL seems to have been truncated. Try this:

http://factfinder.census.gov/servlet/AG ... 20&_sse=on

(see screen shot one attached)

Type "3335 NE WEDGEWOOD DR" in the street address field, "Palm Bay" in the city field, and select "Florida" from the drop down menu. Leave the zip code blank, it is optional. Click the Go button. The result is screen shot two - a drop down list box, showing the geographical entities (counties, tracts, blocks and so forth) in which this address occurs. I'd like to grab the fifth line in that list, block, write it to a file, and then repeat.

Just in case anyone else in the forum wants to know why this is useful - the census site offers a great deal of public information, and this particular function, resolving an address to a census block, is important to be able to link demographic information to a particular address. Unfortunately, the publicly-available method of getting the census block is one-at-a-time, not useful for a list of adrresses in a batch - just the thing iOpus is made to solve. Service firms offer this translation, but can be extremely expensive, and since the base data is paid for by our tax dollars, I prefer to try a tool first.

Thank you again for responding so quickly - much appreciated!
Attachments
Screen Two.jpg
Screen shot two - scrape the list box result
Screen Two.jpg (140.94 KIB) Viewed 453 times
Screen One.jpg
Screen Shot One - fill in the form
Screen One.jpg (107.66 KIB) Viewed 453 times
vpataca
 
Posts: 2
Joined: Mon Nov 02, 2009 9:16 pm

Re: Newbie: form fill from CSV, append extract to file

Postby Hannes, iOpus on Wed Nov 04, 2009 1:37 am

Thanks for the additional information.

Duplicating the selection TAG does the trick.
The first TAG uses CONTENT to select the 5th line, the second TAG then extracts the selected entry:
Code: Select all
TAG POS=1 TYPE=SELECT FORM=NAME:form1 ATTR=NAME:_geo_id CONTENT=4
TAG POS=1 TYPE=SELECT FORM=NAME:form1 ATTR=NAME:_geo_id EXTRACT=TXT
Hannes, iOpus Support
Hannes, iOpus
 
Posts: 1838
Joined: Thu Sep 21, 2006 6:27 am


Return to Data Extraction and Web Screen Scraping

Who is online

Users browsing this forum: No registered users and 2 guests