-
You provide the URL of the seed page in Box 1. The seed page contains any number
of URLs - it could be an ordinary web page or icould be a Google or Alta Vista Page.
To work with a Google or Alta Vista Page you must first run the search from Google/Alta
Vista - the list of results is then displayed. At the top of the page is the URL -
copy this and then paste it into Box 1 of Dig Deeper.
-
You can have multiple search URLs providing they are separated with commas and the
final URL has a comma after it.
-
Click button 1 to process the seed URL.
-
The URLs are sorted alphabetically and listed for you in Box 2.
-
Now you can set Dig Deeper to visit each page in the list and bring back the text.
-
If you continue by pressing button 2, each link in the list is visited and the text
is written to screen in Box 3.
Note: I have used regular expressions to filter out just the text
for display but where the page's html code is heavily embedded with Javascript then
some code will be returned with the text. Also, if the linked page uses (html) Frames
then no content is returned.
-
There are certain URLs that are filtered - all those pointing to word documents, pdf
documents or graphic files won't be returned. Some links point to pages that are no
longer there or don't result in a proper URL - a message is usually displayed, stating
why.
-
If you get an error page displayed then one of the resulting pages is returning text
that causes the program to fail - you can send us the seed URL and we can fix
it.
Alta Vista and then Google - searching on Art Prizes in Australia.
http://www.altavista.com/web/results?q=%22australian+art+prizes%22&mik=photo&mik=graphic&mip=all&mis=all&miwxh=all&stq=10,
http://www.google.com.au/search?q=%22art+award%22+OR+%22art+prize%22+OR+%22sculpture+prize%22+OR+%22sculpture+award%22+AND+2006&num=100&hl=en&lr=&cr=countryAU&newwindow=1&start=100&sa=N,
Here is a normal URL with links:
http://www.discoverymedia.com.au/artzinePub/index.asp
Enter the full URL from the source page separated by a comma - it is important
that the last string has a comma after it as well.
Box 1 - Enter the
search strings separated by commas.
Box 2 - URLs from
the target page
Box 3 - Text from
the links