Please use this article thread when sending any questions. All scripts in the Appendix A are examples and you are encouraged to study and modify them according to your requirements (if you are a developer). Please use care when quering search engines.
Most of the source scripts are in Perl. Please ensure that you have installed all of the required Perl modules before running any scripts. Some of the scripts also use wget.exe (for simplicity). As websites and internet services constantly change, some parts of the scripts may need to be adjusted from time to time. I will be releasing the next script updates according to the following schedule:
January 15th, 2010
(will include Bing related changes)
March 15th, 2010
(will include any bug fixes etc..)
You can always find the latest scripts (source code) at http://book.seowarrior.net
Thank you!
John
Tags: book, seo warrior, source code
Hi John
first of all let me say that you’ve done a great job on your book.
I’m recently going through the chapters and the codes. I have two simple questions regarding chapter II.
1. stopwords.txt – What is the exact purpose of loading spotwords.txt? I’m asking if it’s there to opt out some keywords (SEO-wise) or just simply HTML codes.
2. It appears like there is no space in spider view output between HTML elements. Is only the code or is the search engines parse the text out?
Thanks
Rad
Rad,
Thanks for the comments. Let me answer your questions.
1) The purpose of the stopwords.txt is to list any common words that search engines would typically ignore. For more details, you can refer to the explanation offered at the following links:
a> http://searchenginewatch.com/2156061
b> http://en.wikipedia.org/wiki/Stop_words
2) The spider viewer script is meant as an illustration (only) of how search engines would view each URL. Each search engine would do this in any way they like. In regards to the space between HTML elements, I have used no space just to make a point about the way search engines will read a particular URL (html) in the serial fashion — after stripping out (semantically irrelevant) HTML tags.
Thanks,
John
Fabulous book, by far the best (if not the only) in the “quantitative market research on the web” field; thanks for all the work you put into it.
OT comments:
Maybe the “perlold” directory in xampp.zip can be deleted?
(WP 2.9.1 is out, according to the dashboard of your web site.)
Thanks for the compliments! Comments like these makes it worthwhile to spend 2 years of time to write a book.
Also, thanks for the “perlold” tip. Yes, it can be deleted. Yes, I am aware of WP 2.1 release –I am just doing the backup first
Regards,
John
I guess it will be pretty simple to set up on LINUX (Fedora, for example). Are you planning a summary for this, please?
I haven’t tried all of the scripts on Linux, but it should be straight forward though. I would probably try usinh XAMPP for Linux first. If there is enough interest I might port everything to Linux. For now I would suggest to copy everything under htdocs to your Linux box under your Apache web folder (if not htdocs). Let me know if you get any problems. I might actually try this on my Mac as it is a Linux variant.