[Tutor] [ann] CGI Link Checker 0.1 (original) (raw)
Adayapalam Appaiah Kumaraswamy kumanna at myrealbox.com
Mon Jul 12 08:13:52 CEST 2004
- Previous message: [Tutor] Please critique my temperature_conversion.py
- Next message: [Tutor] Please critique my temperature_conversion.py
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]
Dear Python users, I am new to Python. As I learnt a bit more on coding in Python, I decided to try out a simple project: to write a CGI script in Python to check links on a single HTML page on the web. Although I am just a hobby programmer, I thought I could show it to others and ask for their comments and suggestions. It is my first CGI script as well as my first Python application.
I looked about around the net, but found only a few link-checking details related to Python. So, I thought I could write a no-frills one myself.
BTW the W3C Link Checker is written in Perl. I don't know Perl, so I couldn't look at it for ideas.
I am working on a slow dial-up connection. I had to face the following problems:
1.Delayed responses for large pages: I worked around this by flushing sys.stdout after every three links checked; that might lead to inefficiency, but it does throw the results three at a time to the impatient user. Otherwise, the Python interpreter would wait until the output buffer is filled till dumping it to the web server's output.
2.Slow: I don't know how to make the script perform better. I've tried to look into the code to make it run faster, but I couldn't do so. Also, I think the hosting server's bandwidth may contribute to this. Still, it takes only about 5 to 10 seconds more than the W3C validator for very large pages, and 2 to 3 seconds more for smaller ones. Your results may vary, I'd love to know.
3.HTML parsing: I have made no attempt to (and I do not propose to) check pages with incorrect HTML/XHTML. This means that if the Python HTMLParser fails, my script exits gracefully. An example of invalid HTML is www.yahoo.com.
Finally, since this is my first Python program, I might not have properly adapted to the style of programming experienced Python users may be accustomed to. So, I request you to please correct me in this regard as well.
In all, it was an good experience, and gave me more than a glimpse of the power offered by Python.
Please read the instructions on the page before entering your URL to test the script. You can spawn the script from: http://kumar.travisbsd.org/pyprogs/example.html Personally, I have tried the following sites with this script: http://www.w3.org/ - Works 100% perfect. http://www.yahoo.com/ - Invalid HTML. Exits gracefully.
Source code only (meaning without the fancy images and CSS I have used): http://kumar.travisbsd.org/pyprogs/cgilink.txt
If you want to try hosting the script on your own server, get this and see the README (This includes all the images and fancy CSS): http://kumar.travisbsd.org/pyprogs/cgilink-0.1.tar.gz
Thank you. Kumar
-- Adayapalam Appaiah Kumaraswamy (Kumar Appaiah)
Web: http://www.ee.iitm.ac.in/~ee03b091/
1, Balaji Apartments, 32, Third Street, East Abhiramapuram, Mylapore, Chennai - 600004 India
- Previous message: [Tutor] Please critique my temperature_conversion.py
- Next message: [Tutor] Please critique my temperature_conversion.py
- Messages sorted by: [ date ] [ thread ] [ subject ] [ author ]