entries for looking
at later. I'm not sure how many individual sites it has visited, but it's grabbed about 13,000 html documents so far, and it still has a vast number of unvisited links. Maybe you'd like to compare stat/data ? -Jon -- Jonathon Fletcher, Information Services, Stirling University. j.fletcher@stirling.ac.uk (X400: "/S=jf1/O=stirling/PRMD=uk.ac/C=gb/") WWW Home Page: http://www.stir.ac.uk/~jf1 ------------------------------ From: Charlie Stross To: mkgray@MIT.EDU Subject: the wanderer ... Date: Wed, 15 Dec 1993 11:42:03 +0000 (GMT) Hi there! You mentioned in (URL: http//www.mit.edu:8001/afs/sipb/user/mkgray/ht/web-growth.html) that your knowbot The Wanderer is written in Perl. Now it just so happens that I need to write something similar, and was thinking of doing it in Perl. I know you're not keen on letting source code out, but if I beg would you mind letting me see a copy? I'm pretty much a Perl novice so I won't comment on the elegance of the program ... what I'm interested in is writing a limited knowbot- like tool suitable for the following tasks: 1. Given a URL, and a constraint (depth of traversal, domain of traversal, number of URL's traversed, fan-out of graph, or some similar metric), a. Search the file at that URL for a regexp and save it if the regexp is found, then b. Traverse all dependant URLs hanging off that URL, until the initial constraints are exhausted 2. Given a URL, dump the file and all its children (subject to some constraints) to a local filesystem, optionally translating URLs so that Mosaic can be used to view the web locally 3. Create a dependency map of all children hanging off a URL Basically, all these tasks sound like spin-offs of The Wanderer, and they're all suitable for automation. I'd prefer to re-use any source code available on the net for this job, rather than re-invent the wheel. What do you say? -- Charlie (SCO Open Desktop development team, Technical Publications) -------------------------------------------------------------------------------- Charlie Stross is charless@sco.com, charlie@antipope.demon.co.uk ------------------------------ From: David Sisson Subject: Re: Growth of the World Wide Web To: mkgray@MIT.EDU Date: Wed, 15 Dec 1993 16:35:31 -0500 (EST) Your Wanderer program sounds real neat. I was planning to write a mirror program at some time (to steal neat HTML documents and ensure faster lookup/ reliability for access). But what I really had questions about were things like if I change my home page from gopher.vt.edu to www.vt.edu, would it catch it? Would it be easy to identify who still uses gopher.vt.edu? Also, why does it have gopher.vt.edu:80 and gopher.vt.edu:10021 as separate home pages? I've got one standard home page plus two extra servers (10021 and 10020) to do specific addons. I also didn't find connections to the nearby computer science home pages I note from mine (fox.cs.vt.edu:80 and vtopus.cs.vt.edu:80). You may have run the program before I added those to my home page, though. Hope these solutions help make Wanderer a little better. Enjoy your holidays! BTW, there is a DOS/Windows job opening up around here (interviews end the 7th of January) working with information systems. DOS/Windows work probably dies after a year, then migrates into Unix responsibilities. I'm going to post this one to misc.jobs.offered later today in case you're interested. We're doing lots of work with WWW and other information systems. We even have apartments down here with Ethernet (not directly school related) in 'em! This is a job with the school though. -- daves@vt.edu (Dave Sisson)
To: mkgray@MIT.EDU Subject: Mosaic/WWW Date: Wed, 15 Dec 93 16:43:53 EST From: Keith Morgan I have just spent a couple of hours playing with the SIPB WWW stuff and wanted to tell you how enjoyable and impressive it was. I was just looking up IAP times and found lots more. One question, when I click on Dilbert on your home page, should it run or do I have to run it through another viewer? It didn't work for me but Doctor Fun did. Anyway, WWW and Mosaic are great. Keith Morgan Dewey Library
Date: Wed, 15 Dec 93 13:55:37 PST Reply-To: robg@halcyon.com From: robg@halcyon.com To: mkgray@MIT.EDU Subject: Your Web research This is a very interesting piece of analysis. I'd be interested in learning more about your methodology. Do you have anything written up that summarizes it? Also do you know whether similar analyses have been done regarding gopher, wais, and other net data resources? Also at the end you (perhaps jokingly) suggest you might be interested in a job that allowed you to do net related analysis and work. In a few different capacities (easier to explain verbally than over the phone) I might be able to either hire you or line you up with people who might be able to. If you're interested in this please give me a call at (206) 447-0567. Rob Glaser ------------------------------ Date: Wed, 15 Dec 1993 17:26-0500 From: John C. Mallery Subject: 623 is pretty small To: mkgray@MIT.EDU Gopher has 1200 sites listed on it's all world-wide gophers page. There are over 2 million hosts on the internet with perhaps 40 million email users world wide. We're ramping up an intelligent information infrastructure project here at the AI lab. How about coming over and chatting sometime soon. I suspect we have some quite strongly overlapping interests, (e.g. pointer GC on the net). John Mallery Artificial Intelligence Laboratory, NE43-797 Massachusetts Institute of Technology 545 Technology Square Cambridge, Massachusetts 02139-4301 (617)253-5966 PS Why not clean up some of the typos in your WWW page so you don't look too much like a techie? ------------------------------ (original) (raw)
Date: Mon, 27 Sep 93 13:16:46 -0600 From: rdaniel@acl.lanl.gov (Ronald E. Daniel) To: webmaster@MIT.EDU Subject: Weather map Hi, A compliment and a bug report on the weather map. It is really nice, and I look forward to using it. The problem is that I can't do much with it. Clicking in the northern half of New Mexico gets me the Denver, CO forecast, clicking in the southern half gets the Phoenix, AZ. forecast. Could you please pass along my request to at least make Albuquerque (ABQ) the response for the state of NM? Picking up the forecasts for Santa Fe (SAF) would be nice too, since it is much closer to Los Alamos than Albuquerque is. Thanks in advance, and once again the map is really cool. Ron Daniel Jr. email: rdaniel@acl.lanl.gov Advanced Computing Lab voice: (505) 665-7453 MS B-287 TA-3 Bldg. 2011 fax: (505) 665-4939 Los Alamos National Lab tautology: "Conformity is very popular" Los Alamos, NM, 87545 ------------------------------ To: webmaster@MIT.EDU From: Stephan Deutsch stephan.deutsch@germany.eu.net Subject: Click on weathermap Date: Fri, 12 Nov 1993 19🔞09 +0100 Dear Administrators, I have a question. You have a weathermap in your WWW server on which one can obtain specific information by clicking. I saw this possibility several other times and always got the information out of the document source, that ISMAP is used as a token in the HTML code. Searching through the WWW/HTML documentation i never found this keyword (maybe I'm blind or looking at the wrong places ;-). Would you please be so kind to point me to the right document for doing such a thing? Thank you very much in advance! Best Regards Stephan Deutsch BTW: Your weather map is real fun and we showed this service on the Frankfurt bookfair several times to show the possibilities of the Internet and what services are offered by organisations on the net. The response was great :-) you should add the names of the persons who do this stuff so the people looking at the work can see its creators. === ____ === Stephan Deutsch, Public Relations === / / / ___ ___ _/_ === sd@Germany.EU.net === /---- / / / / /___/ / === EUnet Deutschland GmbH === /____ /___/ / / /___ / === Emil-Figge-Str. 80 ===== ===== 44227 Dortmund ===== Connecting Europe since 1982 ===== Tel.(Fax) +49 231 972 2222 (1111) ------------------------------ Date: Fri, 10 Dec 93 09:44 EST From: larryk@computone.com (larry kollar) To: webmaster@MIT.EDU Subject: Weather Gateway The weather gateway is a GREAT idea! Here's the bugs I've run into so far: - I get frequent timeout errors. It may be a system load issue, so I'm not sure what you could do about it. - In the weather map, clicking on Atlanta returns weather for Memphis. Also, clicking on Ft. Myers (FL) returns weather for Orlando. I haven't been able to get anything but timeout errors on Tampa. Don't get discouraged -- the weather map is going to be one of the best things about WWW once you've shaken out the bugs. I imagine that anyone showing off the Web to a newbie would use that as the first example of "what can you do with it?" I know what a bitch it is getting things set up -- we're almost done with a Web server here and I'm doing most of the work. I sure wouldn't tackle anything as ambitious as an interactive weather map as a first project. :-) I plan to add image maps to our server in the future, though. Larry Kollar, Senior Technical Writer | email: larryk@computone.com Computone Inc, Roswell, GA | "You help your country by investing Disclaimer: I just write the manuals! | in the future, not by waving flags." Check out our World-Wide Web server, http://www.computone.com/ ------------------------------ Date: Mon, 13 Dec 93 09:22:15 -0500 From: mbr@bellcore.com (Mark Rosenstein) To: mkgray@MIT.EDU Subject: W4 - other stats Reply-To: mbr@bellcore.com (Mark Rosenstein) Matthew: I was reading through: http://www.mit.edu:8001/afs/sipb/user/mkgray/ht/web-growth.html and was wondering if you also had number of documents stats, as well as number of sites (also, though this would be harder, a measure of the size of the docs). W4 is a totally Cool idea. Mark. =-=-=-= "Ignorance is the mother of Adventure" --Hagar the Horrible ------------------------------ Date: Tue, 14 Dec 93 15:49:41 CST From: jon@balder.us.dell.com (Jon Boede) To: mkgray@MIT.EDU Subject: Weronica :-) It would be interesting if your WWW Wanderer would collect the title from each html document that it comes across ... you could then index that document and put it up on the Web the way Veronica exists for gopherspace. You've probably already thought of this but I thought I'd just say hi. :-) Jon -- ,,, (o o) Jon Boede ----ooO-(_)-Ooo---- jon@dell.com +1 512 728-4802 Engineering, Dell Computer Corp. Server OS Development Austin, TX "When I was 10, mean old man Miller's house burned down. We put home plate where his toilet once stood -- his garden became our center field... and in these ways the laws of karma were revealed." ------------------------------ From: Michael James Gebis mjg51721@uxa.cso.uiuc.edu Subject: Growth of the To: mkgray@MIT.EDU Date: Tue, 14 Dec 1993 16:45:03 -0600 (CST) I just browsed your "Growth of the World Wide Web" document that was recently posted on the What's new with NCSA Mosaic. It's cool, but here's another set of statistics to add: http://www.cen.uiuc.edu/\~mg7932/web.use.html This is from the network information center, it's a measure of traffic on the NSFNet to each port. I've collected several months worth of data. The originals can be found at: ftp://nic.merit.edu/nsfnet/statistics/ It's another indication of just how quickly the Web is growing. -- Mike Gebis m-gebis@uiuc.edu Mean people suck. http://www.cen.uiuc.edu/\~mg7932/mike.html ------------------------------ To: mkgray@MIT.EDU Subject: W4 Date: Wed, 15 Dec 1993 10:39:59 MET From: Kevin Laws kevin@oc3s-emh1.army.mil So you've created a World Wide Web Wanderer? I was thinking about that...with just a little work to interface it with WAIS and a database, you could create an index of all World Wide Web documents (similar to Veronica for Gopher). That would be incredibly useful when trying to do World Wide Web-based research. All you would have to do is retrieve every document (which you already do), strip out the tags, and index it along with its title and an anchor around the title. The database is very large and your system administrators may not appreciate having it on their machine. However, I'm sure that if you just created the software, then somebody would be willing to run it (like the 4 or 5 sites that run Veronica today). Plus you would be hailed as a savior in the WWW community! I know you are just doing this in your spare time, and I think what you've done so far is an impressive contribution to Web research. This is just an idea. -- Kevin ------------------------------ Date: Wed, 15 Dec 1993 12:16:46 +0100 From: Girardin Luc girardin@heisun1.unige.ch To: mkgray@MIT.EDU Subject: Table of Wanderer Hi! I had a look to your document about the growth of the World Wide Web. It's really interresting to see how and how fast the Web is growing. Hope you will continue to give this service. However, your list doesn't include my World-Wide Web Server (http://heiwww.unige.ch/). Your wanderer wasn't certainly not dicevered it because it is only referenced in about 10 servers around. My server contain about 50 documents and I'm considering going to a project in political science that will give access to 10GB of document on the Web. I'm sure that the Web will continue to grow very fast. Sincerly ------------------------------------------------------- Luc Girardin, System manager, Computer center The Graduate Institute for International Studies Institut Universitaire de Hautes Etudes Internationales Avenue de la Paix 11A, CH-1202 Geneve, Switzerland Phone : +41 (22) 734.89.50 Fax : +41 (22) 733.30.49 Internet : girardin@hei.unige.ch WWW : http://heiwww.unige.ch/girardin/ ------------------------------ Subject: W4 - the Wanderer To: mkgray@MIT.EDU Date: Wed, 15 Dec 93 10:25:59 GMT From: Jonathon Fletcher j.fletcher@stirling.ac.uk Matthew, I just read the growth thing in the 'What's New' page for Mosaic and saw you had a net wanderer. I too have a net wanderer, written in C, using Berkeley socket code (or rather, HP's implementation - the two are likely to not be identical). I'm currently running a 'wander', if you like to call it that - it's been running since sunday night. It collects html references and titles, size, and /j.fletcher@stirling.ac.uk/girardin@heisun1.unige.ch/kevin@oc3s-emh1.army.mil/mjg51721@uxa.cso.uiuc.edu/stephan.deutsch@germany.eu.net