Wikipedia:Version 1.0 Editorial Team - Wikipedia (original) (raw)

Page semi-protected

From Wikipedia, the free encyclopedia

This page in a nutshell: We are publishing selections of Wikipedia articles for offline use in Kiwix releases, in print, or for distribution on flash drives/memory cards. Most of the work is organized off wiki. WP:1.0WP:1.0WP:1WP:1
Wikipedia 1.0 — (talk)FAQTo do
Release version toolsGuide(talk)(stats)
Article selection process
Release criteria
Review team (FAQ)
"Selection" project (Talk)
Vital articlesLevel 1 articlesLevel 2 articlesLevel 3 articlesLevel 4 articles Level 5 articles
"Selection" project for kids ((t))
WORK VIA WIKI PROJECTS (talk)
Pushing to 1.0 (talk)
Static content subcom.

This is the Version 1.0 Editorial Team page. This is the home page on the English Wikipedia for a broad group of people who work to produce offline versions of content. This content is used in places where internet access is limited. It includes (but is not limited to) English Wikipedia articles. The Wikipedia content assessment scheme was originally developed here to allow us to put together article selections, and it is still managed by members of this group through the WP 1.0 bot, although the main users of that scheme are Wikipedia editors and WikiProjects.

General background

Article ratings assessment scheme

In late 2003, Wikipedia co-founder Jimmy Wales had proposed making an offline release version of Wikipedia. This group was formed in late 2004 to meet this challenge. Our work involves identifying and organizing articles, and improving and maintaining a core set. Our work does not hinder the existing wiki process for creating and editing articles, but rather it supports that work by providing additional organization. We aim to produce collections that can be used in places where the internet coverage is expensive or non-existent. Our early collections were distributed via DVD; now these are shared via download, then distributed on hardware such as a Raspberry Pi. Originally, only a fixed selection was available, but there is now much flexibility in how selections can be made. This project is now mainly one point in a network of groups who collect and distribute open educational resources from the Internet in an offline form.

Our approach diverged from earlier Wikipedia collections developed by the German offline project and by Wikimedia Polska, both of which involved a large group of volunteers checking over a snapshot at the time of release. On the English Wikipedia, we elected to set up an assessment scheme in advance, with the idea that collections could be assembled using quality & importance information compiled by WikiProjects. The assessment scheme originated as a manual tool for WikiProjects to organise articles in their subject area, and (once automated) this is the main application for article assessment.

See these more detailed related articles:

How you can help

You are encouraged to join us and help out with one of the projects, or to discuss Wikipedia 1.0 on the talk page. A significant part of our work centers around maintaining the assessment scheme, which is now used on more than seven million articles by over 1000 active WikiProjects on the English Wikipedia. It is also being used on other language projects. Generally work on this team is sporadic – periods of hectic activity followed by long periods of waiting! Often work is long and tedious – checking through a list of 22,000 instances of profanities one by one, organizing 10,000 keywords taken from category names, or dealing with technical bugs when the assessment bot fails for no apparent reason. However, it is all worth it in the end.

Current needs

Technical needs

We work extensively with Kiwix, who maintain this Github page, and they have this list of WP1 issues they are working on. If you have tech skills, and know how to solve some of these problems, please contact the Kiwix people via Github or via the Kiwix main site. One of the most important goals for us at WP1.0 is the project to create a new list builder module, coordinated by User:Audiodude

On Wikipedia

Although we have much of the requisite system automated, there are still some outstanding tasks:

Out in the real world

The main goal here is to help with distribution, especially in remote or undeveloped areas without Internet access. The main target audiences are:

Please let us know on the Talk page if you can help with any of these, and be sure to ping the user in your message to make sure they see your request.

A page read offline in Kiwix

Status

At present, the main activities are:

To select articles, we are mainly using a bot-assisted selection process based on assessment by individual WikiProjects, where articles are selected automatically based on quality and importance project rankings.

RevID selection

Based on discussions (at the 2017 Potsdam hackathon and since), we plan to reactivate RevID selection. Previously code based on WikiTrust was used in Version 0.8, and this appeared to produce a largely vandalism-free collection of articles. This worked by scoring each RevID based on the edits remaining in it, and choosing the most "trustworthy" recent RevID based on the WikiTrust algorithm.

Offline Wikipedia projects

Active offline projects

If you would like to start a new project, please discuss it on the talk page first before adding it here.

Wikipedia 1.0 Projects
Name Summary of overall strategy Coordinator Description of activities
Work via WikiProjects (WVWP) Use "networking" to mobilise our existing subject specialists on wiki User:Walkerma Organise and facilitate compilation of article lists from the WikiProjects and seek to identify important topics within each WikiProject's area of expertise. Locate important topics that are currently not being managed by projects. In conjunction with WP:COUNCIL, the project serves as a link with the editing community, and may later help locate expert reviewers.
m:WikiMed encyclopedia Create a high-quality medical encyclopedia as a free, offline resource for use in places where resources and internet are limited. Doc James on MDWiki Our goal is to make clear, reliable, comprehensive, up-to-date education resources and information in the biomedical and related social sciences freely available to all people in the language of their choice, online and off. While we are working on resources for all people, we are particularly interested in developing material useful for those in low and middle income countries.

Past releases

Past Wikipedia 1.0 Projects
Name Summary of overall strategy Month of release Description of activities Website Next release
Version 0.5 A test release prior to release of Version 1.0 above. April 2007 A test release designed to pave the way for Version 1.0. Used manual nominations and approval based on importance and quality. Approval was by only one person, from the review team. Okawix site now inactive Version 0.7
Version 0.7 A test release of automated article selections, prior to release of Version 1.0 above. Early 2010 A test release designed to pave the way for Version 1.0. Used SelectionBot to make an article selection based on importance and quality. Vandalism prevention used a script, with manual checks, which delayed the release significantly. [1] ZIM download Version 0.8
Version 0.8 A test release of automated article selections, prior to release of Version 1.0 above. March 2011 A test release designed to pave the way for Version 1.0. Version 0.8 used bot-assisted article selection, with manual adjustments based on feedback from WikiProjects. Used as a test of the WikiTrust revisionID selection code - this worked well. Wikipedia:Version 0.8/downloads. Version 0.9
2006 Wikipedia CD Selection (previously called "Test Version") Work with release version done off site that was coordinated by BozMo April 2006 2000 articles with content filtered/selected for use by children (see Wikipedia:Wikipedia CD Selection). No longer available - see 2008/9 release below 2007 Wikipedia CD Selection (below)
2007 Wikipedia CD Selection Work with release version done off site that was coordinated by BozMo May 2007 4655 articles with content filtered/selected for use by children (see Wikipedia:Wikipedia CD Selection). No longer available - see 2008/9 release below 2008/9 Wikipedia CD Selection (below)
2008/9 Wikipedia CD Selection Work with release version done off site that was coordinated by BozMo October 2008 5502 articles with content filtered/selected for use by children (see Wikipedia:Wikipedia CD Selection). http://schools-wikipedia.org Not yet known

Inactive projects

Inactive Wikipedia 1.0 Projects
Name Summary of overall strategy Coordinator Description of activities
School selection Put together selections of 1–10 GB sizes for use in high schools and elementary schools User:Walkerma and others Uses new code that starts with a seed and works out, guided by the WP 1 selection ranking to guide it

Publishing steps

The process of generating an offline version of a sub-selection of Wikipedia article is multistage. It needs many dedicated and singled-purposed operations. The following chart show how the WP1 project envisioned things in 2010.

The general process for producing an offline release

Even if this chart is still, to a large extent, valid; we practice and envision things slightly differently nowadays. One of the most important paradigm change we had to make is to remove as much as possible human based manual activity because the amount of work is simply too high to be achieved in a reasonable amount of time. We tend now to automate as much as possible the whole process. As a consequence the project is now predominantly focused on technology.

Technical approach

Support Wikiproject assessment effort

The first software created to support the WP1 project has been the User:WP_1.0_bot. First written in Perl by User:CBM and then slighly modified and maintained by a few other volunteers. In 2020 the bot has been totally rewritten in Python following modern development standards (API, automated tests, etc.) by User:Audiodude. The code base is available en developed on Github.

The WP1bot had and still have three traditional purposes:

Select article titles

...

Select article revision

...

Scrape selected articles for offline usage

...

Orchestrate periodic and multiple scraping

...

Publish and distribute offline snapshots

...

Statistics

The WP 1.0 bot tracks assessment data (article quality and importance data for individual WikiProjects) assigned via Talk page banners. If you would like to add a new WikiProject to the bot's list, please read the instructions at Wikipedia:Version 1.0 Editorial Team/Using the bot.

The global summary table below is computed by taking the highest quality and importance rating for each assessed article in the main namespace.

All articles by quality and importance
Quality Importance
Top High Mid Low ??? Total
FA 1,658 2,621 2,498 2,136 187 9,100
FL 193 688 788 699 99 2,467
A 375 701 808 613 111 2,608
GA 3,488 7,977 16,051 22,286 1,916 51,718
B 18,174 35,729 59,935 84,993 28,478 227,309
C 18,085 58,847 150,063 372,999 108,185 708,179
Start 18,772 95,736 436,758 1,823,947 446,961 2,822,174
Stub 4,042 30,627 273,946 2,901,742 790,505 4,000,862
List 5,258 18,442 57,245 223,292 99,942 404,179
Assessed 70,045 251,368 998,092 5,432,707 1,476,384 8,228,596
Unassessed 113 395 967 12,796 316,993 331,264
Total 70,158 251,763 999,059 5,445,503 1,793,377 8,559,860

Articles in this table may be listed in multiple projects. The counts, especially the total article count, is not a count of the total number of articles in English Wikipedia.

About this table

Notes

  1. ^ Per Wikipedia:Version 1.0 Editorial Team/Release Version Criteria

General

Assessment and validation

Wikipedia books

Article selections

See also