Wikibooks:Reading room/Proposals/2018/March

Change from comma-based article counting
In light of recent events (described below), this community needs to discuss whether it wants to change the way content pages are "officially" counted on this wiki.

First, some background on "article counting": As described at mw:Manual:Article count and mw:Manual:$wgArticleCountMethod, MediaWiki-based wikis like this one have three possible ways of counting "content pages" (historically called "good articles" and now usually just called "articles"). All three counting methods require that a page be in a "content namespace" and not be a redirect in order to be counted as an article. The wiki may be set up to also require that the page contain at least one wikilink (the 'link' method) or at least one comma (the 'comma' method); otherwise all such pages are counted (the 'any' method). (BTW, the single-quotes are intentional, since single-quotes are used in the [large] [//noc.wikimedia.org/conf/InitialiseSettings.php.txt configuration file] that stores the relevant settings — search for the text "ArticleCountMethod" to see.)

This wiki uses the 'comma' method and has 3 content namespaces: the main namespace (which needs no "prefix" in front of page titles) and the "Cookbook:" and "Wikijunior:" namespaces. Therefore, the count of "content pages" on this wiki (which I will continue to call the "article count", mainly out of habit) nominally includes all non-redirects in those three namespaces that contain at least one comma. I say "nominally" because, in fact, what I just said is not actually true. (!) Allow me to explain…

Many bugs related to article counting have been found and fixed in the MediaWiki code over the years, but this wiki has (probably) never been recounted to fix the article count itself. This is mainly because the two maintenance scripts that are run to fix on-wiki statistics (which are called  and  ) do not actually implement the 'comma' method at all; instead, for performance reasons, they simply check for positive page length (in addition to the usual non-redirect and content-namespace criteria). Since April 2015, the second script (to recount only articles) has been run on the 21st of each month in order to keep the article counts more-or-less correct, even if relevant bugs remain in the code. But all Wikibooks wikis have been excluded from the periodic recounting because the English and Portuguese Wikibooks use the 'comma' criterion (and are the only Wikimedia wikis to do so — FYI, the Czech Wikinews, Chinese Wikinews, Gujarati Wikisource, Polish Wikisource, Serbian Wikisource, Serbian Wikiquote, and Wikidata use the 'any' method, and all other Wikimedia wikis use the 'link' method).

It appears that the 'comma' method is correctly implemented when pages are saved (i.e., created or edited), but that's laregly irrelevant if the article count itself has become very wrong because of past bugs. In any case, it doesn't matter anymore, because…

On February 15th and again on February 21st this wiki was recounted using the  script. What this means is, the current article count no longer reflects the comma-based criterion, nor is it possible at this point to fix it (anytime soon) so that it accurately reflects the comma-based criterion (because there is no maintenance script to recount it in that way).

As reported at Wikimedia News, the article count of this wiki changed from 57,843 (about 18 hours before the February 15th recount) to 79,075 (up to 6 hours afterwards), a 37% increase. Presumably the same thing would have happened if the 'any' method had been used, since there are no zero-length pages here. (Note that the "live" article count is .)

So, now we find ourselves in a situation where the current article count of this wiki is (approximately) "correct" with respect to a different criterion than the one we have assumed was being used to count articles here. This count may tend to drift a bit as the wiki is edited (since page-saves still use the 'comma' criterion), but it will never match the true 'comma' based count unless someone writes some relevant code and gets it approved for use. (This may be unlikely because searching wikitext for a specific string is resource intensive.) Meanwhile, there are plans to periodically recount all stats on all Wikimedia wikis, including Wikibooks, which would mean that the count here would be periodically be reset to this weird "positive-length" based count, anyway. Although the current count doesn't technically match that given by any of the official, documented counting methods, as a practical matter it approximately matches what the 'any' method would give.

So, should we switch from the 'comma' article-counting method to the 'any' method, since that is essentially the situation we find ourselves in now?

(For additional context, see the discussion at T59788 and the information at Article counts revisited.)

- dcljr (discuss • contribs) 08:48, 28 February 2018‎ (UTC)


 * The following two posts appear to account for the current configuration:
 * Technical Assistance reading room, 24 January–5 February 2011: Page counts are incorrect
 * Proposals reading room, 4–8 February 2011: $wgUseCommaCount
 * --Pi zero (discuss • contribs) 02:23, 1 March 2018 (UTC)


 * It looks as if we decided to switch from links to commas in February 2011 because we deemed links unsuitable here, and then the any option was added later than year. I wonder whether the two events are related.  I see the technical suggestion of commas was provided to us at the time by bawolff;  do you have any thoughts on this?  --Pi zero (discuss • contribs) 13:07, 1 March 2018 (UTC)
 * Comma count method has been an issue for a long time, as its difficult to recount. I'd reccomend using the any method if the community finds that acceptable. Bawolff (discuss • contribs) 22:04, 1 March 2018 (UTC)
 * . Situation clarified; I'm comfortable. --Pi zero (discuss • contribs) 18:10, 3 March 2018 (UTC)
 * . Good to see there's no objections to this yet. Be warned, though, that a decision to remove comma-based counting entirely from the software (T188472) is moving ahead faster that I had anticipated, so if anyone does object to this change, speak now… - dcljr (discuss • contribs) 00:49, 4 March 2018 (UTC)

FYI, comma-based article counting has been removed entirely from the MediaWiki software, and this wiki has been switched to use the 'any' article-counting method (T188472). - dcljr (discuss • contribs) 03:07, 8 March 2018 (UTC)