Page MenuHomePhabricator

Run fixInconsistentBoards on Wikidata
Closed, ResolvedPublic

Description

@Infopetal reports:

I had the Structured Discussions beta enabled on my Wikidata talk page and attempted to disable it in my Wikidata preferences upon seeing that it is being deprecated but seem to have broken something. I now see the error:

The Structured Discussions workflow is not associated with this page.

[9a2892fd-32e3-4536-a174-37a85066dedc] 2024-10-04 18:16:13: Fatal exception of type "Flow\Exception\InvalidDataException

I am unable to view the talk page history or make any changes

Per https://www.mediawiki.org/wiki/Talk:Structured_Discussions/Deprecation#%22Fatal_exception_of_type_'Flow\Exception\InvalidDataException'%22

This is due to T371769. The broken page is https://www.wikidata.org/wiki/User_talk:Infopetal

Event Timeline

Restricted Application added a subscriber: Aklapper. · View Herald Transcript

Following on T371769#10209219, I checked @DLynch's suspicion that this issue is caused by a stale cache. Just like in T371582#10034787, I checked both what Flow logic claims about the title and the database, see logs below:

[urbanecm@mwmaint2002 ~]$ mwscript shell.php wikidatawiki
DEPRECATION WARNING: Maintenance scripts are moving to Kubernetes. See
https://wikitech.wikimedia.org/wiki/Maintenance_scripts for the new process.
Maintenance hosts will be going away; please submit feedback promptly if
maintenance scripts on Kubernetes don't work for you. (T341553)
Psy Shell v0.12.3 (PHP 7.4.33 — cli) by Justin Hileman
> $c = require "$IP/extensions/Flow/container.php";
= Flow\Container {#6616}

> $f = $c['factory.loader.workflow']
= Flow\WorkflowLoaderFactory {#9166}

> $f->createWorkflowLoader(Title::newFromText('User_talk:Infopetal'))

   Flow\Exception\InvalidDataException  Flow workflow is for different page.

> $wikipage = \MediaWiki\MediaWikiServices::getInstance()->getWikiPageFactory()->newFromTitle(Title::newFromText('User_talk:Infopetal'))
= WikiPage {#9050}

> $content = $wikipage->getContent()
= Flow\Content\BoardContent {#9115}

> $workflowId = $content->getWorkflowId()
= Flow\Model\UUID {#9114}

> $storage = $c['storage']
= Flow\Data\ManagerGroup {#6607}

> $workflow = $storage->getStorage('Workflow')->get($workflowId)
= Flow\Model\Workflow {#9060}

> $workflow->getArticleTitle()->getPrefixedText()
= "User talk:Infopetal/Structured Discussions Archive 1"

> ^D

   INFO  Ctrl+D.

[urbanecm@mwmaint2002 ~]$
wikiadmin2023@10.192.32.134(flowdb)> select * from flow_workflow where workflow_page_id=114557683 and workflow_type='discussion'\G
*************************** 1. row ***************************
                   workflow_id: $&;+
                 workflow_wiki: wikidatawiki
            workflow_namespace: 3
              workflow_page_id: 114557683
           workflow_title_text: Infopetal/Structured_Discussions_Archive_1
                 workflow_name: 
workflow_last_update_timestamp: 20241004180443
              workflow_user_id: NULL
           workflow_lock_state: 0
        workflow_definition_id: NULL
              workflow_user_ip: NULL
            workflow_user_wiki: NULL
                 workflow_type: discussion
1 row in set (0.206 sec)

wikiadmin2023@10.192.32.134(flowdb)>

In this case, DB is consistent with what the code reports. This means stale cache is not what is causing this error to happen.

I messaged @DLynch on Slack to see if there is any other investigation that should happen while we have a known broken board. I'll run the script as soon as I hear back from him. Sorry for delaying fixing this – this is an opportunity for us to gather more information about why this bug is happening (see also T371769).

Reading through the original report, there is one thing that caught my attention. Infopetal said (emphasis mine):

I had the Structured Discussions beta enabled on my Wikidata talk page and attempted to disable it in my Wikidata preferences upon seeing that it is being deprecated

This means Infopetal/Structured_Discussions_Archive_1 is the new title. In that case, it would be the page_title field that would be out of date:

wikiadmin2023@10.192.16.216(wikidatawiki)> select * from page where page_id=114557683\G
*************************** 1. row ***************************
           page_id: 114557683
    page_namespace: 3
        page_title: Infopetal
  page_is_redirect: 0
       page_is_new: 1
       page_random: 0.978932955143
      page_touched: 20230802135322
page_links_updated: 20230802140427
       page_latest: 1926106923
          page_len: 1
page_content_model: flow-board
         page_lang: NULL
1 row in set (0.000 sec)

wikiadmin2023@10.192.16.216(wikidatawiki)>

Is it possible the transaction to change page_title was not committed for some reason, thus causing this bug? (Of course, the good question is "why would that happen")

And the core part of the move isn't logged either:

https://www.wikidata.org/wiki/Special:Log?type=&user=&page=User_talk%3AInfopetal&wpdate=&tagfilter=&wpfilters%5B%5D=newusers&wpFormIdentifier=logeventslist

Presumably running fixInconsistentBoards will get the page to a working state as if the move had never happened and then someone can try to move the Flow board on-wiki again, maybe with better logging this time.

Another interesting piece of logs, from around the time of the move, logged in the Flow channel:

2024-10-04 18:04:45.106111 [1edb3151-dedd-4868-93b5-94c1ad665a6e] mw-web.codfw.main-6dd9755db7-kjt6x wikidatawiki 1.43.0-wmf.25 Flow ERROR: Flow\Import\OptInController::initiateChange failed to disable Flow on 'User talk:Infopetal' for user 'Infopetal'. Failed creating revision at User talk:Infopetal/Archive 1 {trace} {"action":"disable","talkpage":"User talk:Infopetal","user":"Infopetal","message":"Failed creating revision at User talk:Infopetal/Archive 1"} 
[Exception Flow\Import\ImportException] (/srv/mediawiki/php-1.43.0-wmf.25/extensions/Flow/includes/Import/OptInController.php:319) Failed creating revision at User talk:Infopetal/Archive 1
  #0 /srv/mediawiki/php-1.43.0-wmf.25/extensions/Flow/includes/Import/OptInController.php(588): Flow\Import\OptInController->createRevision(MediaWiki\Title\Title, string, string)
  #1 /srv/mediawiki/php-1.43.0-wmf.25/extensions/Flow/includes/Import/OptInController.php(178): Flow\Import\OptInController->removeArchiveTemplateFromWikitextTalkpage(MediaWiki\Title\Title)
  #2 /srv/mediawiki/php-1.43.0-wmf.25/extensions/Flow/includes/Import/OptInController.php(110): Flow\Import\OptInController->disable(MediaWiki\Title\Title)
  #3 /srv/mediawiki/php-1.43.0-wmf.25/includes/deferred/MWCallableUpdate.php(52): Flow\Import\OptInController->Flow\Import\{closure}(string)
  #4 /srv/mediawiki/php-1.43.0-wmf.25/includes/deferred/DeferredUpdates.php(460): MediaWiki\Deferred\MWCallableUpdate->doUpdate()
  #5 /srv/mediawiki/php-1.43.0-wmf.25/includes/deferred/DeferredUpdates.php(204): MediaWiki\Deferred\DeferredUpdates::attemptUpdate(MediaWiki\Deferred\MWCallableUpdate)
  #6 /srv/mediawiki/php-1.43.0-wmf.25/includes/deferred/DeferredUpdates.php(291): MediaWiki\Deferred\DeferredUpdates::run(MediaWiki\Deferred\MWCallableUpdate)
  #7 /srv/mediawiki/php-1.43.0-wmf.25/includes/deferred/DeferredUpdatesScope.php(243): MediaWiki\Deferred\DeferredUpdates::MediaWiki\Deferred\{closure}(MediaWiki\Deferred\MWCallableUpdate, int)
  #8 /srv/mediawiki/php-1.43.0-wmf.25/includes/deferred/DeferredUpdatesScope.php(172): MediaWiki\Deferred\DeferredUpdatesScope->processStageQueue(int, int, Closure)
  #9 /srv/mediawiki/php-1.43.0-wmf.25/includes/deferred/DeferredUpdates.php(310): MediaWiki\Deferred\DeferredUpdatesScope->processUpdates(int, Closure)
  #10 /srv/mediawiki/php-1.43.0-wmf.25/includes/MediaWikiEntryPoint.php(674): MediaWiki\Deferred\DeferredUpdates::doUpdates()
  #11 /srv/mediawiki/php-1.43.0-wmf.25/includes/MediaWikiEntryPoint.php(496): MediaWiki\MediaWikiEntryPoint->restInPeace()
  #12 /srv/mediawiki/php-1.43.0-wmf.25/includes/MediaWikiEntryPoint.php(454): MediaWiki\MediaWikiEntryPoint->doPostOutputShutdown()
  #13 /srv/mediawiki/php-1.43.0-wmf.25/includes/MediaWikiEntryPoint.php(209): MediaWiki\MediaWikiEntryPoint->postOutputShutdown()
  #14 /srv/mediawiki/php-1.43.0-wmf.25/index.php(58): MediaWiki\MediaWikiEntryPoint->run()
  #15 /srv/mediawiki/w/index.php(3): require(string)
  #16 {main}

It seems that when a move happens as a result of changing beta preferences, Flow catches any exceptions, and re-logs them in its own channel. Unfortunately, I cannot find any records of a similar exception for the T371582 case. However, an exception would certainly explain an hypothetical uncommitted transaction to the core database (at least in this case).

It would be nice if it logged what the error was.

Unfortunately the code seems to clobber it by throwing a nondescript exception in the event of a bad status : https://github.com/wikimedia/mediawiki-extensions-Flow/blob/master/includes/Import/OptInController.php#L318

It's worth pointing out that the code that threw an exception here was specific to the Beta feature disabling, whereas the previous case was a manual move.

Change #1081239 had a related patch set uploaded (by Urbanecm; author: Urbanecm):

[mediawiki/extensions/Flow@master] OptInController: Log more details about failures to create revision

https://gerrit.wikimedia.org/r/1081239

It's worth pointing out that the code that threw an exception here was specific to the Beta feature disabling, whereas the previous case was a manual move.

Yeah, this seems to be the exact opposite of the previous issue: it's a failed move that did update the workflow page title, as opposed to a successful move that didn't update the workflow page title.

Mentioned in SAL (#wikimedia-operations) [2024-10-17T18:48:00Z] <urbanecm> mwscript-k8s --comment=T377360 -f -- extensions/Flow/maintenance/FlowFixInconsistentBoards.php --wiki=wikidatawiki # T377360

Urbanecm_WMF edited projects, added Growth-Team (Current Sprint); removed Growth-Team.

Thanks all for the quick investigation here. Considering there's nothing else we can learn from this instance, I went ahead and run the script; see logs:

[urbanecm@deploy2002 ~]$ mwscript-k8s --comment=T377360 -f -- extensions/Flow/maintenance/FlowFixInconsistentBoards.php --wiki=wikidatawiki
⏳ Starting extensions/Flow/maintenance/FlowFixInconsistentBoards.php on Kubernetes as job mw-script.codfw.n7lpjss7 ...
⏳ Waiting for the container to start...
🚀 Job is running.
📜 Streaming logs:

Checked a total of 300 Flow boards.  Of those, 0 boards had an inconsistent title; 0 were fixed.

Checked a total of 600 Flow boards.  Of those, 0 boards had an inconsistent title; 0 were fixed.
   
Checked a total of 900 Flow boards.  Of those, 0 boards had an inconsistent title; 0 were fixed.

Checked a total of 1200 Flow boards.  Of those, 0 boards had an inconsistent title; 0 were fixed.
INCONSISTENT: Core title for 'v73q4742vi3wb49x' is 'User talk:Slowking4/Structured Discussions Archive 1', but Flow title is 'User talk:Slowking4'
ERROR: 'User talk:Slowking4/Structured Discussions Archive 1' has page ID '66908402', but no workflow is linked to this page ID

Checked a total of 1500 Flow boards.  Of those, 1 boards had an inconsistent title; 0 were fixed.

Checked a total of 1800 Flow boards.  Of those, 1 boards had an inconsistent title; 0 were fixed.

Checked a total of 2100 Flow boards.  Of those, 1 boards had an inconsistent title; 0 were fixed.

Checked a total of 2400 Flow boards.  Of those, 1 boards had an inconsistent title; 0 were fixed.
INCONSISTENT: Core title for 'xhja8b509c3jo898' is 'User talk:TNLHK', but Flow title is 'User talk:TNLHK/Structured Discussions Archive 1'
FIXED: Updated 'xhja8b509c3jo898' to match core title, 'User talk:TNLHK'
INCONSISTENT: Core title for 'xkz7rt74tip8xtxm' is 'User talk:Infopetal', but Flow title is 'User talk:Infopetal/Structured Discussions Archive 1'
FIXED: Updated 'xkz7rt74tip8xtxm' to match core title, 'User talk:Infopetal'
INCONSISTENT: Core title for 'xmi6m34du659htx5' is 'User talk:V0lkanic', but Flow title is 'User talk:V0lkanic/Structured Discussions Archive 1'
FIXED: Updated 'xmi6m34du659htx5' to match core title, 'User talk:V0lkanic'

Checked a total of 2523 Flow boards.  Of those, 4 boards had an inconsistent title; 3 were fixed.
[urbanecm@deploy2002 ~]$

I then manually moved the page where it is supposed to be – no exception this time. I also moved the three other pages to the new name, and this succeeded with no issues. With that, calling this resolved.

It's worth pointing out that the script left behind a broken page at https://www.wikidata.org/wiki/User_talk:Slowking4/Structured_Discussions_Archive_1. Despite the errors https://www.wikidata.org/wiki/User_talk:Slowking4 somehow works, though, so maybe the broken page can just be deleted.

Or maybe that would make things worse and it's best to leave well enough alone until Flow is actually undeployed and it becomes moot.

Change #1081239 merged by jenkins-bot:

[mediawiki/extensions/Flow@master] OptInController: Log more details about failures to create revision

https://gerrit.wikimedia.org/r/1081239