It also uses a lot of CPU, which seems to indicate there is something really inefficient going on there.
Description
Details
Subject | Repo | Branch | Lines +/- | |
---|---|---|---|---|
Add SubsetMessageGroup | mediawiki/extensions/Translate | master | +206 -0 | |
Add tag cache to MediaWikiTopMessageGroup | translatewiki | master | +3 -1 |
Event Timeline
To reproduce: MessageGroupStats::forGroup( 'core-0-mostused', MessageGroupStats::FLAG_NO_CACHE );
Job queue was stuck again, so I used xdebug profiling to investigate. Over half of the time was spent in MediaWikiTopMessageGroup->getTags.
Change 966508 had a related patch set uploaded (by Nikerabbit; author: Nikerabbit):
[translatewiki@master] Add tag cache to MediaWikiTopMessageGroup
Here's a screenshot of time profile with my suggested fix, from live environment (note that this is an incomplete run, I did Ctrl+C after 30 seconds or so).
Change 966508 merged by jenkins-bot:
[translatewiki@master] Add tag cache to MediaWikiTopMessageGroup
New bottleneck is getDefinitions:
Still takes 30 minutes which is more than I'd like.
Change 966515 had a related patch set uploaded (by Nikerabbit; author: Nikerabbit):
[mediawiki/extensions/Translate@master] Add SubsetMessageGroup
Change 966515 merged by jenkins-bot:
[mediawiki/extensions/Translate@master] Add SubsetMessageGroup
cat debug.php MessageGroupStatsRebuildJob::newRefreshGroupsJob( [ 'core-0-mostused' ] )->run(); time php /srv/mediawiki/workdir/maintenance/eval.php < debug.php ******************************************************************************* NOTE: Do not run maintenance scripts directly, use maintenance/run.php instead! Running scripts directly has been deprecated in MediaWiki 1.40. It may not work for some (or any) scripts in the future. ******************************************************************************* real 0m31.487s user 0m17.516s sys 0m0.654s
30 seconds is much faster than 30 minutes (or multiple hours before the first improvement).
Found one instance from the logs how long the job would run:
Oct 17 10:45:07 web2.translatewiki.net php[211581]: 2023-10-17 08:45:07 MessageGroupStatsRebuildJob Special:MainPage cleargroups=["wikimedia-advanced","ext-0-wikimedia","ext-timedmediahandler-user","core-0-mostused","mediawiki","wikimedia-main","ext-0-all","ext-timedmediahandler"] namespace=-1 title=MainPage requestId=7d88edddbed20efa22c22c81 (id=89289724,timestamp=20231017084507) STARTING Oct 17 22:16:40 web2.translatewiki.net php[211581]: 2023-10-17 20:16:40 MessageGroupStatsRebuildJob Special:MainPage cleargroups=["wikimedia-advanced","ext-0-wikimedia","ext-timedmediahandler-user","core-0-mostused","mediawiki","wikimedia-main","ext-0-all","ext-timedmediahandler"] namespace=-1 title=MainPage requestId=7d88edddbed20efa22c22c81 (id=89289724,timestamp=20231017084507) t=41493097 good
It's not direct comparison, but the running time of other things is marginal compared to this one, so roughly speaking we improved the job running time from 11 and half hours 30 seconds. Or relative speaking 99.93% improvement.