Tuesday, 2024-07-02

*** ykarel__ is now known as ykarel10:37
opendevreviewAurelio Jargas proposed zuul/zuul-jobs master: Add ensure-poetry role  https://review.opendev.org/c/zuul/zuul-jobs/+/92228613:54
opendevreviewMonty Taylor proposed zuul/zuul-jobs master: Hook poetry into ensure-python and build-python-release  https://review.opendev.org/c/zuul/zuul-jobs/+/92309415:00
opendevreviewMonty Taylor proposed zuul/zuul-jobs master: Add ensure-poetry role  https://review.opendev.org/c/zuul/zuul-jobs/+/92228615:02
opendevreviewMonty Taylor proposed zuul/zuul-jobs master: Hook poetry into ensure-python and build-python-release  https://review.opendev.org/c/zuul/zuul-jobs/+/92309415:02
jrosseri have one particular job using ubuntu-jammy-32GB, and when things get busy it ends up with NODE_FAILURE. I saw this once before and now it's happening again today https://zuul.opendev.org/t/openstack/build/d72396ac4299433a9b2d63a31cc564c815:21
clarkbjrosser: yes only one cloud is currently offering that node type iirc and if it can't boot the node you get a node failure15:22
clarkbthe nested virt labels have the same sort of limitation and why we encourage people to use them sparingly15:22
jrosserwould it not wait for a node?15:22
clarkbjrosser: the way nodepool works in our configuration it will attempt to boot the node three times in a cloud before moving to another cloud. If all clouds fail to boot the node you get node failure. With this node type only one cloud can boot the label so only three attempts are made. Nodepool will only wait if the cloud is accurately providing quota information back to nodepool15:24
clarkbindicating that there isn't enough quota remaining. However clouds can fail to boot nodes for a variety of reasons15:24
clarkbno valid host found is a common one where clouds are near enough to capacity and quota isn't conservative enough15:24
jrosserclarkb: might this be one of those cases where the potential number of nodes and the quota are mismatched?15:30
clarkbjrosser: yes potentially. It could be the cloud simply has less capacity today than it did in the past. I would have to look at logs to get a better sense of this though. Its also possible teh cloud is flaky at booting nodes (though I suspect that isn't the case)15:31
fungikeep in mind we're running flat out at capacity in nodepool for normal node types at the moment due to fixes for a very large/complicated openstack security advisory, not sure if that might be related15:32
fungihttps://grafana.opendev.org/d/21a6e53ea4/zuul-status15:33
clarkbfungi: I don't think the cloud providing the large nodes provides any of the regular flavors15:33
clarkbit isn't likely to be related if that is true15:33
fungiyeah, that's why i was unsure15:33
fungiunless there's some interaction on the nodepool launcher side breaking requests to other providers, which seems unlikely as well15:34
jrossergrafana suggests that vexxhost-ca-ymq-1 as a max nodes of 72 but in practice it never exceeds 5015:37
fungithat can also be in part due to mixing and matching of different memory sizes for nodes if we're doing 16g and 32g there (i don't recall)15:37
fungiso there might be memory quota for 72x 16g nodes but not 72x 32g15:38
clarkbya the main thing to check would be if the node failruse are a result of no valid host found errors or similar15:39
clarkbif so we can probably dial back the max servers value15:39
fungiin situations like that nodepool relies on the quota information the cloud supplies to determine available capacity for satisfying requests, and the max-servers is a sort of backstop15:39
clarkbI'm not in a good spot for that right htis moment. I have to finish some local patching for regresshion and then have code reviews I've promised15:39
clarkbfungi: frickler: https://review.opendev.org/c/opendev/git-review/+/920845 should we go ahead and land that one?16:02
fungii think so, assuming frickler was good with my answer16:03
fricklerah, I missed that answer, approved after unmangling what gerrit UI made out of the __init__ ;)16:12
fungiyeah, it's not super straightforward and could certainly benefit from some refactoring for clarity16:13
opendevreviewMerged opendev/git-review master: Update the upper bound for Python and Gerrit tests  https://review.opendev.org/c/opendev/git-review/+/92084516:49
opendevreviewMerged zuul/zuul-jobs master: Add ensure-poetry role  https://review.opendev.org/c/zuul/zuul-jobs/+/92228617:39
fungii manually enqueued and promoted a couple of bug fix changes in the gate pipeline which are blocking openstack security vulnerability fixes, just a heads up19:20
fungitesting fix changes specifically19:20
clarkbI'm going to be afk for a bit (popping out for lunch and then probably a hair cut)20:01
clarkbI don't quite have fungi hair but it is getting too long20:02
tonybfungi: If you could look at the held wiki node for various anti-SPAM extensions that'd be helpful20:03
tonybtony@thor:~$ grep wiki99 /etc/hosts20:03
tonyb104.239.143.6wiki99.opendev.org20:03
tonybAny testing hints would be great so I don't need to lean so heavily on you20:04
fungiis the vhost serving as wiki.opendev.org or just wiki99?20:06
fungilooks like it's redirecting to wiki9920:09
tonybFor testing it's wiki9920:09
fungik20:10
fungilooks like my login is working at least20:10
tonybNice20:10
tonybI assume that's OpenID20:10
fungiyeah20:11
fungilooks like https://wiki99.opendev.org/w/index.php?title=Special:RecentChanges&limit=500&hidepatrolled=1&days=30 is probably working, but i'll need to log in as a separate unverified user and test making some edits, which will need to wait until the openstack vulnerability stuff quiets down a bit more20:13
fungii'll also test mass deletion when i do that, which is the other half of the workflow for dealing with spammy/vandal edits20:13
tonybSounds good.  So SPAM looks like newuser is created (which also means creating an ubuntu-one account?) The new user posts a lot of stuff (possibly to their :talk page)20:16
tonybYou see that content in the link above and then mass delete everything they did and ban/block the account?20:16
tonybBut Also somewhere in there ... in the case the newuser is legit you'd instead add them to the autopatrolled group to filter them from further inspection?20:18
tonybIf I'm more or less on the right track you can add my user (or I can create a new specific one) to help manage the spam on the wiki20:19
fungithat is precisely the workflow, yes20:22
fungithere is also a grey area where their first edits might not be convincing so i leave them patrolled and then the next time they make edits i look back at their edit history and see they're a returning user20:23
fungirather i leave the new user unverified but not blocked and i mark their initial edits as patrolled but then check their subsequent edits the next time they make some20:24
fungiso the goal is that the query i linked above normally returns no results (or occasionally references to deleted pages which were spam)20:24
tonybOkay I think I understand how that'd work20:25
fungiit's easy to visually skip over the already deleted pages because the links in the recent changes report show up red since they go to pages that don't exist20:25
tonybI see a lot of users that I'd expect to be whitelisted (which is what I thought added the account to autopatrolled would do) in the list you linked to20:26
tonybis that normal or could it be a byproduct of something not migrating across20:27
tonybWell the same link for wiki.openstack looks approximately correct20:28
fungihttps://wiki99.opendev.org/w/index.php?title=Special%3AListUsers&username=&group=autopatrol&limit=50 is the verified users20:28
fungithe group name is autopatrol (as in their edits won't be manually patrolled by a moderator)20:29
fungilooks like there are a little over a thousand users in there20:30
fungiin that group i mean20:30
tonyband my account TonyBreeds is in that list, but my edits are also in the moderation list you linked20:31
fungioh, i think you need to be in a more privileged group to see the filtering correctly20:32
tonybAh okay.20:32
tonybThat's helpful20:33
tonybI feel like I've hijacked your time which I didn't mean to do.20:33
fungii've added you to administrator and bureaucrat groups on wiki99 if you want to refresh the quert20:33
fungiquery20:33
fungihopefully the page should be empty now20:33
tonyband indeed it is!20:35
tonyband I have a "Show patrolled edits" filter that wasn't there before20:35
fungiyeah, i think that's limited to bureaucrats or admins which is why it wasn't filtering them out for you before20:39
fungiit's also what i see normally on the old server if my login has expired20:39
tonybgot it20:39
opendevreviewTony Breeds proposed opendev/system-config master: [DNM] Run ansible-devel under python-3.11  https://review.opendev.org/c/opendev/system-config/+/92270421:55

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!