openstackgerrit | Merged opendev/system-config master: Make bindep installs non-interactive https://review.opendev.org/738121 | 00:18 |
---|---|---|
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] graphite container deployment https://review.opendev.org/738125 | 00:23 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] graphite container deployment https://review.opendev.org/738125 | 01:06 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] graphite container deployment https://review.opendev.org/738125 | 01:37 |
*** ykarel|away is now known as ykarel | 04:27 | |
*** _cipher has joined #opendev | 04:32 | |
*** ysandeep|away is now known as ysandeep | 04:40 | |
*** _cipher has quit IRC | 04:55 | |
*** _cipher has joined #opendev | 04:56 | |
*** aannuusshhkkaa has quit IRC | 05:37 | |
*** diablo_rojo has quit IRC | 05:50 | |
*** _cipher has quit IRC | 05:59 | |
*** _cipher has joined #opendev | 06:05 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/738150 | 06:10 |
*** danpawlik is now known as dpawlik|EoD | 06:15 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: [wip] graphite container deployment https://review.opendev.org/738125 | 06:16 |
*** rpittau|afk is now known as rpittau | 06:29 | |
openstackgerrit | Merged openstack/project-config master: Retire networking-onos, openstack-ux, solum-infra-guest-agent: Step 1 https://review.opendev.org/737987 | 06:50 |
*** ysandeep is now known as ysandeep|afk | 07:04 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Graphite container deployment https://review.opendev.org/738125 | 07:12 |
*** hashar has joined #opendev | 07:20 | |
*** _cipher has quit IRC | 07:22 | |
*** ysandeep|afk is now known as ysandeep | 07:23 | |
*** _cipher has joined #opendev | 07:27 | |
*** bhagyashris|afk is now known as bhagyashris | 07:27 | |
frickler | mnaser: wow, those amd nodes really seem to rock, a good 30% off of a complete tempest run. makes me wonder whether we might want to consider having a flavor with like 6 cores instead of 8, if we could increase our quota with that proportionally | 07:29 |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Graphite container deployment https://review.opendev.org/738125 | 07:30 |
*** tosky has joined #opendev | 07:35 | |
openstackgerrit | Ian Wienand proposed opendev/system-config master: Graphite container deployment https://review.opendev.org/738125 | 07:44 |
AJaeger | ianw: afs building failed on https://review.opendev.org/737995 (job openafs-rpm-package-build-promote) | 07:46 |
AJaeger | ianw: I think I know what's going on, patch coming | 07:47 |
AJaeger | fix is https://review.opendev.org/738155 | 07:50 |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:03 | |
*** _cipher has quit IRC | 08:04 | |
*** hashar_ has joined #opendev | 08:21 | |
*** hashar has quit IRC | 08:22 | |
*** hashar_ is now known as hashar | 08:29 | |
*** DSpider has joined #opendev | 08:46 | |
*** ykarel is now known as ykarel|lunch | 08:48 | |
openstackgerrit | Carlos Goncalves proposed openstack/project-config master: Add nested-virt-centos-8 label https://review.opendev.org/738161 | 08:50 |
*** ysandeep is now known as ysandeep|lunch | 09:06 | |
openstackgerrit | Shivanand Tendulker proposed openstack/project-config master: Removes py35 and py27 jobs for proliantutils https://review.opendev.org/738168 | 09:09 |
openstackgerrit | Carlos Goncalves proposed openstack/project-config master: Add nested-virt-centos-8 label https://review.opendev.org/738161 | 09:13 |
openstackgerrit | Shivanand Tendulker proposed openstack/project-config master: Removes py35 and py27 jobs for proliantutils https://review.opendev.org/738168 | 09:14 |
*** ysandeep|lunch is now known as ysandeep | 09:22 | |
*** ykarel|lunch is now known as ykarel | 09:55 | |
*** priteau has joined #opendev | 10:03 | |
*** rpittau is now known as rpittau|bbl | 10:04 | |
ttx | fungi, AJaeger: so I got a hit overnight on the github mirroring race condition, and it failed to behave the way I expected. It did not ignore the error in mirroring as it should have: https://zuul.openstack.org/build/4966cd5617624d348fa0048de14f1f96/console | 10:55 |
ttx | It should ignore the error if zuul.newrev is defined: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/upload-git-mirror/tasks/main.yaml#L41 | 10:56 |
ttx | And it was defined: https://zuul.openstack.org/build/4966cd5617624d348fa0048de14f1f96/log/zuul-info/inventory.yaml#37 | 10:56 |
ttx | Any hint welcome :) | 10:57 |
*** _cipher has joined #opendev | 11:03 | |
ttx | ... now thinking that I should just add a retry and suppress the newrev check, that would take care of the race condition too. | 11:06 |
ttx | ... but still interested in understanding why that first solution fails. | 11:06 |
mnaser | frickler: happy to experiment. These machines have really fast I/O and brand new processors. Nothing officially announced yet though :) | 11:16 |
*** _cipher has quit IRC | 11:19 | |
openstackgerrit | Thierry Carrez proposed zuul/zuul-jobs master: upload-git-mirror: use retries to avoid races https://review.opendev.org/738187 | 11:21 |
*** jaicaa has quit IRC | 11:23 | |
*** ryohayakawa has quit IRC | 11:35 | |
AJaeger | avass: do you know what's wrong with ttx's ignore_errors line? See his comments above - help appreciated, please | 11:46 |
*** jaicaa has joined #opendev | 11:49 | |
frickler | mnaser: yeah I saw the processors in the zuul output ;) | 11:57 |
*** rpittau|bbl is now known as rpittau | 12:12 | |
*** sorin-mihai has joined #opendev | 12:21 | |
*** ykarel is now known as ykarel|afk | 13:14 | |
*** cloudnull has quit IRC | 13:17 | |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Set noninteractive in assemble script too https://review.opendev.org/738204 | 13:28 |
*** mordred has quit IRC | 13:34 | |
*** cloudnull has joined #opendev | 13:40 | |
*** cloudnull has quit IRC | 13:44 | |
*** cloudnull has joined #opendev | 13:45 | |
*** mlavalle has joined #opendev | 13:58 | |
openstackgerrit | Shivanand Tendulker proposed openstack/project-config master: Removes py35, tox and cover jobs for proliantutils https://review.opendev.org/738168 | 14:22 |
clarkb | frickler: mnaser we did that with osic 4 core flavors back in the day. Definitely worth considering | 14:35 |
clarkb | infra-root should we proceed with https://review.opendev.org/#/c/738109/ now? and try to get a run of mamage projects in prod? | 14:36 |
*** ykarel|afk is now known as ykarel | 14:37 | |
fungi | clarkb: sounds good, i just reviewed and approved it | 14:39 |
fungi | i'm around, albeit barely treading water this morning | 14:39 |
fungi | but can help with the production debugging if it doesn't work | 14:39 |
clarkb | I rechecked it a bunch last night so fairly confident it is working | 14:40 |
clarkb | passing the gate will be like a 5th of 6th successful run | 14:41 |
* clarkb makes tea while that goes through the gate | 14:42 | |
*** priteau has quit IRC | 14:44 | |
*** priteau has joined #opendev | 14:47 | |
*** cloudnull has quit IRC | 14:50 | |
*** cloudnull has joined #opendev | 14:51 | |
fungi | yeah, it looks like a reasonable approach | 14:52 |
fungi | basically grab the list, if it's in there we know we don't need to create it, if it's not in there assume it might just be missing because the listing we got is incomplete so check directly whether it exists, and then if it doesn't exist create it | 14:52 |
*** priteau has quit IRC | 14:52 | |
openstackgerrit | Merged openstack/project-config master: Removes py35, tox and cover jobs for proliantutils https://review.opendev.org/738168 | 14:59 |
*** priteau has joined #opendev | 15:25 | |
clarkb | fungi: manage projects fix is about to merge | 15:29 |
*** priteau has quit IRC | 15:30 | |
openstackgerrit | Merged opendev/system-config master: Deal with gitea pagination of repo lists https://review.opendev.org/738109 | 15:30 |
*** priteau has joined #opendev | 15:30 | |
clarkb | there it is | 15:30 |
*** mordred has joined #opendev | 15:35 | |
mordred | clarkb: https://crbug.com/gerrit/13027 ... the gerrit folks upstream are looking at various things related to the master/main sitch - they're considering making it possible to support both targets one as the alias for the other | 15:36 |
mordred | clarkb: I think maybe we should add a bug to their list https://bugs.chromium.org/p/gerrit/issues/list?q=label%3AHotlist-Respectful-Terminology&sort=priority about the inability to replicate the default branch change | 15:36 |
mordred | it's not a gerrit issue to solve - but they might have a good context from which to discuss it with upstream | 15:36 |
clarkb | mordred: ++ fwiw I'm fairly certain the reason git push doesn't update HEAD is git push can work in a non replication setup | 15:37 |
clarkb | mordred: the solution there may be a --replicate type flag to git push where it implies force and updating HEAD and all that | 15:37 |
mordred | yeah | 15:37 |
clarkb | and ya gerrit can probably articulate that need better than us | 15:37 |
fungi | right, and git doesn't want to assume your local branches are the same as the remote's branches | 15:37 |
clarkb | fungi: manage projects is doing jeepyb things to gerrit now | 15:39 |
clarkb | so the gitea issue is past us (at least temporarily as we need to make those other improvments) | 15:39 |
clarkb | fungi: this is the bit you may be interested in though as it should do those acl updates | 15:39 |
mordred | clarkb: I've submitted an issue | 15:43 |
clarkb | mordred: I haen't had a chance to read the thing they wrote yet, but if aliasing they probably awnt to avoid assuming master is the only branch people would do that with. I could see that being useful for some projects like ansible that have decided not to use master but also use a name other than what people are converging on | 15:45 |
mordred | clarkb: yeah. they've also got an issue written to discuss the general idea of rename a branch | 15:46 |
mordred | clarkb: so it sounds like they're trying ot look at each of the pieces of this potentially generally | 15:46 |
openstackgerrit | Thierry Carrez proposed openstack/project-config master: [DNM] Define maintain-github-mirror job https://review.opendev.org/738228 | 15:47 |
ttx | fungi, clarkb: interested in early feedback on ^ | 15:48 |
ttx | (no urgency) | 15:48 |
ttx | The script is pretty well-tested, but the Ansible/Zuul part might need fixes | 15:49 |
ttx | Regarding: https://review.opendev.org/738187, it's probably too late today to approve it, won't be able to watch it much, so probably better wait for Monday | 15:50 |
*** rpittau is now known as rpittau|afk | 15:59 | |
clarkb | mordred: fungi: for when you get a chance. It appears that the gitea and gerrit side of things all went well according to the manage-projects log | 16:03 |
clarkb | mordred: fungi the problem I'm seeing is that hte job isn't ending now that manage-projects on review.o.o has completed | 16:03 |
clarkb | I expect that is because review-test is causing it to have a sad | 16:03 |
mordred | clarkb: \o/ | 16:04 |
clarkb | for prod we seem ok but in order to not have noise in the signal there (whcih caused confusion debugging the gitea thing) we may want to remove review-test from there or something? | 16:04 |
*** ykarel is now known as ykarel|away | 16:04 | |
clarkb | and more generally, we should think about how we can make ansible fail faster in those situations? | 16:05 |
mordred | clarkb: yes to both- we should definitely exclude review-test here | 16:06 |
mordred | clarkb: I think I left it in originally because it seemed like testing that manage projects works right against new gerrit would be good | 16:07 |
clarkb | k, I'm not able to do that right this moment. Now that I'm reasonably confident its happy on the prod side I need to figure out a bike ride before summer wakes up and says hello | 16:07 |
mordred | but maybe we're missing something to make that work | 16:08 |
clarkb | not sure how people operate in warm weather like this | 16:08 |
* mordred waves to clarkb from new orleans | 16:08 | |
clarkb | mordred: I'm learning if I don't get out before like 11am I'm better off waiting until tomorrow | 16:10 |
openstackgerrit | Monty Taylor proposed opendev/system-config master: Stop running manage projects on review-test https://review.opendev.org/738232 | 16:12 |
mordred | clarkb: I also recommend just learning to enjoy sweating | 16:12 |
*** mordred has quit IRC | 16:33 | |
*** mordred has joined #opendev | 16:38 | |
*** _cipher has joined #opendev | 16:42 | |
*** priteau has quit IRC | 16:48 | |
AJaeger | clarkb: seems jeepyb run fine - https://opendev.org/openstack/charm-neutron-api-plugin-arista is created and content was imported | 17:07 |
openstackgerrit | Merged openstack/project-config master: Normalize projects.yaml https://review.opendev.org/738150 | 17:23 |
AJaeger | infra-root, infra-prod-manage-projects timed out after 30mins on the change above ^ | 17:57 |
*** mtreinish has quit IRC | 18:04 | |
*** mtreinish has joined #opendev | 18:05 | |
clarkb | AJaeger: yes we think the timeout is related to trying to run on review-test | 18:07 |
clarkb | AJaeger: https://review.opendev.org/738232 should help. I'm reviewing that now | 18:07 |
mordred | AJaeger: the manage-projects itself should have been successful | 18:11 |
mordred | clarkb: does manage-projects just spin indefinitely trying to connect if the server isn't there? haven't we seen that issue in other contexts? | 18:12 |
mordred | clarkb: because, you know, we're not even running gerrit there yet | 18:12 |
clarkb | mordred: that could be | 18:12 |
mordred | clarkb: I think manage-projects will indefinitely retry | 18:12 |
mordred | clarkb: I unfortunately have to step out of the house for a little bit so I can't hands-on help right now | 18:13 |
clarkb | mordred: no worries I don't think its urngent now that the main failure is addressed | 18:16 |
clarkb | and its friday etc etc | 18:16 |
mordred | clarkb: yeah. yay friday | 18:16 |
clarkb | mordred: for docker zuul executors is there anything I can do to help move that along? | 18:16 |
mordred | clarkb: now that the krb5-user patch landed - we can try doing a docker-compose pull ; docker-compose up -d again | 18:21 |
mordred | on ze01 | 18:21 |
clarkb | mordred: cool I'll give that a go after breakfast/lunch | 18:21 |
mordred | I can do that real quick again if you wann akeep an eye on it | 18:21 |
mordred | or - I can leave it to you - turns out it's a simple operation either way :) | 18:22 |
clarkb | ya I can watch it between sandwich bites | 18:23 |
clarkb | mordred: just docker-compose down it if it is sad? | 18:23 |
mordred | clarkb: yeah | 18:24 |
mordred | clarkb: so -... | 18:24 |
mordred | I'm doig pull | 18:24 |
mordred | I ran out of space the first time - but the other container was still running. so I stopped it and repulled and it was fine | 18:25 |
mordred | but we might need to investigate disk space requirements | 18:25 |
clarkb | mordred: ok zuul runtime is on a separate partition | 18:25 |
mordred | clarkb: yeah. / is at 100% | 18:25 |
clarkb | for containers I've tried to aggressively prune images in our ansible and we may need todo that there since it isn't ansibling right now? | 18:25 |
clarkb | I'll take a look | 18:25 |
mordred | clarkb: ++ - I bet there's something we're not doing right - we're using 39G in / | 18:26 |
mordred | that's ... heavy | 18:26 |
* mordred runs out - biab | 18:27 | |
clarkb | my normal incantation of cd / && du -hs * | sort -h doesn't work when we've got /afs mounted | 18:30 |
clarkb | infra-root ze01:/root/var-lib-zuul-backup is 20GB large and accounts for a significant portion of our idsk use on / | 18:35 |
clarkb | looks like that was made back in 2017 (must've been part of initial zuulv3 rollout) | 18:35 |
clarkb | can we clean that up? | 18:35 |
fungi | i suspect it can | 18:44 |
clarkb | I'll clean it up in a bit if I don't hear objections | 18:46 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Paginate all the gitea get requests https://review.opendev.org/737885 | 18:59 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Increase parallelism of gitea project creation https://review.opendev.org/738064 | 18:59 |
clarkb | I'll WIP those because I want to recheck them a bunch just to be sure there aren't any other weird corner cases here to address | 18:59 |
mordred | clarkb: I agree, I think that can go | 19:01 |
mordred | clarkb: I mean - it's a backup from 2017 | 19:02 |
clarkb | ya I'm just about to context switch to that and clean it up | 19:02 |
mordred | clarkb: I've got headroom if you want | 19:02 |
clarkb | its fine I was just getting those gitea management changes rebased | 19:03 |
mordred | kk | 19:03 |
clarkb | rm'ing that dir on ze01 now | 19:03 |
clarkb | and done. mordred can I just docker-compose up -d now? or should I do another image pull? | 19:04 |
mordred | clarkb: should be good - although it also won't hurt | 19:04 |
mordred | the pull should be a no-op | 19:05 |
clarkb | k | 19:05 |
clarkb | zuul-executor is running on ze01 now | 19:06 |
clarkb | I'm going to prune docker images | 19:06 |
clarkb | all done | 19:06 |
clarkb | ze01 seems extremely busy so it is doing work. I guess we just wait now to see if the post run tasks arehappy | 19:10 |
mordred | clarkb: ++ | 19:11 |
mordred | clarkb: it seemed to generally work except for that last time | 19:12 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Finish retirement of openstack-ux,solum-infra-guestagent https://review.opendev.org/737992 | 19:16 |
openstackgerrit | Andreas Jaeger proposed openstack/project-config master: Finish retirement of networking-onos https://review.opendev.org/738263 | 19:16 |
*** aannuusshhkkaa has joined #opendev | 20:31 | |
*** shtepanie has joined #opendev | 20:41 | |
*** hashar has quit IRC | 21:02 | |
*** paladox has quit IRC | 21:33 | |
*** paladox has joined #opendev | 21:37 | |
mordred | clarkb: look at this: https://zuul.opendev.org/t/openstack/build/3d1fe33142dc42aeb0ce66f80a95674b/log/job-output.txt#5854 | 21:51 |
mordred | clarkb: I had tests timeout in sdk unittests - the test timeout is set to 5 seconds (which is still _absurdly_ long for a test) | 21:51 |
mordred | all of them are in random - which makes me thnk - perhaps test node is missing the random stuff? | 21:52 |
mordred | content = ''.join(random.SystemRandom().choice( | 21:52 |
mordred | string.ascii_uppercase + string.digits) | 21:52 |
mordred | for _ in range(file_size)).encode('latin-1') | 21:53 |
mordred | is the code in question | 21:53 |
fungi | "the random stuff" | 21:53 |
fungi | didn't it move to math.random? | 21:53 |
fungi | maybe newer python interpreter? | 21:53 |
mordred | fungi: we run <insert name of thing> on the vms to generate entropy no? | 21:53 |
fungi | oh, it was timing out? yeah, usually <thing> | 21:54 |
* fungi refreshes fridaybrain | 21:54 | |
mordred | fungi: can't think of the name of <thing> for the life of me | 21:54 |
fungi | haveged | 21:55 |
mordred | yes! | 21:55 |
fungi | i had to grep the dpkg -l output on one of my virtual machines for random terms | 21:55 |
fungi | proof i should not be behind a keyboard right now i guess | 21:55 |
mordred | fungi: haveged is listed in infra-package-needs | 21:58 |
mordred | and this ran on a bionic node in vexxhost - so it shouldn't be new or exciting | 22:00 |
clarkb | maybe haveged isnt running for some reason? | 22:00 |
clarkb | but ya haveged should provideplenty of entropy I think | 22:00 |
fungi | (...if it gets started) | 22:01 |
mordred | and this is only really asking for 4000 bytes - althuogh it's doing it per-thread - so 28k total | 22:01 |
clarkb | also perhaps related to new hardware in vexxhost | 22:01 |
clarkb | like maybe it uses hardware pool in kvm amd that is sad or something | 22:01 |
*** _cipher has quit IRC | 22:01 | |
mordred | that said - why is this using systemrandom in the first place | 22:02 |
*** _cipher has joined #opendev | 22:03 | |
mordred | I do not need random for security - this is a test fixture | 22:03 |
fungi | urandom would totally be sufficient there | 22:04 |
clarkb | is it possible that bypasses haveged somehow? | 22:06 |
clarkb | haveged should feed /dev/random though | 22:07 |
fungi | yeah, on bionic the kernel should be new enough to have nonblocking /dev/random after seeding | 22:07 |
fungi | i think | 22:08 |
fungi | or it could be friday, in which case all bets on the accuracy of my memory are suspect | 22:08 |
mordred | yeah | 22:08 |
fungi | i've been mowing for the past hour, so it's possible the sun has addled my brain | 22:09 |
* fungi gets back to it, this lawn isn't going to destroy itself after all | 22:09 | |
*** _cipher has quit IRC | 22:12 | |
*** _cipher has joined #opendev | 22:12 | |
mordred | fungi: have you considered getting a goat? | 22:15 |
mordred | fungi: it would have the added benefit of also eating any other object you leave downstairs | 22:16 |
clarkb | avoid the bit flies | 22:16 |
clarkb | *bit and dont google that | 22:16 |
clarkb | bah | 22:16 |
clarkb | bot | 22:16 |
mordred | clarkb: you type good | 22:17 |
clarkb | the bestest typist | 22:18 |
fungi | yeah, aware of botflies | 22:18 |
fungi | no thanks | 22:18 |
fungi | we have sandflies and those are already getting on my nerves more than the mosquitoes | 22:18 |
clarkb | system-config-run-base seems to be broken | 22:33 |
clarkb | https://zuul.opendev.org/t/openstack/build/903e7e94e1b6472696fb590ea083ee96/log/job-output.txt#1383 for a half a second I thought taht could be related to tripleo's problems but this is mad about an ssh rsa key file | 22:33 |
clarkb | we are failing to run the nested ansible to apply base to all the hosts | 22:34 |
clarkb | did we change how ssh works there? | 22:35 |
clarkb | oddly the gitea jobs passed both times and that also runs nested ansible | 22:35 |
clarkb | I'm not sure I have the friday afternoon motivation to debug that :) | 22:36 |
fungi | yeah, i'm in the middle of a half-hearted attempt to diagnose openstack constraints proposal failures | 22:41 |
clarkb | I rechecked it again. If it happens a third time I'll do my best to look closer | 22:44 |
clarkb | I wonder if this is an ssh-keygen issue like we have with zuul quickstart to get the format right | 22:45 |
clarkb | but ansible uses openssh not paramiko so it should just work | 22:45 |
fungi | but why only now? | 22:45 |
clarkb | distro update maybe? | 22:45 |
clarkb | that could explain the delta between base and gitea jobs too as we run them on different platforms maybe? | 22:46 |
fungi | yeah maybe | 22:46 |
*** tosky has quit IRC | 23:00 | |
*** DSpider has quit IRC | 23:30 | |
*** _cipher has quit IRC | 23:35 | |
*** _cipher has joined #opendev | 23:36 | |
*** _cipher has quit IRC | 23:40 | |
*** _cipher has joined #opendev | 23:44 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!