| *** mtreinish_ is now known as mtreinish | 00:22 | |
| opendevreview | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/c/openstack/project-config/+/962557 | 02:12 |
|---|---|---|
| hemanth | Hey I have the following patch in sunbeam-charms project https://review.opendev.org/c/openstack/sunbeam-charms/+/959080/10 where zuul shows Change has successfully merged but i do not see the commit in https://opendev.org/openstack/sunbeam-charms/commits/branch/main | 03:14 |
| hemanth | The promote pipeline is failed and it is expected to fail. Does this have any affect on the merge | 03:17 |
| tonyb | hemanth: that obviously seems strange. I'm having lunch ATM but if no-one else looks at it I will when I'm at my desk | 03:17 |
| hemanth | thank you | 03:17 |
| tonyb | we are having issues with review.o.o ATM but I don't think they'd cause this | 03:17 |
| tonyb | how long did that change merge? just now or ages ago? where "ages ago" is more than an hour | 03:18 |
| hemanth | couple of hours ago | 03:19 |
| hemanth | and this one an hour ago https://review.opendev.org/c/openstack/sunbeam-charms/+/962894 | 03:19 |
| tonyb | okay. | 03:20 |
| hemanth | tonyb: I see the commits now in opendev.org/openstack/sunbeam-charms, thanks! | 04:45 |
| tonyb | hemanth: I was just starting to look at it. :/ | 04:45 |
| tonyb | hemanth: I don't think that the charm will get published to charmhub without another change merging | 04:51 |
| hemanth | tonyb: yeah I am fine with that.. I will handle it | 04:52 |
| tonyb | okay | 04:52 |
| opendevreview | Ivan Anfimov proposed openstack/project-config master: wip https://review.opendev.org/c/openstack/project-config/+/962925 | 12:04 |
| fungi | tonyb: hemanth: the cogent connectivity issues to review might also be impacting git replication to gitea | 13:25 |
| fungi | once thing settle down it would probably be a good idea to force re-replicate all projects just to make sure there aren't other commits missing in gitea too | 13:26 |
| clarkb | hemanth: tonyb: fungi triggered replication and I see that commit and a few others in gitea now. It was almost certainly the cogent outage as the timing lines up perfectly | 15:25 |
| dansmith | seems like network access into opendev servers is really slow today? | 15:31 |
| dansmith | gerrit and even the docs site take a long time to respond | 15:31 |
| dansmith | zuul seemed pretty responsive though | 15:31 |
| clarkb | dansmith: the docs site and gerrit are completely independent and in different cloud providers in different parts of north america. Gerrit was impacted by a large cogent outage last night that we've had to rerun background tasks for to sync things back up with the gitea servers. I suspect that is why you noticed slowness there. It should be back to normal at this point | 15:33 |
| clarkb | though. | 15:33 |
| clarkb | dansmith: then the docs server is overwhelmed by clients doing slow reads filling all the slots. We've got a change in flight to increase the limits in apache | 15:33 |
| dansmith | my devstack just timed out cloning from it | 15:33 |
| fungi | cloning from gerrit (review.opendev.org) or gitea (opendev.org)? | 15:34 |
| clarkb | and just == last 20 minutes or literally right now? | 15:34 |
| dansmith | review.opendev and yes just now | 15:36 |
| clarkb | ok ideally no one is cloning from review.opendev (you should use the giteas for that). I was just able to clone opendev/system-config and it is slower than I anticipate Receiving objects: 100% (135717/135717), 48.49 MiB | 871.00 KiB/s, done. But functional | 15:37 |
| clarkb | let me test from my ovh node to see if this could be slowness in cogent still | 15:37 |
| dansmith | (brb) | 15:37 |
| clarkb | I think it is cogent still being sad I get Receiving objects: 100% (135717/135717), 48.49 MiB | 10.28 MiB/s, done. from my ovh node that was not affected by the cogent issue | 15:38 |
| dansmith | clarkb: for unmerged patches? I figured cloning from gerrit was the only way | 15:38 |
| dansmith | are all those refs mirrored to gitea? | 15:38 |
| clarkb | dansmith: they are | 15:39 |
| fungi | they're all mirrored actually, yes | 15:39 |
| clarkb | but also you don't typically clone to get unmerged patches | 15:39 |
| clarkb | but either way the data should be in gitea | 15:39 |
| clarkb | fungi manually deployed the apache tunables update to the server hosting docs and now we shall see if this is sufficient to get ahead of the slow clients | 15:41 |
| clarkb | mtr shows a path that looks similar to what had problems last night. Maybe they had to reduce capacity to remove broken hardware from the path ro somethign? | 15:45 |
| clarkb | their status page only shows an ongoign issue with att in the atlanta region though | 15:46 |
| clarkb | (however last night I couldn't even reach their status page) | 15:46 |
| sfernand | hey folks! Regarding the failing NFS jobs with POST_FAILURE, adding the depends-on for the patch has helped us get past the immediate failure, so now we have access to the logs. Thanks a lot clarkb for the help! :) | 15:51 |
| sfernand | It seems the underlying issue is that qemu-img convert is randomly getting stuck when it tries to write to the NFS mount. When it stalls, I see messages in the syslog like "task nfsd:71407 blocked for more than 122 seconds" [1] right after the qemu-img call. | 15:51 |
| sfernand | Is there any chance we could get a node held so I can ssh in and investigate further? I suspect it's a resource issue on the node itself (like memory exhaustion or I/O limits) | 15:51 |
| sfernand | [1] https://zuul.opendev.org/t/openstack/build/8e77fc30cdfe4ac98f2d4ef663c57f20/log/controller/logs/syslog.txt#4429 | 15:51 |
| clarkb | sfernand: yes, if you let us know the change number and job you wish to hold I can put that in plce for you | 15:51 |
| clarkb | sfernand: basically I tell zuul to hold things then you recheck to trigger the failure and that will preserve the test env | 15:52 |
| sfernand | awesome! change is https://review.opendev.org/c/openstack/devstack-plugin-nfs/+/952476 | 15:58 |
| sfernand | job is devstack-plugin-nfs-tempest-full, we may need to retrigger a couple time to see it fails | 15:58 |
| clarkb | sfernand: the hold request is in place. recheck away. Once you've got your failure let us know what your ssh pubkey is and we can add it to the environment | 16:01 |
| sfernand | will do. thanks! | 16:07 |
| dansmith | clarkb: fungi sorry, was on a call | 16:34 |
| dansmith | we clone to get unmerged patches all the time when we're testing things before merge... I'm not sure why that's unusual | 16:35 |
| dansmith | so you're saying I can use git.opendev but with the same refs/ path that gerrit gives me for a given patch and revision? | 16:35 |
| dansmith | surely that mirroring can't be instant, right? so how long after someone pushes a rev to gerrit can I see it from the gitea side? | 16:36 |
| fungi | dansmith: not instant, but usually sub-second | 17:13 |
| fungi | basically every time a change updates, gerrit internally triggers replication to gitea and those tasks can queue but usually get processed near-instantly | 17:14 |
| fungi | and yeah, the refs/... path should be identical | 17:15 |
| fungi | they're not browseable because gitea lacks the ability to browse named refs, but you should still be able to git fetch them | 17:15 |
| clarkb | dansmith: zuul only fetches the new changes from gerrit. It doesn't do a new clone each time | 17:22 |
| clarkb | we try as much as possible to get peopel to not clone from gerrit itself | 17:23 |
| clarkb | that is why we have a farm of replicas | 17:23 |
| clarkb | and yes usually subsecond but for larger repos like nova it will be longer. However still measurable in a few seconds I think | 17:23 |
| clarkb | but I don't think that is why you're seeing problems. I think the issue is related to the cogent problems from last night. Now we don't have packet loss but do appear to have much reduced bandwidth | 17:24 |
| dansmith | okay, I'll try to start cloning from gitea then | 17:31 |
| dansmith | clarkb: I understand zuul doesn't, but for a fresh devstack in a VM, it's easier to let it clone than pre-provision the repos, especially since clean and unstack blow away /opt | 17:31 |
| fungi | dansmith: the other thing to keep in mind is that if you "clone" at a specific gerrit change, you're getting that change's parentage rather than how zuul is actually going to test it. instead you want to fetch and cherry-pick any involved changes onto the target branch tip | 17:33 |
| dansmith | fungi: no, I really don't.. what I want is to test a given patch set the way it is in review.. I'm not trying to reproduce what zuul is going to do when/if the merge happens | 17:34 |
| fungi | we avoid gratuitously rebasing chages in review, which can lead to them having an outdated commit history compared to the branch state unless there's an actual merge conflict, but yeah if your goal is not to replicate what zuul is going to test then i guess that's sufficient | 17:34 |
| sfernand | clarkb: job failed on first run | 20:30 |
| sfernand | https://zuul.opendev.org/t/openstack/build/339e847cb5de4ae0aa5a5dc8705da8c8 | 20:30 |
| fungi | sfernand: looks like your held node is listed near the top of https://zuul.opendev.org/t/openstack/nodes so that worked. what ssh private key do you want granted access to it? | 20:36 |
| sfernand | what is the preferable way to share a public key with you | 20:38 |
| fungi | you can dump it here in irc or /msg it to me, or if you have it up at a url in, like, launchpad or github i can pull it from there just let me know the url | 20:38 |
| fungi | whatever works for you really | 20:40 |
| fungi | er, share the public key not the private key, of courser | 20:40 |
| fungi | the file with the .pub extension, e.g. ~/.ssh/id_ed25519.pub | 20:41 |
| fungi | i said private initially, totally not what i meant | 20:41 |
| sfernand | dont worry I got it, just sent you | 20:43 |
| fungi | yep, you should be all set, let me know if not | 20:43 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!