| Clark[m] | bit of a slow start for me today. I'm working on system updates then our meeting agenda then I'm into a big block of meetings. | 16:12 |
|---|---|---|
| Clark[m] | corvus: for the zk upgrade the release notes indicate no special upgrade steps from 3.7 or 3.8 to 3.9. I do think the "normal" upgrade process is to upgrade each non leader node to the new version first then upgrade the leader last. I don't think our current ansible playbook is able to do things in that order (though may do so by chance). Do you think we should do that upgrade manually? | 16:13 |
| corvus | Clark: that's what you did last time. i think that's the safe thing to do. i would also be willing to try just letting it run unmanaged and see what happens (over a weekend). recovering shouldn't be a big deal if something goes wrong. either way wfm (and i'm happy to help / run it). | 16:23 |
| clarkb | corvus: ack | 16:26 |
| clarkb | in other news seems my usb drive or usb control or usb port is unhappy making loading secrets more painful this morning than it should be | 16:27 |
| clarkb | I'll have to test the device in another computer to see if I can narrow down the problem | 16:27 |
| clarkb | ok I've hacked up the agenda on the wiki. I think we managed to complete a lot of stuff (dealth with gerrit restarts, etherpad upgrade, trixie mirroring, etc) so I've cleared a bunch of stuff off the ageanda and made some updates to other items | 16:33 |
| clarkb | is there anything new that should be added? If not I'll send this out in about 10-15 minutes probably | 16:33 |
| corvus | clarkb: want to put zk on there? | 16:34 |
| clarkb | corvus: ++ | 16:44 |
| clarkb | also I think I have a bad usb port, but another port on the same controller seems to work | 16:45 |
| clarkb | unsure if this is possibly related to the recent kernel update (wouldn't surprise me) | 16:45 |
| clarkb | and now my usb dac doesn't want to show up. This is "fun" | 16:46 |
| fungi | recent kernel update? | 16:46 |
| fungi | maybe something changed with the driver for your hub chip | 16:47 |
| clarkb | ya I'm wondering. It is a minor patch update not a new proper version but weirder things have been known to happen | 16:47 |
| clarkb | I may just try a reboot here and see if that reinitializes things more happily | 16:50 |
| fungi | have you tried turning it off and on again? | 16:52 |
| clarkb | ok agenda is ssent | 16:57 |
| clarkb | I'm going to see if a reboot helps anything | 16:57 |
| clarkb | my av setup is also not working which will make my meeting later today more annoying. Here I was thinking "this is smart usb just always works" and now usb has failed me | 16:57 |
| clarkb | ya that port works now after a reboot. But my usb dac is still not coming up so I'm beginning to suspect something with the kernel update impact usb behaviors. Fun | 17:02 |
| fungi | have the ability to boot your prior kver? | 17:05 |
| clarkb | I do, I'll test that if I can't get av stuff working | 17:09 |
| clarkb | further debugging: if I plug my external hub back into that port (that was working when the device was direct attached) it goes back to not working. So maybe something to do with this external hub | 17:14 |
| fungi | on the topic of "why i hate the typical github pull request workflow" i give you https://github.com/orcwg/orcwg/pull/213/commits as a prime example | 17:17 |
| fungi | push a huge commit... oh not suitable? push a revert, then a tiny change | 17:18 |
| fungi | now when that merges you get a pile of unnecessary noise in the commit history for what should have been a comparatively small diff | 17:18 |
| fungi | unless you squash them, rewriting the authors commits entirely | 17:19 |
| clarkb | the built in audio controller continues to not function and I can't get my usb dac going. I suspect that I have a bank of working ports on controller A and a bank of less working ports on controller B but some devices seem to work with controller B | 17:22 |
| clarkb | and dac doesnt' work with any ports. | 17:23 |
| clarkb | but I suspect that dac issues may be unrelated (or maybe whatever affected my controller hit the connected dac too | 17:24 |
| fungi | possible some usb-connected device is throwing noise onto the bus? is dmesg reporting any usb resets? | 17:29 |
| fungi | or could something be drawing too much current? | 17:30 |
| clarkb | yes on the bad port only | 17:30 |
| clarkb | I got it working with some combo of other ports, then disconnected to tidy of cables and now it doesn't work again. So something is definitely breaking in a weird way. I may just put it down for today and figure out using my phone as an av device | 17:30 |
| clarkb | ya ok I think if I just avoid that one bad port things eventually steady state into a happier place | 17:37 |
| clarkb | not ideal but I can make this work for now | 17:37 |
| clarkb | corvus: were you able to track down that arm64 node in osuosl that was not getting cleaned up after being put in an used state? | 17:54 |
| corvus | clarkb: yes! https://review.opendev.org/966500 should fix the immediate issue with that node | 17:57 |
| corvus | then every change after it in that stack is making sure it doesn't happen again (and ultimately prompted the zk upgrade conversation) | 17:57 |
| clarkb | awesome that is now on my review list | 17:57 |
| corvus | it probably makes sense to merge that today and do a launcher restart with just that | 17:58 |
| clarkb | ack I should be able to review after all my meetings are done ~noon today | 17:58 |
| corvus | ++. should be a short review. :) | 17:58 |
| mnaser | corvus: i was thinking of slowly adding patches to my zuul-web stack to bump up things a few components at a time (aka redux... then react.. then patternfly.. etc) -- would that duplicate something that's on your list already since you mentioned it yessterday? | 18:01 |
| mnaser | Sorry. Thought this was the Zuul channel. | 18:03 |
| corvus | mnaser: yes, i've done enough work in that direction already that i know that eventually we hit a deadlock that's going to require untangling a lot, including removing CRA. i think it's going to be easier to just do it all at once, and yes, i currently have that penciled in for early december on my calendar. | 18:03 |
| opendevreview | Goutham Pacha Ravi proposed openstack/project-config master: Set noop job for the governance-sigs repository https://review.opendev.org/c/openstack/project-config/+/966755 | 18:43 |
| tonyb | clarkb: anytime from 1900utc works for me for the Gerrit update. that's middle of the day(ish) for the US though so maybe a little later? | 19:40 |
| clarkb | tonyb: ya what about 2100 or 2200 UTC? thats 1pm or 2pm pacific so not quite end of day (which is my preference) while still minimizing impact on others | 19:41 |
| clarkb | tonyb: I think we can expect that gerrit shutdown will timeout trying to process the h2 databases which is a 5 minute timeout. Then startup will be "cold" with caches needing to be rebuilt | 19:41 |
| tonyb | works for me | 19:41 |
| clarkb | so it may take around 15 minutes or so from start to finish | 19:41 |
| tonyb | okay. Any of that time window works for me. so pick based on your preference (lunch/bike ride/weather etc) | 19:44 |
| tonyb | I'll be around | 19:44 |
| clarkb | tonyb: I think we can get started at 2100 which should have the actual restart happening by about 2130 | 19:44 |
| * clarkb makes notes now to not forgety | 19:44 | |
| tonyb | sounds good | 19:46 |
| opendevreview | Merged opendev/zuul-providers master: Use mirror for trixie image build job https://review.opendev.org/c/opendev/zuul-providers/+/966615 | 19:51 |
| clarkb | fungi: I ran a tail on the apache log for lists and don't see anythin that stands out as particularly bad. Just the expected crawling behaviors | 20:05 |
| clarkb | I need to eat lunch now, but could be the issue is unrelated to normal traffic in that case? | 20:05 |
| opendevreview | Merged opendev/system-config master: Fix test_yamlgroup https://review.opendev.org/c/opendev/system-config/+/966639 | 20:07 |
| fungi | yeah, without knowing what the database is spending so much of its time on, it's hard to say | 20:09 |
| clarkb | after lunch I looked at apache logs againand I think we can identify a couple of potentially problematic sets of requests | 21:36 |
| clarkb | not positive they are the cause of the load issues, but they follow patterns we've seen in the past that have caused problems | 21:37 |
| clarkb | I'm going to take advantage of the current weather situation and go outside for a bit. Back later and can catch up on lists or anything else that may come up then | 21:57 |
| corvus | clarkb: the cleanup change landed, so i'll restart the launchers and verify the node gets deleted | 22:01 |
| corvus | there are now 0 "used" nodes | 22:06 |
| corvus | Ramereth[m]: that instance you asked about on friday is deleted now (and the zuul bug is fixed). thanks for letting us know. | 22:07 |
| corvus | i have also cleaned up the leaked image upload records (that was a bug from a couple of weeks ago that should be fixed now) | 22:09 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!