Wednesday, 2024-09-11

cardoeSorry. Kid afterschool stuff pulled me away.01:20
fungidon't apologize, we operate very asynchronously02:00
fungii'm going ahead and approving 926078 so that it will hopefully be deploying early enough in the day that we can get confirmation of production functionality out of the way13:32
fungithis is taking longer than anticipated... it's been an hour and system-config-run-etherpad is only just now getting a node request14:28
fungiit has finally started running14:34
fungiestimated 15 minutes to merge so deploy should hopefully beat the hourly jobs14:35
opendevreviewMerged opendev/system-config master: Update etherpad to 2.2.4  https://review.opendev.org/c/opendev/system-config/+/92607814:52
clarkbI have arrived just in time14:53
fungiinfra-prod-service-etherpad is running already14:55
clarkbya the container hs restarted too14:56
clarkbit says it is healthy14:57
fungiloading existing pads seems to be fine, with headings too14:57
clarkbsame for me14:57
clarkbI'm going to try loading them in an incognito tab just to rule out caching making things work14:57
clarkbthat still seems happy so this is looking good14:58
fungiand on that note, i need to disappear for some quick lunch/errands, but can help test meetpad when i get back in an hour or so14:58
clarkbthanks15:00
clarkbfwiw etherpad introduced a new local db type in addition to its "dirtydb" local type in 2.2.3. But we override that explicitly in settings to use mysql/mariadb so should be unaffected15:00
clarkbthe fact that we can still load old dbs implies we haven't gotten the db settings wrong15:00
clarkb*still load old etherpads implies we are using the old db15:01
corvusclarkb: have a test pad?15:02
clarkbcorvus: https://etherpad.opendev.org/p/gerrit-upgrade-3.7 is the one I was looking at since it has a bunch of headers in it but maybe best to not edit that one15:03
clarkbcorvus: https://etherpad.opendev.org/p/isitbroken is one that we can type in15:05
clarkbthat seems to be working for me between firefox and chrome15:08
corvuslgtm too15:08
clarkbwe did break meetpad fwiw. And it is almost certainly related to the rewrite rule adjustments15:54
clarkbI'm guessing we need to add similar pass throughs for these things in the meetpad nginx config15:54
clarkbhrm though I see a successful response for padboostrap-*.min.js in my browser debug tooling so maybe that isn't the issue15:56
clarkbwould be annoying if the 2.2.2 updates for code loading are just incompatible currently15:56
clarkbthe console log shows the error occurs in padBoostrap trying to read an undefined property of skinName15:57
clarkbfirefox has a slightly different error and says parent.parent.clientVars is undefined16:06
corvusclarkb: fungi i feel like maybe we didn't sufficiently discuss "system-config" as a potential home for the image build jobs... pros: it's an untrusted repo that deals with how opendev runs services; cons: it's a big repo that's *mostly* focused on ansible-based system operations.16:07
corvusbut i made my original test change in that repo; so that's a rough sketch of what it could look like: https://review.opendev.org/84879216:10
corvuswould probably call that top level directory something like "zuul-images" instead of "nodepool"16:10
clarkbya I guess taht would work. Might be a little awkward due to being in the openstack tenant and having other responsibilities if we want to load the configs into other tenants later (so they get access to the images?)16:10
corvusoh ha i forgot about that16:11
corvusi have a strong preference for doing this in the opendev tenant, so i'll stick with the new-repo plan :)16:12
fungiokay, back16:13
opendevreviewJames E. Blair proposed openstack/project-config master: Add opendev/zuul-jobs repo  https://review.opendev.org/c/openstack/project-config/+/92894516:14
opendevreviewJames E. Blair proposed openstack/project-config master: Add opendev/zuul-jobs to Zuul  https://review.opendev.org/c/openstack/project-config/+/92894616:14
fungiand yeah, i would have made the same point about opendev/system-config still being in the openstack tenant16:14
corvus(my strong preference comes from the new pipelines we're going to be experimenting with; we may end up adding them to openstack eventually, but that's not certain now, so i'd like them to be in opendev)16:15
corvuswe can always move em later16:15
clarkbhttps://opendev.org/opendev/system-config/src/branch/master/playbooks/roles/jitsi-meet/files/meet.conf#L104-L117 this is the code block for proxying /etherpad/ to https://etherpad.opendev.org in meetpad's nginx conf16:16
clarkband looking in developer tooling there is no obvious failure to load files16:16
clarkbbut there does seem to be missing content (particularly settings I guess?) in the running js16:17
clarkbbut things like locales.json are being loaded just fine16:17
clarkbits almost like it thinks it has loaded everything then it starts to run the js and explodes. But all of the previous requests 200 and appear to return the rough type of data that I woudl expect. I guess I can compare this set of requests to what a normal etherpad with jitsi meet looks like16:21
fungithanks, i was about to ask if anyone was already working on the path changes for meetpad16:23
clarkbfungi: well I'm not sure yet that the path changes are needed since we basically send everything under /etherpad/ to the etherpad server16:24
clarkbcomparing network traces between etherpad alone and meetpad I see etherpad alone requests and receives a manifest.json but I don't see that in meetpad at all16:24
fungipossible something in etherpad itself has regressed to using a full/explicit path instead of relative16:25
clarkblooks like meetpad does serve a manifest.json for jitsimeet. etherpad's manifest.json is actually html.... so maybe those two things are getting confused?16:26
fungioh, one shadowing the other?16:27
clarkbhowever I don't see a request to /etherpad/p/manifest.json at all under meetpad but etherpad alone does request /p/manifest.json16:27
clarkbok here's somethign that may actually be useful. If I open the wss messages pane for our websocket comms under etherpad alone i see messages for client vars and skinname16:28
clarkbso this is probably a websocket issue if we're not getting that data back under meetpad?16:28
clarkbhrm meetpad shows those messages too though16:29
clarkbaccording to the console in chrome the initial failure occurs during init, but then there are subsequent errors of the same "TypeError: Cannot read properties of undefined (reading 'skinName')" in handle message from server16:35
clarkbso ya I think this must be some sort of websocket read failure16:35
clarkbmaybe not of the socket itself but at least of the data coming out of the socket16:36
opendevreviewClark Boylan proposed opendev/system-config master: Don't redirect etherpad manifest.json  https://review.opendev.org/c/opendev/system-config/+/92895116:41
clarkbat the very least I think ^ should be fixed16:41
clarkbI don't know if that will fix this issue though16:41
clarkbbut I'm fairly stumped otherwise without a better udnerstanding of etherpad js and how everythign is tied together with the data coming over the websocket16:42
fungiyeah, let's see what happens after that deploys16:43
fungihashar: i'm curious, any idea if the machine learning work mentioned in https://x.com/chrisalbon/status/1833873528757903434 is running on openstack?16:43
hasharoh16:44
fungiif so, i know people who would love to help you promote that16:44
hasharwell that is how I  learn our ML team is doubling  size :]16:44
fungihah! okay16:45
hasharI think they run on a dedicated Kubernetes cluster built on top of ML dedicated baremetal16:45
hasharI imagine cause they need some GPUs16:45
fungifair enough!16:45
fungiopenstack does have gpu management capabilities, but i can understand just wanting something dedicated16:45
hasharthey are on Libera.chat in #wikimedia-ml  if you want to ask. Chris Albon is the team director16:46
fungithanks!16:46
hasharwe don't use OpenStack in production. It was baremetal and Ganeti for VMs16:47
hasharwe now have some kubernetes cluster which afaik is built on top of baremetal16:47
hasharbut we do have a cloud offering16:47
fungineat, thanks for the details16:47
fungias one of the directors for spi, i similarly like to see when spi associated projects are relied on in production ;)16:50
fungiit's not often i come across anyone mentioning ganeti these days16:51
hasharI think the idea was that for our production use case, OpenStack was largely overkill16:52
fungiparticularly if you didn't need multi-tenancy, yes16:52
hasharbut the cloud we offer to our volunteers is powered by openstack with some kubernetes built on top of it16:53
fungiawesome!16:53
hasharhttps://wikitech.wikimedia.org/wiki/Portal:Cloud_VPS 16:53
hasharso yeah there is a whole ecosystem built on top of OpenStack :]16:54
fungii'll pass that along. i saw someone asking about the wmf ai/ml activity mostly because there's been a marketing push lately about the various organizations who are doing that sort of stuff on openstack, but i'm always happy to know about any uses of it really16:56
hasharwell they can reach out to Chris Albon directly, he will know for sure16:56
hasharbetter reach out to the canonical source than relaying on my hearsays16:57
hasharafaik the very first thing he started when joining was to create a Kubernetes cluster.16:57
fungiyeah, i mostly just abuse your presence here because i like to have opportunities to talk to you ;)16:58
* hashar blushes16:59
hasharelse for our cloud offer the team has communication channels at https://wikitech.wikimedia.org/wiki/Help:Cloud_Services_introduction#Communication_and_support16:59
hasharI am not sure who would be the best person to talk to17:00
hasharbut for sure bd808 (here in this channel) knows all about it since AFAIK he did a lot of product management / definition17:00
fungithanks!17:01
hasharand please ping me anytime, I am happy to chat :-]17:01
fungisame, of course ;)17:02
JayFhashar: if there's ever anything we can do to help you all get those BM pieces running on Ironic (should you want to move your BM into OpenStack, too), feel free to reach out. (OpenStack+)Ironic with k8s on top is an extremely common setup.17:06
hasharJayF: I don't know much about how SRE manages our machines, but we have a couple decades of experience doing that and I imagine they would be reluctant to add an abstraction layer17:09
hasharthey picked up Ganeti cause it was good enough for our use case17:10
JayFIronic was mostly written by SRE-types for SRE-types fwiw; not trying to sell you a thing just telling you we're here and actively willing to help if you ever wanna :)17:10
hasharsure thing :]17:11
JayFwe always say you can come chat with us just if you wanna complain about horrible hardware with us :D 17:11
hasharthen I am not involved in the SRE team or the baremetal layers, so I can't tell much about it17:11
opendevreviewMerged opendev/system-config master: Don't redirect etherpad manifest.json  https://review.opendev.org/c/opendev/system-config/+/92895117:31
clarkbas expected ^ didn't fix things but did fix serving of that file17:43
clarkbthis is fun the chat messaging functionality works but not the pad content17:45
clarkbchat messaging uses websocket traffic too so the websocket is working at least minimally (which seemed previously confirmed by the fact that the messages are in the developer tool log)17:45
opendevreviewJames E. Blair proposed openstack/project-config master: Add opendev/zuul-jobs repo  https://review.opendev.org/c/openstack/project-config/+/92894517:48
opendevreviewJames E. Blair proposed openstack/project-config master: Add opendev/zuul-jobs to Zuul  https://review.opendev.org/c/openstack/project-config/+/92894617:49
clarkblooks like initializing the editorcontainer is where things break so what may explain why chat works but not the editable pad area18:00
clarkband then subsequent message handling fails because the editorcontainer isn't initialized?18:00
clarkbhttps://github.com/ether/etherpad-lite/blob/v2.2.4/src/static/js/ace2_inner.ts#L242-L243 I suspect this is the code that is exploding though am not 100% positive of that (I'm inferring it based on the two different errors from ff and chrome involving parent.parent.clientVars and skinName)18:06
clarkbthat code is nwe from august 1718:06
clarkboh yup I think I figured out how to confirm that and yes that appears to tbe the line18:08
clarkbthe system knows what the skin name is because it is requesting the no-skin/pad.css file18:17
clarkbhttps://github.com/ether/etherpad-lite/issues/6618 this is it18:31
clarkbunfortunately instead of linking to the fix on the develop branch they've just suggested we deploy the develop branch....18:31
clarkbhttps://github.com/ether/etherpad-lite/commit/a61f634586017dcadffd859820b66cd5916cef3a18:32
clarkbthe issue is that `parent` is the meetpad window I think and so they move to `window` to select the inner iframe?18:32
clarkbdo we want to cherry pick that fix on top of v2.2.4 and see if it works, or deploy develop, or just wait for the next release?18:33
clarkbI'll stew on that for a bit while I do other things18:34
fungii guess without knowing when the next release will be, that might be "roll back to 2.2.1 and restore the database backup... then wait"?18:49
clarkbfungi: I suspect we may not need a db backup rollback because I don't think they've made changes to the db format. But also I think everything mostly works except for the embedded etherpad in meetpad (meetpad itself should still work for comms) so maybe we just roll forward?18:50
clarkbunrelated to everything else https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/H6NWHXBZI4FKTLY6VY3GVNYNWN7XSTFH/ is an interseting email it doesn't have content in the archive I think baceuse they didn't send a plain text version too?18:50
clarkbfungi: maybe we reach out on the issues and ask if we cherry pick that one commit if that is a workable compromise or if there aer other commits we need too?18:51
clarkbor ask if a 2.2.5 will be happening soon18:52
clarkbthe commit log since 2.2.4 seems liek a small number of bugfixes and may make sense to do another quick bugfix release if we wask them18:52
fungii don't think it's just the lack of plaintext part in that message that results in the weird archive result, if memory serves there are some message types defaulted in clients more popular with southeast asian script users that end up with a mime part that is an encoded message object consisting of additional mime parts, and hyperkitty may have trouble decoding that correctly to pick which part18:56
fungiis the actual message text18:56
clarkbah. There are other odd things too liek I think maybe I was bcc'd but also the headers are there for the mailing list so my mua knows how to reply to list but if I hit reply to all it only replys to the sender not the list and sender like it does normally18:57
clarkbdefinitely doing weird things with their client18:57
fungiit seems these anomalies originate from people who don't typically use e-mail for communication18:58
clarkbfungi: ya the more I think about this the more it makes sense to me that etehrpad should make a bugfix release soon given the bugfixes on the develop branch. I think we should just respond on the issue and ask if they are open to doing that18:58
clarkbya also using screenshots to capture text logs is painful to me18:59
fungior may only use it for inter-office communication on some bespoke corporate group communications platform where compatibility with external mail systems is a half-implemented afterthought18:59
clarkbfungi: I posted https://github.com/ether/etherpad-lite/issues/6587#issuecomment-2344526946 and now time for lunch19:21
fungiyeah, fingers crossed. but at least we can run a snapshot in the interim if we want20:44
fungiheaded out to grab dinner, but will check back in later20:44
clarkbshould I prep a build of current develop or a cherry pick of that specific change?20:45
clarkbconsidering the blast radius of this is fairly small I think I'm willing to wait a bit to see if etherpad has a better answer for us before blazing ahead in any particular direction20:50
clarkbbut I'm happy to be more proactive if others disagree20:50
opendevreviewJames E. Blair proposed openstack/project-config master: Add opendev/zuul-jobs repo  https://review.opendev.org/c/openstack/project-config/+/92894521:25
opendevreviewJames E. Blair proposed openstack/project-config master: Add opendev/zuul-jobs to Zuul  https://review.opendev.org/c/openstack/project-config/+/92894621:25
cardoeSo might be the wrong place to ask but since OpenInfra is OIDC. Do you guys use that for Keystone and OpenStack services at all?21:26
clarkbwe don't run any of our own keystones and I'm not aware of any that integrate with the openinfra oidc server so no on that front. There was some initial movement to moev login for thinsg to that but that stalled out, then we came up with the even better idea to have keycloak act as an identity proxy so that you can login with whatever you prefer including openinfra oidc but that is21:29
clarkbcurrently stalled out on getting a proxy layer in place for openid to ubuntu one (since that is what the bulk of current users are using having that will ease the transition). tonyb is working on getting that going currnetly though21:29
clarkbfor example I think zanata (and maybe weblate) translation stuff authenticated with openinfra's identity server21:29
jrosseri integrate keystone with my employers OIDC (including keycloak in the middle as an identity broker)21:31
jrosserthe deployment projects have examples of this21:31
opendevreviewMerged openstack/project-config master: Add opendev/zuul-jobs repo  https://review.opendev.org/c/openstack/project-config/+/92894522:51
cardoeGuess I should tinker with keycloak then cause I’ve been messing with dex for the proxy piece.23:11
cardoeThe docs around keystoneauth (client wise) are sparse for OIDC. The examples I’ve seen you have to have a lot of variables / data exported. I’ve got something working with vexxhost’s websso but that’s a third party plugin.23:12
opendevreviewMerged openstack/project-config master: Add opendev/zuul-jobs to Zuul  https://review.opendev.org/c/openstack/project-config/+/92894623:17
cardoeIt’s really the client side authentication piece that seemed awkward to me.23:20
fungicardoe: you might ask for tips in #openstack-keystone too. i recall knikolla mentioning doing something similar for mass open cloud's federation, he's not been around much lately but other folks in there may have also done similar things23:54
cardoeGood point.23:58

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!