openstackgerrit | Merged openstack/project-config master: Wheel publish jobs: include system-config roles https://review.opendev.org/734704 | 00:03 |
---|---|---|
openstackgerrit | Merged openstack/project-config master: Be more explicit about using python3 to run tools/ https://review.opendev.org/734393 | 00:05 |
openstackgerrit | Merged openstack/project-config master: Add Backport-Candidate label for Kolla deliverables https://review.opendev.org/733243 | 00:05 |
*** cloudnull has quit IRC | 00:23 | |
*** cloudnull has joined #opendev | 00:24 | |
*** Meiyan has joined #opendev | 01:02 | |
*** xiaolin has joined #opendev | 01:07 | |
ianw | clarkb: hrm, i think i'm hitting "If a child job inherits from a parent which defines a pre and post playbook, then the pre and post playbooks it inherits from the parent job will run only with the roles that were defined on the parent." | 01:14 |
ianw | i guess i need to add the system-config roles to https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/jobs.yaml#L1160 ; even though they're only used by the publish | 01:15 |
*** xiaolin has quit IRC | 01:22 | |
*** xiaolin has joined #opendev | 01:28 | |
*** xiaolin has quit IRC | 01:33 | |
*** mlavalle has quit IRC | 02:01 | |
*** xiaolin has joined #opendev | 03:16 | |
*** xiaolin has quit IRC | 03:20 | |
openstackgerrit | Ian Wienand proposed openstack/project-config master: Revert "Wheel publish jobs: include system-config roles" https://review.opendev.org/734739 | 03:40 |
*** ykarel|away is now known as ykarel | 04:11 | |
ianw | /afs/.openstack.org/mirror/wheel/debian-10-x86_64 : Connection timed out | 04:25 |
ianw | gosh darn it ... | 04:25 |
ianw | i dunno what the heck is up with those volumes http://paste.openstack.org/show/794549/ | 04:33 |
*** sgw has quit IRC | 04:34 | |
ianw | clarkb/fungi/corvus: ^ maybe you could take a bit to have a look at why these volumes appear corrupt on the executors? i'm a bit brain-dead on it now | 05:08 |
openstackgerrit | Merged openstack/project-config master: Revert "Wheel publish jobs: include system-config roles" https://review.opendev.org/734739 | 05:26 |
AJaeger | ianw: could you review https://review.opendev.org/732490 for dib - this should be fine now IMHO | 05:32 |
AJaeger | , please? | 05:32 |
*** xiaolin has joined #opendev | 06:02 | |
*** factor has quit IRC | 06:17 | |
*** factor has joined #opendev | 06:17 | |
*** Dmitrii-Sh has quit IRC | 06:17 | |
*** Dmitrii-Sh has joined #opendev | 06:18 | |
*** hashar has joined #opendev | 07:04 | |
*** xiaolin has quit IRC | 07:05 | |
*** iurygregory has quit IRC | 07:11 | |
*** xiaolin has joined #opendev | 07:17 | |
*** rpittau|afk is now known as rpittau | 07:21 | |
xiaolin | hello, opendev, we want to donate computing resources, do we need to build our own cloud to meet the minimum requirements: support for a 100 concurrent VM instances, each with 8GB RAM, 8 vCPUs, and 80GB storage? | 07:24 |
*** tosky has joined #opendev | 07:29 | |
*** iurygregory has joined #opendev | 07:33 | |
frickler | xiaolin: our experience with operating a cloud ourselves haven't been too positive, so from our side the best solution would be if you could operate a cloud yourself. if that isn't possible, we might consider some other option, but that would require some more discussion | 07:33 |
frickler | xiaolin: the size of the cloud isn't a hard limit, in particular if you are talking about mips based ressources (iirc) instead of x86 | 07:34 |
*** xiaolin has quit IRC | 07:36 | |
frickler | xiaolin: most of the team members are US based, it would be great if you could continue discussion during their business hours. if that's too inconvenient, maybe sending a mail with how your plans look like would be easier | 07:36 |
frickler | see http://lists.opendev.org/cgi-bin/mailman/listinfo/service-discuss | 07:37 |
openstackgerrit | Merged opendev/irc-meetings master: Update QA office hour time https://review.opendev.org/734612 | 07:46 |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:01 | |
*** ravsingh has joined #opendev | 08:24 | |
*** DSpider has joined #opendev | 08:26 | |
*** ykarel is now known as ykarel|lunch | 09:13 | |
*** hashar has quit IRC | 09:25 | |
*** xiaolin has joined #opendev | 09:45 | |
*** ysandeep is now known as ysandeep|lunch | 09:48 | |
*** xiaolin has quit IRC | 09:51 | |
*** ykarel|lunch is now known as ykarel | 10:04 | |
*** xiaolin has joined #opendev | 10:04 | |
openstackgerrit | Chandan Kumar (raukadah) proposed openstack/diskimage-builder master: [DNM] missing file /etc/pki/tls/private https://review.opendev.org/734782 | 10:11 |
*** sshnaidm|afk is now known as sshnaidm | 10:14 | |
*** rpittau is now known as rpittau|bbl | 10:19 | |
*** ysandeep|lunch is now known as ysandeep | 10:20 | |
*** xiaolin has quit IRC | 10:23 | |
openstackgerrit | Carlos Goncalves proposed zuul/zuul-jobs master: configure-mirrors: add CentOS 8 Stream https://review.opendev.org/734787 | 10:27 |
openstackgerrit | Carlos Goncalves proposed opendev/base-jobs master: Add centos-8-stream nodeset https://review.opendev.org/734788 | 10:29 |
*** Meiyan has quit IRC | 10:30 | |
openstackgerrit | Carlos Goncalves proposed openstack/project-config master: CentOS 8 Stream initial deployment https://review.opendev.org/734791 | 10:40 |
openstackgerrit | Luigi Toscano proposed openstack/project-config master: gerritbot: more notifications in the cinder channel https://review.opendev.org/734792 | 10:47 |
*** tkajinam has quit IRC | 10:53 | |
*** lpetrut has joined #opendev | 10:59 | |
mordred | frickler: it occurs to me - one of the options we might want to consider (and consider putting on that document) - is if someone wants to donate compute resources but are not already cloud operators, it might be easier for them to ship some computers to one of our existing cloud providers | 11:22 |
frickler | mordred: I've been thinking that too, but gathered we should ask the affected operators first, donnyd and mnaser would first come to my mind | 11:24 |
frickler | there's likely also complications like how to handle hardware replacements and how the hardware integrates into the existing environment, if I were a cloud operator, I'd rather get a financial donation and order the same hardware I use everywhere else | 11:28 |
mordred | frickler: yeah - donnyd and mnaser are who I was thinking about | 11:29 |
mordred | frickler: and yes to financial - unless there is some sort of specific hardware requirement, such as mips | 11:30 |
openstackgerrit | Carlos Goncalves proposed openstack/diskimage-builder master: Add support for CentOS 8 Stream https://review.opendev.org/734083 | 11:35 |
fungi | agreed, often the folks interested in donating very specific hardware have ties to the manufacturer of said hardware, and so can provide it at much lower overall cost than normal market prices | 11:41 |
fungi | so an actual hardware donation could go a lot further than a purely financial one | 11:42 |
mordred | yeah - without needing to learn how to be a cloud operator | 12:02 |
mordred | just for what it's worth - I am beset upon by kittens atm. one has decided my trackpad is a pillow and the other has decided my forearm is a perch. so -- computering is currently under duress | 12:03 |
fungi | yes, christine showed me a photo. remember, they *are* carnivores and *you* are made of meat | 12:04 |
mordred | yes, this is rather true | 12:04 |
fungi | they could just be trying to lull their breakfast into a false sense of security | 12:04 |
mordred | I am 100% certain they only see me as breakfast | 12:05 |
*** rpittau|bbl is now known as rpittau | 12:05 | |
mordred | I think they are employing the correct tactics | 12:05 |
fungi | one of ours still hasn't stopped trying to eat us, after four years of abject failure | 12:06 |
mordred | one day success may be forthcoming | 12:06 |
fungi | he'll eventually wear us down, yep | 12:07 |
*** ravsingh has quit IRC | 12:20 | |
*** olaph has quit IRC | 12:28 | |
fungi | so, looking at the mirror volumes ianw mentioned, it appears all the recently-created wheel volumes in afs are showing issues while the older 3 are behaving normally | 12:31 |
fungi | i see the same behavior locally from my workstation even | 12:31 |
AJaeger | ttx, config-core, looking at https://review.opendev.org/#/c/734640/ - I think we want to keep the official-openstack-repo-jobs for this step of retirement so that the repo gets empied and visible on github. Do you agree? ttx, or are you doing something in github so that this is not needed? | 12:33 |
fungi | AJaeger: i believe retired repos have been getting deleted from github | 12:34 |
AJaeger | fungi: ok, in that case we don't need it. | 12:35 |
fungi | they'll still be available on opendev, but the github mirror for openstack is now only active/maintained projects | 12:35 |
AJaeger | https://github.com/openstack/syntribos redirects to https://github.com/openstack-archive/syntribos - without the final change to delete the repo | 12:36 |
fungi | ahh, yeah, i guess they've been getting transferred to the openstack-archive org rather than just deleted | 12:36 |
fungi | maybe ttx knows how the repos in that archive org should look | 12:37 |
AJaeger | So, it's not available anymore in /openstack/ - but misses the deletion. So, ttx, what's your preference? Continue like with syntribos or want the final change as well? | 12:37 |
AJaeger | thanks fungi - let's wait for ttx | 12:38 |
openstackgerrit | Sagi Shnaidman proposed zuul/zuul-jobs master: Add jobs for testing ensure-ansible https://review.opendev.org/734584 | 12:40 |
fungi | ianw: clarkb: corvus: i'm mildly suspicious of afs01.ord.o.o, dmesg has entries from friday which look like a xen domu suspend/restore | 12:40 |
fungi | still trying to work out where the actual afs timeouts are coming from though, not sure if that server is involved | 12:40 |
fungi | yeah, nevermind, vos listvol says it doesn't host copies of those volumes anyway | 12:41 |
fungi | and afs02.dfw.o.o is probably not the source of the problem as it's only hosting readonly replicas while the timeouts are for interactions with the rw volumes | 12:42 |
*** hashar has joined #opendev | 12:46 | |
fungi | for some reason `fs checkservers -cell openstack.org` doesn't seem to work the way the manpage implies (it tells me the local machine is unavailable) | 12:50 |
fungi | this is weird... for some reason all the problem volumes have an extra "server afs01.dfw.openstack.org partition /vicepa RO Site" listed by `vos listvldb` | 12:57 |
fungi | (so one rw replica on afs01.dfw and two ro replicas on afs01.dfw as well as one ro replica on afs02.dfw) | 12:58 |
*** rajinir has quit IRC | 13:01 | |
fungi | no, wait! that's because i was querying from afs01.dfw | 13:03 |
*** sgw has joined #opendev | 13:03 | |
fungi | if i query from my workstation it lists an rw and ro site *on* my workstation?!? | 13:04 |
*** rajinir has joined #opendev | 13:04 | |
fungi | now the fs checkservers error is starting to make sense | 13:04 |
fungi | this is _bizarre_ | 13:04 |
fungi | http://paste.openstack.org/show/794566/ | 13:05 |
openstackgerrit | Hervé Beraud proposed openstack/project-config master: gerritbot: more notifications in the oslo channel https://review.opendev.org/734827 | 13:05 |
openstackgerrit | Sagi Shnaidman proposed zuul/zuul-jobs master: Add jobs for testing ensure-ansible https://review.opendev.org/734584 | 13:05 |
*** ravsingh has joined #opendev | 13:07 | |
fungi | similarly if i run vos listvldb on the mirror-update instance, it reports "server mirror-update01.opendev.org partition /vicepa RW Site" and similar for RO | 13:08 |
fungi | so something in the record is saying that the rw volume and one ro replica are on the client's local system | 13:09 |
*** olaph has joined #opendev | 13:09 | |
ttx | looking | 13:15 |
AJaeger | ttx, compare https://opendev.org/openstack/syntribos and https://github.com/openstack-archive/syntribos | 13:16 |
AJaeger | github missed the last change that we have in opendev. Ir that correct or should those be the same? | 13:17 |
ttx | AJaeger: in openstack-archive we have both forms. Some have a archiving commit and some do not. | 13:17 |
ttx | I think I have a slight preference for those who have the archiving commit | 13:17 |
AJaeger | ttx, ok - then we need to keep the official-openstack-repo-jobs until last minute - I'll know what to do. | 13:18 |
AJaeger | thanks, ttx | 13:18 |
ttx | as it may not be super-obvious this is archived content otherwise | 13:18 |
ttx | (you have to notice the change in org name) | 13:18 |
ttx | AJaeger: if that's not too much of a hassle, i think that's better yes | 13:18 |
AJaeger | ttx, it's no problem | 13:20 |
AJaeger | ttx, https://review.opendev.org/734835 updates the docs | 13:24 |
openstackgerrit | Emilien Macchi proposed openstack/project-config master: Retire Paunch https://review.opendev.org/734640 | 13:27 |
*** ykarel is now known as ykarel|afk | 13:31 | |
fungi | i need to disappear for a grocery pickup appointment, but can resume banging my head against these weird afs volumes in a bit | 13:46 |
openstackgerrit | Emilien Macchi proposed openstack/project-config master: Retire Paunch https://review.opendev.org/734640 | 13:59 |
*** ykarel|afk is now known as ykarel | 14:02 | |
openstackgerrit | Sagi Shnaidman proposed zuul/zuul-jobs master: Add jobs for testing ensure-ansible https://review.opendev.org/734584 | 14:04 |
mordred | infra-root: I've got to run an errand and will be out for a couple of hours. | 14:12 |
*** sshnaidm is now known as sshnaidm|bbl | 14:23 | |
*** hashar has quit IRC | 14:39 | |
openstackgerrit | Oleksandr Kozachenko proposed openstack/project-config master: Add magnum and magnum-tempest-plugin in required-projects https://review.opendev.org/734863 | 14:43 |
*** mlavalle has joined #opendev | 14:47 | |
clarkb | fungi: ianw: I wonder if that is an afs version mismatch problem? Like perhaps using your local openafs system to talk to older fileservers to create the volumes is a problem (this assumes we didn't create the volumes on the servers themselves) | 14:58 |
*** priteau has joined #opendev | 14:58 | |
clarkb | maybe we should try to create a new test volume on afs01 and see if it exhibits the same behavior? | 14:58 |
*** ykarel is now known as ykarel|away | 15:01 | |
*** lpetrut has quit IRC | 15:09 | |
openstackgerrit | Drew Walters proposed openstack/project-config master: Add missing project to Airship doc job https://review.opendev.org/734874 | 15:14 |
fungi | clarkb: i saw the same behavior on mirror-update01.openstack.org which is xenial, same as afs01.dfw, and also saw it when running locally *on* afs01.dfw | 15:18 |
clarkb | fungi: right but where did we create the volumes? | 15:18 |
clarkb | I'm just wondering if it could be a creation problem with new openafs talking to old openafs to create the volume | 15:19 |
fungi | ahh, so an issue on creation... maybe? i created some from my workstation, but i think ianw may have needed to delete and recreate them later (now i don't remember why, i'll look up the earlier discussions) | 15:20 |
clarkb | I think it was a similar situation with extra volume replicas? | 15:20 |
clarkb | but I'm not 100% sure on that | 15:20 |
fungi | i'll try to find that earlier conversation after lunch | 15:20 |
*** ysandeep is now known as ysandeep|away | 15:30 | |
clarkb | fungi: has https://review.opendev.org/#/c/729029/1 been tested to check that the ep_headings plugin continues to work? | 15:33 |
*** olaph has quit IRC | 15:33 | |
clarkb | fungi: if not I think what we can do is push a followup that forces the system-config-run etherpad job to fail, add a hold for that node then use /etc/hosts to talk to the test node as if it were production and check it? | 15:33 |
*** yoctozepto has quit IRC | 15:34 | |
*** yoctozepto has joined #opendev | 15:35 | |
fungi | i have not tested it, no | 15:42 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Test etherpad with testinfra https://review.opendev.org/734880 | 15:50 |
clarkb | k, I've put a hold on ^ and when we are done checking things that way we can drop the assert False and have a bit more checking done autoamtically too | 15:51 |
*** rpittau is now known as rpittau|afk | 16:07 | |
openstackgerrit | Merged openstack/diskimage-builder master: Fix yumdownloader cache dir https://review.opendev.org/698788 | 16:24 |
openstackgerrit | Carlos Goncalves proposed openstack/diskimage-builder master: Add support for CentOS 8 Stream https://review.opendev.org/734083 | 16:27 |
*** ravsingh has quit IRC | 16:36 | |
clarkb | fungi: 213.32.76.138 in /etc/hosts as etherpad.opendev.org and load https://etherpad.opendev.org/p/clarkb-test it works but its not perfect | 16:41 |
fungi | mmm | 16:42 |
clarkb | fungi: I think next step may be disabling ep_headings and checking if it renders properly? | 16:42 |
fungi | probably. it's just one line in the config | 16:43 |
clarkb | trying to figure out how to do that on the running instance | 16:45 |
clarkb | we bake it into the image, might be easiset to just push another update without that in the image | 16:48 |
clarkb | before I do that I'll try rebuilding thei mage on the test node and restart with docker compose | 16:49 |
clarkb | heh I've just realized the ep_headings thing is the only thing we change so I can just switch to upstream image to test this | 16:53 |
clarkb | I'll do that if the image rebuild fails for some reason | 16:53 |
clarkb | fungi: ya without ep_headings its a bit better. What i notice though is that we're running with the new format not the old ui | 16:54 |
clarkb | which may be related | 16:54 |
openstackgerrit | James E. Blair proposed zuul/zuul-jobs master: Allow upload-docker-image role to be used outside of promote https://review.opendev.org/734890 | 16:56 |
openstackgerrit | Oleksandr Kozachenko proposed openstack/project-config master: Add openstack/magnum and openstack/magnum-tempest-plugin in required-projects https://review.opendev.org/734863 | 17:00 |
*** priteau has quit IRC | 17:01 | |
clarkb | fungi: ok test it now. I think the problem is 1.8.3 switched to colibris skin by default | 17:06 |
clarkb | even though the 1.7.x series docs said this wouldn't happen until etherpad 2.0 | 17:06 |
clarkb | forcing skinName to no-skin in the settings seems to fix this | 17:06 |
clarkb | I'll get a change up taht does that and we can recapture the host and dobule check it is happy with config management (and not my manual fiddling) | 17:07 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Upgrade Etherpad to 1.8.4 https://review.opendev.org/729029 | 17:10 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Test etherpad with testinfra https://review.opendev.org/734880 | 17:10 |
clarkb | I'm putting a hold on that again | 17:10 |
clarkb | fwiw using the test node as a test system seems to be working reasonably well. And better yet I can recylce them easily | 17:11 |
clarkb | *test node as dev system | 17:13 |
fungi | clarkb: we set the "noTheme" (or whatever it's called) skin explicitly in our configs, so should override the colibris default | 17:14 |
clarkb | fungi: I couldn't find that fwiw | 17:15 |
clarkb | fungi: it wasn't until 1.8.3 that the default changed from no-skin to colibris though | 17:15 |
clarkb | which is why I think we were fine (I looked for a place we were explicitly overriding it and couldn't find it so I added it to your change) | 17:15 |
*** hashar has joined #opendev | 17:16 | |
fungi | it used to be in our config, i even removed the line so we could test the colibris default on etherpad-dev | 17:16 |
fungi | i wonder if we lost that when we containerized | 17:16 |
clarkb | fungi: that could be | 17:17 |
fungi | though yes, you're right that was when i was testing a commit from the devel branch while they were working toward 1.8.3 | 17:17 |
fungi | but it was explicitly set to no theme in our config at that point | 17:18 |
fungi | which prevented us from following the default change to colibris | 17:18 |
clarkb | I personally don't care for colibris, it feels more like a official document than a ethereal note pad | 17:19 |
clarkb | but I think we should update to colibris as a separate step if we want to go that route | 17:19 |
clarkb | (for anyone wondering why not jump to colibris) | 17:19 |
fungi | yes, i mean, we tried it out on the old review-dev and none of us seemed impressed with it | 17:23 |
fungi | it's trying too hard to be google docs i think | 17:23 |
fungi | i don't want a collaborative word processor, i want a collaborative text editor | 17:24 |
fungi | it's possible folks who spend a lot of time in wysiwyg environments prefer the word processor feel, while folks who spend a lot of time in terminals emulators and shell environments prefer the text editor feel | 17:25 |
clarkb | that could be | 17:25 |
clarkb | it is rare that i start soffice | 17:26 |
clarkb | (and yes libreoffice still installs that binary) | 17:26 |
fungi | i do everything i can to avoid starting a word processor | 17:26 |
fungi | including command-line converters which turn word processing documents into plain text | 17:26 |
fungi | (antiword, for example) | 17:27 |
fungi | i suppose giving users a dislplay toggle to switch between themes individually/locally would allow for the best of both worlds | 17:27 |
fungi | but it doesn't appear that they've designed for such a case | 17:28 |
clarkb | fungi: ya we'd need headings and other potential plugins to work with both skins in that case | 17:28 |
clarkb | the previous test instance that I fixed manually continues to lgtm | 17:31 |
clarkb | if the new test instance which should be up in about 20 minutes or so looks good to others then I think we can go ahead and land the upgrade | 17:31 |
fungi | awesome, thanks for picking that up! | 17:38 |
fungi | i've been buried under ml discussions and reviews | 17:38 |
*** sshnaidm|bbl is now known as sshnaidm | 17:40 | |
clarkb | well I wanted to approve it and then realized I should double check it was ready :) | 17:49 |
openstackgerrit | Emilien Macchi proposed openstack/project-config master: Deprecate Paunch https://review.opendev.org/734640 | 17:49 |
*** hashar has quit IRC | 17:57 | |
fungi | clarkb: ianw: back to the afs volumes, this actually looks like the same thing we saw previously with those volumes: http://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-05-26.log.html#t2020-05-26T23:45:01 | 17:58 |
fungi | it's odd, both the volumes i created and those ianw created are exhibiting the issue | 18:00 |
clarkb | infra-root 104.239.168.111 in /etc/hosts as etherpad.opendev.org seems to be working now with my update to set no-skin on fungi's upgrade change | 18:01 |
clarkb | https://etherpad.opendev.org/p/clarkb-test is the etherpad I used really quickly there | 18:02 |
openstackgerrit | Clark Boylan proposed opendev/system-config master: Test etherpad with testinfra https://review.opendev.org/734880 | 18:02 |
clarkb | that change should pass now that I don't need it to fail for node holding. I've also cleaned up my earlier holds | 18:04 |
fungi | clarkb: seems to work, though now the weird background overlap we were seeing through meetpad appears in this test deploy of etherpad | 18:05 |
clarkb | fungi: its not as bad this time, my 'g' renders with my color properly | 18:06 |
clarkb | but its breaks into your color which is weird | 18:06 |
clarkb | the meetpad situation was the g had no tail | 18:06 |
clarkb | oh wait I had to add more text :) | 18:06 |
clarkb | :/ | 18:06 |
clarkb | hard refresh doesn't change that | 18:07 |
clarkb | I wonder if this is new etherpad bug and we were simply noticing it with meetpad beacuse we were doing some testing | 18:07 |
fungi | entirely possible | 18:07 |
fungi | it's like they added more top and bottom padding within the authorship color container or something | 18:08 |
fungi | also the ability to independently dock or float the authors and chat boxes has changed | 18:09 |
fungi | now you can alter the behavior through the config modal, but there are no buttons to switch them between docked and floating independently | 18:10 |
fungi | also the author colors toggle no longer temporarily fixes the background overlaps like we saw with meetpad, so maybe this is a slightly different problem | 18:11 |
clarkb | I can transition the chat box between docked, float and closed using the little buttons for it | 18:14 |
clarkb | and clicking the authors thing toggles it too | 18:14 |
clarkb | do you mean some other behavior? | 18:14 |
fungi | looking again | 18:16 |
fungi | oh, yep, i missed the buttons for chat | 18:17 |
clarkb | I do also find toggling authorship colors doesn't change the overlap | 18:17 |
fungi | i guess it's that the floating user list is now independent of your chat mode | 18:17 |
clarkb | maybe we should try without ep_headings again and see if that color overlap behavior changes | 18:18 |
fungi | so author list can float over the chat column unless you toggle the "show chat and users" config option | 18:18 |
clarkb | fungi: to do that you can edit /etc/etherpad-docker/docker-compose.yaml on the server to change the image for etherpad from our ehterpad image to the upstream 1.8.4 image since the only thing we changein our image is the addition of the ep_headings plugin | 18:19 |
clarkb | (not sure if you are interested in doing that or I should go for it | 18:19 |
clarkb | though I'm about to pop out for a bike ride | 18:19 |
fungi | i can't seem to ssh into 104.239.168.111 as root | 18:19 |
fungi | oh! it has my user on it | 18:20 |
clarkb | fungi: yes, its been converted to one of our production nodes (at least as far as behavior goes) | 18:20 |
fungi | i guess that's an artifact of our production-like testing | 18:20 |
clarkb | yup | 18:20 |
fungi | neat | 18:20 |
fungi | yeah, i can do it, just need to find the name of their dockerhub org | 18:21 |
fungi | looking now | 18:21 |
clarkb | fungi: its in our dockerfile | 18:21 |
clarkb | I think you want to change docker.io/opendevorg/etherpad to docker.io/etherpad/etherpad:1.8.4 | 18:22 |
*** mlavalle has quit IRC | 18:22 | |
clarkb | then sudo docker-compose down && sudo docker-compose up -d and refresh browser | 18:22 |
fungi | yep, i concur | 18:22 |
clarkb | fungi: looks like you used 1.8.0? we want 1.8.4 (unless you want to check the old and new behavior in comparison) | 18:23 |
fungi | aha, yep | 18:23 |
fungi | and after `cd /etc/etherpad-docker/` | 18:23 |
clarkb | we should check it with the older version anyway | 18:23 |
fungi | well, i switched to .4 just now | 18:24 |
clarkb | ya we can do it after | 18:24 |
fungi | but yes, we can do both | 18:24 |
fungi | still seems to do it with upstream image | 18:24 |
clarkb | problem continues after removing ep_headings | 18:24 |
clarkb | (which makes me think it is a bug in etherpad on the no skin skin) | 18:24 |
fungi | i'll try dropping back to .0 again using upstream | 18:24 |
clarkb | 1.8.0 looks fine | 18:25 |
fungi | downgrading to 1.8.0 makes the overlap go away | 18:25 |
clarkb | I also notice that the font sizes changed | 18:25 |
fungi | yeah | 18:25 |
fungi | they did indeed | 18:25 |
clarkb | I wonder if those things are related | 18:25 |
fungi | likely | 18:26 |
clarkb | on a positive note testing things like production in testing is remarkably easy | 18:26 |
fungi | amazingly | 18:26 |
clarkb | fungi: maybe we should file a bug with etherpad about it (and take screenshots using this test isntance?) and see what they say before upgrading? | 18:27 |
clarkb | I'm worried that we'll be told no-skin isn't supported anymore | 18:27 |
fungi | that makes sense as a next step, sure | 18:27 |
fungi | and yes, i have the same expectation | 18:27 |
clarkb | k, I can work on that after a bike ride and lunch if you don't want to bother. | 18:28 |
clarkb | and with that I'm popping out now for a bit | 18:28 |
*** sshnaidm is now known as sshnaidm|afk | 19:05 | |
Open10K8S | Hi team | 19:19 |
Open10K8S | https://review.opendev.org/#/c/734863/ | 19:19 |
Open10K8S | Please check this PS | 19:19 |
Open10K8S | I updated the commit message | 19:19 |
Open10K8S | Regards | 19:19 |
Open10K8S | https://review.opendev.org/#/c/734891/ is waiting | 19:20 |
clarkb | Open10K8S: +2 | 20:30 |
corvus | it looks like https://review.opendev.org/733409 ran for the first time in a cloud with differing public and private ip addresses in the gate and failed | 20:37 |
corvus | i guess we should actually map the nodepool private_ipv4 to our ansible inventory public_v4 in the gate | 20:39 |
corvus | since it seems like the private addresses are what we write to /etc/hosts | 20:39 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Stop using backend hostname in zuul testinfra tests https://review.opendev.org/733409 | 20:42 |
clarkb | corvus: hrm, we put the private IP there to avoid traversing NAT which has been problematic in the past | 20:43 |
clarkb | (for things like vxlan tunnels) | 20:43 |
clarkb | fungi: I think 5fd6aeeea62674cecf997421546a675d91cf45ef may be the commit that broke things | 20:43 |
corvus | clarkb: yeah, i think just using the private ip should be fine | 20:44 |
clarkb | fungi: I don't understand why yet, but the commit message in etherpad-lite makes it seem likely | 20:44 |
clarkb | fungi: I'll file a bug now with a couple screenshots anda pointer to that commit and see if they say anything | 20:47 |
fungi | clarkb: thanks! i could do it, but not until after i'm done prepping dinner | 20:48 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Fake zuul_connections for gate https://review.opendev.org/730929 | 20:49 |
corvus | and i think the public/private ipv4 issue also broke the last run of that, so that's a rebase | 20:49 |
openstackgerrit | James E. Blair proposed opendev/system-config master: WIP: add Zookeeper TLS support https://review.opendev.org/720302 | 20:50 |
clarkb | no worries, I'm going to update etherpad on our test node to get screenshot of the broken stuff | 20:50 |
fungi | cool, that's how i was considering doing it too, since it's just a one-liner edit and down/up the container | 20:52 |
openstackgerrit | Merged zuul/zuul-jobs master: Allow upload-docker-image role to be used outside of promote https://review.opendev.org/734890 | 20:55 |
clarkb | https://github.com/ether/etherpad-lite/issues/4106 has been filed | 21:03 |
clarkb | corvus: on the ip address switch, do we use public_ipv4 for anything? (iptables?) | 21:04 |
clarkb | I know ansible itself is going to use the ansible_host value | 21:04 |
corvus | clarkb: yes that's exactly it | 21:04 |
clarkb | thanks | 21:05 |
clarkb | change lgtm then | 21:05 |
corvus | that's the value we (just recently) started using in the iptables rules; that replaced a dns lookup -- it's now an ansible inventory lookup so that we can do iptables by ansible group | 21:05 |
clarkb | rgr | 21:05 |
clarkb | corvus: btw not sure if you saw but the "use zuul test node as -dev server standin" worked really well earlier today | 21:06 |
clarkb | mordred: ^ you too | 21:06 |
corvus | clarkb: oh nice, sorry i missed a bunch earlier. but that's pretty cool. might make some kind of self-service hold thing worthwhile | 21:07 |
clarkb | I changed my vote on the etherpad upgrade change from +2 to -W with a link to the issue I filed | 21:10 |
clarkb | we'll see where that takes us I think | 21:10 |
fungi | awesome, thanks again! | 21:28 |
*** mlavalle has joined #opendev | 21:29 | |
corvus | woohoo! zuul started in system-config-run-zuul: https://zuul.opendev.org/t/openstack/build/ef6229a9233f4206a1d24e0724839f83/log/zuul01.openstack.org/debug.log | 21:47 |
corvus | i'm going to do one more rebase of that stack | 21:47 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Stop using backend hostname in zuul testinfra tests https://review.opendev.org/733409 | 21:48 |
openstackgerrit | James E. Blair proposed opendev/system-config master: Fake zuul_connections for gate https://review.opendev.org/730929 | 21:48 |
openstackgerrit | James E. Blair proposed opendev/system-config master: WIP: add Zookeeper TLS support https://review.opendev.org/720302 | 21:48 |
*** DSpider has quit IRC | 21:58 | |
Open10K8S | clarkb: thank you | 22:00 |
clarkb | fungi: going back to the openafs oddity, is your local openafs a 1.8 version? I wonder if that could be part of it and we should try a create on the fileserver itself? | 22:01 |
fungi | well, a bunch of those volumes (possibly all of them?) were created by ianw, so it may make more sense to check how and from where he created them | 22:02 |
clarkb | ah | 22:02 |
fungi | or at least double-check whether i'm misreading the discussion here from may 26 | 22:03 |
fungi | but yes, i've currently got openafs 1.8.6~pre1-3 | 22:04 |
fungi | from debian unstable | 22:04 |
* clarkb looks at python2 things again | 22:09 | |
openstackgerrit | Clark Boylan proposed openstack/project-config master: Install git-review under python3 for proposed updates https://review.opendev.org/735019 | 22:13 |
clarkb | that was an easy one to address so I went for it | 22:13 |
ianw | clarkb/fungi: i'm guessing the afs volumes still aren't happy? | 22:13 |
clarkb | ianw: ya | 22:13 |
clarkb | ianw: one thing I was wondering about is if a newer openafs was used to create them which potentially caused problems | 22:14 |
ianw | the thing is they *were* happy, for a bit | 22:14 |
clarkb | I've only ever created them on the fileserver and while its been a while those have always been happy as far as I know | 22:14 |
ianw | vos examine mirror.wheel.focala64 | 22:18 |
ianw | Could not fetch the information about volume 536871131 from the server | 22:18 |
ianw | Possible communication failure | 22:18 |
ianw | that's on miror-update | 22:18 |
clarkb | corvus: comment on https://review.opendev.org/#/c/730929/6 | 22:18 |
ianw | i wonder if tcpdumping again we'll see this icmp stuff | 22:18 |
ianw | server mirror-update01.opendev.org partition /vicepa RW Site | 22:19 |
ianw | this seems familiar. somehow it has decided that mirror-update01 is in the vldb, or at least is showing it as such | 22:19 |
clarkb | ianw: was the volume created from mirror-update01.opendev.org or something like that | 22:19 |
clarkb | ianw: and ya fungi found that this was the same problem we had earlier (a week or two back) when we hit prolbmes around these volumes | 22:20 |
clarkb | that was based on irc logs | 22:20 |
ianw | clarkb: i think it was, but this is similar to what i saw when looking at fungi's volumes; why i deleted them and recreated them, which worked when i left it, but appears to have gone back to the same thing now | 22:20 |
ianw | rx data vldb reply get-entry-by-name-n "mirror.wheel.focala64" numservers 4 servers 127.0.1.1 127.0.1.1 23.253.73.143 104.130.138.161 partitions a a a a rwvol 536871131 rovol 536871132 backup 536871133 (504) | 22:26 |
ianw | that's from a tcpdump to afsdb01 | 22:26 |
ianw | that sure looks like it's saying the servers for wheel.focala64 include 127.0.1.1 | 22:26 |
fungi | yep | 22:26 |
fungi | that's exactly what i found looking into it today too | 22:26 |
fungi | so the vldb records have somehow replaced the afs01.dfw.openstack.org record for the rw volume with 127.0.1.1, and added a 127.0.1.1 ro replica | 22:28 |
ianw | 127.0.1.1 and openafs has some google hits | 22:28 |
fungi | could it be because /etc/hosts on afs01.dfw has "127.0.1.1 afs01.dfw.openstack.org afs01" | 22:28 |
fungi | and so it's resolving its ip address based on that? | 22:29 |
ianw | "This seems to bite everyone who installs the Debian or Ubuntu packages on | 22:29 |
ianw | a non-modified server which has" | 22:29 |
fungi | last modified date on that file is more than a year ago though | 22:29 |
clarkb | fungi: its part of our normal setup to do that | 22:29 |
fungi | right | 22:29 |
fungi | which is why i'm wondering what has caused that to suddenly become an issue | 22:30 |
clarkb | ya thats why I wondered if openafs version used to do the create is important | 22:30 |
fungi | unless maybe these are the first volumes we've added since the hosts file was "normalized" to match our other servers? | 22:30 |
clarkb | beacuse we've left these things alone for a long time and they've been fine | 22:30 |
clarkb | fungi: thats possible, I don't know if we've added new volumessince the xenial upgrade | 22:30 |
ianw | well i think we have a smoking gun ... first thing is how to get rid of it | 22:31 |
ianw | https://lists.openafs.org/pipermail/openafs-info/2013-December/040285.html | 22:31 |
fungi | yeah, occam's razor says we had it set up correctly on trusty, the xenial upgrade undid the hosts file back to "normal" and these are the only volumes we've added since | 22:33 |
clarkb | ianw: I'll admit I don't quite understand any of what that email is trying to say | 22:34 |
clarkb | like the 127.0.1.1 problem is because it is already in /etc/hosts ? why do we need to update it? | 22:34 |
ianw | clarkb: heh, me either yet :) | 22:34 |
ianw | but i think it's our current best clue :) | 22:34 |
clarkb | ya I agree it seems to be the thread to pull on | 22:34 |
clarkb | https://docs.openafs.org/Reference/5/NetRestrict.html is the other thing I didn't recognize | 22:35 |
clarkb | I think that means we can add netrestrict files on the servers to exclude 127.0.1.1 and whatever else | 22:35 |
clarkb | which is probably a reasonable enough workaround for us as we can stick all of that into config managmeent | 22:35 |
ianw | i think if we do the vos remsite on afs01 it will remove appropriately | 22:36 |
clarkb | ianw: maybe we need both things? remsite to fix the existing volumes and netrestrict to avoid this in the future? | 22:37 |
ianw | yeah, remove the 127.0.1.1 entries, then make sure they don't come back | 22:37 |
mordred | clarkb: yay! (re: -dev server standin) | 22:38 |
mordred | also - sorry, my "I'll be gone for a couple of hours for errand" - I forgot to translate that to nola time | 22:39 |
clarkb | mordred: I just assume the rain flooded the streets so everyone raided the bars for togo cups full of $drink_of_choice | 22:39 |
ianw | $ vos remsite -server afs01.dfw.openstack.org -partition a -id mirror.wheel.focala64 | 22:39 |
ianw | Deleting the replication site for volume 536871131 ...Removed replication site afs01.dfw.openstack.org /vicepa for volume mirror.wheel.focala64 | 22:39 |
mordred | clarkb: that's an excellent assumption with very accurate details | 22:40 |
ianw | 22:39:08.878673 IP 104.130.136.20.7003 > 104.130.137.130.56972: rx data vldb reply get-entry-by-name-n "mirror.wheel.focala64" numservers 3 servers 127.0.1.1 23.253.73.143 104.130.138.161 partitions a a a rwvol 536871131 rovol 536871132 backup 536871133 (504) | 22:40 |
clarkb | ianw: so its still reporting 127.0.1.1? | 22:40 |
mordred | clarkb: the real issue is that we're still only at 30% capacity for bars that don't serve food - so the lines for go-cups of $drink are extra long | 22:41 |
ianw | clarkb: before (top) and after (bottom) : http://paste.openstack.org/show/794603/ | 22:41 |
ianw | i think there's two entries ... and it removed the "real one" | 22:42 |
clarkb | ianw: ya your tcpdump shows 127.0.1.1 so I assume it removed the real one | 22:42 |
mordred | clarkb, ianw: so - just to catch up - our entries in /etc/hosts with 127.0.1.1 hostname are bad and break things. didn't we spend a bunch of time at one point to make sure those entries existed? | 22:42 |
clarkb | mordred: yes, those entries make other things happy iirc | 22:43 |
clarkb | something to do with unbound maybe | 22:43 |
clarkb | (because it listens on that addr?) | 22:43 |
clarkb | ianw: remsite says you can provide the ip address or the name | 22:43 |
clarkb | ianw: so maybe add the proper site back then use 127.0.1.1 to remsite the wrong one? | 22:44 |
ianw | ok, "vos examine" with "-noresolve" shows : server 127.0.1.1 partition /vicepa RW Site | 22:44 |
mordred | clarkb: nod. well - fwiw, we only do that in set-hostname which we only do in launch-node | 22:44 |
ianw | so the *rw* volume is on 127.0.1.1 | 22:44 |
clarkb | i'm looking at http://docs.openafs.org/Reference/1/vos_remsite.html fwiw | 22:44 |
clarkb | mordred: ah so we could maybe make an exception for afs servers | 22:44 |
mordred | so if we wanted it to be different on some of the hosts, it shouldn't be hard, nor should it break anything | 22:44 |
mordred | yeah | 22:44 |
mordred | I think we could just write an /etc/hosts file in the afs role | 22:45 |
mordred | nothing should fight | 22:45 |
*** tkajinam has joined #opendev | 22:45 | |
ianw | maybe vos move is way to update the rw volume? | 22:49 |
ianw | something happened ... maybe ... http://paste.openstack.org/show/794604/ | 22:53 |
clarkb | fwiw vos move looks correct for RW sites reading manpages | 22:53 |
clarkb | addsite is RO only | 22:53 |
clarkb | ianw: does vos examine -noresolve look happier now too? | 22:54 |
ianw | no ;) | 22:58 |
ianw | $ vos examine -noresolve mirror.wheel.focala64 | 22:58 |
ianw | Could not fetch the information about volume 536871131 from the server | 22:58 |
ianw | : No such device | 22:58 |
*** aannuusshhkkaa has joined #opendev | 22:59 | |
*** shtepanie has joined #opendev | 23:00 | |
clarkb | doesn't seem like there is a way to convert a RO site to a RW site? | 23:00 |
rm_work | Hey, trying to walk some folks through registering for a new openstack account, and it's failing to create accounts right after clicking register on https://openstackid.org/auth/register | 23:00 |
rm_work | Is this a known issue? | 23:00 |
clarkb | otherwise I'd say addsite with the correct IP, this gives us a RO volume. Then switch it to the RW volume | 23:00 |
rm_work | HTTP 500: openstackid.org is currently unable to handle this request. | 23:00 |
clarkb | rm_work: I'll ping the foundation sysadmins and see if the server logs say anything obvious to me | 23:01 |
rm_work | ok, both of them are getting it, and I was able to replicate as well. | 23:01 |
clarkb | it looks like PHP is running out of memory but the service itself has plenty | 23:02 |
clarkb | *server itself | 23:02 |
clarkb | rm_work: foundation sysadmin is looking at it now. Will let you know what they find | 23:04 |
rm_work | alright, thanks! | 23:08 |
ianw | clarkb: the move seems to have failed, i'm not sure what to do now | 23:10 |
clarkb | ianw: ya I don't know either. Maybe ignore that one for now (and we'll rebuild it), but try the remsite using the IP on another one and see if that fixes it? | 23:11 |
ianw | i tried on bionica64 and it did the same thing | 23:13 |
ianw | Failed to create a transaction on the source volume 536871125 | 23:13 |
ianw | VOLSER: volume is busy | 23:13 |
clarkb | ianw: using the IP it removed the other IP? | 23:13 |
clarkb | with remsite I mean | 23:13 |
*** tosky has quit IRC | 23:15 | |
clarkb | rm_work: I've been askedto confirm you are ticking the captcha box | 23:21 |
rm_work | yes. | 23:21 |
ianw | clarkb: no, it seemed to fail. i'm going to try just removing all the broken volumes | 23:21 |
clarkb | ianw: ok, before we create new ones should we add the netrestriction? | 23:21 |
clarkb | I guess we can do that manually then put it in config management later if that is easier | 23:22 |
rm_work | you can fairly easily replicate, they should be able to throw a junk name and email in and replicate themselves in like 10 seconds | 23:22 |
ianw | clarkb: yeah, let me clean what's there then we can try making them again :) | 23:22 |
clarkb | rm_work: the main person says they haven't been able to reproduce though someone else has reprodcued | 23:22 |
rm_work | huh. all three of us trying here got the same thing. so maybe if we just hammer it? :D | 23:22 |
clarkb | rm_work: we are looking at the memory issue as a possible cause though | 23:22 |
ianw | oh, i wonder if it's the mount in the r/w parition? | 23:22 |
clarkb | ianw: oh we need to remove it from the fs side, the move then remount? that would make sense | 23:23 |
ianw | clarkb: i can't do an rmmount because the "file doesn't exist" | 23:25 |
clarkb | hrm | 23:26 |
clarkb | maybe we have to rmmount before remsite? | 23:26 |
clarkb | (I claim no expertise though, maybe mordred or corvus know better?) | 23:29 |
mordred | I dind't do it | 23:29 |
ianw | vos remove -noresolve -server afs01.dfw.openstack.org -partition a -id mirror.wheel.bionica64 | 23:31 |
ianw | Volume 536871125 does not exist on server and partition | 23:31 |
ianw | VOLSER: no such volume - location specified incorrectly or volume does not exist | 23:31 |
ianw | this is really starting to annoy me :) | 23:31 |
*** mlavalle has quit IRC | 23:32 | |
clarkb | rm_work: latest word is that it may be input specific | 23:32 |
clarkb | rm_work: debugging continues, but I'm just playing messenger right now so don't have all the details | 23:32 |
rm_work | O_o | 23:32 |
rm_work | so on our side, we had three separate people doing it and it failing in the same way. I wonder what all three of us could have independently done to get that error | 23:33 |
fungi | sounds like it was input-dependent | 23:40 |
fungi | so that all of them were hitting it isn't too surprising | 23:41 |
openstackgerrit | sebastian marcet proposed opendev/system-config master: OpenstackId v3.0.10 https://review.opendev.org/735022 | 23:43 |
rm_work | ^^ related? | 23:44 |
rm_work | I mean what input would three different people have shared? different names, different email domains... | 23:44 |
rm_work | unless being in the USA is an input problem :D | 23:44 |
openstackgerrit | sebastian marcet proposed opendev/system-config master: OpenstackId v3.0.11 https://review.opendev.org/735022 | 23:44 |
clarkb | rm_work: field size I think | 23:45 |
clarkb | rm_work: not the exact data, but its width | 23:45 |
clarkb | (which may also explain memory issues if the allocated memory is dependent on expected input sizes? | 23:45 |
rm_work | O_o | 23:45 |
clarkb | but again I don't know specifics | 23:45 |
rm_work | My name and email address were both pretty short | 23:45 |
rm_work | but yeah | 23:46 |
* rm_work shrugs | 23:46 | |
rm_work | as long as it works | 23:46 |
fungi | *shrug* php ;) | 23:47 |
ianw | clarkb: through a series of vos removes on afs01.dfw i think i have cleared the bad volumes | 23:51 |
ianw | i've removed the mounts | 23:55 |
ianw | i think we can try recreating | 23:55 |
ianw | i need a cup of tea first :) | 23:55 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!