*** mnaser has quit IRC | 16:41 | |
*** diablo_rojo_phon has quit IRC | 16:42 | |
*** mnaser has joined #opendev-meeting | 16:44 | |
*** mnaser has quit IRC | 16:46 | |
*** mnaser has joined #opendev-meeting | 16:47 | |
*** mnaser has quit IRC | 16:49 | |
*** mnaser has joined #opendev-meeting | 16:50 | |
clarkb | We will get started on the meeting shortly | 18:59 |
---|---|---|
clarkb | anyone else here for the meeting? seems like it has been a busy distracting day | 19:00 |
fungi | i will busily switch to meeting mode | 19:00 |
clarkb | #startmeeting infra | 19:01 |
openstack | Meeting started Tue Apr 21 19:01:06 2020 UTC and is due to finish in 60 minutes. The chair is clarkb. Information about MeetBot at http://wiki.debian.org/MeetBot. | 19:01 |
openstack | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 19:01 |
*** openstack changes topic to " (Meeting topic: infra)" | 19:01 | |
openstack | The meeting name has been set to 'infra' | 19:01 |
mordred | o/ | 19:01 |
clarkb | #link http://lists.opendev.org/pipermail/service-discuss/2020-April/000010.html Our Agenda | 19:01 |
ianw | o/ | 19:01 |
clarkb | #topic Announcements | 19:01 |
*** openstack changes topic to "Announcements (Meeting topic: infra)" | 19:01 | |
zbr | o/ | 19:01 |
clarkb | I wanted to call out here that splitting opendev into its own comms channels seems to be working for getting more people to engage | 19:02 |
AJaeger | o/ | 19:02 |
clarkb | welcome! to all those people (not sure if any are here in this channel now but we've seen more traffic on the mailing list) | 19:02 |
fungi | we're up to 80 nicks currently in the #opendev channel | 19:03 |
fungi | (still a far cry from the 250+ in #openstack-infra, but many of those may be zombies for all intents and purposes) | 19:03 |
clarkb | #topic Actions from last meeting | 19:04 |
fungi | also 20 subscribers to service-discuss and 25 to service-announce | 19:04 |
*** openstack changes topic to "Actions from last meeting (Meeting topic: infra)" | 19:04 | |
clarkb | #link http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-14-19.01.txt minutes from last meeting | 19:04 |
clarkb | there were no actions. | 19:04 |
clarkb | #topic Priority Efforts | 19:04 |
*** openstack changes topic to "Priority Efforts (Meeting topic: infra)" | 19:04 | |
clarkb | #topic Update Config Management | 19:05 |
*** openstack changes topic to "Update Config Management (Meeting topic: infra)" | 19:05 | |
clarkb | maybe mordred can update on his activity here then ianw? | 19:05 |
clarkb | maybe we lost mordred | 19:07 |
clarkb | my understanding of it is that we've continued to push towards zuul driven CD of things | 19:07 |
clarkb | in particular we are now looking at cleaning up puppetry as and where necessary | 19:07 |
corvus | does anyone know what the status of containerized zuul is? | 19:07 |
mordred | heya - sorry | 19:08 |
mordred | yes! | 19:08 |
mordred | so - three things going on | 19:08 |
corvus | (i'd like to proceed with the tls work, so catching up on that would be helpful for me) | 19:08 |
mordred | first - I'm still working through the followup from the gerrit rollout - next on that list is gerritbot - this led me to eavesdrop which has turned in to reorganizing how we run puppet a bit | 19:09 |
mordred | so - sorry for that rabbithole - but I think it'll be worth it | 19:09 |
mordred | https://review.opendev.org/#/q/topic:puppet-apply-jobs | 19:09 |
mordred | that's the topic related to that | 19:09 |
mordred | second and third are nodepool and zuul | 19:09 |
mordred | https://review.opendev.org/#/q/topic:container-zuul | 19:10 |
corvus | (i think that rabbit hole -- getting the puppet jobs down to size -- is great and worth it) | 19:10 |
mordred | nodepool-launcher is ready to go and I think now safe to land: https://review.opendev.org/#/c/720527/ | 19:10 |
mordred | it won't restart containers in prod, so I think we can land it then do a manual rolling restart of the launchers | 19:10 |
mordred | if people are happy with what we did there with starting vs. not starting docker-compose ... I can apply the same thing to the zuul patch: | 19:11 |
mordred | https://review.opendev.org/#/c/717620/ | 19:11 |
mordred | (issue being we don't necessarily want ansible to run docker-compose up every time it runs - but we DO want that to happen in the gate) | 19:12 |
mordred | I believe once I update that patch with the start boolean - it'll also be ready to go | 19:12 |
mordred | and I think also safe to land | 19:12 |
mordred | but - since that's nodepool and zuul - please review with an eye to "is this safe to land" | 19:12 |
corvus | we probably could start nodepool-launcher every time | 19:13 |
mordred | corvus: maybe we land first with nothing starting - because we have to stop the systemd stuff ... | 19:13 |
corvus | yeah | 19:13 |
corvus | i'm okay with starting conservative there | 19:13 |
mordred | and then land a patch to flip the var on the things where we're happy to do it every time | 19:13 |
corvus | what's the thinking on nodepool builders? i didn't get the full story yesterday | 19:14 |
fungi | also ianw discovered that debootstrap (used by dib to make debian/ubuntu images) needs a couple of patches to work from a container, so has published a custom build of it in a ppa and confirmed that's working from a container | 19:14 |
corvus | (is nb04 broken? or what?) | 19:14 |
mordred | ianw is further with diagnosing the issue - I think he's got a working build | 19:14 |
fungi | we debated switching to something newer like mmdebstrap, but don't want dib to break for users of older platforms where those newer tools aren't yet shipped as part of the distro | 19:14 |
mordred | but it involves two unlanded merge requests | 19:14 |
mordred | corvus: nb04 is broken for debuntu builds | 19:14 |
mordred | so they have been removed from it | 19:14 |
ianw | yes, a few things in progress | 19:14 |
mordred | oh good - it's ianw | 19:14 |
clarkb | specifically because debootstrap in docker containers explodes the next thing that runs in the container? | 19:15 |
ianw | yes, it likes to unmount /proc | 19:15 |
frickler | https://review.opendev.org/721394 | 19:15 |
corvus | it sounds like we can't really run our builders or executors in containers at the moment | 19:15 |
corvus | i'm a little worried that the zuul tls work is starting to collide with this | 19:15 |
corvus | the zuul patch has the executors running outside of containers | 19:16 |
corvus | should we rethink what we're doing with the builders? or can we get them into a consistent state soon? | 19:16 |
mordred | corvus: well - I think we can get to full ansible | 19:16 |
ianw | i am working to get our dib functional tests of converted to building from the container | 19:16 |
fungi | it sounds like we should be able to run builders from containers with a patched debootstrap | 19:16 |
mordred | corvus: which would be the part of the story that would most impact tls work, yes? | 19:17 |
corvus | mordred: yeah | 19:17 |
corvus | so we're looking at having 3 builders run from ansible+puppet, and 1 from ansible+containers? | 19:17 |
mordred | so - yeah - let's give ianw a little bit to see if we can get a solid container story for the builder with patched debootstrap | 19:17 |
corvus | or 3 from just "ansible" | 19:17 |
ianw | please don't forget there is an arm64 builder which has not had a lot of attention, but i would not like to drop | 19:17 |
mordred | I think just ansible if we can't get the container build going | 19:17 |
mordred | ianw: I have thoughts on that - let's come back to arm | 19:18 |
corvus | what kind of time are we talking about there, cause it sounds like ianw is working on a rabbit hole of his own with the container functional testing? | 19:18 |
corvus | basically, we're holding a zuul release on opendev being able to test this stuff | 19:18 |
mordred | well - it sounds like the patched debootstrap works - so now it's about updating testing to prove that it works and make sure we don't regress, yes? | 19:18 |
fungi | (and working out the arm story) | 19:19 |
corvus | so i think we need to either get the system into a place where we can realistically land a coordinated configuration change to the whole system in a day or two, or else sever the dependency between opendev and zuul releases (at least, temporarily) | 19:19 |
mordred | ok. so - there are a couple of options for that | 19:19 |
mordred | we can work on an ansible+pip install (I can work on that right now)- based on the current ansible+docker install and similar to how we did zuul-executors in the zuul patch | 19:20 |
mordred | we'll need focal nodes for them to be new enough | 19:20 |
clarkb | mordred: why do we need focal for that? | 19:21 |
clarkb | everything is pip installed so shouldn't depend on focal? | 19:21 |
mordred | because of the reasons we're using the containers in the first place- the rpm helper tools on bionic are old or missing | 19:21 |
clarkb | oh for builders specifically. Got it | 19:21 |
mordred | yeah | 19:21 |
ianw | mordred: if you mean, just use pip on a plain host to install, i.e. replicating the puppet in ansible, i have a patch that does that | 19:21 |
AJaeger | for focal, we need to merge https://review.opendev.org/#/c/720718/ to mirror it - and stop mirroring trusty | 19:21 |
mordred | I think it's not unreasonable to upload a focal base image | 19:21 |
mordred | yes | 19:22 |
mordred | and that would be good so that we can have integration test jobs | 19:22 |
mordred | but - I think we can work those in parallel | 19:22 |
ianw | also, arm only builds xenial/buster/bionic/centos atm. we don't need the updated tools which are required for fedora, as of right now | 19:22 |
corvus | fungi: i don't understand your comment in 720718 | 19:22 |
mordred | and get a focal base image uploaded to rax-dfw and boot a nb on it that we can use for fedora builds | 19:23 |
corvus | fungi: i don't know what the differences between those two hosts are | 19:23 |
mordred | as ianw says - we only need that for fedora builds | 19:23 |
clarkb | corvus: its a response to my comment | 19:23 |
ianw | corvus: we have done the work to move reprepro from puppet to ansible yet | 19:23 |
corvus | clarkb: i understand that. i don't understand how mirror-update.opendev.org and mirror-update.openstack.org are different | 19:23 |
mordred | so we can boot the other ansible-baesd builders on bionic | 19:23 |
clarkb | corvus: mirror-update.opendev.org is ansible managed and only does rsync based mirror updates currently | 19:24 |
fungi | the opendev.org server is the new one cron jobs are being migrated too, off the older openstack.org server | 19:24 |
clarkb | corvus: mirror-update.openstack.org does all the other mirror updates (reprepro and maybe other tools too) | 19:24 |
fungi | s/too/to/ | 19:24 |
corvus | but they're both afs heads? | 19:24 |
fungi | yes, both write into afs | 19:25 |
mordred | I think we should not tie this to reworking anything about how reprepro and old mirror-update works - really just upping the quota and doing a manual release should be fine to get this moving, yes? | 19:25 |
clarkb | mordred: yes | 19:25 |
corvus | sorry, this is proving a distraction. i still don't understand fungi's comment and the implications, but i'll just follow up later. | 19:25 |
clarkb | basically my comment was calling out that you need to bump the quota and do the manual release | 19:26 |
clarkb | if you do that its should all be fine | 19:26 |
corvus | and fungi said something isn't necessary, but i don't know wha.t | 19:26 |
mordred | great - so I think tasks would be: get focal mirroring going, get nodepool building focal nodes, build a manual focal-minimal to upload as a base image into rax-dfw, get a pure-ansible port of nodepool-builder | 19:26 |
mordred | most of those can be done in parallel | 19:26 |
fungi | corvus: oh, because mirror-update.opendev.org get vos release run remotely by ansible and uses localauth to avoid timeouts | 19:27 |
corvus | so do we want to switch all of the nb nodes to pure-ansible, retiring the current nb04? | 19:27 |
mordred | I'm happy to take the pure-ansible port since I'm cranking on that stuff- can someone else help drive the mirror update? | 19:27 |
fungi | sorry, i had to page all that back in | 19:27 |
corvus | then make the container switch later after there's lots more testing? | 19:27 |
ianw | mordred: bionic is sufficient to build fedora. in fact, i already did all of that, let me fine the patch | 19:27 |
fungi | my comment was specifically in response to clarkb's | 19:27 |
mordred | yeah - I think that's a sane thing to do for now - although I do think that continuning the container debugging and testing work is imporant | 19:27 |
mordred | ianw: but not suse | 19:28 |
mordred | ianw: because it doesnt' have zypper | 19:28 |
fungi | in response to clarkb's "Its possible this is no longer a concern..." | 19:28 |
mordred | so - I think we should operate under the assumption that having at least one focal node would be beneficial - and that we also might need at least one bionic node because arm. hopefully we can coalesce on only focal once we can prove out that it works fine for arm | 19:29 |
clarkb | mordred: ianw I think the only major risk with the focal plan is focal + arm64. But it sounds like we can maybe keep that on xenial or bionic for a bit longer | 19:29 |
fungi | so i was saying initial vos release timeouts *are* a concern for anything added to mirror-update.openstack.org (like reprepro-based mirroring which is still there for the moment) but not for things mirrored using mirror-update.opendev.org (like rsync-based stuff) | 19:29 |
mordred | yeah | 19:29 |
mordred | I don't think the ansible differences between bionic and focal are likley large | 19:29 |
fungi | corvus: does that answer your question? | 19:29 |
mordred | we don't have big things like systemd vs sysvinit | 19:29 |
corvus | fungi: does that change apply to opendev or openstack? | 19:30 |
corvus | mordred: that sounds reasonable | 19:30 |
fungi | corvus: openSTACK because it's an ubuntu mirror | 19:31 |
fungi | so vos release timeouts are still a concern for that change | 19:31 |
fungi | i probably should have quoted clarkb's comment in my reply, but thought it was obvious what i was replying to (clearly it wasn't, sorry!) | 19:31 |
corvus | well, the part you were referring to with "this" would have been helpful | 19:32 |
mordred | I'm happy to work on the ansible nodepool-builder (which ianw may have already done) and the focal-minimal image in rax-dfw - can someone else drive the steps needed to get the vos release done safely? | 19:32 |
ianw | https://review.opendev.org/#/c/692924/ | 19:32 |
mordred | ianw: awesome, thanks! | 19:32 |
clarkb | ianw: ^ does that seem reasonable? I think we can switch to containers on top of ansible without containers easily enough when we have that working relaibly | 19:32 |
ianw | tbh this is was i was proposing about 6 months ago :) | 19:33 |
mordred | ianw: although I think a lot of the current code can stay as it is in container-builder - so it'll be about picking out the appropriate things that we need for pip nodepool | 19:33 |
clarkb | mordred: I can start a root screen on mirror-update and grab the lock to get this started | 19:33 |
clarkb | then we need to update the quota, merge the cahnge, and manually trigger the update | 19:33 |
mordred | clarkb: ++ | 19:33 |
clarkb | but first I need to load my ssh key | 19:34 |
mordred | ianw: well - I think we would have been golden with the container work you did -- if there wasn't this crazy debootstrap bug :) | 19:34 |
mordred | of all the things to derail this - I wasn't expecting _that_ ;) | 19:34 |
ianw | it's just a lot of uncharted territory all around | 19:34 |
ianw | anyway, i'll keep working on it | 19:35 |
mordred | ianw: cool - and yes - I'd love for us to be able to switch back to pure-containers there | 19:35 |
clarkb | mordred: if we can swing back to gerrit things, we have a third party ci operator that is using devstack-gate and discovered that we are still not replicating to review.o.o/p/ properly | 19:35 |
mordred | although there is also an arm thing we have to solve | 19:36 |
clarkb | mordred: I triggered replicaton openstack/requirements (the out of date repo) just in case that was somethign that didn't get rereplicated after config updates and no change | 19:36 |
mordred | I've got thoughts on the arm thing - it's solvable | 19:36 |
mordred | http://eavesdrop.openstack.org/irclogs/%23opendev/%23opendev.2020-04-19.log.html#t2020-04-19T15:04:32 <-- basic summary of what we need to do to build arm and x86 images when we buiold images | 19:36 |
clarkb | I also noticed we have replication things failing for the local /opt/git thing there | 19:36 |
mordred | clarkb: ok - we need to investigate that - it should be fixed | 19:37 |
mordred | can we swing back to that after the meeting? | 19:37 |
clarkb | mordred: yup | 19:37 |
corvus | weird, a quick check shows that checks out for me | 19:38 |
clarkb | corvus: you can cloen it but you end up on an older commit aiui | 19:38 |
corvus | clarkb: i'm saying i don't see an older commit | 19:39 |
mordred | oh - I know what it is | 19:39 |
corvus | but anyway, mordred requested we defer this | 19:39 |
mordred | well - I looked anyway | 19:39 |
mordred | the issue is a few missing repos that we created while the mount wasn't properly in place | 19:40 |
mordred | so we didn't actually create them on the real filesystem | 19:40 |
corvus | (my local test case is opendev/system-config, because that's the ancient checkout i noticed the problem with) | 19:40 |
mordred | starlingx/kernel would be one I believe | 19:40 |
mordred | actually - it's owned by root | 19:41 |
mordred | anywho - we can fix that | 19:41 |
mordred | we should figure out if we're running manage-projects as the wrong user | 19:42 |
mordred | and thus creating the local replication target repos as the wrong user | 19:42 |
clarkb | k anything else on this subject? as a time check we have 18 minutes left and a few other things to get to (but this was also a huge chunk of change last week so want to make sure we get through it) | 19:42 |
mordred | I'm good | 19:43 |
fungi | all clear here | 19:43 |
clarkb | #topic OpenDev | 19:44 |
*** openstack changes topic to "OpenDev (Meeting topic: infra)" | 19:44 | |
clarkb | As mentioned we seem to be picking up some new traffic which is good | 19:45 |
clarkb | Fungi has proposed that openstack-infra become a SIG and the openstack TC is on board with that | 19:45 |
clarkb | I think it makes sense too | 19:45 |
frickler | why not fold it into qa? | 19:46 |
fungi | #link http://lists.openstack.org/pipermail/openstack-discuss/2020-April/014296.html Forming a Testing and Collaboration Tools (TaCT) SIG | 19:46 |
AJaeger | The TaCT SIG - I love the proposal ;) | 19:46 |
clarkb | frickler: I think gmann is concerned they don't have the knowledge to manage some of the tools officially like that | 19:46 |
fungi | frickler: members/leaders of the openstack qa team have expressed concern over suddenly becoming responsible for "more stuff" | 19:46 |
clarkb | frickler: I think long term that may make sense but for now this gives us the ability to keep it a thing that is more tightly scoped and work with qa team as necessary | 19:46 |
fungi | (also i could see qa shutting down as a team and folding into the same sig eventually) | 19:47 |
frickler | oh, maybe that direction might work, too | 19:47 |
frickler | or split of some things like devstack | 19:47 |
frickler | o.k. | 19:47 |
frickler | s/of/off | 19:47 |
fungi | we could stand some volunteers to serve as chairs for that sig, if it's something folks are generally in favor of forming | 19:48 |
clarkb | If you'd like to volunteer to be service coordinate for opendev now is the time to do so | 19:49 |
clarkb | I mentioned I'd be happy to continue but also think new involvement is good as well | 19:49 |
fungi | keep in mind that there is little responsibility as a sig chair, mostly just be aware of what's going on generally within the sig and be able to serve as a representative for it | 19:49 |
clarkb | ya we'll need a sig chair as well but thats less involved I exect | 19:50 |
clarkb | #topic General Topics | 19:50 |
*** openstack changes topic to "General Topics (Meeting topic: infra)" | 19:50 | |
clarkb | It is PTG planning time | 19:50 |
clarkb | we have been given some constraints in an effort to have collaboration happen between projects and keep hours sane for attendees | 19:50 |
fungi | i have a link i was going to paste with sig chair responsibilities, but maybe i'll just follow up to that ml post with it | 19:51 |
frickler | so is the vPTG to run completely on meetpad? or something else like zoom or bj? | 19:51 |
fungi | (snice we're short on time) | 19:51 |
clarkb | the result of that is a giant ethercalc where we need to sign up for time | 19:51 |
fungi | frickler: not determined yet | 19:51 |
clarkb | frickler: I think that is still be sorted out. I'd like to be able to make meetpad an option | 19:51 |
clarkb | so we should keep pushing on it | 19:51 |
clarkb | from that etherpad I've identified 3 two hour blcoks that I think work with our global presence | 19:52 |
fungi | i gather there's communication going out in the next day or two from the event planners to try and nail down requirements for collaboration software | 19:52 |
corvus | i'll see about working out what the deal is with the python version there | 19:52 |
clarkb | corvus: I think mordred pushed a fix for it | 19:52 |
frickler | o.k., but that'd need some big push IMO. I'll try to get some time allocated for that | 19:52 |
corvus | oh awesome | 19:52 |
mordred | corvus: the root cause was fun | 19:52 |
clarkb | Monday 1300-1500 UTC, Monday 2300-0100 UTC, Wednesday 0400-0600 UTC | 19:52 |
mordred | corvus: https://review.opendev.org/#/c/721707/ | 19:52 |
clarkb | those are the blocks I think will work so that we can each attend ~2 out of ~3 without too much pain | 19:52 |
clarkb | if there isn't any immediate objection to those blocks I can go ahead and sign us up for them (and tweak later if necessary) | 19:53 |
clarkb | (and yes it will mean an early morning or a late night for many of us if you intend to hit 2 of the 3) | 19:54 |
clarkb | (but it seemed to be an equitable distribution when I wrote out the times in a table ) | 19:54 |
frickler | +1 from me | 19:55 |
fungi | yeah, i'll make myself available whenever | 19:55 |
AJaeger | LGTM | 19:55 |
fungi | maybe i can even swing all three with appropriate quantities of caffeine coursing through my veins | 19:55 |
clarkb | cool. I probably won't get to signing up for those times until later today so let me know if there is a major conflict | 19:55 |
clarkb | Next up is the wiki update but I think we can skip it due to time | 19:55 |
clarkb | which takes us to etherpad | 19:56 |
fungi | etherpad is dead, long live etherpad? | 19:56 |
clarkb | as part of mordreds container/ansible/cd work the old etherpad servers are gone including etherpad-dev | 19:56 |
clarkb | we haven't replaced that server and will instead rely on system-config end to end testing | 19:56 |
clarkb | the idea mordred and I had was if we need to verify UI behavior we can hold a test node and use it to manually verify off what zuul built | 19:56 |
clarkb | I expect this will work reasonably well as a tool we can leverage for various services | 19:57 |
mordred | yeah - if we find that sucks - we can always spin up a new etherpad-dev | 19:57 |
clarkb | Wanted to call this out as a separate agenda because if it isn't working well then that feedback would be good to hear | 19:57 |
clarkb | mordred: ++ | 19:57 |
clarkb | #topic Open Discussion | 19:57 |
*** openstack changes topic to "Open Discussion (Meeting topic: infra)" | 19:57 | |
clarkb | alright have a few minutes for any thing else | 19:57 |
clarkb | Sounds like that might be it. Thank you everyone! | 19:59 |
clarkb | #endmeeting | 19:59 |
*** openstack changes topic to "Incident management and meetings for the OpenDev sysadmins; normal discussions are in #opendev" | 19:59 | |
openstack | Meeting ended Tue Apr 21 19:59:30 2020 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 19:59 |
openstack | Minutes: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.html | 19:59 |
openstack | Minutes (text): http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.txt | 19:59 |
fungi | #link https://governance.openstack.org/sigs/reference/sig-guideline.html#select-sig-chairs sig chair responsibilities | 19:59 |
openstack | Log: http://eavesdrop.openstack.org/meetings/infra/2020/infra.2020-04-21-19.01.log.html | 19:59 |
fungi | d'oh, too late | 19:59 |
fungi | will follow up to the ml like i originally decided | 19:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!