opendevreview | Eduardo Santos proposed openstack/diskimage-builder master: Bump Ubuntu release to focal https://review.opendev.org/c/openstack/diskimage-builder/+/806296 | 02:28 |
---|---|---|
*** ysandeep|away is now known as ysandeep | 05:36 | |
opendevreview | Eduardo Santos proposed openstack/diskimage-builder master: General improvements to the ubuntu-minimal docs https://review.opendev.org/c/openstack/diskimage-builder/+/806308 | 05:41 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: yum-minimal: use DNF tools on host https://review.opendev.org/c/openstack/diskimage-builder/+/806318 | 06:38 |
*** rpittau|afk is now known as rpittau | 07:07 | |
opendevreview | Riccardo Pittau proposed openstack/diskimage-builder master: Fix debian-minimal security repos https://review.opendev.org/c/openstack/diskimage-builder/+/806188 | 07:22 |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: yum-minimal: use DNF tools on host https://review.opendev.org/c/openstack/diskimage-builder/+/806318 | 07:35 |
*** jpena|off is now known as jpena | 07:38 | |
*** mgoddard- is now known as mgoddard | 07:58 | |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: yum-minimal: use DNF tools on host https://review.opendev.org/c/openstack/diskimage-builder/+/806318 | 08:23 |
*** ykarel is now known as ykarel|lunch | 08:24 | |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: yum-minimal: use DNF tools on host https://review.opendev.org/c/openstack/diskimage-builder/+/806318 | 09:07 |
*** ysandeep is now known as ysandeep|lunch | 09:12 | |
*** ysandeep|lunch is now known as ysandeep | 09:36 | |
opendevreview | Ian Wienand proposed openstack/diskimage-builder master: yum-minimal: use DNF tools on host https://review.opendev.org/c/openstack/diskimage-builder/+/806318 | 09:53 |
*** ykarel|lunch is now known as ykarel | 10:04 | |
*** jpena is now known as jpena|lunch | 11:34 | |
opendevreview | Jean Paul Gatt proposed openstack/diskimage-builder master: Support setting custom repository for Redhat Satellite. https://review.opendev.org/c/openstack/diskimage-builder/+/806395 | 11:37 |
lucasagomes | hi folks, if you have some time today mind taking a look at this new project proposal https://review.opendev.org/c/openstack/project-config/+/805802 ? It needs another +2. Thanks in advance! | 12:17 |
*** dviroel|out is now known as dviroel|ruck | 12:17 | |
fungi | config-core: ^ | 12:22 |
fungi | i +2'd it yesterday, but we've been spread a bit thin on reviewers this week | 12:23 |
lucasagomes | fungi, it's all good! thanks much for reviewing it | 12:28 |
*** jpena|lunch is now known as jpena | 12:31 | |
*** rpittau is now known as rpittau|afk | 13:42 | |
opendevreview | Merged openstack/project-config master: New project: OVN BGP Agent https://review.opendev.org/c/openstack/project-config/+/805802 | 13:54 |
opendevreview | Monty Taylor proposed openstack/diskimage-builder master: yum-minimal: use DNF tools on host https://review.opendev.org/c/openstack/diskimage-builder/+/806318 | 14:02 |
*** ysandeep is now known as ysandeep|away | 14:41 | |
*** ykarel is now known as ykarel|away | 14:43 | |
*** dviroel|ruck is now known as dviroel|ruck|lunch | 14:57 | |
clarkb | looks like it got approved | 15:03 |
opendevreview | James E. Blair proposed opendev/system-config master: Pin base and builder images to buster https://review.opendev.org/c/opendev/system-config/+/806423 | 15:03 |
clarkb | Looking at my day I expect to be around a lot more today, but probably not with consistent blocks of time to do things like the job mapping or gerrit account cleanup. | 15:03 |
clarkb | Will probably try to focus on reviews and helping with bullseye stuff | 15:04 |
corvus | mordred, fungi: take a look at 806423 -- the commit msg there outlines a plan if we want to do this stepwise | 15:04 |
corvus | i think we should either do the plan in the commit message, or manually push the tag and try to push things through quickly | 15:05 |
corvus | it's just that i'm not sure how much time folks have to devote to this. if we can't wrap it up more or less immediately, i think 806423 is a good idea because it gets us back to a stable/maintainable state quickly and we can do the transition slowly | 15:06 |
clarkb | corvus: the plan outlined there makes sense to me. Then we can shift to bullseye on a case by case basis which limits blast radius as we go | 15:06 |
mordred | corvus: reading | 15:07 |
clarkb | and buster isn't eol for a while so no concern there | 15:07 |
mordred | corvus: ++ | 15:09 |
fungi | i'm good with the plan | 15:09 |
fungi | and sorry i'm intermittently responsive, hurricane shutter people still trying to finish up and there's a crew here foam-insulating the attic all day too | 15:10 |
lucasagomes | infra-root hi, the patch https://review.opendev.org/c/openstack/project-config/+/805802 is now merged (thanks!) and the gerrit groups have been created (https://review.opendev.org/admin/groups/e7b50bd1c1fcb7b67d184c0df2de5bdee06b7b03 and https://review.opendev.org/admin/groups/b47387254ae1ed75db7479f1725759e43500dced) can you please adding me to these groups ? | 15:10 |
lucasagomes | so I can start adding the rest of the members, thanks! | 15:10 |
fungi | lucasagomes: yep, just a moment and i'll take care of it | 15:10 |
lucasagomes | fungi++ thanks much! | 15:10 |
mordred | corvus: +A | 15:10 |
fungi | lucasagomes: done, let me know if it's not working like you expect | 15:13 |
lucasagomes | fungi, thanks much! Will do | 15:13 |
clarkb | fungi: I have approved https://review.opendev.org/c/opendev/system-config/+/805407 to reflect hte upgrade you did. thanks! | 15:28 |
fungi | oh, thanks! i should have done that, sorry i forgot | 15:29 |
fungi | also i guess it's time to think about scheduling a similar lists.o.o upgrade | 15:30 |
clarkb | yup, do we want to discuss that in next week's meeting? | 15:31 |
clarkb | the openstack release makes that maybe a little tricky, but overall things went well so really would only expect issues in our vhost setup? | 15:31 |
fungi | that's my thinking too | 15:31 |
fungi | i'd be happy to do the upgrade on a saturday or something to minimize impact of the (longer in this case mainly because snapshotting will take longer) outage | 15:32 |
fungi | or we could try snapshotting it live instead and just live with the possibility of a jinky filesystem if we need to boot from the image | 15:33 |
clarkb | do you recall how long the snapshot took last time? | 15:33 |
fungi | not really, a few hours i think | 15:33 |
clarkb | probably the thing to do is propose a date and then ask openstack et al to weigh in if they have concerns with the timing. Then OpenStack can indicate the conflicts for their release if problematic | 15:34 |
fungi | i'll see if i can find a more precise duration for the last lists.o.o image i made | 15:36 |
clarkb | fungi: I'm happy to help on a weekend too though definitely not this one. This week was exhausting :) | 15:39 |
*** jpena is now known as jpena|off | 15:39 | |
fungi | yeah, of course not this weekend ;) | 15:40 |
fungi | i don't have plans next weekend, though that's a holiday weekend in the usa i think? so you're likely to have plans | 15:41 |
clarkb | I don't have concrete plans but ya it is the weekend before school starts here | 15:42 |
fungi | looks like i made the last lists.o.o image on 2021-05-04 | 15:43 |
fungi | "5:15 PM WET" according to rackspace... what tz is that? | 15:44 |
clarkb | western european time zone says google | 15:45 |
clarkb | weird that rax would report timestamps in that timezone unelss this was a hack to get utc because they don't use utc directly? | 15:45 |
clarkb | it us utc+0 | 15:45 |
fungi | aha | 15:45 |
clarkb | *it is | 15:45 |
fungi | yeah, except it may also have a daylight time | 15:45 |
fungi | anyway, that's coarse enough for me to hopefully find it in the irc log | 15:45 |
fungi | ahh, right, this was in prep for esm enrollment with the exim vulnerability around that time | 15:48 |
fungi | CVE-2020-28020 | 15:50 |
lucasagomes | fungi, FYI everything seems to be working great! Thanks much | 15:50 |
fungi | you're welcome, glad to hear it | 15:51 |
fungi | clarkb: looks like i started creating the image at 17:15 utc and remarked at 21:52 utc that it had finally completed, though not sure i caught it right when it completed. i did remark at 20:00 utc that it was still being created, so that means somewhere between 2.75 and 4.5 hours | 15:53 |
fungi | keeping in mind that was imaging it while the server was running, so if we do take it offline to create the image instead it might be quicker (no guarantees though) | 15:55 |
opendevreview | Merged opendev/system-config master: Test lists.kc.io on focal https://review.opendev.org/c/opendev/system-config/+/805407 | 16:00 |
opendevreview | sean mooney proposed openstack/project-config master: Add review priority label to nova deliverables https://review.opendev.org/c/openstack/project-config/+/787523 | 16:02 |
clarkb | fungi: ya we can't be sure of what the limitation is there (disk iops or catching up with running changes, etc) | 16:05 |
*** dviroel|ruck|lunch is now known as dviroel|ruck | 16:06 | |
clarkb | if we figure 4.5 hours to snapshot then allocate 2 hours for the upgrade itself that is still doable within a day. | 16:07 |
fungi | yep | 16:14 |
fungi | related, the openinfra foundation staff are interested in moving a bunch of general foundation mailing lists from the lists.openstack.org site to a new lists.openinfra.dev site, which i'm pushing out until post-upgrade because i want to make sure i can get an accurate picture of what the memory pressure increase from that will likely be | 16:19 |
fungi | should be able to check the output of free, stop all mailman services, run free again, take the difference and divide by 5 to get our average per-site memory consumption, then project what it would look like to increase overall memory utilization on the server by a similar amount | 16:21 |
fungi | it's not precise, but should be good enough to gage the potential impact | 16:22 |
clarkb | I think it may be apache that consumes a good chunk of the memory too which we can potentially tune down | 16:23 |
opendevreview | Merged openstack/project-config master: Add review priority label to nova deliverables https://review.opendev.org/c/openstack/project-config/+/787523 | 16:23 |
fungi | though i guess i can also try to infer the impact of the upgrade by performing a similar exercise pre-upgrade (now) and comparing to the same on lists.k.i since it's already upgraded | 16:23 |
fungi | taking the 5x multiplier into account | 16:24 |
fungi | i'll try to find a quiet time this weekend to do all that quickly | 16:24 |
clarkb | might be a good idea to do that anyway | 16:24 |
clarkb | since it will indicate if 5x on focal is likely to cause problems | 16:24 |
fungi | yep | 16:24 |
fungi | my concern as well | 16:25 |
fungi | it's mostly python-based daemons, which means individual python processes, and those aren't exactly light on memory use | 16:25 |
corvus | fungi: any chance you can take the opportunity to move those lists to mm3? | 16:25 |
fungi | corvus: i see the operating system upgrade as making that easier, but would probably want to do that as a separate step | 16:26 |
clarkb | the plan we had been operating under was convert to ansible (done), upgrade to focal (in progress), then figure out mm3 on the modern tooling and os | 16:26 |
fungi | corvus: but yes, ultimately i do want to do that | 16:26 |
corvus | Ack. Just wondered if we could skip a step, sounds like no because tooling isn't ready | 16:27 |
fungi | however, i'm not sure if we have sufficient resources on the existing server to run it all side-by-side | 16:27 |
fungi | that's another thing we need to dig into | 16:27 |
fungi | i thnik the running mm3 part shouldn't be hard. i already did a poc a few years back with distro packages though we could switch to using the semi-official container images | 16:28 |
clarkb | fwiw using our system-config-run- jobs we can test the deployment of mm3 and possibly even an upgrade from mm2 to mm3 on focal like what I've done with gerrit 3.2 -> 3.3 | 16:28 |
clarkb | that means we can make progress on the operating system upgrade and mm3 concurrently if we have time | 16:29 |
fungi | in theory once everything is moved to mm3 the memory footprint could be smaller, as we get full multi-domain support in doing so and can drop the current vhost setup | 16:29 |
fungi | mm3 correctly maps name@domain to a unique list, so no need to worry about collisions over the name portion (which is the primary concern with mm2's domain functionality) | 16:30 |
clarkb | the buster image switch should merge momentarily. Then we cross fingers for happy promotion | 17:05 |
mordred | Woot | 17:06 |
opendevreview | Merged opendev/system-config master: Pin base and builder images to buster https://review.opendev.org/c/opendev/system-config/+/806423 | 17:11 |
clarkb | base 3.8 and uwsgi base 3.9 failed to promote | 17:13 |
clarkb | now where that gets interesting is if we've got a bullseye 3.8 builder and a buster 3.8 base | 17:13 |
clarkb | I don't think that is an issue in this case | 17:13 |
clarkb | https://zuul.opendev.org/t/openstack/build/48199b22f7d74af292fe77a93ae17787/console says we failed afte rthe promotion if I read that correctly. Someone other than myself should probably double check and confirm that | 17:14 |
clarkb | the uwsgi builds seems to be in a similar situation | 17:17 |
mordred | yeah - that happened yesterday - there's issues in the cleanup | 17:54 |
fungi | so what's the next step? we can just merge the fix and then follow up with the unpin? | 18:02 |
Clark[m] | I may miss some context but wasn't that a fix? | 18:04 |
Clark[m] | Since it sets us back to where we were? | 18:04 |
fungi | the pin? i thought that was so we could merge dib fixes | 18:09 |
fungi | the yum->dnf switch | 18:09 |
fungi | or am i crossing the streams? | 18:09 |
mordred | yeah - the buster pin should open the door for us to fix dib and nodepool | 18:09 |
mordred | without things being blocked while we work it | 18:10 |
fungi | right, cool | 18:10 |
fungi | Clark[m]: so you're not necessarily missing context, it was a fix for the fix | 18:10 |
fungi | so that we can fix it fixed and then unfix once it no longer needs fixing | 18:10 |
fungi | because that's in no way confusing | 18:11 |
Palaver | I would need to push to zuul an image of an OS to run tests for a change for kolla-ansible | 18:51 |
mazzy | Who can help with that? | 18:53 |
fungi | mazzy who can help with what? or are you also Palaver? | 18:53 |
mazzy | I spoke already with a core maintainer of kolla ansible project and he redirect | 18:53 |
mazzy | Yes fungi. It's me.wrong nickname | 18:53 |
mordred | fungi, Clark, corvus: I'm working on a followup patch for the buster/bullseye stuff | 18:55 |
fungi | mazzy: our nodepool-builder servers build operating system images using diskimage-builder, what specifically are you looking for? we may already have a representative image of the linux distribution and version on which you want to test | 18:55 |
mazzy | fungi: thanks. The image I would need is of Flatcar | 18:56 |
*** odyssey4me is now known as Guest5585 | 18:56 | |
mazzy | I'm not sure you have it already | 18:56 |
mordred | this: https://www.kinvolk.io/flatcar-container-linux/ right? | 18:57 |
mazzy | mordred: correct | 18:57 |
fungi | mazzy: so the first step would be making sure https://pypi.org/project/diskimage-builder has elements for building images of that | 18:57 |
mazzy | I would address the stable version | 18:58 |
mazzy | Flatcar distributes several images for several platforms | 18:58 |
fungi | once it's supported by diskimage-builder, we'd add configuration for it to our nodepool-builder servers so they would start building images of it | 18:58 |
mordred | gotcha. that might not be super immediately compatible with how we run vms - a decent amount of work would need to go into the zuul jobs - the base jobs are going to make a lot of assumptions about being able to ansible in to a node and do stuff. so some design work would need to be done to figure out the best way to accomplish using it | 18:58 |
mazzy | I've already tested kolla against flatcar | 18:59 |
mazzy | And it works | 18:59 |
mazzy | I was able to spin a fleet just by changing few lines of the ansible kolla | 19:00 |
fungi | we'd also want to get package mirrors and python wheel builders set up to reduce network overhead from installing things on the nodes | 19:00 |
mordred | that's not the issue - it's the mechanism of interaction. I'm sure it's solvable, but it'll take more design than just being able to boot one | 19:00 |
mordred | fungi: that's the thing - you dont install things on those nodes | 19:00 |
mordred | this is coreos - just different | 19:00 |
fungi | um, you don't install things? | 19:00 |
fungi | how do you install kolla on it? | 19:00 |
mordred | it's immutable base os designed for running containers | 19:00 |
mordred | so you can run containers | 19:00 |
mazzy | Exactly | 19:00 |
mazzy | I have just installed Python | 19:00 |
mordred | but ansibling in and running pip install is not going to work | 19:01 |
mazzy | And that's it | 19:01 |
fungi | ahh, we don't test on containers, we test on virtual machines, so presumably kolla would have jobs to build flatcar linux containers on some other distro in that case | 19:01 |
mazzy | Python is still possible to be installed along with pip | 19:01 |
fungi | when we test container things, we install the containers onto virtual machines managed by nodepool | 19:01 |
mordred | mazzy: I assume one bootstraps via cloud-init like with coreos? | 19:02 |
mazzy | Ignition | 19:02 |
fungi | so zuul isn't communicating directly with containers to run the jobs, it's communicating with the virtual machines where those containers are installed | 19:02 |
mordred | fungi: yah - flatcar is the os for the VM | 19:02 |
fungi | er, how can it be immutable then? | 19:03 |
mordred | but it has a vastly diffrent operating model than what we expose | 19:03 |
mordred | the base filesystem is an immutable snapshot. when you boot it, you provide cloud-config info that tells it what containers you want it to run | 19:03 |
mazzy | Because eveyrhing runs on container and it does not have any package manager | 19:03 |
fungi | you still need to be able to write somewhere on the node | 19:03 |
mordred | yah - there's usually data volumes | 19:03 |
mazzy | Correct | 19:04 |
mordred | but they're accessed/exposed as container volumes | 19:04 |
fungi | mazzy: well, anyway, i guess this discussion highlights that flatcar isn't designed like a typical linux distribution, so you'll want to get very familiar with how zuul and nodepool work before trying to design a way to integrate them | 19:04 |
mordred | the model of "shell in to the node with ssh and perform os commands to do stuff" is not the model | 19:04 |
mordred | yeah - I do think it's possible | 19:04 |
mordred | but it's going to be very non-trivial | 19:04 |
mazzy | Wait a sec | 19:05 |
mazzy | What I'm trying to do is not deliver support of the base os for the containers. | 19:05 |
mazzy | I'm trying to deliver support for the os where containers will run | 19:05 |
mordred | right | 19:05 |
mordred | that's what's going to be very non-trivial to support | 19:06 |
mordred | mazzy: for context, we don't use cloud metadata services for instance specialization *AT ALL* | 19:06 |
mazzy | Sorry do not follow you | 19:06 |
fungi | the way our ci jobs normally work is that nodepool builds an image which contains cached copies of a lot of stuff, boots virtual machines from those images, allocates them to builds for jobs when requested by zuul, then the zuul executor connects to the node(s) for the build via ssh to run the playbooks for those jobs, which generally involves installing the things the job will need into the | 19:06 |
fungi | node and then running some testsuite and collecting results/artifacts and reporting results, at which time the nodes are returned and garbage-collected to free cloud quota | 19:06 |
mazzy | What does that mean? | 19:06 |
mordred | so we currently don't even have a mechanism to pass any info to ignition | 19:06 |
mordred | yah. what fungi said. we don't expose any interface to the cloud mechanisms that you would need to interact with to be able to boot and interact with a flatcar os image | 19:07 |
fungi | we expect the cloud provider to pass information to the virtual machine via configdrive, so it knows how to conifgure networking so that zuul will be able to connect to it | 19:08 |
fungi | we use a lightweight agent https://pypi.org/project/glean which is like a very stripped-down cloud-init replacement | 19:08 |
mordred | now - I'm 100% certain a solution could be designed. but it'll be some deep design and interaction | 19:08 |
mordred | fungi: yah - so in flatcar they use Ignition instead of cloud-init - which is like a more powerful beefed _up_ cloud-init replacement ;) | 19:09 |
mazzy | Ok but in my case I could bundle a flatcar image with just Python and pip installed. This is the only thing I need | 19:09 |
mordred | you need to be able to ssh into the node | 19:09 |
mordred | and you need to be able to interact with the node to do your actions via that ssh connection | 19:10 |
fungi | also if we don't rebuild the image with cached copies of the openstack projects, the builds will end up fetching tons of git state on every run | 19:10 |
fungi | we generally rebuild all our operating system images daily | 19:10 |
mordred | well - I imagine the kolla container builds would happen on a differen tnode and these would use the built containers images | 19:11 |
fungi | to make sure their contents and cached data are as current as is manageable | 19:11 |
mordred | but we would want a variation of the base jobs that did not attempt to push the git repo contents to the flatcar node | 19:11 |
fungi | if the idea is to test changes to kolla, then the kolla images would need to be built somewhere by the job, so yeah it could be a multi-node job which uses an ubuntu node to build the images and then a flatcar node gets them deployed from an instantiated registry, or there could be a build job and a flatcar test job depending on that sharing images from the buildset registry | 19:12 |
mazzy | How do you usually ssh into the image? | 19:12 |
*** sshnaidm is now known as sshnaidm|afk | 19:12 | |
mazzy | Usrf/pwd? | 19:12 |
fungi | mazzy: rsa key | 19:13 |
mordred | there's a public key | 19:13 |
fungi | supplied via cloud provider metadata and installed at boot by glena | 19:13 |
fungi | glean | 19:13 |
mordred | fungi: it's like our opendev servers where we are just running docker-compose. except instead of building from a base os, installing docker with ansible and then putting the compose file on the node, docker would be pre-baked in and we'd pass the compose file in at instance boot time via instance metadata | 19:13 |
mazzy | Iirc flatcar should still support cloudinit | 19:14 |
fungi | or are we installing them into the built vm images with a noepool element? i need to double-check | 19:14 |
mordred | fungi: I think we moved to keys via instance metadata | 19:14 |
mordred | but you might be right :) | 19:14 |
fungi | well, we do set a ZUUL_USER_SSH_PUBLIC_KEY in the env-vars for nodepool builders | 19:15 |
fungi | so i think it's baked into the images | 19:15 |
mazzy | https://github.com/kinvolk/coreos-cloudinit | 19:15 |
mazzy | Yeah seems cloudinit still supported | 19:15 |
fungi | mazzy: while you were gone i think i convinced myself we just bake the zuul ssh public key into our node images anyway | 19:16 |
Clark[m] | We don't use glean or cloud I it for the zuul credentials. We do use glean for our root credentials | 19:16 |
Clark[m] | A better approach might be to reboot into flatcar | 19:16 |
Clark[m] | That way zuul can bootstrap per usual then the job can convert itself to the target | 19:16 |
Clark[m] | But I'm not sure how feasible that flip would be | 19:17 |
fungi | oh, could even do something like kexec maybe to save time | 19:17 |
mazzy | Which type the images you use are? | 19:17 |
mazzy | Are qemu images? | 19:18 |
Clark[m] | It depends on the cloud. We build a single image with diskimage-builder then convert it to raw, qcow2, and vhd for various clouds | 19:20 |
mazzy | Interesting because they already do that | 19:20 |
mazzy | Kinvolk builds images for anything out there | 19:20 |
mazzy | https://stable.release.flatcar-linux.net/amd64-usr/current/ | 19:21 |
Clark[m] | Right but we want to build our own images. It allows us to control what goes in them and ensure they are all identical other than format across our clouds | 19:21 |
mordred | yeah. booting the image is not the problem | 19:21 |
Clark[m] | And unfortunately we tend to have clouds that can't boot random internet images | 19:21 |
mordred | the problem will be interacting with the image once it is booted | 19:21 |
Clark[m] | Well it is part of the problem due to networking setup | 19:21 |
Clark[m] | But then once that is solved you have the next issue of talking to the image | 19:22 |
mazzy | Well but the point is right that. If we bake the image then we can push what we want in the image | 19:22 |
mordred | mazzy: we can - but we still can't have a _job_ pass anything to ignition | 19:23 |
Clark[m] | It might be helpful to talk about what you are trying to achieve with flatcar and our CI system | 19:23 |
mazzy | I mean at the moment with the flatcar builder they provide I can bake eveyrhing inside of the official image | 19:23 |
mazzy | Clark[m]: adding Flatcar support to kolla | 19:24 |
mordred | right - but to make use of that you'd need the ability to have zuul job build a flatcar image containing the kolla images you need inside of it ... and then the ability to boot that flatcar image in the cloud. that's super unpossible. the next option would be to have one job build kolla images and then a second job boot a flatcar instance that you would run kolla containers inside of - but unless you can start those containers via an ansible | 19:25 |
mordred | ssh connection, which is not how people use flatcar, it's still going to be an issue | 19:25 |
fungi | yeah, maybe part of the confusion here is that we don't boot nodes in a job, we boot nodes and then jobs ask for an available already booted node to run on | 19:25 |
mordred | right | 19:25 |
fungi | so the nodes are not job-specific | 19:25 |
fungi | they are generic nodes booted with a representative of whatever general-purpose operating system they are meant to replicate | 19:26 |
mazzy | But we can leverage flatcar official tools to build image for zuul jobs | 19:27 |
mazzy | Those are open source | 19:27 |
fungi | yes, there could for example be diskimage-builder elements for building generic flatcar images | 19:27 |
fungi | and then we could boot generic flatcar virtual machines from them which jobs could request | 19:27 |
mazzy | Exactly | 19:28 |
fungi | and then *magic happens* to add the things which the job will need in the running flatcar virtual machine | 19:28 |
mordred | yup. that part is easy enough (it's work, but it's easy) | 19:28 |
mordred | yah. that's the part that needs a story | 19:28 |
mordred | mazzy: so - once we have booted a flatcar instance for your job, how do you expect to run kolla containers in it? | 19:28 |
mazzy | mordred: what do you mean? | 19:29 |
fungi | how do you install kolla once flatcar is booted | 19:29 |
mazzy | Running ansible playbooks | 19:29 |
fungi | got it | 19:29 |
mazzy | This is what I've already done and fully tested in my servers today | 19:30 |
mordred | ok. so you can ssh from outside the flatcar instance and do all the things via ansible? | 19:30 |
mazzy | Exactly | 19:30 |
fungi | so zuul will ssh into a running flatcar virtual machine, then run ansible playbooks to install kolla | 19:30 |
mazzy | Correct | 19:30 |
mordred | ok - sweet. that's not nearly as much of a mismatch than we were worried about | 19:30 |
mordred | in that case, then, adding dib support for building a flatcar image is going to be the main work | 19:31 |
fungi | so in that sense it should work like a generic gnu/linux distribution on a virtual machine | 19:31 |
mazzy | Exactly. | 19:31 |
mazzy | Be used flatcar image must come with Python installed | 19:31 |
mazzy | Which of course is not the case | 19:31 |
mazzy | And we need to bake with Python | 19:32 |
fungi | yeah, we install python into all our virtual machine images when building them | 19:32 |
mazzy | Or ask to ansible to install it | 19:32 |
mordred | heh. ... https://kinvolk.io/docs/flatcar-container-linux/latest/reference/developer-guides/sdk-modifying-flatcar/#using-cork <-- | 19:32 |
mordred | that works with chroots already | 19:32 |
mordred | so adding dib support for flatcar stuff might not be horribly difficult | 19:32 |
mazzy | Cork -yeah their tool | 19:32 |
mazzy | Cork and mantle to be exact | 19:33 |
mazzy | Which is your opinion on Python? When should be installed? | 19:33 |
fungi | making sure glean works instead of cloud-init may be useful, for consistency with our other images | 19:33 |
mordred | yeah | 19:34 |
mordred | mazzy: it'll need to be in the image - first thing we do with one is ansible in - so without python that will not work well :) | 19:34 |
fungi | mazzy: we install a "default" python but allow jobs to install other versions of python after starting if they need different interpreter versions | 19:34 |
Clark[m] | fungi: not just consistency but cloud init may not work with certain vlouds | 19:34 |
mazzy | Make sense | 19:35 |
mazzy | So we need to make it with Python | 19:35 |
mordred | "Flatcar Container Linux is based on ChromiumOS, which is based on Gentoo." <-- there's already gentoo support in dib - so ultimately it's possible it might be fairly straightforward to add support | 19:35 |
fungi | but yeah, we bake some python version into all our virtual machine images just to make things go more smoothly with ansible | 19:36 |
mazzy | Ok cool cool. Seems we are on the same page. Then what's next? | 19:37 |
mazzy | Should I create a proposal anywhere? | 19:37 |
fungi | (generally whatever the default python3 interpreter is for the distro, but if the distro doesn't have a default i guess we pick one) | 19:37 |
mordred | mazzy: look at https://opendev.org/openstack/diskimage-builder | 19:37 |
mordred | that's what nodepool uses to build images for zuul | 19:37 |
mordred | it'll need flatcar support | 19:37 |
fungi | yeah, that's essentially our entry point for creating images | 19:37 |
mazzy | Yes. So I can wire changes in there. Cool cool | 19:38 |
fungi | nodepool is going to call diskimage-builder specifying some set of elements, for example debian-minimal | 19:38 |
mazzy | In our case I would like to start simple. Address only the stable version | 19:38 |
mazzy | Which Python do you usually use? | 19:39 |
mazzy | I was used to run PyPy on flatcar because easy to bundle it and install it | 19:39 |
fungi | we also have functional test jobs which run on proposed diskimage-builder changes to make sure the built image can be booted under devstack and reached by network, so should be a fairly good indicator to reviewers whether it's working | 19:40 |
fungi | mazzy: cpython (generally a supported 3.x) | 19:40 |
mazzy | Perfect. Oh Wait sec | 19:40 |
mazzy | Important point... Networking | 19:40 |
fungi | would probably make sense to use cpython 3.9.whatevers-latest | 19:40 |
mazzy | There is dhcp around? | 19:41 |
mordred | on some clouds | 19:41 |
fungi | it depends on the cloud | 19:41 |
mordred | https://opendev.org/openstack/project-config/src/branch/master/nodepool/nb03.opendev.org.yaml#L86-L96 <-- this is the list of elements we include in our dib builds | 19:41 |
fungi | the configdrive metadata will provide overrides if defaulting to dhcp isn't an option | 19:41 |
mordred | simple-init is one of the ones you'll really need to focus on - it's what installs glean which is what deals with networking | 19:41 |
mordred | it already supports gentoo - so it's likely not that bad to handle | 19:42 |
mazzy | Gotcha | 19:42 |
mazzy | Ok should be all clear for the moment | 19:46 |
mazzy | I will get back definitely in case of some blockers | 19:46 |
fungi | we'll be around! | 19:47 |
fungi | we also have a mailing list, service-discuss@lists.opendev.org | 19:47 |
mazzy | Thanks a lot | 19:48 |
mazzy | Noted! | 19:48 |
fungi | lance is asking whether we still have any leaks in the osuosl environment... i guess i can just look for images or server instances with a non-current date in our account there? anything else need checking? | 19:57 |
Clark[m] | That was what I did last time. Also volumes as it is boot from volume iirc | 19:58 |
fungi | i only see two of each of our image types in openstack image list, and just one server instance according to openstack server list | 19:59 |
fungi | i'll check volume list | 19:59 |
fungi | volume list is empty... is that accurate? | 19:59 |
fungi | maybe we're not doing bfv there? | 20:00 |
fungi | i need to pop out to run an errand but will be back in a few and can reply to his message | 20:00 |
Clark[m] | We might not be doing bfv then | 20:02 |
Clark[m] | Sounds like it is nice and clean | 20:02 |
opendevreview | Monty Taylor proposed opendev/system-config master: Produce both buster and bullseye container images https://review.opendev.org/c/opendev/system-config/+/806448 | 20:20 |
mordred | corvus, Clark, fungi : ^^ I *think* that's the next step? | 20:21 |
clarkb | mordred: looking at my only questions is how does that affect existing users? it shouldn't do muhc because they exisitng names remain they will just stop being updated? | 20:23 |
clarkb | oh actually we keep tagging buster with the old tags | 20:24 |
clarkb | which means they will get updated until buster eols or we switch the tag to bullseye | 20:24 |
clarkb | I do think we should aim to delete buster, but ya this looks like a good next step | 20:28 |
mordred | yeah - I figure we'll swap the latest tag to buster at some point - and then stop building busters - but no rush on that | 20:29 |
Ramereth | <- Lance from OSUOSL BTW in case you want to ping me here | 20:41 |
clarkb | Hello. As mentioned above it seems like things are pretty clean right now | 20:49 |
clarkb | I'll let fungi write up a proper response when he returns | 20:50 |
fungi | Ramereth: d'oh! yep, all good, thanks for checking in! | 21:00 |
fungi | i only see the expected images and server instances in openstack image and server list output | 21:00 |
fungi | i'll follow up by e-mail too | 21:00 |
Ramereth | \o/ | 21:02 |
fungi | and sent | 21:05 |
fungi | one thing we need to look into in our end is that nodepool thinks we have an unlocked node there in a deleting state from 128 days ago (doesn't show up in openstack server list though so i think it's stale info in our zookeeper or something) | 21:06 |
opendevreview | Clark Boylan proposed opendev/system-config master: Produce both buster and bullseye container images https://review.opendev.org/c/opendev/system-config/+/806448 | 22:51 |
clarkb | mordred: ^ a minor update that should fix the job failures. Turns out ARG is weird | 22:51 |
fungi | i prefer to use ARRRGH | 22:54 |
clarkb | aye matey | 23:00 |
clarkb | next issue is I think we reverted too much? | 23:27 |
clarkb | ya we're doing eavesdrop and uwsgi as if they are on bullseye | 23:28 |
clarkb | I think we can do that in followups and more important is we have the imgaes to start so I'll fix the bindep files | 23:28 |
opendevreview | Clark Boylan proposed opendev/system-config master: Produce both buster and bullseye container images https://review.opendev.org/c/opendev/system-config/+/806448 | 23:31 |
*** dviroel|ruck is now known as dviroel|out | 23:44 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!