Wednesday, 2024-01-31

*** zigo_ is now known as zigo		09:43
jrosser	morning	11:53
noonedeadpunk	o/	12:28
opendevreview	Andrew Bonney proposed openstack/openstack-ansible master: WIP: [doc] Update distribution upgrades document for 2023.1/jammy https://review.opendev.org/c/openstack/openstack-ansible/+/906832	13:03
noonedeadpunk	andrewbonney: I've commented our solution to one of your TODO items fwiw	13:13
andrewbonney	Ta. I've got a copy of your script, just haven't got to using it during this process yet	13:13
noonedeadpunk	ah, ok, was not sure if I've ever shared this	13:13
noonedeadpunk	also likely it should be polished a bit... but well	13:14
jrosser	noonedeadpunk: do you know how nested virt is supposed to work in CI jobs?	13:14
jrosser	i see that there are nested virt enabled flavors here https://opendev.org/openstack/project-config/src/branch/master/nodepool/nl03.opendev.org.yaml#L147-L188	13:14
noonedeadpunk	i somehow thought that nested virt is a requirement from infra side before joining	13:15
jrosser	but we always make `nova_virt_type: qemu` in user_variables for any zuul job	13:15
jrosser	and this is terribly slow for the capi job, like 10x slower	13:15
noonedeadpunk	qemu is very slow, yes	13:16
noonedeadpunk	as generally tempest just spawn VMs with cirros to check connectivity and drop it	13:16
jrosser	so i was wondering if we are somehow doing it wrong, and the nodeset choice is supposed to decide if its nested virt or not	13:16
noonedeadpunk	or well, maybe cases with octavia where LBs are spawned, but all cases have quite small workload	13:17
jrosser	we can detect this at runtime with `cat /sys/module/kvm_amd/parameters/nested`	13:18
noonedeadpunk	so you mean that nested virt we should be able to use kvm?	13:18
jrosser	(or kvm_intel as needed)	13:18
jrosser	yeah so in my cloud here i am running AIO for capi with nested virt enabled	13:18
jrosser	and becasue it's not a zuul job it actually is using kvm accel	13:18
noonedeadpunk	and holds you saw were all having it?	13:19
jrosser	and the magnum cluster creates in 4 mins	13:19
jrosser	and in the CI hold i have, the nodes are nested virt enabled, but we set it to qemu in zuul user_variables	13:19
noonedeadpunk	yeah, I see	13:19
jrosser	and then it takes like 40mins+ to create the capi cluster	13:19
noonedeadpunk	makes sense to use kvm when we can, sure	13:20
jrosser	and the CPUs are just pegged at 100% on qemu processes	13:20
noonedeadpunk	maybe this will get some boost for other jobs as well	13:20
jrosser	yeah, and i guess if i can make it dynamic detection then if the provider supports it, it will use it	13:21
noonedeadpunk	I feel like it should be >90% of cases...	13:22
noonedeadpunk	Like testing kata would be impossible otherwise...	13:23
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Enable nested virtualisation in the AIO when it is available https://review.opendev.org/c/openstack/openstack-ansible/+/907327	14:11
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible-os_magnum master: Add job to test Vexxhost cluster API driver https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199	14:12
* noonedeadpunk wonder if execution time will drop significantly		14:23
jrosser	i guess in a normal job this will only affect booting cirros, so perhaps not	14:23
jrosser	but the capi job makes amphora and two ubuntus......	14:23
noonedeadpunk	yeah, true...	14:23
noonedeadpunk	tempest run still takes 8mins...	14:25
noonedeadpunk	So if it will drop to 3 - 5 mins is noticable...	14:25
jrosser	that would be good	14:25
noonedeadpunk	jrosser: seems it 's not working out: https://zuul.opendev.org/t/openstack/build/bad24e6c50114083a56c7ae392f99dfe/log/logs/openstack/aio1-utility/tempest_run.log.txt	15:17
noonedeadpunk	I know it's distro job, but it fails like very alike to how it would with such change	15:17
noonedeadpunk	with VM crashing on attempt of startup	15:18
jrosser	hmm maybe it needs to be opt-in for the job	15:26
noonedeadpunk	and you have KVM not QEMU locally?	15:27
noonedeadpunk	Like, you sure about that?:)	15:27
noonedeadpunk	jrosser: as we are already trying to guess kvm/qemu here: https://opendev.org/openstack/openstack-ansible-os_nova/src/branch/master/tasks/nova_virt_detect.yml	15:29
noonedeadpunk	so eventually, we could just undefine it and rely on nova role...	15:30
jrosser	argh i see	15:31
opendevreview	Jonathan Rosser proposed openstack/openstack-ansible master: Allow nova role to detect virtalisation type in CI jobs https://review.opendev.org/c/openstack/openstack-ansible/+/907327	15:36
jrosser	noonedeadpunk: in my local AIO vm i see `-accel kvm` on the /usr/bin/qemu-system-x86_64 process	15:37
noonedeadpunk	mhm, I see	15:37
noonedeadpunk	jrosser: fwiw, it's failing quite dramatically as well	17:03
noonedeadpunk	but not for everything....	17:04
jrosser	ok so there is probably very good reason to have nested virt specific nodesets	17:04
noonedeadpunk	I wonder what's wrong though, and why it's not catched with nova role then	17:05
jrosser	as it may just be br0k in certain providers	17:05
noonedeadpunk	as conditions there looked around right	17:05
jrosser	well if you have unfortunate kernel troubles between host/guest doesnt this all go a bit bad?	17:05
jrosser	afaik its very sensitive to host side things	17:05
jrosser	if you codesearch for the virt_type stuff its just tons of "use qemu in CI" comments all over	17:07
noonedeadpunk	yeah, that;'s probably reason why it's there	17:09
jrosser	anyway - the node type for the capi job is one where this is actaully supposed to work	17:12
jrosser	so we still need a way to opt in	17:12
jrosser	OMG it works https://review.opendev.org/c/openstack/openstack-ansible-os_magnum/+/905199?tab=change-view-tab-header-zuul-results-summary	17:44
jrosser	\o/	17:44
noonedeadpunk	wow	17:56
noonedeadpunk	that is veeeery promising :)	17:56
noonedeadpunk	only 2h:)	17:57
noonedeadpunk	even faster then upgrade job :D	17:57
jrosser	they are fast nodes	18:47
TheCompWiz	jrosser: I finally broke down and tried the AIO install, and I'm getting yet-another-brick wall. The bootstrap_aio script dies when trying to "Download EPEL gpg keys"	19:32
TheCompWiz	wait... why on earth would it be trying to install centos EPEL keys on ubuntu?	19:38
noonedeadpunk	TheCompWiz: that is correct question :D	19:39
TheCompWiz	ansible facts say distribution is ubuntu...	19:50
jrosser	can you share the output?	19:50
TheCompWiz	https://paste.openstack.org/show/bPrmCUBAQKp3mRy0LpGb/	19:51
TheCompWiz	the ansible facts shows this: "ansible_os_family": "Debian"	19:51
TheCompWiz	and everything I see says the "Install EPEL" block shouldn't even be run...	19:52
TheCompWiz	brb 1 sec... switching PCs.	19:54
*** TheCompWiz is now known as TCW		19:55
TheCompWiz	back	19:56
jrosser	its not to do with debian or not	19:58
jrosser	the error says `error while evaluating conditional ('s3fs' in systemd_mount_types)`	19:58
jrosser	seeing the output of tasks to do with setting up the AIO storage is going to be the only way to understand this	20:11
noonedeadpunk	TheCompWiz: to be more specific `'_bootstrap_host_data_disk_device' is undefined.`	20:36
TheCompWiz	which is odd... because I set export BOOTSTRAP_OPTS="bootstrap_host_data_disk_device=sdb bootstrap_host_data_disk_fs_type=xfs bootstrap_host_public_interface=ens34"	20:37
TheCompWiz	before running bootstrap.	20:37
TheCompWiz	and yes, sdb does exist.	20:37
TheCompWiz	and was partitioned & formatted.	20:37
noonedeadpunk	aha, ok, that;s important input	20:39
dmsimard[m]	noonedeadpunk: are you going to fosdem after all? :P	20:39
noonedeadpunk	dmsimard[m]: nah :( Like I even got time and funding, but task workload is just /o\	20:39
noonedeadpunk	so next time I guess	20:40
dmsimard[m]	I know the feeling, there'll be a next time no stress	20:40
noonedeadpunk	TheCompWiz: that actually should have worked....	20:47
noonedeadpunk	I'd need try to reproduce that, but only tomorrow :(	20:48
noonedeadpunk	eventually then it should have failed waaaay earlier I guess	20:49
noonedeadpunk	like here: https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/tasks/check-requirements.yml#L135-L144	20:50
noonedeadpunk	TheCompWiz: oh.... what if you try to drop partition table?	20:53
noonedeadpunk	that looks like a bug	20:53
noonedeadpunk	so we define _bootstrap_host_data_disk_device only when there's no partition table exist	20:54
noonedeadpunk	https://opendev.org/openstack/openstack-ansible/src/branch/master/tests/roles/bootstrap-host/tasks/prepare_data_disk.yml#L36-L38	20:54
noonedeadpunk	But that in next task assume it's defined and using another condition	20:55
noonedeadpunk	which I think explicitly what you see	20:55
noonedeadpunk	so solution - add disk, set everything the same, but don;'t bother yourself with partition creation on the drive :D	20:55
noonedeadpunk	TheCompWiz: ^	20:55
TheCompWiz	noonedeadpunk: wiped partitions, and re-running bootstrap-aio. I'll paste the results.	21:22
TheCompWiz	hmmm... that seems to have worked. ... not sure how I ended up in that situation.	21:24
spatel	How to force detach volume?	21:54
spatel	I have one VM has single volume attached two time	21:54

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!