Monday, 2024-02-19

opendevreview	Takashi Kajinami proposed openstack/os-vif master: Fix missing or unnecessary dependencies https://review.opendev.org/c/openstack/os-vif/+/909341	01:40
opendevreview	Takashi Kajinami proposed openstack/os-vif master: Fix missing or unnecessary dependencies https://review.opendev.org/c/openstack/os-vif/+/909341	01:41
*** mklejn_ is now known as mklejn		08:36
*** tkajinam is now known as Guest213		08:56
opendevreview	Merged openstack/nova master: HyperV: Remove RDP console connection information API https://review.opendev.org/c/openstack/nova/+/906991	09:28
opendevreview	Merged openstack/nova-specs master: Create specs directory for 2024.2 Dalmatian https://review.opendev.org/c/openstack/nova-specs/+/906073	11:03
opendevreview	Stephen Finucane proposed openstack/nova-specs master: Add openapi spec https://review.opendev.org/c/openstack/nova-specs/+/909448	13:18
opendevreview	Takashi Kajinami proposed openstack/nova-specs master: libvirt: Stateless firmware support https://review.opendev.org/c/openstack/nova-specs/+/908297	13:23
opendevreview	Takashi Kajinami proposed openstack/nova-specs master: libvirt: AMD SEV-ES support https://review.opendev.org/c/openstack/nova-specs/+/907702	13:24
bauzas	dansmith: sean-k-mooney: when you have a minute, I'd like to discuss about the mdev live-mig series https://review.opendev.org/c/openstack/nova/+/904209/16	14:37
bauzas	(about the periodic call)	14:38
sean-k-mooney	sure	14:41
sean-k-mooney	we will be hitting meetings shortly but im free for a while	14:41
bauzas	awaiting dan when he can	14:42
bauzas	my concern is about the API we could use for calling the drivcer	14:42
dansmith	bauzas: I'm just getting started in my morning so I'm doing lots of stuff, sorry	14:43
bauzas	no worries then	14:43
bauzas	I'll try to provide a WIP then	14:43
sean-k-mooney	bauzas: just do this in update_aviable_resouces	14:44
sean-k-mooney	thats already being called and its very closely related	14:45
bauzas	sean-k-mooney: sure, but given we don't call the driver by it, I need to create a new driver method :(	14:45
sean-k-mooney	we do	14:45
sean-k-mooney	update_avlaiable_resouces calls the driver	14:45
sean-k-mooney	thats why im suggesting it	14:45
bauzas	https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L9582	14:46
sean-k-mooney	i belive that is called form https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L10587	14:47
sean-k-mooney	yep here https://github.com/openstack/nova/blob/master/nova/compute/resource_tracker.py#L911	14:48
sean-k-mooney	the resouce tracker calls the driver	14:48
sean-k-mooney	so you can just add it to get_available_resource	14:48
sean-k-mooney	in the driver	14:48
sean-k-mooney	which is what you linked too	14:49
sean-k-mooney	the alternitive which i assume you wanted to dicuss	14:50
sean-k-mooney	was adding a new virt driver api function	14:50
sean-k-mooney	and then calling that directly	14:50
sean-k-mooney	alternitivly you could decorate a method in the libvirt driver as a perodic	14:50
sean-k-mooney	and have it run seperatly form those run by the compute manager	14:51
sean-k-mooney	both are workable but the minimal change is just make get_avialable_resoucces call a new function to reconsile the dict.	14:52
sean-k-mooney	bauzas: looking at the code i wonder if there is a better approch	14:53
sean-k-mooney	bauzas: have you coniserd doing it in update_provider_tree	14:53
sean-k-mooney	https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L8869C9-L8869C29	14:54
sean-k-mooney	that has the advantage of being passed in the allocations	14:54
sean-k-mooney	so if you see there is not allocation against the provider but there is a reservation in the dict for an instance	14:54
sean-k-mooney	that would be a stale reservation correct?	14:55
sean-k-mooney	hum perhaps not	14:55
sean-k-mooney	the allocation may be owned by the migration uuid	14:56
sean-k-mooney	if you were to do it in _update_provider_tree i would do it in the existing _update_provider_tree_for_vgpu fucntion where you do reshapes	14:56
bauzas	sean-k-mooney: sorry got a phonecall meanwhile	14:56
sean-k-mooney	no worries	14:57
sean-k-mooney	readback and we can chat later	14:57
* bauzas scrolling back		14:57
bauzas	sean-k-mooney: okay I see your point	14:58
bauzas	so, we would need to inject the db to get the migrations in _update_provider_tree_for_vgpu()	14:58
bauzas	sean-k-mooney: would you be okay ?	14:59
bauzas	I mean calling objects.MigrationList.get_in_progress_and_error() in _update_provider_tree_for_vgpu()	15:00
dansmith	I've lost track of what we're trying to solve here	15:01
dansmith	I thought the periodic was to clean up reservations that are now stale for migrations that never happened?	15:02
dansmith	in the driver memory-resiident dict	15:02
sean-k-mooney	yes	15:03
sean-k-mooney	so i think we can do that in _update_provider_tree_for_vgpu	15:03
dansmith	so what injecting into the DB needs to happen?	15:03
sean-k-mooney	nothing has to happen in the db	15:03
dansmith	ack, okay, I'm confused by bauzas' statement then	15:04
sean-k-mooney	dansmith: is the destionation allcotion held by the instance uuid or the migration uuid	15:04
sean-k-mooney	i think the instance_uuid?	15:04
dansmith	instance IIRC yeah	15:04
sean-k-mooney	ack, assuming that is correct all _update_provider_tree_for_vgpu need to do	15:04
sean-k-mooney	is loop over the in memory dict	15:05
dansmith	maybe bauzas is talking about _querying_ for canceled migration objects to know what to clean from the dict?	15:05
sean-k-mooney	and remove any key that does not have an allocation	15:05
bauzas	dansmith: btw. I added a new check by https://review.opendev.org/c/openstack/nova/+/904209/16/nova/virt/libvirt/driver.py#1740	15:05
dansmith	sean-k-mooney: you can't do that I think because you might be racing with the start of the migration?	15:05
bauzas	dansmith: my question was about where we could verify the dict	15:05
dansmith	sean-k-mooney: or are you thinking the instance already has an allocation for that host, until it's canceled and we can go on the presence of that allocation?	15:06
sean-k-mooney	we should have claimed the placment allcotion before calling migrate correct	15:06
bauzas	and I just don't want to create a new virt driver API method just for this	15:06
dansmith	right, okay, yeah that makes sense	15:06
dansmith	I'm just wondering what bauzas meant by "so, we would need to inject the db to get the migrations in _update_provider_tree_for_vgpu() "	15:06
bauzas	sean-k-mooney: I wouldn't check the allocations, I would just verify that all the dict keys have a current migration	15:07
sean-k-mooney	well that required addtional db lookups	15:07
sean-k-mooney	which can only be done in the compute manager or resouce tracker	15:07
sean-k-mooney	not the driver	15:07
bauzas	dansmith: I'm saying that I want to get the current migrations	15:07
dansmith	bauzas: but "has an allocation for gpu on this host" is good enough right?	15:07
sean-k-mooney	so that why im trying to use the allcotions instead	15:07
dansmith	it also means the reservation mirrors the allocation in placement, which seems right	15:08
bauzas	dansmith: no, because you could have an allocation for an instance while it's not because of a migration	15:08
bauzas	if we want to check the dict, we need to look at the migrations	15:08
dansmith	bauzas: okay, but that would be for an instance that migrated in successfully in which case it doesn't hurt right?	15:08
bauzas	dansmith: that's what you asked me	15:08
bauzas	dansmith: no, it would only be for current migrations	15:09
bauzas	once the migration is done, then the dict key is removed	15:09
bauzas	(or aborted)	15:09
dansmith	bauzas: actually, what I asked is for it to not to get out of sync silently so that things stop working and a restart is the only solution :)	15:09
bauzas	so we need to compare	15:09
sean-k-mooney	bauzas: not what dansmith is saying is if we have an entry in the dict and the instance ahs an allcoation agaisnt the host and there is no migration	15:09
dansmith	bauzas: if that happens, then we're fine, this is just for cleanup if the migration does not get rolled back successfully right?	15:09
sean-k-mooney	then its can only happy because the migration succeeded	15:09
dansmith	sean-k-mooney: well, not exactly	15:10
sean-k-mooney	and we failed to remove it form the dict for some bug	15:10
bauzas	dansmith: that's what I'm saying, we need to compare migrations with the dict	15:10
bauzas	if a dict key is not related to a current migration, then it's stale	15:10
dansmith	sean-k-mooney: what I'm saying is that if there's a reservation in the dict and there's an allocation, then it's either still in progress, has yet to be rolled back, or it succeeded in which case keeping the reservation in the dict doesn't prevent other stuff from working	15:10
dansmith	once the instance is deleted or migrated away, healing based on the allocation would work itself out too	15:11
sean-k-mooney	dansmith: ya i agree with that	15:11
sean-k-mooney	and yes to the healing too	15:11
sean-k-mooney	also bauzas if we ever use consumer_types	15:11
dansmith	bauzas: a dict key for a completed migration where the instance is on our host is not incorrect, even if it is stale, and won't harm anything yeah?	15:11
sean-k-mooney	you can actully tell if the allocation is a migration if you really want too	15:11
sean-k-mooney	so if we start using consumer types for that and we do this based on the allcoation if you want to filter you will be able to in the future	15:12
bauzas	dansmith: I wouldn't be super happy, as we should have removed it by https://review.opendev.org/c/openstack/nova/+/904209/16/nova/virt/libvirt/driver.py#11355	15:12
dansmith	can we gmeet?	15:12
dansmith	I feel like we're going in circles here	15:12
bauzas	sure, I can	15:12
bauzas	sec	15:12
bauzas	https://meet.google.com/cyg-ucsv-xri	15:13
dansmith	sean-k-mooney: are you able to join?	15:14
opendevreview	Amit Uniyal proposed openstack/nova master: enforce remote console shutdown https://review.opendev.org/c/openstack/nova/+/901824	15:14
opendevreview	Amit Uniyal proposed openstack/nova master: enforce remote console shutdown https://review.opendev.org/c/openstack/nova/+/901824	15:17
sean-k-mooney	dansmith: sorry no i was in/am in anohter call	15:28
sean-k-mooney	dansmith:i can join quickly now since this call is just ending	15:29
opendevreview	Mateusz Klejn proposed openstack/nova master: Fixed: unit tests for force delete instance https://review.opendev.org/c/openstack/nova/+/909464	16:01
opendevreview	Mateusz Klejn proposed openstack/nova master: Changed: adjust shelving for shelved image protection https://review.opendev.org/c/openstack/nova/+/909465	16:01
mnaser	Good morning! https://review.opendev.org/c/openstack/nova/+/909228 + https://review.opendev.org/c/openstack/nova/+/909229 are ready for review today :)	16:04
melwitt	when O	16:42
melwitt	*when I've looked into nova-lvm job failures, what I have seen is that they fail due to the glance apache server responding to nova with a HTTP 502 bad gateway. meanwhile no errors in glance logs https://etherpad.opendev.org/p/nova-ci-failures-minimal#L11	16:43
melwitt	not really sure what to do next to debug	16:44
sean-k-mooney	melwitt: this maybe be related to glance-api being OOM killed	16:45
sean-k-mooney	melwitt: artom was debuging failure where that happend so it would be good to confirm that is not whats happening fro the nova-lvm case	16:45
artom	"debugging" is a strong word	16:46
artom	I chased a red herring only to figure out that glanceclient does chunking for us	16:46
melwitt	hm ok. I didn't think there was OOM killer in these but let me check	16:46
artom	So I still no have no idea why glance-api gets oom-killed	16:46
melwitt	oh, yeah there is some OOM kill in here	16:47
sean-k-mooney	melwitt: there may not be but if there is no error in the glance api log i would first check for an OOM event	16:47
sean-k-mooney	so we chatted about turning on zswap on nova-next	16:47
sean-k-mooney	but since we are having OOM events in nova-lvm i might turn it on there too	16:47
sean-k-mooney	or increase the swap there to 8G if its still at 4G	16:48
melwitt	yeah, g-api. hm	16:48
melwitt	Feb 15 13:59:43 np0036770057 kernel: oom-kill:constraint=CONSTRAINT_NONE,nodemask=(null),cpuset=user.slice,mems_allowed=0,global_oom,task_memcg=/system.slice/system-devstack.slice/devstack@g-api.service,task=uwsgi,pid=86510,uid=1001	16:49
sean-k-mooney	melwitt: so its runing with 4G of swap today	16:51
dansmith	also, g-api might not be actually consuming lots of memory, it might just be that the OOM killer decided it was the best thing to kill to create some space	16:51
dansmith	so look at the oom output to see if it was huge or not	16:51
opendevreview	sean mooney proposed openstack/nova master: bump nova-lvm to use 8G of swap from 4G https://review.opendev.org/c/openstack/nova/+/909469	16:55
melwitt	Feb 15 13:59:43 np0036770057 kernel: Out of memory: Killed process 86510 (uwsgi) total-vm:2290700kB, anon-rss:1733784kB, file-rss:6324kB, shmem-rss:32kB, UID:1001 pgtables:3732kB oom_score_adj:0 4707 so I guess 2.2 GB? not sure if that's typical	16:56
sean-k-mooney	melwitt: dansmith ^ thats the swap bump and ill look at adding zswap to that and nova-next in a seperate patch later	16:56
sean-k-mooney	melwitt: so it should be maybe 150-200mb at most normally	16:57
dansmith	makes sense to try it on nova-lvm too so that seems okay	16:57
sean-k-mooney	that looks like it holding a guest image in memory	16:57
sean-k-mooney	that or we need to add an exicit python garbage collector call in the image upload path	16:58
melwitt	Feb 15 13:59:43 np0036770057 kernel: Free swap = 0kB	16:58
dansmith	the image upload path doesn't store the image	16:58
sean-k-mooney	i.e. to egerly run the garbage collector after straming the image	16:58
dansmith	it's possible that python keeps allocating new buffers for each thing we read and needs a break to do garbage collection, that's certainly possible	16:59
sean-k-mooney	i belive we have hadd issues with that in eithe glance or cinder before	17:00
sean-k-mooney	maybe im thinking of swift	17:01
sean-k-mooney	https://codesearch.opendev.org/?q=gc.collect&i=nope&literal=nope&files=&excludeFiles=&repos=	17:01
sean-k-mooney	there are a few project that have an explcit gc.collect() call	17:02
sean-k-mooney	hum although mainly in test code	17:03
dansmith	there was also something about python never deflating the heap, which I think gorka can tell us about	17:03
sean-k-mooney	ya thats what i was recalling. speaking to gorka about the agents growing there memory usage over time	17:03
dansmith	but I would think that wouldn't be our problem here since we should be good to re-use those portions of the heap over and over, especially with allocations of exactly the same size each time	17:03
sean-k-mooney	in anycase bumping to 8G of swap will mitigate the issue for now	17:04
opendevreview	Jay Faulkner proposed openstack/nova-specs master: Sharding: More detailed test plan https://review.opendev.org/c/openstack/nova-specs/+/908673	17:05
JayF	https://review.opendev.org/c/openstack/nova-specs/+/908673 Heads up, I'm going to be implementing the plan laid out here this week; please make loud noises quickly if it's unacceptable :D	17:05
sean-k-mooney	ack for 2024.2 or are you still hoping to merge the shard stuff for 2024.1	17:07
sean-k-mooney	i guess for 2024.1	17:08
sean-k-mooney	if you can get the testing working in time before FF i guess that is fine	17:08
sean-k-mooney	for me the minium i think we need is a 2 node job with to set of vms with 2 ironnic compuge service partitioning the vms into shards within the same conducotr group	17:09
sean-k-mooney	JayF: having tempest test that acttuly assert the toplogy is proably not in the scope of tempest. it would be in the scope of whitebox or could be done via a differnt tempest plugin	17:11
gibi	Uggla: i responded in the manila series	17:12
JayF	sean-k-mooney: It was going to be in ironic-tempest-plugin	17:12
sean-k-mooney	ack thats proably ok then	17:12
JayF	Two node job is not something we do in Ironic really	17:12
JayF	And we are working to slim down our job set; try to get more cases in less VMs	17:12
sean-k-mooney	JayF: can you deploy 2 nova-compute services on a singel node	17:13
JayF	so adding more extensive jobs works against that -- but if we validate the resource pool only contains sharded nodes, we're golden (and already have code to do this in devstack that I can lift)	17:13
JayF	sean-k-mooney: We have zero tests where we do >1 configured nova-compute in Ironic.	17:13
JayF	sean-k-mooney: and the devstack machinery does not support it	17:13
sean-k-mooney	right so that is why i was expecting to have a 2 node job	17:14
sean-k-mooney	each runnign a single nova-compute with ironic	17:14
sean-k-mooney	but ok we can get a similar effect with what your suggesting	17:14
sean-k-mooney	it just sound like much more work	17:14
JayF	No, much much less work. I've already looked into this with Julia and we have a good amount of confidence in this approach.	17:15
sean-k-mooney	well not nessiarly much more just more then i would have expected	17:15
* JayF famous last words		17:15
sean-k-mooney	ack did you have time to respine the patches to nova for the pep8 error	17:15
JayF	I just don't want to do it and someone come along and say "not good enough" :D	17:15
JayF	yeah, I just did	17:15
sean-k-mooney	cool	17:15
JayF	right before talking in here because I went to link it and saw the V-1 :D	17:16
dansmith	I guess I'm confused about how we can know the partitioning works with only one node?	17:17
sean-k-mooney	ok i put the spec update on teh same gerrit topic as teh code	17:17
sean-k-mooney	"ironic-shards"	17:17
sean-k-mooney	dansmith: the job will create more then 1 ironic "vm" and only tag a subset of them with the shard key	17:17
sean-k-mooney	dansmith: so the job will need to assert that not all the nodes in the ironic node list	17:18
sean-k-mooney	show up in the RPs	17:18
JayF	Yeah, we'll end up with e.g. 10 nodes configured (fake drivers, no backend), and only 10-N of them configured with a shard	17:18
sean-k-mooney	and then ones that do march the expected shard key	17:18
JayF	once the good nodes show up, we start a timer and look for any bad nodes	17:18
dansmith	so just making sure that the one nova-ironic we have doesn't pick up the other fake resources outside the configured shard?	17:18
JayF	exactly	17:18
dansmith	okay the spec doesn't say ten.. so to be clear, it will be a bunch of fake ironic resources and not just .. one real and one fake, right?	17:19
dansmith	because the existing hash ring thing will get 1/2 right half the time	17:19
JayF	with fake nodes, the number can be as many as we want	17:19
sean-k-mooney	which works and covers the gap we had. it just not how i would have approched it	17:19
JayF	so if we need a large number of "N" to have confidence, it's trivial to do it	17:19
JayF	where large is like, small two digit number	17:20
sean-k-mooney	JayF: we dont but it would have been nice to have more then one nova-compute in the same ironic conductor group with diffent shard keys	17:20
dansmith	okay I was hoping for better, tbh, but I guess it's better than nothing	17:20
dansmith	right exactly	17:20
JayF	sean-k-mooney: doing that same test for conductor groups is a follow on	17:20
JayF	oh, not exactly what you said	17:20
JayF	but ensuring we have nodes in good/bad conductor groups and testing there	17:20
JayF	similarly to how we are planning for shards (so then we'll have CI coverage for conductor group based paritioning, too, which AIUI doesn't exist at all today either)	17:21
sean-k-mooney	so from a nova point of view we want to ensure we can have multiple nova compute services with idffent shard keys in the same cloud	17:21
sean-k-mooney	and for each to manage ther own slice	17:21
sean-k-mooney	we kind of get that by what you propose	17:22
sean-k-mooney	it just would have been nicer to actully test that directly	17:22
JayF	We are validating up to the virt driver level, but not up to the compute level	17:22
JayF	and you'd rather us go up a little bit further, which makes sense	17:22
JayF	This is an MVP is good; but I will keep an eye out for opportunities to test >1 nova compute -- this is likely something to plan and do as a separate project b/c I suspect it'll be a massive change to the ironic devstack plugin to do so	17:23
JayF	TheJulia: ^ WDYT; tl;dr how hard would it be to do an Ironic test job with >1 nova-compute pulling from the same Ironic?	17:23
sean-k-mooney	i dont think it would need much work at all	17:23
JayF	Maybe I'll reword it to be more personal: that is closer to the approach I was taking last year that kept me running into walls; if it's possible/easy I don't see that path myself	17:24
sean-k-mooney	JayF: the way multi node plugins are ment to work is runnign the same plugin once per node just with bit enabled/disbale as needed	17:24
sean-k-mooney	JayF: so to add a second nova-comptue with ironic we would jsut turn off all the bits the plugin does for the ironic api extra	17:24
JayF	I think you're ignoring that we'd have to have the second node talk to Ironic on the first node, which I think would require some network magic	17:25
sean-k-mooney	and only deploy nova-compute with the ironic driver with a difffen tshard key configured	17:25
sean-k-mooney	JayF: yes but we have that for all the other multi node jobs	17:25
sean-k-mooney	anyway when you have a job up for review	17:26
sean-k-mooney	we can look at makeing a multi node variant later	17:26
JayF	I am not sure Ironic community at large will be thrilled to add multinode to our gate complexities, but timing-wise that's probably PTG topic anyway.	17:26
sean-k-mooney	if we are not crating any addtional fake nodes then we can likely do it with just a config overried to change the virt driver to ironic and populate the nova.conf with the ironic details	17:27
dansmith	ironic has no multinode jobs?	17:27
TheJulia	We have one, adding further complexity to it is not ideal	17:27
Uggla	gibi, thx, I will have a look.	17:27
dansmith	eesh	17:28
JayF	oh, we do > ironic-tempest-ipa-wholedisk-direct-tinyipa-multinode https://zuul.opendev.org/t/openstack/build/7a8be5f2987745f390f6078e8ac87d82 : SUCCESS in 1h 14m 03s (non-voting)	17:28
JayF	non-voting	17:28
TheJulia	JayF: to the orignal ask, that is how the job operates now	17:28
JayF	Ack, so as a second step, I can look into making that job also test sharding	17:28
TheJulia	I think it just would if set, fwiw	17:28
dansmith	I guess I'm not sure what configuration is more deserving of a multinode layout than the configuration that arranges for more-than-trivial nodes	17:29
JayF	I honestly don't know what that job tests; it'll be on my plate to look into it	17:29
JayF	if I can just flip sharding on in that job and use it to test, that'd be even simpler (although I still think there's value in doing the "check resources" type job, because it'll fail more cleanly if we break things)	17:30
sean-k-mooney	there is some logic in devstack to run multiple fake computes on a single node	17:31
sean-k-mooney	so we could (next cycle) explore doign that for ironic. but honestly i think all that is need to enable the comptue service on a secnd node and set the virt driver to ironci and set the other config options	17:31
TheJulia	JayF: multiple ironic-conductors and nova-computes if I'm recalling correctly	17:31
TheJulia	It has just been a while since I've done any work on it	17:32
JayF	oooh, so is that the job I dind't know existed that does coverage for conductor_groups	17:32
JayF	that would make a lot of sense, actually	17:32
sean-k-mooney	thats one of the two jobs we have in our expermiental pipeline	17:33
sean-k-mooney	https://github.com/openstack/nova/blob/master/.zuul.yaml#L981-L990	17:33
sean-k-mooney	we have ironic-tempest-bfv and ironic-tempest-ipa-wholedisk-direct-tinyipa-multinode	17:34
JayF	I think that is the whole mystery laid out there then	17:35
JayF	looks like I have a template job, and need to just shardify it	17:35
sean-k-mooney	ya so this is how hte second node is configured https://github.com/openstack/ironic/blob/master/zuul.d/ironic-jobs.yaml#L692-L733	17:35
JayF	those OVS config bits where the parts I was worried about figuring out, honestly	17:36
sean-k-mooney	well thats common with the libvirt jobs	17:37
sean-k-mooney	for the most part anyway	17:37
JayF	My Ironic knowledge is concentrated closer to the endpoint :) I'm working to spread it up the stack more and more over time but networking is a big blind spot	17:37
sean-k-mooney	JayF: what i would suggest is focus on the testing you had planed before	17:39
sean-k-mooney	if we can speratelly add teh config seeting to tag the ironic nodes in that other job with a shard key	17:39
sean-k-mooney	and configure the relevnet nova-computes on ech host to look at only the shard key for the mv on the relevent host	17:39
sean-k-mooney	that will cover us for the multi node job	17:40
JayF	Well, except for the "you should	17:40
JayF	do the other one first" that's what I had in mind	17:40
sean-k-mooney	well if you want to do the multi node one first that fine	17:41
JayF	I'll take your advice and still keep on the path I added to the spec -- even though you did an ace job of making the multinode seem easier :D	17:41
sean-k-mooney	i was just hoping you could provide an exmaple of configuring it	17:41
sean-k-mooney	then we could help prot it to the multi node one	17:41
opendevreview	Fabian Wiesel proposed openstack/nova master: Do not close returned image-chunk iterator https://review.opendev.org/c/openstack/nova/+/909474	17:41
sean-k-mooney	JayF: you can set teh shard key in the nova.conf trivially like this https://github.com/openstack/nova/blob/master/.zuul.yaml#L459-L485	17:43
sean-k-mooney	im not sure how you go about tagging the relevent fake ironics with a shard key	17:43
sean-k-mooney	if really needed to hack thing quickly i would use a local.sh file created by a pre playbook	17:44
sean-k-mooney	like this https://review.opendev.org/c/openstack/nova/+/679656/13/playbooks/nfv/nfv.yaml	17:45
sean-k-mooney	but i assume you have a nicer way to do that in teh ironic plugin	17:45
sean-k-mooney	ideally via a local.conf parmater	17:45
sean-k-mooney	JayF: if your not familar with /opt/stack/devstack/local.sh if it exist it run right before devstack finishes so if a job creates on eit will be run after devstack has otherwise run fully but before any tests are run	17:46
sean-k-mooney	so its a way to hack in funcitonaltiy quickly via a zuul job that hsould eventually be in a devstack plugin	17:47
*** dmitriis is now known as Guest251		17:48
JayF	honestly that's as nifty for local dev setup as it is for a CI trick, ty	17:50
sean-k-mooney	yes so the reason ci jobs should not use it hevialy	17:50
sean-k-mooney	is because its ment to be reserved for local dev	17:51
sean-k-mooney	but they can use it	17:51
sean-k-mooney	plugins just should not require it to have stuff to work	17:51
sean-k-mooney	since its for the user not the "framework/plugins" to use	17:51
sean-k-mooney	so form a ci point of view its perfectly valid to use	17:52
sean-k-mooney	but if its complex it proably shoudl be a feature in the plugin/devstack instead	17:52
JayF	I mean, yeah, it's kinda obvious it'd be hacky to land a CI job using that	17:54
sean-k-mooney	JayF: i was epecting to see something in enroll_nodes https://github.com/openstack/ironic/blob/245505dc04da6e401e1810812809fef3f3757f04/devstack/lib/ironic#L2431	17:56
sean-k-mooney	to set the shard key on the node	17:57
sean-k-mooney	did i miss that	17:57
JayF	shard key is optional; you only need it when setting things up	17:57
JayF	and it's unused upstream until the nova change lands	17:58
JayF	so there's nothing to miss (yet)	17:58
sean-k-mooney	ah ok so the devstack pluging change has not happened yet	17:58
sean-k-mooney	i was epectign to see a new Settign added to the plugin	17:58
sean-k-mooney	and hten have it append the shared key into the node options if its defined	17:59
JayF	sean-k-mooney: https://opendev.org/openstack/ironic-tempest-plugin/commit/f2b9c74c9cb8459a979d9002aae3c1a41737c77a is the testing that exists r/n in tempest, no scenario tests yet	17:59
JayF	this conversation is me kicking off that work for this week	17:59
sean-k-mooney	right but that patch is in the tempest plugin not the devstack plugin	17:59
JayF	yep; that's what I'm saying, nothing scenario is done, I have the old patch from last cycle which is basically useless now and never landed (but did what you're lookign for)	18:00
sean-k-mooney	ah ok	18:00
sean-k-mooney	well its not useless in that it allows you to deploy locally with ironic fake nodes via devstack	18:01
sean-k-mooney	with shards	18:01
JayF	Yeah, but it's useless in terms of testing coverage	18:01
JayF	because it was passing even when the client wasn't even passing ?shard=blah to the Ironic PAI	18:02
JayF	that's what I mean by useless :D	18:02
sean-k-mooney	it depends on what you were trying to test. but i agree it alone was not suffienct	18:18
sean-k-mooney	dansmith: melwitt nova-lvm passed with https://review.opendev.org/c/openstack/nova/+/909469 by the way https://zuul.opendev.org/t/openstack/build/5069b706178e4430ba48bdd52234b065 so do we want to just merge that for now	18:20
sean-k-mooney	memory_tracker low_point: 358852	18:21
sean-k-mooney	SwapFree: 5804540 kB	18:21
sean-k-mooney	so we are not ussing most of the addtional swap	18:22
dansmith	I'm cool with it yeah :)	18:22
sean-k-mooney	but that does not mean it was not needed and just not clolelcted i nthe memory tracker	18:22
dansmith	um, parse error	18:23
dansmith	but yeah, I'm in favor of 8G swap for any of these that are on the border	18:23
sean-k-mooney	i mean the memory tracker log runs perodically adn the extra swap could have been needed breifly in between runs	18:23
melwitt	sean-k-mooney: noice, +W	18:26
*** NeilHanlon is now known as nhanlon		19:00
*** nhanlon is now known as NeilHanlon		19:02
opendevreview	Merged openstack/nova stable/2023.1: libvirt: stop enabling hyperv feature reenlightenment https://review.opendev.org/c/openstack/nova/+/909229	20:43
opendevreview	Merged openstack/nova master: bump nova-lvm to use 8G of swap from 4G https://review.opendev.org/c/openstack/nova/+/909469	21:11

Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!