opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Add new option to require mulitpathd for volume attachment https://review.opendev.org/c/openstack/nova/+/845660 | 02:41 |
---|---|---|
opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Add new option to require mulitpathd for volume attachment https://review.opendev.org/c/openstack/nova/+/845660 | 02:43 |
opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Add new option to require mulitpathd for volume attachment https://review.opendev.org/c/openstack/nova/+/845660 | 02:46 |
opendevreview | Merged openstack/nova stable/xena: Fix eventlet.tpool import https://review.opendev.org/c/openstack/nova/+/840733 | 03:26 |
opendevreview | Merged openstack/nova stable/xena: Gracefull recovery when attaching volume fails https://review.opendev.org/c/openstack/nova/+/829433 | 03:26 |
opendevreview | Merged openstack/nova stable/yoga: Fix segment-aware scheduling permissions error https://review.opendev.org/c/openstack/nova/+/840732 | 04:07 |
opendevreview | Merged openstack/nova stable/yoga: Isolate PCI tracker unit tests https://review.opendev.org/c/openstack/nova/+/840830 | 04:07 |
opendevreview | Merged openstack/nova stable/yoga: Remove unavailable but not reported PCI devices at startup https://review.opendev.org/c/openstack/nova/+/840831 | 04:22 |
opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Add new option to require mulitpathd for volume attachment https://review.opendev.org/c/openstack/nova/+/845660 | 04:24 |
opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Add new option to require mulitpathd for volume attachment https://review.opendev.org/c/openstack/nova/+/845660 | 04:26 |
opendevreview | Merged openstack/nova stable/wallaby: Fix inactive session error in compute node creation https://review.opendev.org/c/openstack/nova/+/811809 | 04:35 |
opendevreview | Takashi Kajinami proposed openstack/nova master: libvirt: Add new option to require mulitpathd for volume attachment https://review.opendev.org/c/openstack/nova/+/845660 | 06:45 |
gibi | good morning | 07:32 |
bauzas | good morning | 07:47 |
gibi | sean-k-mooney, artom: I have concerns in https://review.opendev.org/c/openstack/nova/+/824048 | 07:54 |
*** lajoskatona_ is now known as lajoskatona | 07:59 | |
bauzas | gibi: following the discussion we had last week about VGPU allocations https://github.com/openstack/nova/blob/master/nova/virt/libvirt/driver.py#L8203-L8206 | 08:34 |
bauzas | https://images.squarespace-cdn.com/content/518f5d62e4b075248d6a3f90/1385326046909-827X09Z2RML6LYHWQBVI/git-blame2.jpg?format=1500w&content-type=image%2Fjpeg | 08:37 |
gibi | bauzas: ahh, it was wishfull thingkin | 08:42 |
gibi | thinking | 08:42 |
bauzas | gibi: I just wrote a quick functest and this is weird | 08:43 |
bauzas | I asked for VGPU=2 and I got only one allocation | 08:43 |
bauzas | so I need to dig into the code | 08:43 |
gibi | one allocation of two vgpus from the same pgpu? | 08:43 |
bauzas | just a plain simple flavor with resources:VGPU=µ2 | 08:44 |
bauzas | which should have created a single allocation of VGPU=2 | 08:44 |
bauzas | when arriving before the above code | 08:44 |
bauzas | against one single RP | 08:45 |
gibi | yepp placement does not split allocation against a single RC between RPs | 08:45 |
bauzas | but I don't know why when I introspect, I see : | 08:45 |
bauzas | (Pdb) vgpu_allocations | 08:45 |
bauzas | {'1df35fc7-41f4-4ef4-b995-7fd561c6a391': {'resources': {'VGPU': 1}}} | 08:45 |
gibi | ohh, that is strange | 08:45 |
gibi | you should have 2 in the allocation | 08:45 |
bauzas | yeah | 08:45 |
bauzas | correct | 08:45 |
gibi | but then you have a nice problem to debug :) | 08:46 |
bauzas | I'm checking the max limit of the inventory | 08:46 |
bauzas | that could be the reason | 08:46 |
bauzas | if we cap to 1, then there are no ways to create an allocation of 2 | 08:46 |
bauzas | but... the scheduler should have failed, right? | 08:46 |
gibi | but then you should get no allocation candidates | 08:47 |
bauzas | yeah that | 08:47 |
bauzas | something got messed somewhere and I still need to investigate whether this is just a fixture issue | 08:47 |
gibi | do you have the placement log from the a_c query? | 08:48 |
gibi | does nova requested 1 VGPU in the a_c query? | 08:48 |
bauzas | I'll set the DEBUG level | 08:49 |
bauzas | shit OS_DEBUG=1 doesn't seem to work with functional tests | 08:50 |
gibi | bauzas: yep it does not, I was not able to track that down last time | 08:51 |
bauzas | pdb'ing a bit further down then | 08:52 |
bauzas | anyway, looks like a nice bone to snag | 08:52 |
bauzas | oh f*** | 08:53 |
bauzas | forget | 08:53 |
bauzas | it's PEBKAC | 08:53 |
* bauzas facepalms | 08:53 | |
* bauzas hides | 08:53 | |
bauzas | I wrote stupid code | 08:53 |
bauzas | https://paste.opendev.org/show/bgjrdJhdinSqXXIY6EWm/ | 08:54 |
bauzas | way better now this is fixed | 08:55 |
bauzas | let's pretend this whole conversation never existed | 08:55 |
bauzas | but my original point remains | 08:56 |
bauzas | we could let operators to isolate the allocations between different named groups | 08:56 |
bauzas | in their flavors | 08:56 |
bauzas | the point is, we will just swallow all of them but one | 08:57 |
gibi | ahh different falvor :) | 08:57 |
* bauzas blushes | 08:57 | |
bauzas | I'm just about modifying the flavor to ask for one VGPU per group | 08:58 |
bauzas | and I'm pretty sure we'll end up with only one mdev | 08:58 |
ygk_12345 | hi all can anyone help me with this https://bugs.launchpad.net/nova/+bug/1978065 | 09:14 |
gibi | ygk_12345: please check the request-id in the conductor and scheduler logs as well | 09:18 |
ygk_12345 | gibi: All I can find are those messages from all nova logs. They are still stuck in scheduling and building state. Even now out of 10 vms, only 8 are created fine. rest two are in building state | 09:21 |
ygk_12345 | gibi: i have added those logs now to the case., pls check them | 09:22 |
gibi | I saw. that is awful small amount of log for an instance boot. do you see more logs for those VMs that booted successfully? | 09:22 |
ygk_12345 | gibi: let me check that | 09:23 |
opendevreview | Rajat Dhasmana proposed openstack/python-novaclient master: Add support to rebuild boot volume https://review.opendev.org/c/openstack/python-novaclient/+/827163 | 09:25 |
ygk_12345 | gibi: i have added the log | 09:28 |
gibi | I don't think you are actually having / finding all the logs. for a successfull boot you should see many log lines in the conductor / scheduler and compute service | 09:30 |
bauzas | gibi: sorry to interupt you but my a_c language is a bit rusty | 09:32 |
gibi | bauzas: no worries | 09:32 |
bauzas | gibi: /placement/allocation_candidates?group_policy=isolate&in_tree=adbdb144-d84b-4b80-b03b-a0bc520d91ba&limit=1000&resources=DISK_GB%3A20%2CMEMORY_MB%3A2048%2CVCPU%3A2&resources1=VGPU%3A1&resources2=VGPU%3A1&root_required=%21COMPUTE_STATUS_DISABLED gives me no valid candidates and I wonder why | 09:32 |
bauzas | the two RPs are supporting different types but I don't ask them | 09:33 |
bauzas | I just ask for one VGPU per RP | 09:33 |
gibi | do you have the PGPU RP in the tree of in_tree=adbdb144-d84b-4b80-b03b-a0bc520d91ba ? | 09:35 |
gibi | so you requests two vgpus from two different pGPUs (isolate) | 09:36 |
gibi | I mean one vgpu from each pgpu | 09:36 |
gibi | that should work if you have to pGPU RPs | 09:37 |
gibi | could you paste the output of https://github.com/gibizer/osc-placement-tree ? (or publish the test you are running so I can reproduce it?) | 09:39 |
bauzas | sorry was doing other thing | 09:41 |
bauzas | gibi: I can install it in the func venv | 09:42 |
bauzas | oh, you said it in the README :) | 09:42 |
bauzas | lemme pip it | 09:42 |
bauzas | gibi: I need to disappear for gym reasons but I'll work on it this afternoon | 09:48 |
bauzas | I just installed, I just need to add the placement api module in my test class | 09:48 |
gibi | bauzas: have a nice workout. feel free to ping me later | 09:48 |
bauzas | gibi: <3 | 09:51 |
bauzas | gibi: found my problem | 09:51 |
bauzas | I was only having one child RP | 09:51 |
* bauzas needs to go off but thanks | 09:51 | |
gibi | ack, good to hear that | 09:51 |
gibi | bauzas, sean-k-mooney: an interesting performance bug https://bugs.launchpad.net/nova/+bug/1978372 | 09:55 |
sean-k-mooney | first tought is they dont know how to use the hw:numa_* extra specs | 09:59 |
sean-k-mooney | its very bad pratcit to use hw:numa_cpus or hw:numa_mem with symetric numa toplogies | 10:00 |
sean-k-mooney | its explictly an anti pattern | 10:00 |
sean-k-mooney | and should not be doen | 10:00 |
sean-k-mooney | it wont affect the perfrmace but the also mis understand the relation ship betweeh hw:cpu_threads and hw:cpu_max_threads | 10:01 |
sean-k-mooney | if hw:cpu_treads is specifed then max is ignored | 10:01 |
sean-k-mooney | sam for sockets | 10:02 |
sean-k-mooney | so that flavor defintion annoys me on a fundimental level :) | 10:02 |
sean-k-mooney | gibi: hoststly im not that suprised we know that this is a very slow algorthim | 10:03 |
gibi | then look at the minimal reproduction that has no falvor :) | 10:03 |
sean-k-mooney | i did some experiments a few year ago repimlementing it form scratch | 10:04 |
sean-k-mooney | but the reason i did not continue with that was we were going to implement numa in placement any day now | 10:04 |
gibi | also we are emitting 18G debug logs during that run which is pretty extreme | 10:04 |
sean-k-mooney | peopel complaied there was not enough logging in the hardware module :P | 10:05 |
sean-k-mooney | https://github.com/SeanMooney/cpu-pinning | 10:06 |
sean-k-mooney | this was my experimal approch | 10:06 |
gibi | as far as I understand this call does not call out to any extrnal system (no fs, db, rabbit calls) so it runs in pure python. So taking minutes indicate that we are doing something crazy | 10:08 |
sean-k-mooney | we are try to validate cpu (floating and pinned),ram ,pci, and pmem affinity based on the rules set in teh falvor with regards to thread affintiy, device affintiy and any semetic or asmetirc numa requirements | 10:10 |
sean-k-mooney | gibi: by the way we have know about this basically from the start that its qudratic or wrose as you scale the number of host/guest numa nodes | 10:16 |
sean-k-mooney | its why the numa toploty filter shoudl always be last in your filter list if you enable it | 10:16 |
sean-k-mooney | i am not sure if ic an repodcue that simpel example in my ohter implemntaion but im going to give it a try quickly | 10:19 |
sean-k-mooney | [11:27:50]❯ ./run.sh | 10:28 |
sean-k-mooney | Courtesy Notice: Pipenv found itself running within a virtual environment, so it will automatically use that environment, instead of creating its own for any project. You can set PIPENV_IGNORE_VIRTUALENVS=1 to force pipenv to ignore that environment and create its own instead. You can set PIPENV_VERBOSITY=-1 to suppress this warning. | 10:28 |
sean-k-mooney | 10:28 | |
sean-k-mooney | 0:0,1:32,2:512,3:544 | 10:28 |
sean-k-mooney | 0:1,1:2,2:3,3:4 | 10:28 |
sean-k-mooney | 0:5,1:69,2:133,3:197 | 10:28 |
sean-k-mooney | 0:33,1:513,2:34,3:514,4:35,5:515,6:36,7:6 | 10:28 |
sean-k-mooney | 0:2047,1:1983,2:1919,3:1855,4:2015,5:1951,6:1887,7:1823 | 10:28 |
sean-k-mooney | bug repoducer | 10:28 |
sean-k-mooney | ---------------------------------------- | 10:28 |
sean-k-mooney | 0:0,1:3,2:6,3:9,4:12,5:15,6:18,7:21,8:48,9:51,10:54,11:57,12:60,13:63,14:66,15:69,16:1,17:4,18:7,19:10,20:13,21:16,22:19,23:22,24:49,25:52,26:55,27:58,28:61,29:64,30:67,31:70,32:2,33:5,34:8,35:11,36:14,37:17,38:20,39:23,40:50,41:53,42:56,43:59,44:62,45:65,46:68,47:71 | 10:28 |
sean-k-mooney | real 0m1.004s | 10:28 |
sean-k-mooney | user 0m0.893s | 10:28 |
sean-k-mooney | sys 0m0.094s | 10:28 |
sean-k-mooney | gibi: is ^ better | 10:28 |
sean-k-mooney | or | 10:29 |
sean-k-mooney | real 0m0.986s | 10:29 |
sean-k-mooney | user 0m0.887s | 10:29 |
sean-k-mooney | sys 0m0.078s | 10:29 |
sean-k-mooney | if i jus trun the repoducer | 10:29 |
gibi | the reproducer took 6 mins to run on my laptop | 10:32 |
gibi | so yours seems to be a loooot faster | 10:32 |
sean-k-mooney | https://github.com/SeanMooney/cpu-pinning/commit/018a1d0a9abeeec40d3e3ccbfd640363158433c5 | 10:34 |
sean-k-mooney | that is the same case right | 10:34 |
sean-k-mooney | 1 socket 16 numa nodes and 48 cores with 2 thread pre cpu for 96 theads | 10:35 |
sean-k-mooney | then booth a 48 core cpu with pinning and the prefer thread polciy with 1G hugepages | 10:35 |
sean-k-mooney | and 488GB of ram | 10:35 |
gibi | there are some usage too in the reproducer | 10:35 |
gibi | the first 7 host numa cell is used | 10:36 |
sean-k-mooney | ack i can add that and see if it makes any differnce | 10:36 |
sean-k-mooney | basically the same | 10:50 |
sean-k-mooney | real 0m1.074s | 10:50 |
sean-k-mooney | user 0m0.936s | 10:50 |
sean-k-mooney | sys 0m0.115s | 10:50 |
sean-k-mooney | https://github.com/SeanMooney/cpu-pinning/blob/master/pinning.py#L378-L415= | 10:51 |
sean-k-mooney | the issue with my alternitiv implemntation is it only supprots hugepages and cpu pinning | 10:52 |
sean-k-mooney | it is missing pci supprot and mixed cpu support | 10:52 |
sean-k-mooney | ill run there repodcuer to have a comparison on the same hardware | 10:53 |
sean-k-mooney | im going to make coffee whiel this runs... | 10:59 |
gibi | dansmith, bauzas: what do you think do we want to / can do something about this? https://bugs.launchpad.net/nova/+bug/1978549 It feels like a bug but I'm not sure I want to go back and add a db migration to stein in nova or touch the placement db init script. https://bugs.launchpad.net/nova/+bug/1978549 | 11:03 |
sean-k-mooney | gibi: dont we tend to not drop the columes right away | 11:06 |
sean-k-mooney | to cater for rolling upgrades | 11:06 |
gibi | sure | 11:06 |
sean-k-mooney | we stop the usage and then a few release later we can drop the column | 11:06 |
gibi | but then we suddenly forget to add it to the placement db init script | 11:07 |
gibi | so if you have a DB that was created before placemetn was moved out of nova repor then you have can_host | 11:07 |
sean-k-mooney | well that woudl only be an issue if you were initing on a version that used it right | 11:07 |
gibi | we removed the can_host column but without a DB migration | 11:08 |
gibi | the removal happen when the new db init was created for the moved out placement | 11:08 |
sean-k-mooney | i see | 11:08 |
sean-k-mooney | gibi: real 12m32.814s | 11:10 |
sean-k-mooney | maybe my 4 year old laptop is starting to show its age | 11:10 |
sean-k-mooney | gibi: we could just document this as a workaround for the db issue | 11:11 |
sean-k-mooney | the fix is only requried if movign form in nova to out of tree placement right | 11:12 |
sean-k-mooney | if you are doign a db restore | 11:12 |
sean-k-mooney | and even then only if you are migrating db backends | 11:12 |
sean-k-mooney | if you were just doing a db export and import in mysql it would be fien because the improt would create the tabels | 11:13 |
sean-k-mooney | in there case they likely are just migrating the data and using placemnt-mange to init an empty db | 11:13 |
gibi | I agree to only document the workaround. I'm not sure where to put that documentation though. As adding a reno now in zed in placement for an issue introduced in stein in nova does not feels right | 11:16 |
sean-k-mooney | we can do a stable only reno | 11:16 |
sean-k-mooney | on stine | 11:16 |
sean-k-mooney | or update the existing one for speliting out placment which i assuem exists somewhere | 11:17 |
gibi | hm there is admin/upgrade-to-stein.rst | 11:18 |
gibi | that would be OK | 11:18 |
gibi | thanks | 11:18 |
opendevreview | Balazs Gibizer proposed openstack/placement master: Add WA about resource_providers.can_host removal https://review.opendev.org/c/openstack/placement/+/845730 | 11:32 |
sean-k-mooney | gibi: im not going to do this now but if i were to update my poc to implemetne all or a subset of the current numa_fit_instance_to_host funciton | 11:33 |
sean-k-mooney | would we consider that backportable if it was opt in and both implemations could be in the code in parallel even fi only one would be used on a compute host at a time | 11:33 |
gibi | why we need both? does yours produce a different result than the current? | 11:34 |
sean-k-mooney | mine does not do everythin the current one does | 11:34 |
gibi | yepp, then we might need a config flag | 11:34 |
sean-k-mooney | so im wondering if we coudl make it incremental | 11:35 |
sean-k-mooney | and also have the option to run both as filters | 11:35 |
sean-k-mooney | so mine currently only handeles hugepages and cpu pinning | 11:35 |
sean-k-mooney | its missing pci devices and mixed cpus and maybe pmem | 11:36 |
sean-k-mooney | pmem enabels a numa toplogy but i dont think we provide affintiy for pmem | 11:36 |
sean-k-mooney | so its reall just mixed mode and pci devices that woudl initally be missing | 11:36 |
sean-k-mooney | my tought is you coudl enable both filters and have the fast one run first so you woudl only validate the slow hosts if we knew the cpu and ram request were valid | 11:37 |
sean-k-mooney | s/slow hosts/hosts with the slow version/ | 11:38 |
gibi | overall I'm OK to make it incremental or even selectable, but I'm not sure stable cores will like it | 11:38 |
gibi | this will include a new bounch of code to stable branches | 11:38 |
sean-k-mooney | so basicaly you woudl do pci_passthroughfilter,numa_v2,numa_toplogy_filter | 11:38 |
gibi | that is liability | 11:39 |
sean-k-mooney | yep | 11:39 |
sean-k-mooney | it is | 11:39 |
gibi | what if we just add what we have to master as selectable, then we improve on it later on master | 11:39 |
sean-k-mooney | we could yes | 11:39 |
sean-k-mooney | im just trying to think if there is anything we can do to adress the bug report on older branches too | 11:40 |
sean-k-mooney | i dont think there is anythign trivial that can be fixed on the older brances to make the perfromance accpetabel. | 11:40 |
gibi | I don't know. maybe we can look at the reproducer in a profiler | 11:41 |
sean-k-mooney | i think its to do with our usage so itertools permuations | 11:42 |
sean-k-mooney | we do a liniar loop over all permuations of numa nodes when trying to fit the instance | 11:42 |
sean-k-mooney | we early out once the instance fits but that is really not efficent with more then a hand full of numa nodes | 11:43 |
sean-k-mooney | gibi: i can try an take a look at it breifly | 11:44 |
gibi | sean-k-mooney: only if you want :) I'm not asking to take this | 11:45 |
gibi | I'm not promising either that I can fire up a profiler today | 11:45 |
sean-k-mooney | hehe | 11:45 |
sean-k-mooney | this is not new | 11:45 |
sean-k-mooney | so i dont think it supper high priority but i was half way thorugh fixing up my vdpa patches | 11:46 |
sean-k-mooney | so i want to finish those today | 11:46 |
sean-k-mooney | so im somehwat interested in is there a minimal fix we can do to make ti better | 11:46 |
sean-k-mooney | i just dont want to spend all day looking at it | 11:46 |
sean-k-mooney | so ill give it tilll the top of the hour | 11:46 |
sean-k-mooney | we can take another look later in the week or next week | 11:47 |
gibi | ack | 11:47 |
* sean-k-mooney forgot how nice it is to have a working debugger | 11:57 | |
opendevreview | Merged openstack/nova stable/wallaby: Add service version check workaround for FFU https://review.opendev.org/c/openstack/nova/+/844202 | 11:57 |
gibi | :) | 11:59 |
sean-k-mooney | its amazing how much simpler nova code becomes without eventlets | 11:59 |
sean-k-mooney | actully there might be a small tweak we can do | 12:01 |
sean-k-mooney | we currently do this | 12:02 |
sean-k-mooney | for host_cell_perm in itertools.permutations( | 12:02 |
sean-k-mooney | host_cells, len(instance_topology) | 12:02 |
sean-k-mooney | ): | 12:02 |
sean-k-mooney | so we get the next perumation for the full pinning | 12:02 |
sean-k-mooney | i wonder if we could do this one numa node at a time | 12:02 |
sean-k-mooney | so loop over the instnace numa cells | 12:03 |
sean-k-mooney | and try to pin them one at a time | 12:03 |
sean-k-mooney | then try to pin the rest | 12:03 |
sean-k-mooney | i think that would be a lot faster | 12:04 |
sean-k-mooney | as currently if the first 7 numa nodes are full we have to try all permuations for the 7 numa nodes before we will try the 8th i think | 12:04 |
sean-k-mooney | if i refactor this we will do 7 test for the first guest numa node it will match on the 8th numa node and then we will try fiting the next numa node | 12:05 |
gibi | would that just do the permutation generation with loops? sure the order of the search would be different | 12:05 |
gibi | so it might optimize for the current csae | 12:06 |
gibi | case | 12:06 |
sean-k-mooney | the worst case performace woudl still be the same but i think it would imporve best and average case | 12:06 |
gibi | so we optimize based on some heuristics that the first numa nodes are more likely to be filled than the later numa nodes | 12:11 |
sean-k-mooney | well until master it did a liniar search | 12:11 |
sean-k-mooney | so yes the first numa nodes were always filled first deterministically | 12:11 |
sean-k-mooney | we recently added numa node blancing | 12:11 |
gibi | true, so using the old linear fill, it make sense to add a heuristics to the search to | 12:12 |
gibi | o | 12:12 |
sean-k-mooney | also my entire system just hard locked up while i was debuging that even with it paused | 12:12 |
gibi | wondering if just simply using permutations(reversed(host_cells)) would be enough to optimize too | 12:12 |
sean-k-mooney | perhaps i should close some broser tabs | 12:12 |
sean-k-mooney | well we are not sorting the host_cells | 12:13 |
gibi | the other day I run out of memory on my laptop, so now I added some swap | 12:13 |
sean-k-mooney | based on aviable ram disk pci devices and if you asked for them | 12:13 |
sean-k-mooney | i ran out of swap but still had 8GB of ram free | 12:14 |
sean-k-mooney | i do have a 2 node devstack and openshift running in 3 8G vms currently too | 12:14 |
sean-k-mooney | oh my email client is only using 4G of ram today that is nice of it. it was using 15 last week... | 12:16 |
sean-k-mooney | 1910527.379325] Out of memory: Killed process 1338765 (.qemu-system-x8) total-vm:13382508kB, anon-rss:7832544kB, file-rss:0kB, shmem-rss:4kB, UID:0 pgtables:17324kB oom_score_adj:0 | 12:17 |
sean-k-mooney | yep i ran out of memory ill stop the vms for now | 12:17 |
sean-k-mooney | thats totally a good sign for the effiency fo this code right | 12:18 |
sean-k-mooney | gibi: :) | 12:29 |
sean-k-mooney | import nova.conf | 12:29 |
sean-k-mooney | CONF = nova.conf.CONF | 12:29 |
sean-k-mooney | CONF.compute.packing_host_numa_cells_allocation_strategy = False | 12:29 |
sean-k-mooney | that fixes the issue | 12:29 |
sean-k-mooney | we disable the numa blancing by defualt for "backwards compatiblity" | 12:29 |
sean-k-mooney | gibi: if you enable it when we sort by free memory all the used nodes go to the end | 12:30 |
sean-k-mooney | so the first permuation fits | 12:30 |
sean-k-mooney | [13:30:55]➜ time python t.py | 12:31 |
sean-k-mooney | InstanceNUMATopology(cells=[InstanceNUMACell(8),InstanceNUMACell(9),InstanceNUMACell(10),InstanceNUMACell(11),InstanceNUMACell(12),InstanceNUMACell(13),InstanceNUMACell(14)],emulator_threads_policy=None,id=<?>,instance_uuid=<?>) | 12:31 |
sean-k-mooney | real 0m1.489s | 12:31 |
sean-k-mooney | user 0m1.373s | 12:31 |
sean-k-mooney | sys 0m0.095s | 12:31 |
sean-k-mooney | gibi: not as fast as my out of tree version but pretty close | 12:31 |
gibi | ahh | 12:31 |
sean-k-mooney | and fully feature complete | 12:31 |
gibi | that is an easy workaround for the particular case | 12:32 |
sean-k-mooney | well the spread approch is generaly beter provided you dont need really large vms | 12:32 |
sean-k-mooney | that depend on fully filling the numa nodes to spawn | 12:32 |
sean-k-mooney | but yes | 12:32 |
sean-k-mooney | gibi: ok added a comment https://bugs.launchpad.net/nova/+bug/1978372/comments/6 | 12:38 |
sean-k-mooney | gibi: we have already started backporting the pack/spread behavior to xena they reported it on wallaby | 12:38 |
sean-k-mooney | so if we just continue the backport of the sorting behavior we could close it as a dupe i guess | 12:39 |
gibi | sean-k-mooney: thanks | 12:39 |
sean-k-mooney | that does raise the question of if we should change the default for pack vs spread on master and or still look at the other implemation in the future | 12:39 |
gibi | I think we can chance default on master now | 12:40 |
gibi | if we want | 12:40 |
gibi | probably not on stable | 12:40 |
sean-k-mooney | right not stabel but i can propose a patch to change the default and add a release note for people to review | 12:44 |
gibi | yeah, lets try | 12:47 |
gibi | I have to refresh myself about the trade off though | 12:48 |
sean-k-mooney | spread will try to put vms on empty numa nodes first pack does the reverse trying to use all aviable space on numa ndoes before using the next one | 12:49 |
sean-k-mooney | if you have 2 numa nodes and but 3 vms 1 that need a full numa node and 2 that each need a half numa node | 12:50 |
sean-k-mooney | then with pack all 3 will in any order | 12:50 |
sean-k-mooney | with spread unless the big vms is booted first only 2 of the 2 will schdule | 12:50 |
sean-k-mooney | gibi: ^ that the main trade off | 12:50 |
gibi | thanks | 12:51 |
sean-k-mooney | but if you spread you get better cpu/memory performace in the guest until the node is full and then its the same | 12:51 |
gibi | I think both way is valid, but if we know that the scheduling performance is better in the spread case then that might be enough reasoning to switch | 12:52 |
gibi | the default | 12:52 |
sean-k-mooney | yep both are valid but i personaly prefer spread as the default | 12:52 |
sean-k-mooney | it does need to be configurable | 12:53 |
sean-k-mooney | so that operators can choose | 12:53 |
sean-k-mooney | you could technially have differnt values on scheduler vs compute too | 12:53 |
sean-k-mooney | you could use spread in schduler and pack on computes | 12:53 |
bauzas | gibi: good news, I was able to ask for multiple vGPUs | 12:54 |
sean-k-mooney | currently we recompute the pinning on the compute anyway since we throwaway the schduler info | 12:54 |
bauzas | but yeah, we hit the warning, so we only have one VGPU | 12:55 |
sean-k-mooney | bauzas: using generic mdevs to use differnt device classees for differnt host gpus | 12:55 |
sean-k-mooney | or via multi create | 12:55 |
bauzas | sean-k-mooney: lemme upload the functest | 12:55 |
sean-k-mooney | ack | 12:55 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Add a functest for verifying multiple VGPU allocations https://review.opendev.org/c/openstack/nova/+/845747 | 13:09 |
bauzas | sean-k-mooney: ^ | 13:09 |
bauzas | sean-k-mooney: tl,dr: given of the nvidia driver issue (you can't ask for more than one VGPU per pGPU per instance), operators would then want to spread the vGPUs between multiple pGPUs | 13:10 |
opendevreview | Alexey Stupnikov proposed openstack/nova stable/victoria: Test aborting queued live migration https://review.opendev.org/c/openstack/nova/+/845748 | 13:12 |
opendevreview | Alexey Stupnikov proposed openstack/nova stable/victoria: Add functional tests to reproduce bug #1960412 https://review.opendev.org/c/openstack/nova/+/845753 | 13:26 |
opendevreview | Alexey Stupnikov proposed openstack/nova stable/victoria: Clean up when queued live migration aborted https://review.opendev.org/c/openstack/nova/+/845754 | 13:29 |
opendevreview | Sylvain Bauza proposed openstack/nova master: WIP : Support multiple allocations for vGPUs https://review.opendev.org/c/openstack/nova/+/845757 | 13:42 |
bauzas | gibi: ^ | 13:42 |
gibi | bauzas: left feedback :) | 14:15 |
bauzas | cool, this is just a WIP tho | 14:16 |
bauzas | gibi: thanks for commenting, those are good thoughts | 14:17 |
gibi | I didn't even relaized that it is a WIP, it has functional test :D | 14:18 |
bauzas | Zuul will hit my face as I think I'm closing some gaps | 14:19 |
bauzas | and I'll need to change some UTs | 14:19 |
ygk_12345 | can someone look into this https://bugs.launchpad.net/oslo.messaging/+bug/1978562 please | 14:27 |
*** dasm|off is now known as dasm | 14:31 | |
opendevreview | Balazs Gibizer proposed openstack/nova master: Clean up mapping input to address spec types https://review.opendev.org/c/openstack/nova/+/845765 | 14:32 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Clean up mapping input to address spec types https://review.opendev.org/c/openstack/nova/+/845765 | 14:36 |
opendevreview | Elod Illes proposed openstack/placement stable/victoria: Add periodic-stable-jobs template https://review.opendev.org/c/openstack/placement/+/845770 | 15:02 |
opendevreview | Artom Lifshitz proposed openstack/nova master: libvirt: remove default cputune shares value https://review.opendev.org/c/openstack/nova/+/824048 | 15:03 |
opendevreview | Alexey Stupnikov proposed openstack/nova stable/victoria: Test aborting queued live migration https://review.opendev.org/c/openstack/nova/+/845748 | 15:09 |
opendevreview | Alexey Stupnikov proposed openstack/nova stable/victoria: Add functional tests to reproduce bug #1960412 https://review.opendev.org/c/openstack/nova/+/845753 | 15:10 |
opendevreview | Alexey Stupnikov proposed openstack/nova stable/victoria: Clean up when queued live migration aborted https://review.opendev.org/c/openstack/nova/+/845754 | 15:11 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Remove unused PF checking from get_function_by_ifname https://review.opendev.org/c/openstack/nova/+/845775 | 15:21 |
*** lajoskatona_ is now known as lajoskatona | 15:24 | |
opendevreview | Balazs Gibizer proposed openstack/nova master: Fix type annotation of pci.Whitelist class https://review.opendev.org/c/openstack/nova/+/845780 | 15:30 |
ygk_12345 | can someone look into this https://bugs.launchpad.net/oslo.messaging/+bug/1978562 please | 15:35 |
bauzas | reminder: nova meeting in 20 mins | 15:40 |
opendevreview | Balazs Gibizer proposed openstack/nova master: Move __str__ to the PciAddressSpec base class https://review.opendev.org/c/openstack/nova/+/845781 | 15:50 |
bauzas | #startmeeting nova | 16:00 |
opendevmeet | Meeting started Tue Jun 14 16:00:00 2022 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. | 16:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 16:00 |
opendevmeet | The meeting name has been set to 'nova' | 16:00 |
bauzas | howdy back | 16:00 |
dansmith | o/ | 16:00 |
bauzas | and welcome on our first June nova meeting | 16:00 |
gibi | o/ | 16:00 |
melwitt | o/ | 16:01 |
* bauzas just hopes we'll have more people joining | 16:01 | |
bauzas | but we can start | 16:01 |
Uggla | o/ | 16:01 |
bauzas | #topic Bugs (stuck/critical) | 16:02 |
bauzas | #info No Critical bug | 16:02 |
elodilles | o/ | 16:02 |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 14 new untriaged bugs (+0 since the last meeting) | 16:02 |
bauzas | #link https://storyboard.openstack.org/#!/project/openstack/placement 26 open stories (0 since the last meeting) in Storyboard for Placement | 16:02 |
bauzas | #info Add yourself in the team bug roster if you want to help https://etherpad.opendev.org/p/nova-bug-triage-roster | 16:02 |
bauzas | I have to admit publicly I feel ashamed | 16:02 |
bauzas | I forgot about the baton when it was mine | 16:02 |
bauzas | throw me tomatoes | 16:02 |
melwitt | 🍅 | 16:03 |
bauzas | but alas I triaged one bug :) | 16:03 |
bauzas | apparently, preparing the Summit trip and doing bug triage doesn't mix on my side | 16:03 |
gibi | I took the baton from bauzas in Berlin in person \o/ | 16:03 |
bauzas | literally | 16:03 |
gibi | there was a normal amount of bug inflow | 16:04 |
gibi | https://etherpad.opendev.org/p/nova-bug-triage-20220607 | 16:04 |
bauzas | it could have been an olympic torch | 16:04 |
gibi | does that count as carry on baggage? | 16:04 |
melwitt | it could be your personal item | 16:05 |
gibi | :) | 16:05 |
bauzas | depends on the size I guess | 16:05 |
bauzas | or it could be seen as a sport gear | 16:05 |
bauzas | anyway | 16:05 |
gibi | I saw two interesting bugs | 16:05 |
bauzas | thanks gibi | 16:05 |
gibi | https://bugs.launchpad.net/nova/+bug/1978372 numa_fit_instance_to_host() algorithm is highly ineffective on higher number of NUMA nodes | 16:05 |
gibi | sean-k-mooney updated me that this is a known ineffciency of our algo | 16:06 |
bauzas | what kind of hardware has this large amount of NUMA nodes ? | 16:06 |
* bauzas is always unpleasantly surprised by all the new things that are created around him | 16:06 | |
gibi | I'm not sure | 16:07 |
gibi | but I accept that it is possible | 16:07 |
bauzas | 8 NUMA nodes seems large to me, but I'm not a tech savvy | 16:07 |
sean-k-mooney | bauzas: most recnet amd servers | 16:07 |
sean-k-mooney | 16 numa nodes is not uncommon now | 16:07 |
bauzas | my brain hurts. | 16:08 |
sean-k-mooney | you can get 16 numa nodes in a singel socket now | 16:08 |
sean-k-mooney | and i have see systems with 64 | 16:08 |
* bauzas is network-bound by his gigabit switches at home while he can download at 10 | 16:09 | |
sean-k-mooney | our current packing default falls apart after about 4-8 numa nodes | 16:09 |
gibi | so right now we are slow by default, but if numa spread is enabled instead of the default pack then it is much better a sean-k-mooney discovered | 16:09 |
bauzas | anyway, sounds an opportunity for optimization then | 16:09 |
sean-k-mooney | i have a patch in flight to change the default | 16:10 |
bauzas | the whole packing strategy is hidden within the code | 16:10 |
sean-k-mooney | ill work on the release note and push it later | 16:10 |
sean-k-mooney | we can continue the discussion there if you like | 16:10 |
bauzas | sure | 16:10 |
sean-k-mooney | bauzas: yep its also not part of the api contract and never was | 16:10 |
* bauzas shrugs | 16:11 | |
gibi | so the other bug I would like to mention | 16:11 |
gibi | https://bugs.launchpad.net/nova/+bug/1978549 Placement resource_providers table has dangling column "can_host" | 16:11 |
bauzas | anyway, I understand people wondering why our packing stragegy should struggle only after 16 nodes to iterate | 16:11 |
gibi | I marked as wontfix with a small note in the placement documentation | 16:11 |
gibi | https://review.opendev.org/c/openstack/placement/+/845730 | 16:11 |
gibi | this was a mistake back in stable/stein | 16:12 |
gibi | and I don't want to go back and touch DB migrations there | 16:12 |
gibi | that is all I had for bug triage this week | 16:13 |
bauzas | gibi: can_host is not part of the DB contract ? | 16:13 |
bauzas | I mean the model | 16:13 |
gibi | it was removed from the DB model since stein | 16:14 |
gibi | but we never added a DB migration to drop the coulmn from the schema | 16:14 |
gibi | but when Placement was split out of nova a new initial DB schema was defined but now without can_host | 16:14 |
gibi | hence the inconsistency | 16:14 |
gibi | on the schema level | 16:14 |
gibi | but nothing is uses can_host | 16:14 |
bauzas | oh I understand | 16:14 |
bauzas | but, if this is post-Stein, the table is removed anyway, no ? | 16:16 |
bauzas | as said in 'Finalize the upgrade' | 16:16 |
sean-k-mooney | i think the issue here is they are doing a postgres to mariadb migration | 16:17 |
sean-k-mooney | so they were using placement manage to create the new db schma | 16:17 |
sean-k-mooney | then trying to do a data migration | 16:17 |
sean-k-mooney | and there orginal db had the column | 16:17 |
sean-k-mooney | btu the target does not | 16:17 |
sean-k-mooney | so if they drop the colum on the souce db then do the data migration it would be fine | 16:18 |
bauzas | ok, I didn't want to enter into the details, let's move on, I think it's safe bet what gibi did | 16:18 |
gibi | ack | 16:18 |
bauzas | melwitt: are you OK with bug triaging this week or do you want me to do it due to my negligence last week as a punition ? | 16:19 |
bauzas | the latter is fine to me | 16:19 |
melwitt | bauzas: sure, maybe better bc I am out on pto next week | 16:19 |
bauzas | melwitt: cool, then I'll steal it from gibi | 16:20 |
melwitt | cool thanks | 16:20 |
bauzas | #info Next bug baton is passed to bauzas | 16:20 |
bauzas | if you don't mind, I'll pass you the baton next week | 16:20 |
bauzas | or we could give it to anyone else | 16:21 |
melwitt | yes that is ok | 16:21 |
bauzas | moving on | 16:21 |
bauzas | #topic Gate status | 16:21 |
* gibi feels sudden emptiness in his life | 16:21 | |
bauzas | #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure Nova gate bugs | 16:21 |
bauzas | #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly Placement periodic job status | 16:21 |
bauzas | #link https://zuul.opendev.org/t/openstack/builds?job_name=nova-emulation&pipeline=periodic-weekly&skip=0 Emulation periodic job runs | 16:21 |
melwitt | 😂 gibi | 16:21 |
bauzas | #info Please look at the gate failures and file a bug report with the gate-failure tag. | 16:21 |
bauzas | #info STOP DOING BLIND RECHECKS aka. 'recheck' https://docs.openstack.org/project-team-guide/testing.html#how-to-handle-test-failures | 16:21 |
bauzas | voilà | 16:22 |
bauzas | hadn't anything to tell gate-wise | 16:22 |
bauzas | anything anyone ? | 16:22 |
bauzas | gibi: I don't feel exactly empowered with the baton y'know | 16:22 |
bauzas | ok, next topic then | 16:23 |
bauzas | #topic Release Planning | 16:23 |
bauzas | #link https://releases.openstack.org/zed/schedule.html | 16:23 |
bauzas | #info Zed-2 is in 4 weeks, mind your specs | 16:24 |
bauzas | as a reminder, we'll have a SpecApprovalFreeze on Zed-2 | 16:24 |
bauzas | fwiw, here is the current list of accepted blueprints, including specless ones : https://blueprints.launchpad.net/nova/zed | 16:25 |
bauzas | (I eventually updated it one hour before...) | 16:25 |
artom | Oh snap, that's in 2 weeks | 16:25 |
bauzas | no | 16:25 |
bauzas | July 14 | 16:25 |
artom | wtf brain | 16:25 |
bauzas | unless I'm counting wrong | 16:25 |
artom | Sorry, ignore me, carry on | 16:25 |
bauzas | that being said, as the clock ticks, next week, we'll discuss of a spec review day | 16:26 |
bauzas | just sayin | 16:26 |
bauzas | next topic | 16:26 |
bauzas | #topic OpenInfra Summit | 16:27 |
bauzas | lemme just do a quick wrap-up | 16:27 |
bauzas | #info bauzas, gibi and stephenfin attended the summit | 16:27 |
bauzas | #info Nova meet-and-greet Operators feedback session on Wednesday, June 8, 2:50pm - 3:20pm got positive feedback | 16:27 |
gibi | (there was beer) | 16:27 |
bauzas | we had a large audience | 16:27 |
sean-k-mooney | gibi: was it tasty | 16:27 |
chateaulav | fun times | 16:27 |
gibi | good beer | 16:27 |
bauzas | gibi: not during sessions tho | 16:27 |
gibi | yeah, bad timing | 16:27 |
sean-k-mooney | bauzas: thats good to hear was the nova session well attended | 16:27 |
bauzas | let's claim this was a productive session | 16:28 |
bauzas | sean-k-mooney: packed room | 16:28 |
sean-k-mooney | excelent | 16:28 |
artom | Full glass, full room, nice | 16:28 |
bauzas | I think this was well deserved, most of the operators thought we disconnected a bit for too long | 16:28 |
bauzas | and the PTG thing doesn't help | 16:28 |
bauzas | one outcome is at least a strong need for a nova recap at every possible gathering | 16:29 |
bauzas | at least they were expecting one at the Summit, but I have to admit I didn't made it | 16:30 |
bauzas | and the OpenInfra Live thing was on April | 16:30 |
bauzas | I guess only a few of them saw it | 16:30 |
sean-k-mooney | the project updates went live on youtube 2 weeks ago | 16:30 |
bauzas | I know | 16:30 |
sean-k-mooney | but i woudl assume many did not see it | 16:30 |
bauzas | that's the OpenInfra Live thing I mentioned | 16:30 |
bauzas | apparently, people pay more attention to cycle highlights when it's in-person | 16:31 |
sean-k-mooney | right but that may have been in april but the videos only got published on youtube in june | 16:31 |
bauzas | anyway, something easily solvable | 16:31 |
bauzas | one other thing, communication | 16:32 |
bauzas | not a surprise, our ML isn't read | 16:32 |
bauzas | and given they lag a lot, they don't think this is a valuable time to chime in | 16:32 |
artom | Wait, so ops show up to Summit, but don't read the ML? How do they know when Summit is? ;) | 16:32 |
bauzas | (they lag by the number of releases) | 16:33 |
bauzas | artom: easy answer : Twitter | 16:33 |
sean-k-mooney | and infra foundation marketing | 16:33 |
gibi | yes, we was asked to tweet more | 16:33 |
bauzas | someone very seriously explained to me they'd prefer nova tweets | 16:33 |
artom | We as in the developers? o_O | 16:33 |
gmann | yeah summit info is communicated in many other ML and places not only openstack-discuss | 16:33 |
gibi | artom: yes, please :) | 16:33 |
melwitt | huh. | 16:33 |
sean-k-mooney | could we auto treat the release notes some how | 16:33 |
dansmith | that's crazy, IMHO | 16:34 |
bauzas | sean-k-mooney: they know about our prelude | 16:34 |
bauzas | but again, they laaaag | 16:34 |
* sean-k-mooney rememebrs the april fools twitter as a message bus spec | 16:34 | |
artom | If they lag releases, what's the point of tweeting, presumably about stuff we're working on *now*? | 16:34 |
chateaulav | build interest and involvement | 16:35 |
bauzas | sean-k-mooney: I'm half considering to register a Twitter handle like @Yo_the_Openstack_Nova_gang | 16:35 |
gmann | chateaulav: +1 | 16:35 |
artom | But what if I'm an anti-social curmudgeon? | 16:35 |
bauzas | artom: heh | 16:35 |
bauzas | anyway, that one wasn't an easy problem to solve | 16:36 |
gmann | operator involvement with developers is one of the key and open issue in board meeting too. and TC also raised it to them. | 16:36 |
bauzas | fwiw, I proposed them to only register to 'ops' and 'nova' ML tags | 16:36 |
gmann | some idea is to combine the events ops meetup and developers one but let's see | 16:36 |
bauzas | both in conjunction | 16:36 |
chateaulav | i am too, dont have to be a genius. just start with little things. if it goes to the ML and you think is worthwhile then tweet it and reference the ML archive and irc chat | 16:36 |
bauzas | gmann: please | 16:36 |
sean-k-mooney | gmann: the best way to adress that woudl proably be to converge the events and bring back the devs summit | 16:37 |
bauzas | gmann: I feel the community more fragmented now we're spît | 16:37 |
bauzas | spliut | 16:37 |
gmann | sean-k-mooney: +1 | 16:37 |
gmann | yeah, true | 16:37 |
bauzas | chateaulav: as I said, I begged them to correctly use the ML tags | 16:37 |
gibi | sean-k-mooney: +1 | 16:37 |
gmann | we got separated when we combined the things :) | 16:37 |
bauzas | and I ask people to *not* make use of [nova][ops] for communicating unless here we agree on the need to engage | 16:38 |
bauzas | #action nova team will only exceptionnally make use of [nova][ops] for important communication to ops. If you're an Ops, feel free to register to both tags in the ML | 16:39 |
artom | Tbf, the ML lately seems to be openstack-support, anecdotally | 16:39 |
sean-k-mooney | not entirly | 16:39 |
dansmith | mostly | 16:39 |
bauzas | artom: alas, we merged openstack@ openstack-dev@ and openstack-ops@ | 16:39 |
sean-k-mooney | we do discuss gate issues and some dev issues | 16:39 |
artom | So maybe there is room for dev -> ops announcement-type stuff. | 16:40 |
bauzas | openstack@ was the place for troubleshooting | 16:40 |
artom | sean-k-mooney, right, I'm missing a "mostly" in there | 16:40 |
artom | @Nova_PTL twitter account? :D | 16:40 |
bauzas | anyway, there are things the nova team can solve and there are other things that are way out of our team scope :) | 16:40 |
sean-k-mooney | so we spilt the events and merged the lists. if only we did the reverse :) | 16:40 |
sean-k-mooney | im not sure there is much we can do right now to adress this topic | 16:41 |
bauzas | sean-k-mooney: correct and I want to move on | 16:41 |
bauzas | this wasn't an ask to find a solution, just a feedback | 16:41 |
artom | You can't tell us "plz tweet moar" and not expect the convo to derail :P | 16:41 |
bauzas | #link https://etherpad.opendev.org/p/r.ea2e9bd003ed5aed5e25cd8393cf9362 readonly etherpad of the meet-and-greet session | 16:41 |
bauzas | artom: I personnally stopped twitting unless exceptional opportunities, not the one to blame | 16:42 |
bauzas | now, back to productive things | 16:42 |
* artom never tweets, but always twit | 16:42 | |
bauzas | you'll see a long list of complains | 16:42 |
sean-k-mooney | some of which have been adresed in newer releases | 16:43 |
bauzas | I encourage any of you to go read the etherpad and amend it (with the write URL of course) | 16:43 |
gmann | bauzas: any feedback on RBAC scope things if that is discussed in nova sessions also other than ops meetup ? | 16:43 |
bauzas | sean-k-mooney: yeah, I've seen you munging a lot of them, thanks | 16:43 |
bauzas | gmann: I asked about it, this was too way advanced for them | 16:44 |
bauzas | but I pointed them the links to the new rules and personans | 16:44 |
bauzas | personas* | 16:44 |
bauzas | also, this was a 30-min session, | 16:44 |
gmann | bauzas: ok | 16:45 |
bauzas | so, please understand we were basically only able to scratch the surface | 16:45 |
gibi | gmann: we had another session around service roles | 16:45 |
sean-k-mooney | some of the pain points are on our backlog | 16:45 |
sean-k-mooney | so its good that operaters have vlaidated that they still care about them | 16:45 |
gmann | bauzas: I understand, just checking in case any specific feedback we got from nova sessions | 16:45 |
bauzas | for the pain points, I'll diligently try to make sure all of them are adressed | 16:45 |
sean-k-mooney | im thinkign of iothread and virtio-multiqueue | 16:45 |
bauzas | gmann: honestly this was frustrating | 16:46 |
gmann | gibi: ack. do you have any link of that, I will combine those to discuss in RBAC meeting next week | 16:46 |
gibi | gmann: my sort summary on the serivce roles https://meetings.opendev.org/irclogs/%23openstack-nova/%23openstack-nova.2022-06-13.log.html#t2022-06-13T06:43:52 | 16:46 |
bauzas | give me 30 mins more and I could have made operators to sign off for sending herds of contributors to the nova project | 16:46 |
gmann | gibi: thanks | 16:46 |
bauzas | don't be surprised if I'm pinging some of you | 16:47 |
bauzas | I want the etherpad to be curated | 16:47 |
gibi | gmann: and this is the session etherpad but it is a bit of a mess https://etherpad.opendev.org/p/deprivilization-of-service-accounts | 16:47 |
bauzas | gibi: this wasn't a mess | 16:47 |
bauzas | this was rather a prank, I guess | 16:47 |
gibi | then I was pranked :) | 16:48 |
bauzas | exactly my point | 16:48 |
bauzas | step 1 : propose a forum session | 16:48 |
bauzas | step 2: let gibi see it | 16:48 |
bauzas | step 3 : make sure gibi will attend it | 16:48 |
sean-k-mooney | in princiapl we shoudl be able to lable all endpoing that are used for inter service comunicatoin as needing the service role | 16:48 |
bauzas | step 4: don't attend your own session and let gibi lead it instead | 16:48 |
bauzas | step 5 : profit. | 16:48 |
gibi | (just background: when I entered the room for that session I was cornered that there is nobody who can lead the session) | 16:49 |
sean-k-mooney | the service role really shoudl not be able to acces any api other then the inter service apis | 16:49 |
sean-k-mooney | that woudl allow use to entirly drop our use fo the admin role eventually | 16:50 |
gmann | sean-k-mooney: yes, that is direction we are going and has to be a careful audit to verify this. | 16:50 |
bauzas | can we stop on the forum discussions ? | 16:50 |
sean-k-mooney | yes we can move on | 16:50 |
bauzas | anyone having a last question or remark ? | 16:50 |
bauzas | (just timeboxing, sorry) | 16:50 |
sean-k-mooney | no worries | 16:50 |
bauzas | #topic Review priorities | 16:50 |
bauzas | #link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement+OR+project:openstack/os-traits+OR+project:openstack/os-resource-classes+OR+project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/osc-placement)+label:Review-Priority%252B1 | 16:50 |
bauzas | #link https://review.opendev.org/c/openstack/project-config/+/837595 Gerrit policy for Review-prio contributors flag. Naming bikeshed in there. | 16:50 |
bauzas | #action bauzas to propose a revision of https://review.opendev.org/c/openstack/project-config/+/837595 | 16:51 |
bauzas | #link https://docs.openstack.org/nova/latest/contributor/process.html#what-the-review-priority-label-in-gerrit-are-use-for Documentation we already have | 16:51 |
bauzas | that's it on my side | 16:51 |
bauzas | I encourage cores to make use of the flag if they wish | 16:51 |
bauzas | #topic Stable Branches | 16:52 |
bauzas | elodilles: your time | 16:52 |
elodilles | #info stable/train is blocked - melwitt's fix: https://review.opendev.org/c/openstack/nova/+/844530/ | 16:52 |
elodilles | #info stable branch status / gate failures tracking etherpad: https://etherpad.opendev.org/p/nova-stable-branch-ci | 16:52 |
elodilles | release patches proposed (yoga, xena, wallaby): https://review.opendev.org/q/project:openstack/releases+is:open+intopic:nova | 16:52 |
sean-k-mooney | yep i was going to proceed with merging https://review.opendev.org/c/openstack/nova/+/844530 but wanted to ask if there were any objections | 16:52 |
sean-k-mooney | i have also commended on the release patches | 16:52 |
sean-k-mooney | most of the patches i wanted to land are now landed over night | 16:53 |
bauzas | cool, I'll do a bit of reviews then | 16:53 |
bauzas | any other point to raise about stable ? | 16:53 |
elodilles | nothing else i think | 16:54 |
bauzas | cool | 16:54 |
elodilles | sean-k-mooney bauzas : thanks for looking at the release patches | 16:54 |
bauzas | last point then | 16:54 |
sean-k-mooney | elodilles: happy too | 16:54 |
bauzas | elodilles: I have to do it, sean-k-mooney told me already :) | 16:54 |
elodilles | :] | 16:54 |
bauzas | #topic Open discussion | 16:54 |
bauzas | there were nothing on the agenda | 16:54 |
bauzas | for the sake of those last 5 mins, any item to raise ? | 16:55 |
melwitt | bauzas: I realized it would be better if I did bugs this week bc if I'm out next week, that's even less time 😆 | 16:55 |
melwitt | I won't be at the next meeting but I can put my bug etherpad link on the agenda for yall | 16:55 |
bauzas | melwitt: I'm both flexible and ashamed | 16:55 |
bauzas | melwitt: pick anytime you want | 16:55 |
bauzas | and I'll do the overlap | 16:55 |
sean-k-mooney | the only item i was gong to raise was releases. we had a request to do stable release last week but that is proceeding anyway | 16:55 |
melwitt | bauzas: ok, I will do this week. sorry for the confusion | 16:55 |
bauzas | melwitt: np | 16:56 |
bauzas | I guess not a lot of people are reading our weekly meeting and even less of them do bug triage | 16:56 |
bauzas | weekly minutes* | 16:56 |
bauzas | but, not a reason for anarchy with no meetings and agenda ! :D | 16:57 |
bauzas | (and proper highlights) | 16:57 |
gibi | we should try to have our meeting on twitter ;) | 16:57 |
bauzas | OK, I guess we can call the wrap | 16:57 |
bauzas | gibi: I was surprised noone debated on the tool itself | 16:58 |
* sean-k-mooney looks side eyed at gibi | 16:58 | |
bauzas | I could instagram nice pictures of me coding | 16:58 |
bauzas | like, me outside coding | 16:58 |
bauzas | me inside in my office room | 16:58 |
melwitt | start a twitch channel | 16:58 |
sean-k-mooney | totally we should all just stream our coding on twitch :) | 16:58 |
bauzas | I'm feeling too old | 16:59 |
bauzas | but at least I'm happy to hear the TC be young-minded with Tik-Tok releases | 16:59 |
gibi | :D | 16:59 |
bauzas | on that last word, | 16:59 |
bauzas | #endmeeting | 16:59 |
opendevmeet | Meeting ended Tue Jun 14 16:59:59 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:59 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/nova/2022/nova.2022-06-14-16.00.html | 16:59 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/nova/2022/nova.2022-06-14-16.00.txt | 16:59 |
opendevmeet | Log: https://meetings.opendev.org/meetings/nova/2022/nova.2022-06-14-16.00.log.html | 16:59 |
bauzas | oh, snap | 17:00 |
sean-k-mooney | bauzas: i would expect that form artom not you :P | 17:00 |
bauzas | haven't thank you all | 17:00 |
gibi | thanks bauzas | 17:00 |
elodilles | yepp, thanks bauzas o/ | 17:00 |
artom | sean-k-mooney, I'm angry with myself for not thinking of that | 17:01 |
bauzas | artom: sean-k-mooney: can't wait for OpenInfra Live events on Tik-Tok | 17:01 |
* bauzas reads everything wrong | 17:02 | |
sean-k-mooney | cant wait to see powerpoint in portrait mode | 17:02 |
chateaulav | the funny thing is, there is a twitter profile: @OpenStackNova | 17:03 |
chateaulav | but its a usergroup.... | 17:03 |
sean-k-mooney | bauzas: isnt there an annouchment feature in the channel bot | 17:03 |
bauzas | chateaulav: NOOOOO | 17:05 |
gibi | we can alway try TheRealOpenStackNova :)( | 17:06 |
chateaulav | pretty sure that can be revoked or something as it is an actual organization project name | 17:06 |
chateaulav | gibi: true, sounds better too | 17:06 |
bauzas | or rather: https://tenor.com/view/luke-skywalker-no-star-wars-mark-hamill-hanging-on-gif-11368455 | 17:06 |
sean-k-mooney | im pretty sure https://twitter.com/OpenStackStatus is tied into irc status mesages | 17:07 |
sean-k-mooney | bauzas: if we really wanted to get stuff on twtter and make annouchment we proably coudl have a similar feed for nova or the comuntiy in general | 17:07 |
sean-k-mooney | and allow ptls to make annouchmes via the bot | 17:07 |
bauzas | like I said, I was surprised noone argued about Twitter with the whole group debating over 10 mins about Mastodon vs. Twitter | 17:08 |
chateaulav | definaetly then you dont have to tweet. just irc as normal | 17:08 |
* bauzas wonders if people would hear more about us if I was playing in Geordie Shore | 17:10 | |
bauzas | or Jersey Shore | 17:10 |
opendevreview | Artom Lifshitz proposed openstack/nova master: libvirt: remove default cputune shares value https://review.opendev.org/c/openstack/nova/+/824048 | 17:23 |
kpdev | Hi,on fresh installed | 17:46 |
kpdev | ubuntu 20.04, I see tox fails on stable/xena | 17:46 |
kpdev | lot of workaround mentioned on threads e.g. use setuptools==58.0.0 | 17:47 |
kpdev | but it still not resolve issue. Anyone faced and solved similar issue in past ? | 17:48 |
kpdev | Collecting suds-jurko>=0.6 Using cached suds-jurko-0.6.zip (255 kB) Preparing metadata (setup.py): started Preparing metadata (setup.py): finished with status 'error' error: subprocess-exited-with-error × python setup.py egg_info did not run successfully. │ exit code: 1 ╰─> [1 lines of output] error in suds-jurko setup command: use_2to3 is invalid. [end of output] note: This er | 17:48 |
sean-k-mooney | potentially dumb question but has a hw:bfv extra spec to enable the boot form volume workflow via flavor ever come up | 18:27 |
sean-k-mooney | i know the idea of a cinder images backend was disucessed and then never implemnted | 18:28 |
sean-k-mooney | but waht about hw:bfv=True hw:bfv_type=mass-storage-volume-type hw:bfv_delete_on_terminate=True | 18:29 |
sean-k-mooney | i was just looking at https://etherpad.opendev.org/p/r.ea2e9bd003ed5aed5e25cd8393cf9362#L235 | 18:29 |
sean-k-mooney | and that seamed to be a way to achive that without having to add a cinder images backend | 18:30 |
sean-k-mooney | it woudl allow operators to define flavor that always used cinder remote storage and they could define the size with the normaly disk value | 18:31 |
sean-k-mooney | just something to think about | 18:32 |
*** dasm is now known as dasm|off | 21:02 | |
zigo | Nova (from Yoga) fails with jsonschema 4.6.0: https://ci.debian.net/data/autopkgtest/unstable/amd64/n/nova/22676760/log.gz | 22:44 |
zigo | It'd be nice if someone could investigate. | 22:44 |
sean-k-mooney | zigo: that proably because uppreconstratis for yoga is 3.2.0 https://github.com/openstack/requirements/blob/stable/yoga/upper-constraints.txt#L583= | 22:46 |
sean-k-mooney | zigo: so that is not tested/supported on yoga | 22:46 |
zigo | sean-k-mooney: It's not tested anywhere, not even on Master. | 22:46 |
zigo | And that's my point... :) | 22:46 |
sean-k-mooney | then we need to unpin it on master and fix issues there | 22:47 |
sean-k-mooney | but we dont normally backport that support to stable branches | 22:47 |
zigo | If we have a patch in master, that's enough for me (very often, I push backports of this kind of patch as Debian only patches...). | 22:48 |
sean-k-mooney | i dont see where that is bing pinned | 22:49 |
sean-k-mooney | its in upper-constreaits | 22:49 |
sean-k-mooney | but that is auto generated when tere are pypi release | 22:49 |
sean-k-mooney | https://github.com/openstack/requirements/search?q=jsonschema | 22:49 |
sean-k-mooney | so its not clear why it is still on 3.2.0 | 22:50 |
sean-k-mooney | 4.6 is the most recent release https://pypi.org/project/jsonschema/ | 22:50 |
sean-k-mooney | 3.2 is form novemenr 18th 2019 | 22:51 |
sean-k-mooney | we should proably bring this up on #openstack-qa or infra | 22:51 |
gmann | jsonschema 4.6.0 will be pretty new for nova and we might need to test properly not just nova but all project using it before we bump it from 3.2.0 to 4.6.0 | 22:59 |
sean-k-mooney | gmann: im more concerned by the fact that we have not updated this in 2 years | 23:00 |
sean-k-mooney | there were plenty of releses in the 4 seriese | 23:00 |
sean-k-mooney | that were not automatically updated in requriemtnes | 23:00 |
sean-k-mooney | 4.6.0 cam our in june but there were plenty of other reslease sicne novmeber 2019 | 23:01 |
sean-k-mooney | that shoudl have been picked up in xena and yoga | 23:01 |
gmann | yeah. we might not need the new things as we are also using Draft4Validator which is enough as per nova need. but yes before bumping to new version we should also bump Draft4Validator to Draft7Validator | 23:04 |
gmann | and with proper testing as we need to check backward compatibility than any new features which we do not need as such | 23:04 |
sean-k-mooney | gmann: im more woried that this is not the only lib that was not updated | 23:06 |
gmann | that is possible, I am not 100% sure how constraints generation work for non openstack deps on their new release. prometheanfire in requirement channel can tell us if there is bug in requirements scripts | 23:10 |
sean-k-mooney | gmann: i think its ment to be triggered peroficaly pulling the latest release form pypi | 23:11 |
sean-k-mooney | im trying to run it manually with tox -e generate -- -p $(which python3) -r ./global-requirements.txt | 23:12 |
sean-k-mooney | im getting failurese in the command execution | 23:13 |
sean-k-mooney | error in anyjson setup command: use_2to3 is invalid. | 23:13 |
sean-k-mooney | https://zuul.openstack.org/job/propose-update-constraints | 23:14 |
sean-k-mooney | i think is ment to do it | 23:14 |
sean-k-mooney | https://zuul.openstack.org/builds?job_name=propose-update-constraints | 23:15 |
sean-k-mooney | that is runnign fine i dont see anything that calles generate | 23:15 |
sean-k-mooney | hum looks liek its defiend in porject config | 23:17 |
sean-k-mooney | https://opendev.org/openstack/project-config/src/branch/master/playbooks/proposal/propose_update.sh#L31-L42 | 23:18 |
sean-k-mooney | more or less the same issue with anyjson if i do | 23:21 |
sean-k-mooney | .venv/bin/generate-constraints -b blacklist.txt -p python3.8 -r global-requirements.txt > upper-constraints.txt | 23:21 |
sean-k-mooney | which is basically what the job does | 23:21 |
sean-k-mooney | it gets much futher if i comment out anyjson but im missign some bindeps which i cant install on my laptop | 23:26 |
sean-k-mooney | ill try and run this in a devstack env tomorrow | 23:26 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!