Tuesday, 2021-03-09

*** lbragstad has quit IRC00:23
*** tosky has quit IRC00:28
*** martinkennelly has quit IRC00:50
*** lbragstad has joined #openstack-nova00:52
*** bauzas has quit IRC01:15
yonglihegibi: thanks  raise the VIC TYPE concerns, I tested by  wire that in local repo. I try to verify Rodolfo's patch today: https://review.opendev.org/c/openstack/neutron/+/77929201:17
*** mlavalle has quit IRC01:21
*** hamalq has quit IRC01:24
gmannstephenfin: gibi updated the policy patch for modernize-os-hypervisors-api BP https://review.opendev.org/c/openstack/nova/+/765798/501:31
*** k_mouza has joined #openstack-nova01:40
*** ociuhandu has joined #openstack-nova01:42
*** k_mouza has quit IRC01:45
*** ociuhandu has quit IRC01:46
*** macz_ has quit IRC01:55
*** bauzas has joined #openstack-nova02:14
*** spatel_ has joined #openstack-nova02:15
*** rcernin has quit IRC02:25
*** rcernin has joined #openstack-nova02:42
*** rcernin has quit IRC02:48
*** rcernin has joined #openstack-nova02:51
*** macz_ has joined #openstack-nova03:12
*** macz_ has quit IRC03:16
*** rcernin has quit IRC03:20
*** psachin has joined #openstack-nova03:29
*** ociuhandu has joined #openstack-nova03:30
*** ociuhandu has quit IRC03:34
*** rcernin has joined #openstack-nova03:40
*** lbragstad has quit IRC03:43
*** zzzeek has quit IRC03:46
*** zzzeek has joined #openstack-nova03:47
*** sapd1_x has joined #openstack-nova03:50
*** efried1 has joined #openstack-nova03:50
*** david-lyle has joined #openstack-nova03:50
*** eandersson8 has joined #openstack-nova03:51
*** adriant4 has joined #openstack-nova03:51
*** dpawlik6 has joined #openstack-nova03:51
*** rcernin_ has joined #openstack-nova03:51
*** gary_perkins_ has joined #openstack-nova03:56
*** owalsh_ has joined #openstack-nova03:58
*** rcernin has quit IRC03:59
*** gyee has quit IRC03:59
*** amodi has quit IRC03:59
*** sapd1 has quit IRC03:59
*** hoonetorg has quit IRC03:59
*** mjturek has quit IRC03:59
*** openstackgerrit has quit IRC03:59
*** efried has quit IRC03:59
*** dklyle has quit IRC03:59
*** adriant has quit IRC03:59
*** gary_perkins has quit IRC03:59
*** eandersson has quit IRC03:59
*** tbarron has quit IRC03:59
*** cgoncalves has quit IRC03:59
*** dpawlik has quit IRC03:59
*** owalsh has quit IRC03:59
*** adriant4 is now known as adriant03:59
*** efried1 is now known as efried03:59
*** eandersson8 is now known as eandersson03:59
*** dpawlik6 is now known as dpawlik03:59
*** gyee has joined #openstack-nova04:01
*** vishalmanchanda has joined #openstack-nova04:16
*** vishalmanchanda has quit IRC04:21
*** vishalmanchanda has joined #openstack-nova04:21
*** songwenping__ has joined #openstack-nova04:28
*** songwenping_ has quit IRC04:32
*** ratailor has joined #openstack-nova04:34
*** whoami-rajat_ has joined #openstack-nova04:41
*** whoami-rajat_ is now known as whoami-rajat04:46
*** jangutter has quit IRC05:15
*** jangutter has joined #openstack-nova05:16
*** ociuhandu has joined #openstack-nova05:18
*** ociuhandu has quit IRC05:22
*** iurygregory has quit IRC05:26
*** k_mouza has joined #openstack-nova05:37
*** k_mouza has quit IRC05:41
*** macz_ has joined #openstack-nova05:48
*** rcernin_ has quit IRC05:48
*** rcernin_ has joined #openstack-nova05:49
*** macz_ has quit IRC05:52
*** spatel_ has quit IRC06:42
*** slaweq has joined #openstack-nova06:50
*** ociuhandu has joined #openstack-nova07:00
*** ociuhandu has quit IRC07:06
*** cgoncalves has joined #openstack-nova07:09
*** bauzas has quit IRC07:16
*** gyee has quit IRC07:29
*** vishalmanchanda has quit IRC07:36
*** ociuhandu has joined #openstack-nova07:43
*** david-lyle has quit IRC07:46
*** dklyle has joined #openstack-nova07:48
*** xek has joined #openstack-nova07:48
*** bauzas has joined #openstack-nova07:54
*** khomesh24 has joined #openstack-nova07:56
*** tesseract has joined #openstack-nova08:01
*** dklyle has quit IRC08:01
*** lpetrut has joined #openstack-nova08:04
*** tesseract has quit IRC08:12
*** tesseract has joined #openstack-nova08:13
*** songwenping_ has joined #openstack-nova08:14
*** songwenping__ has quit IRC08:18
*** andrewbonney has joined #openstack-nova08:19
*** rpittau|afk is now known as rpittau08:24
*** rcernin_ has quit IRC08:24
*** macz_ has joined #openstack-nova08:41
*** martinkennelly has joined #openstack-nova08:41
*** macz_ has quit IRC08:46
*** ociuhandu has quit IRC08:48
*** ociuhandu has joined #openstack-nova08:49
*** ociuhandu has quit IRC08:49
yonglihegibi:  alex_xu:, F.Y.I  I've talke with ralonsoh and slaweq, that neutron patch is likely merged today: https://review.opendev.org/c/openstack/neutron/+/77929208:52
*** sapd1 has joined #openstack-nova08:57
*** derekh has joined #openstack-nova09:00
lyarwoodmelwitt: ack np looking09:01
*** tosky has joined #openstack-nova09:02
lyarwoodelod: https://review.opendev.org/c/openstack/nova/+/777217/ - I think you forgot to vote on this?09:04
*** ociuhandu has joined #openstack-nova09:04
*** lucasagomes has joined #openstack-nova09:05
*** ftarasenko has joined #openstack-nova09:06
*** ociuhandu has quit IRC09:14
*** ociuhandu has joined #openstack-nova09:15
elodlyarwood: wow. apparently. thx. :S09:20
*** ociuhandu has quit IRC09:20
lyarwoodelod: np I've done that lots with the new UI09:20
elodyeah, let's blame that o:) I don't know how i missed to vote :S sorry :S09:22
*** ociuhandu has joined #openstack-nova09:28
*** k_mouza has joined #openstack-nova09:32
*** k-s-dean has joined #openstack-nova09:37
stephenfinoh, no Gerrit bot?09:48
stephenfintrivial fix to address some issues coming down the pipeline with setuptools here https://review.opendev.org/c/openstack/nova/+/77944909:48
jrossercould i get some help with this when debugging why i can't rescue boot-from-volume images https://opendev.org/openstack/nova/src/branch/master/nova/api/openstack/compute/rescue.py#L6009:52
jrosserif i print req.api_version_request i get "API Version Request Major: 2, Minor: 1"09:53
jrosserthat seems always going to fail if passed to api_version_request.is_supported(req, '2.87')09:53
stephenfinjrosser: How are you making the request/09:55
jrosserstephenfin: either through horizon or `openstack server rescue --image 14600a9b-a240-4210-8a32-cea43d0499ac 4f2b0a96-7b6b-4d01-bb75-1248574e71d6`09:57
stephenfinyou need to request API microversion 2.87, as that code would suggest09:57
stephenfinopenstack --os-compute-api-version 2.87 server rescue ...09:57
stephenfinshould do the trick09:57
stephenfinOSC currently defaults to 2.1. We're working on fixing that09:58
*** jawad_axd has joined #openstack-nova09:58
stephenfinI don't know how to do the equivalent in Horizon09:58
jrosserooooohhhh - well thats explains it!09:58
jrosserperhaps that warrants a docs fix here https://docs.openstack.org/nova/latest/user/rescue.html09:59
stephenfinYes, good point09:59
stephenfinlyarwood: Maybe if you have time in the next few weeks? ^09:59
*** k_mouza has quit IRC10:04
*** k_mouza has joined #openstack-nova10:05
*** k-s-dean has quit IRC10:10
jawad_axdHi folks, how luks password is generated for instance booted from encrypted volume? Barbican is used in environment. I can grab luks password like “virsh secret-get-value SECRET-UUID | —base64 —decode” when instance is running on compute node. How can I generate this password on my own? Use case is: if instance is backed up outside somewhere and needs to boot standalone, thus needs luks passwor10:11
jawad_axdd.Any comments on this?Thanks10:11
jrosserjawad_axd: there was a thread on the mailing list about this recently http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020374.html10:15
*** k_mouza has quit IRC10:17
jawad_axd@jrosser I missed that one. I look into it. Thanks10:17
*** k_mouza has joined #openstack-nova10:20
lyarwoodstephenfin: sorry was afk, reading10:21
lyarwoodhuh how did I miss that10:23
*** rcernin_ has joined #openstack-nova10:27
lyarwoodstephenfin: https://review.opendev.org/c/openstack/nova/+/77947910:32
lyarwoodjawad_axd: hey sorry reading10:32
lyarwoodjawad_axd: yeah that post on the ML spells out how to grab the passphrase from the key manager, it's awful I know but thanks to design choices made before my time :/10:33
lyarwoodjawad_axd: http://lists.openstack.org/pipermail/openstack-discuss/2021-February/020421.html specifically lists the steps10:34
*** jangutter has quit IRC10:41
jawad_axd@lyarwood Thanks. Apparently, it requires volume to to be mapped. I am wondering if there is an easy way to decrypt/decode it from barbican cli, when encrypted volume exists in environment and is in in AVAILABLE state.a10:42
*** jangutter has joined #openstack-nova10:42
*** jangutter has quit IRC10:43
*** jangutter has joined #openstack-nova10:44
lyarwoodjawad_axd: No, you can't stream/decode an unmapped encrypted volume.10:44
lyarwoodjawad_axd: unlike Glance, Cinder doesn't send actual volume data via the API10:45
lyarwoodjawad_axd: so you will need to map the encrypted volume to a host before being able to decrypt it10:45
*** iurygregory_ has joined #openstack-nova10:45
lyarwoodjawad_axd: as the owner of the volume you could snapshot it and download the encrypted volume snapshot from glance I guess10:46
*** iurygregory_ is now known as iurygregory10:46
lyarwoodjawad_axd: that would give you a local file you could then decrypt but it's not the live volume10:46
gibistephenfin: question in https://review.opendev.org/c/openstack/nova/+/779449/1/setup.cfg#1110:47
*** k_mouza has quit IRC10:56
jawad_axd@lyarwood Ok.. if I understand correctly, luks header is attached to 'rbd disk in my case'  when we create instance from encrypted volume, is that right ? My understanding is, when we create encrypted volume from image, it attaches luks header to volume_from_image that time, if its true then it should be possible to retrieve luks password when it is not mapped.10:57
lyarwoodjawad_axd: the header is always attached but the actual passphrase is stored outside of that in the key manager and doesn't require access to the volume or image10:59
lyarwoodjawad_axd: to be clear, fetching the passphrase and decrypting data are two seperate steps11:01
lyarwoodjawad_axd: you don't need access to the volume to fetch the passphrase from the key manager11:01
lyarwoodjawad_axd: you do need access to the volume or a downloaded image to decrypt the actual data held within them11:01
jawad_axd@lyarwood I am actually concerned about passphrase here. How to get it from key manager.. I11:02
lyarwoodjawad_axd: kk then it's this part11:03
lyarwoodopenstack secret get --payload_content_type 'application/octet-stream' http://192.168.122.208/key-manager/v1/secrets/6fd4f879-005d-4b7d-9e5f-2505f010be7c --file mysecret.key11:03
lyarwoodjawad_axd: ^ where the URL is provided by `openstack secret list` and lists your secret UUID11:03
* lyarwood forgets if cinder lists the secret uuid somewhere11:03
lyarwoodah yeah it does11:04
lyarwoodhttps://docs.openstack.org/api-ref/block-storage/v3/index.html?expanded=show-a-volume-s-details-detail#show-a-volume-s-details11:04
lyarwoodif you do an `openstack volume show $volume`11:04
lyarwoodencryption_key_id is the secret uuid11:04
lyarwoodjawad_axd: `hexdump -e '16/1 "%02x"' mysecret.key` will then give you the passphrase used to unlock the LUKS header11:05
lyarwoodwhich is awful11:05
lyarwoodbut here we are11:05
lyarwoodjawad_axd: FWIW by default only the owner of the volume should have access to this secret11:06
*** jangutter has quit IRC11:07
*** jangutter has joined #openstack-nova11:07
jawad_axd@lyarwood Yesss... I just tested and got passphrase from "hexdump -e '16/1 "%02x"' mysecret.key" . Very cool. Thanks alot.11:12
*** sapd1 has quit IRC11:16
*** dtantsur|afk is now known as dtantsur11:25
*** rcernin_ has quit IRC11:26
*** Luzi has joined #openstack-nova11:27
*** jangutter has quit IRC11:30
*** jangutter has joined #openstack-nova11:30
*** jangutter has quit IRC11:42
*** jangutter has joined #openstack-nova11:42
*** ociuhandu_ has joined #openstack-nova11:49
*** artom has quit IRC11:51
*** slaweq_ has joined #openstack-nova11:52
*** ociuhandu has quit IRC11:53
*** ociuhandu_ has quit IRC11:53
*** brinzhang has quit IRC11:56
*** lpetrut has quit IRC11:56
*** k_mouza has joined #openstack-nova11:56
*** ociuhandu has joined #openstack-nova11:57
*** lpetrut has joined #openstack-nova11:58
*** slaweq has quit IRC11:59
*** ociuhandu has quit IRC12:03
*** ociuhandu has joined #openstack-nova12:13
*** ociuhandu has quit IRC12:18
*** artom has joined #openstack-nova12:25
*** slaweq_ is now known as slaweq12:38
*** psachin has quit IRC12:47
stephenfingibi: Done12:50
*** __ministry1 has joined #openstack-nova12:52
jrosserstephenfin: thankyou for the api version tip before, i am now able to put a boot-from-volume instance into rescue12:52
stephenfinhth12:52
jrosserit doesnt quite behave as i expect though :)12:53
jrosseri seem to get the original instance root disk as the root disk in the rescue instance12:53
jrosserthis is the xml i get for the disks http://paste.openstack.org/show/803376/12:54
*** zzzeek has quit IRC13:00
*** zzzeek has joined #openstack-nova13:02
*** lucasagomes has quit IRC13:04
lyarwoodjrosser: yup that's by design, it's using stable device rescue so the original disks are presented first13:05
lyarwoodjrosser: with the rescue disk last13:05
jrosserbut i would expect it to boot off the USB disk in that case?13:06
lyarwoodjrosser: note the boot order element, it boots from the rescue disk13:06
gibistephenfin: thanks +213:06
jrosseryeah, so with hw_rescue_bus=usb i don't see it boot from the usb device13:06
lyarwoodjrosser: is it the same image in the volume and rescue image?13:07
jrosserit should be, +/- the snapshots in the ceph backend13:07
lyarwoodjrosser: right so are you sure it's booting from the original volume?13:07
lyarwoodthis caught be out a few times while testing this13:08
jrosseryes, its confusing, one moment13:08
lyarwoodhmm actually I wonder if this is a valid bug when using a different bus13:09
jrosserok before i check that, i had another thing with hw_rescue_bus=scsi13:09
lyarwoodis there the original boot element higher in the XML?13:09
jrosserin the instance console bus=scsi gives "no bootable device"13:09
* jrosser puts this back to USB13:10
jrosserlyarwood: i've just redone it with the bus=usb http://paste.openstack.org/show/803378/13:14
sean-k-mooneylyarwood: we dont really suppor t mixing buses properly13:14
sean-k-mooneyi dont think we generate the contoler properly in all cases13:14
lyarwoodsean-k-mooney: the instance was already using SCSI in the bus=scsi case so that should work13:15
lyarwoodsean-k-mooney: and bus=usb worked for me while I was testing this back in the day with virtio disks attached13:16
*** vishalmanchanda has joined #openstack-nova13:16
sean-k-mooneyyep but i thik we have some edge cases with scsi and virtio blk13:16
lyarwoodyeah13:16
sean-k-mooneyusb would likely have worked because the usb contoler was previously always added by libvirt13:17
sean-k-mooneyi have seen issue where the scsi contoler was not alwasy added but that might have been fixed by now13:17
lyarwoodjrosser: just building an env now to play with this13:17
jrosserlyarwood: oh cool, thankyou :)13:18
lyarwoodjrosser: oh wait, does this reproduce if you use a different image?13:18
jrosseri can try that13:18
lyarwoodyeah please, it might be the rescue disk is finding the original disk first and mounting it as the root filesystem as the labels match13:19
lyarwoodso we are booted into the kernel from the rescue disk using the filesystem from the original13:19
jrosseralso somewhat contrary to the stuff right at the end of here, leaving --image off does some fail-y thing i've not yet found https://docs.openstack.org/nova/latest/user/rescue.html13:20
lyarwoodhmm with a boot from volume instance it should try to boot from the original image referenced by the volume, if one is present. I forget what the behaviour is if an image wasn't used to create the volume.13:22
* lyarwood grabs some lunch while devstack builds13:24
*** lbragstad has joined #openstack-nova13:24
jrosserfrom the wording i'd inferred that no --image would make it use the 'default', i.e the one from the instance being rescued13:24
sean-k-mooneydo we support rescue for BFV13:28
sean-k-mooneywe didnt for a long time13:28
sean-k-mooneyi think it was added in the last 2-3 cycle but cant recall if it landed13:28
jrosserussuri i think13:28
sean-k-mooneyya i know we still have no supprot for rebuild with bfv13:29
jrosserah thats interesting, changing the rescue image to one != the original instance makes things work a whole lot better13:30
sean-k-mooneyjrosser: the image for rescue if you dont pass an image is the image use to boot the vm unless a rescue image is set in the nova.conf https://docs.openstack.org/nova/latest/configuration/config.html#libvirt.rescue_image_id13:30
jrosseri rescued a focal vm with bionic image and thats now as i expect13:30
sean-k-mooneyjrosser: yep it normally does13:30
sean-k-mooneyit should work in both casees but you often dont have the same disk lable issues13:31
jrossertheres a bunch of trap doors for the unwary here :)13:31
sean-k-mooneyfor what its worth i have generally not had issues with this. its typeically just worked13:32
sean-k-mooneyeven before the stable rescuse work13:32
*** jangutter has quit IRC13:34
*** jangutter has joined #openstack-nova13:34
jrosserheres what i get if i don't pass --image http://paste.openstack.org/show/803379/13:35
lyarwoodjrosser: kk that's a bug13:36
lyarwoodjrosser: but glad the original issue is resolved at least13:36
lyarwoodhttps://github.com/openstack/nova/blob/31889ce296d1e1a62fe5825292479009118ddfab/nova/compute/manager.py#L4123-L4130 doesn't look right13:38
jrosserlyarwood: would you expect hw_rescue_bus=scsi to work?13:38
lyarwoodjrosser: with a different image and an instance that already had a disk attached via SCSI yes13:39
jrosserfeels like something else there as i get "No Bootable device" in the instance console13:39
lyarwoodjrosser: would you mind raising a bug for that and the API error above when --image is missing?13:44
lyarwoodI'm not sure about the SCSI failure to find a boot device tbh13:44
*** jangutter has quit IRC13:44
lyarwoodunless again it's something weird with the image13:44
sean-k-mooneyjrosser: which image did you set the rescue bus on?13:44
*** jangutter has joined #openstack-nova13:44
jrosserthe one i'm specifying with --image13:45
sean-k-mooneyif you dont specify it it would have to be on the original image13:45
sean-k-mooneyya ok that should be the image whos metadata we use13:45
jrosserin this case turns out those options are set on both images i've tried as the rescue image now13:46
jrosserone of which is the original instance image13:47
*** sapd1 has joined #openstack-nova13:48
*** spatel_ has joined #openstack-nova13:48
*** spatel_ has quit IRC13:50
*** ratailor has quit IRC13:53
*** tkajinam has quit IRC13:53
*** amodi has joined #openstack-nova13:54
*** spatel_ has joined #openstack-nova13:56
*** songwenping_ has quit IRC14:01
*** songwenping_ has joined #openstack-nova14:01
jrosserlyarwood: bug reports done14:13
jrosserthankyou again for your help, i have something usable now with the usb bus and understanding the need for a different rescue image14:14
*** lucasagomes has joined #openstack-nova14:16
*** lucasagomes has quit IRC14:16
*** jangutter has quit IRC14:16
lyarwoodjrosser: np and thanks for the bugs, I'll try to get them resolved after feature freeze later this week14:17
*** lucasagomes has joined #openstack-nova14:17
*** jangutter has joined #openstack-nova14:17
*** abhishekk is now known as konan14:26
*** konan is now known as abhishekk14:28
*** Luzi has quit IRC14:34
sean-k-mooneylyarwood: by the way do you know why the volumn detach is sometime failing in the live migration job14:36
sean-k-mooneylooks like its hitting nova.exception.DeviceDetachFailed: Device detach failed for vdb: Unable to detach the device from the live config.14:40
*** __ministry1 has quit IRC14:41
lyarwoodsean-k-mooney: no, I've been trying to push gibi's rework along to see if that resolved it tbh14:41
sean-k-mooneyhttp://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22Unable%20to%20detach%20the%20device%20from%20the%20live%20config%5C%22%20AND%20loglevel%3A%20ERROR14:42
lyarwoodsean-k-mooney: there's nothing obvious in the logs but I wonder if it could be related to https://bugs.launchpad.net/cinder/+bug/191775014:43
openstackLaunchpad bug 1917750 in Cinder "Running parallel iSCSI/LVM c-vol backends is causing random failures in CI" [Undecided,New]14:43
sean-k-mooney~300 hits in 30 days14:43
sean-k-mooneyim seeing some libvirt issue on the contoler too not the node with teh detach issue14:43
lyarwooddo you have an example to hand?14:44
sean-k-mooneyam its hitting my vdpa pataches but also neutron let me get one14:45
sean-k-mooneyso ya https://review.opendev.org/c/openstack/nova/+/778350/4 https://zuul.opendev.org/t/openstack/build/fb643b53835341ac8589afeadfa7044d/logs14:47
* lyarwood cracks knuckles14:48
sean-k-mooneyits showing up in neutron too https://review.opendev.org/c/openstack/neutron/+/77778514:49
*** artom has quit IRC14:49
sean-k-mooneyin https://zuul.opendev.org/t/openstack/build/32f3dd64008b469eb9fb8b13ed33f13714:49
sean-k-mooneyso i think this is just an issue with master in general14:50
sean-k-mooneyit could be related to https://bugs.launchpad.net/cinder/+bug/1917750 maybe havent looked at it yet14:50
openstackLaunchpad bug 1917750 in Cinder "Running parallel iSCSI/LVM c-vol backends is causing random failures in CI" [Undecided,New]14:50
*** ociuhandu has joined #openstack-nova14:51
lyarwoodsean-k-mooney: yeah I think it's related14:57
*** ociuhandu has quit IRC14:57
lyarwoodgah15:01
lyarwoodyeah it's that15:01
lyarwoodso we end up in a situation where one test attaches a volume to the host as /dev/sda from c-vol LVM/iSCSI backend #115:02
lyarwoodanother test then attaches another volume with the same WWN to the hsot as /dev/sdb from c-vol LVM/iSCSI backend #215:02
lyarwoodthe first test finishes and removes what it thinks is the first volume attachment15:03
lyarwoodbut it's actually the second15:03
lyarwoodleaving libvirt unable to detach the device from the instance as I assume it can't flush15:03
lyarwoodI guess our current code swallows or ignores the failure from the libvirt and just retries?15:04
sean-k-mooneyporbaly ya15:04
sean-k-mooneywe dont use the events yet15:04
sean-k-mooneyso we need to revert running those n parrallel15:04
sean-k-mooneyalthough is this not a os-brick bug15:05
sean-k-mooneyi mean the concurrent attach should work15:05
lyarwoodyeah it's a job topology bug15:05
lyarwoodit `works` but the first test will end up sending I/O to the volume from the second test15:05
lyarwoodsomething has changed recently somewhere in the stack to allow both backends to return the same WWN tbh15:06
bauzasstephenfin: apologies for forgetting that a string is immutable15:06
* bauzas hides15:06
lyarwoodunless people have always missed these failures15:06
sean-k-mooneylyarwood: would not not be possible thouhg in general15:06
sean-k-mooneyfor them to have the same WWN15:06
lyarwoodnot with a real world backend15:07
lyarwoodhttps://en.wikipedia.org/wiki/World_Wide_Name15:08
*** ociuhandu has joined #openstack-nova15:08
sean-k-mooneylyarwood: ya i tought if you have iscsi portals from different sotrage backend thw WWN was only uniq within any given portal15:12
sean-k-mooneylyarwood: i dont think cinder is choosing the WWN for the volume15:13
*** mjturek has joined #openstack-nova15:14
lyarwoodright yeah that would make sense15:16
* lyarwood builds a test env to prove wtf is going on here15:21
stephenfinbauzas: nw. I've replied to the rest of your comments also15:22
*** lpetrut has quit IRC15:29
bauzasyeah I'll review again15:42
gibilyarwood: do I understand correclty from the scrollback that making some cinder test serial would help merging patches?15:42
lyarwoodgibi: removing c-vol from the computes would15:42
lyarwoodgibi: if we really need to run another c-vol service then it needs to be on the controller alongside the original15:43
lyarwoodgibi: just building an env now to prove things15:43
stephenfinlyarwood, gibi, sean-k-mooney: any aversion to adding mypy to pre-commit?15:43
sean-k-mooneynot speciriclly no15:43
stephenfinassuming I can do it while respecting mypy-files15:44
lyarwoodstephenfin: against the files that are changing?15:44
gibilyarwood: thanks for looking into that, sign me up for review where there is something I can push15:44
sean-k-mooneyi have been bitten by flake8 passed by pep8 didnt15:44
stephenfinyeah15:44
gibilyarwood: does having c-vol along with n-cpu is an invalid config?15:44
lyarwoodstephenfin: no issues assuming it isn't adding a huge amount of delay15:44
sean-k-mooneyi have gotten used to relying on pre-commit to do that form me15:44
stephenfinsean-k-mooney: me too :( I ran it on HEAD but it turns out I broke something then fixed it in the next change15:44
stephenfinme too * 2 :)15:44
stephenfinpre-commit FTW15:44
gibistephenfin: no problem for me I dont use pre-commit :)15:44
*** ociuhandu has quit IRC15:45
stephenfingibi: You're missing out. It's pretty great :)15:45
lyarwoodgibi: no, the issue here is with the default c-vol backend, LVM/iSCSI, that you can't have >1 running version of in an env AFAICT15:45
*** ociuhandu has joined #openstack-nova15:45
* bauzas disappears for around 1.5h15:45
gibistephenfin: I do run fast8 py39 and functional-py39 on every commit I push (except on old stable branches where there is no py39 or py38 support)15:46
bauzaswife is sick and I need to taxi her to the doctor15:46
lyarwoodgibi: we end up mapping different volumes from different backends to the computes with the same WWN (world wide name) that should be unqiue15:46
gibilyarwood: so we cannot have multiple backend per compute?15:46
stephenfingibi: That's better than me. I run what I think are relevant tests and then let the CI do the rest. Our tests take too long to run locally15:46
lyarwoodgibi: we can't have multiple LVM/iSCSI backed c-vol's in the same env15:46
* stephenfin might do some work on that in Xena15:47
gibistephenfin: I have a beefy blade in a lab to run unit and func test (and devstack on baremetal to test sriov)15:47
* lyarwood has some _unit_last and _func_last commands that run unit and func for any test files touched15:47
*** ociuhandu has quit IRC15:47
lyarwoodit's not great but better than nothing15:47
stephenfinI've a four year old laptop 0:)15:47
*** ociuhandu has joined #openstack-nova15:47
lyarwoodrefresh is 3 soooooooooooo ;)15:48
*** dklyle has joined #openstack-nova15:48
gibistill a laptop is slow, you should ask for a lab :)15:48
* stephenfin might not have managed to switch everything over yet...15:48
gibilyarwood: so in a real deployment there can only one c-vol service using the LVM/iSCSI backend?15:49
gibilyarwood: sorry that I'm slow to understant this :)15:49
lyarwoodgibi: I don't think that has ever been enforced but the LVM/iSCSI backend is only ever used for testing15:50
gibiOK, then I stop worrying :)15:50
gibiit is just test :)15:50
*** jangutter_ has joined #openstack-nova15:51
lyarwoodI'll just caveat all of the above with the fact that no one from Cinder has agreed with any of that in the bug as yet15:52
lyarwoodso I might be missing the point entirely15:52
lyarwoodbut two volumes with the same WWN connected to the same host smells like something that will bork devicemapper15:52
*** stephenfin has quit IRC15:53
gibiif it fixes the CI and let us merge the api db compaction before FF then I'm happy to take the hit if we need to revert the change later15:53
gibi:)15:53
*** jangutte_ has joined #openstack-nova15:53
lyarwoodgibi: is that stuck in a recheck loop?15:53
gibipretty much yes15:54
lyarwoodgah okay15:54
*** stephenfin has joined #openstack-nova15:54
lyarwoodlet me confirm and then I can push some changes15:54
gibithe Queens one is recheckd through the whole weekend and still bouncing back15:54
gibimostly with the detach issue15:54
gibibut also kernel panic, and recenlty with POST_FAILURe15:54
*** jangutter has quit IRC15:54
*** jangutter_ has quit IRC15:56
stephenfinsean-k-mooney: because I didn't write it down, can you remind me again how you were suggesting me map a project to a hypervisor in '/os-hypervisors'? Was it metadata?16:07
stephenfinsean-k-mooney: to clarify, if I say "get all hypervisors relevant to this user", what should I be filtering on?16:07
sean-k-mooneyuse the existing metadtaa keys for tenant isolation16:10
sean-k-mooneyone sec ill get it16:11
sean-k-mooneyhttps://docs.openstack.org/nova/latest/admin/aggregates.html#tenant-isolation-with-placement16:12
sean-k-mooneyyou would be looking for filter_tenant_id*16:12
lyarwoodsigh, why is devstack writing out /etc/cinder/cinder-api-uwsgi.ini when we only deploy c-vol16:12
stephenfinsean-k-mooney: great16:12
sean-k-mooneybasically if the host is not a member of an aggreate with filter_tenant_id* then anyone coudl view it16:14
sean-k-mooneyif it is then only does listed in that can view it16:14
sean-k-mooneythe way i was suggsing doing it was look for all host wiht filter_tenant_id=<my-project> and if that is none then allow all hosts16:16
sean-k-mooneyyou coudl do it other ways fo course but i think that is what i suggested in the past16:16
stephenfingmann, gibi: replied on https://review.opendev.org/c/openstack/nova/+/76579816:16
stephenfinthe policy checks are correct, but we're not filtering the compute nodes retrieved16:16
* sean-k-mooney didnt write that down either unless i left it in a comment16:16
gibistephenfin: good potin16:22
gibipoint16:22
sean-k-mooneystephenfin: by the way are you ok with https://review.opendev.org/c/openstack/nova/+/773792 now? i did not remove the extension check since bauzas suggested i should add it when i discussed it previously16:37
sean-k-mooneyit hit the cinder issue i was disucssing with lyarwood  eairler so i have not rechecked it yet16:37
stephenfinHoly s***, OSC has a REPL??16:38
stephenfinsean-k-mooney: I'd like to remove it if we can. I'm literally trying to test if it will work as we speak :)16:39
sean-k-mooneyyes....16:39
sean-k-mooneyyou didnt know that16:39
stephenfinbut tbf I didn't read your replies yet. Looking16:39
sean-k-mooneystephenfin: we can remove it16:39
sean-k-mooneyi tested it locally16:39
sean-k-mooneyif you request a feild that does not exist it does not break anything16:39
sean-k-mooneyalso it passed tempest for the run where i did not have the extension test16:40
stephenfinhmm, so we can avoid a second API check with no consequences?16:40
sean-k-mooneyso if bauzas  is ok with me removing it and gibi is oke to review it again i can remove it16:40
sean-k-mooneystephenfin: its cached16:40
sean-k-mooneywe will only ever check once right16:40
sean-k-mooneyoh way not we refresh the cache16:41
stephenfinyup :)16:41
sean-k-mooneyya ok we can kill the check16:41
stephenfinthough tbf, that refresh is behind a timer16:41
sean-k-mooneyi just didnt want to revert it and get -2 again16:41
stephenfinso it's not as harmful as a I thought16:41
stephenfin*as I16:41
*** jawad_axd has quit IRC16:42
sean-k-mooneycool weell i need to recheck due to the cinder issue so i can respin instead but dont want to keep reving it for no reason16:42
sean-k-mooneywell not no reason but going back an forth16:42
*** jawad_axd has joined #openstack-nova16:42
stephenfinThanks fair16:43
sean-k-mooneystephenfin: did i do https://review.opendev.org/c/openstack/nova/+/773792/8/nova/network/neutron.py#2043 correctly by the way16:43
stephenfinbauzas is AFK at the moment16:43
stephenfinso maybe gibi can weigh in?16:43
stephenfinnah, what gibi suggested is what I was expecting16:43
stephenfin:param: foo foo foo16:44
stephenfin    foo foo foo16:44
*** jawad_axd has quit IRC16:44
stephenfinwell, :return:16:44
stephenfinyou know what I mean :)16:44
sean-k-mooneyah so just 4 spaces instead of like 816:44
stephenfinyup16:44
sean-k-mooneyinstead of removing the indet entirely16:44
stephenfinplease16:44
gibiI can rerewiew16:44
gibiif that was the question16:44
sean-k-mooneywell would you prefer i keep the extention check or remove it16:45
sean-k-mooneysince the call wont fail in either case and we dont check for the port resouces extension16:45
sean-k-mooneygibi: specificlaly this if https://review.opendev.org/c/openstack/nova/+/773792/8/nova/network/neutron.py#205016:46
gibiI'm OK to remove the check as neutron does not bark on nonexistent field and you access the field in the response conditionally16:47
sean-k-mooneyya i use get and default to None and handel that properly later16:48
sean-k-mooneyin that case ill remove that and put back the correct indent for the return comment16:48
sean-k-mooneysorry for the churn i go do that now16:49
sean-k-mooneystephenfin: anything else for me to adress while i do that our are you happy other then that?16:49
stephenfinsean-k-mooney: I'd really like to see that Wallaby reference dropped from the reno too /o\16:53
stephenfinI just left suggestions for other ways to store/get that metadata if you really rely on it16:54
sean-k-mooneyi know how to get it from git i just hate seeing release notes without it16:54
sean-k-mooneybut fine i can drop it16:54
sean-k-mooneyi almost never ead release notes outside of git16:55
stephenfinYeah, figured you would16:55
stephenfinyou could do a comment16:55
sean-k-mooneyhum i guess i could. whats the yaml comment syntax //16:56
sean-k-mooneyit follows c right16:56
stephenfinas I noted, release notes should avoid version information in general since it doesn't make sense for backported fixes. That's not applicable here but in general, it's a good guide16:56
sean-k-mooneyi think ill just drop it for now16:56
stephenfin'#' I think, but I'm not sure16:56
sean-k-mooneywell this is a feature16:56
sean-k-mooneyso it wont be backported but sure16:57
stephenfinyeah, like I said, not applicable here but a good guide16:57
stephenfinjust never include version information in the release note and you never need to think about it16:57
sean-k-mooneyim not sure i actully agree on the backport thing but i also dont wnat to spend time debating it FF week :)16:57
*** takamatsu has quit IRC17:00
*** lucasagomes has quit IRC17:02
*** rpittau is now known as rpittau|afk17:05
*** khomesh24 has quit IRC17:09
lyarwoodhas anyone deployed a local multinode devstack env recently? for some reason I can't curl keystone on the controller from the compute but I can ssh and ping between the hosts just fine17:17
*** lpetrut has joined #openstack-nova17:18
sean-k-mooneylyarwood: likely iptables17:18
lyarwoodsean-k-mooney: disabled, as is firewalld17:18
sean-k-mooneydo "sudo iptables -F; sudo iptables -X"17:18
lyarwoodwhat the flying17:18
lyarwood><17:18
lyarwoodggwp systemd17:19
lyarwoodthe service was dead but for what ever reason systemd didn't flush the rules17:19
sean-k-mooneydevstack/neutorn? add some iptable rules directly17:19
*** openstackgerrit has joined #openstack-nova17:19
openstackgerritKashyap Chamarthy proposed openstack/nova master: libvirt: Use improved guest CPU config APIs  https://review.opendev.org/c/openstack/nova/+/76233017:19
sean-k-mooneyi used to keep that command in my local.sh17:19
sean-k-mooneyso that devstack would just do it every time i stacked17:20
sean-k-mooneyi assume its working now?17:20
lyarwoodyeah not a bad idea17:20
lyarwoodyeah hopefully just stacking the compute again17:20
sean-k-mooneyi never bother to check what exacatly add the iptables ruels athat break it but the fix was simple so i never felt the need17:21
*** jangutter has joined #openstack-nova17:26
kashyaplyarwood: gibi: The above patch (I resolved conflicts from a rebase) is chunky ... but solves a real live migration problem.  In the past it was a blueprint and I wrote some docs in a "spec".  But there's a case to be made for it to be an "advanced bug-fix"17:28
kashyapI now need to go out, perhaps I can bring this up on the upstream meeting for discussion17:28
*** jangutte_ has quit IRC17:30
kashyap[That needs to be split up for easier review]17:31
gibikashyap: ack, I have to look at it tomorrow17:32
*** ociuhandu has quit IRC17:41
*** tesseract has quit IRC17:47
*** ociuhandu has joined #openstack-nova17:54
*** ociuhandu has quit IRC17:54
*** ociuhandu has joined #openstack-nova17:55
*** amodi has quit IRC17:56
*** lpetrut has quit IRC18:00
*** derekh has quit IRC18:00
*** artom has joined #openstack-nova18:01
*** ociuhandu has quit IRC18:02
*** ociuhandu has joined #openstack-nova18:02
*** ociuhandu has quit IRC18:02
*** ociuhandu has joined #openstack-nova18:03
*** ociuhandu has quit IRC18:12
*** dtantsur is now known as dtantsur|afk18:13
*** ociuhandu has joined #openstack-nova18:16
*** gryf has quit IRC18:22
*** gryf has joined #openstack-nova18:25
*** ralonsoh has quit IRC18:28
*** hamalq has joined #openstack-nova18:30
sean-k-mooneystephenfin: gibi i am doing some testing of vdpa for different api actions tl;dr i need to block shelve in addtion to livemigrate which was already planned.18:33
sean-k-mooneybut the reason i need to block shelve is because the vdpa change shared the same code for shelve as normal neutron vf ports18:33
sean-k-mooneye.g. vnic_type=direct18:33
sean-k-mooneyso it also hits https://bugzilla.redhat.com/show_bug.cgi?id=176779718:33
openstackbugzilla.redhat.com bug 1767797 in openstack-nova "When unshelving an SR-IOV instance, the binding profile isn't reclaimed or rescheduled, and this might cause PCI-PT conflicts" [High,Assigned] - Assigned to alifshit18:34
sean-k-mooneynova is correctly claiming the device in the pci tracker but not updating the neutron port18:34
sean-k-mooneybefore we regenerate the xml18:34
sean-k-mooneyim wondering if i should block shelve for all sriov port types including vdpa,vf and pfs18:35
stephenfinI think that would be wise18:35
sean-k-mooneyor even do it in two patches?18:35
sean-k-mooney1 for the vdpa related things and one for the other sriov port type so that could be backported?18:35
sean-k-mooneyif i only have one vm unshelve works fine but its just because it can use the same vf/vdpa device18:36
sean-k-mooneyso if we fix that bug it will work18:36
stephenfinTwo patches make sense18:37
sean-k-mooneyi kind of feel this is like the numa live migration case. it sort of works but fundementally its broken, although this time i think we can backport a fix for unshleve with sriov18:37
openstackgerritArtom Lifshitz proposed openstack/nova master: Follow up from bp/pci-socket-affinity series  https://review.opendev.org/c/openstack/nova/+/77955618:39
lyarwoodsean-k-mooney: so it looks like the issue is with focal nodes still using tgtadm, just going to use your cloud to rebuild a multinode env if that's okay18:45
sean-k-mooneyya it shoudl be fine18:46
sean-k-mooneyassuming i has free space go for it if not tell me an i can shelve something18:46
lyarwoodsean-k-mooney: ah nvm, virt-builder supports focal now18:47
sean-k-mooneyah ya it has about 20-30G of hugepages free18:47
sean-k-mooneywell there is space if you want to boot a coulpel of 8G vms you should be able ot spawn 3-418:47
sean-k-mooneylyarwood: im planning to redpeloy the cloud in a month or two and enable memory over subsciption. curretly everything uses hugepages but for how lightly used the vms are it proably makes snese to not do that18:48
*** ociuhandu has quit IRC18:53
*** ociuhandu has joined #openstack-nova18:54
*** takamatsu has joined #openstack-nova18:55
openstackgerritArtom Lifshitz proposed openstack/nova master: fakelibvirt: make kB_mem default not laughable  https://review.opendev.org/c/openstack/nova/+/77955918:56
*** k_mouza has quit IRC18:56
lyarwoodsean-k-mooney: kk18:58
sean-k-mooneyartom: lyarwood  since ye appare to be around could ye way in on https://review.opendev.org/c/openstack/nova/+/778347/4 just trying to get more input before i respin18:59
sean-k-mooneyartom: lyarwood  basicaly too questions should we use a diffferent name e.g.hw:mem_lock or hw:locked_memoy instead of hw:mlock18:59
sean-k-mooneyand should that require hw:mem_page_size to be set19:00
lyarwoodsean-k-mooney: I'll look once I've kicked off devstack19:00
sean-k-mooneythanks19:00
lyarwoodsean-k-mooney: did you move your jump host again btw?19:00
sean-k-mooneyno it should still be dyn.seanmooney.info19:01
*** ociuhandu has quit IRC19:01
sean-k-mooneyopenstack.seanmooney.info is loadbalanced by cloudflares cdn19:01
lyarwoodyeah there we go, I still had openstack.seanmooney.info19:02
lyarwood.ssh/config updated19:02
sean-k-mooneydyn.seanmooney.info seams to be working for me but because of nat i cant really test that properly19:02
sean-k-mooneyah ok19:03
artomsean-k-mooney, I'm obviously missing context here, but do we need that patch at all right now?19:04
sean-k-mooneyartom: yep19:04
artomAs in, why not continue to add it implicitly when we detect a VDPA device?19:04
artomSorry, not "continue to add", but just "add implicitly"19:05
sean-k-mooneylibvirt does not do it right now19:05
artomLike you're saying we do for SEV and realtime19:05
sean-k-mooneyand the way libvirt currently does it is a big problem for us19:05
artomAh, so Nova doesn't do it, *libvirt* does it19:05
sean-k-mooneyyep19:05
artom(For SEV and realtime)19:05
sean-k-mooneyfor sev and realtime nova does it19:05
sean-k-mooneybut that is slightly differnt19:05
artomTbh, I think having to set an extra spec that you have no choice for is bad UX, no?19:05
artomWhat's preventing Nova from detecting VDPA devices and adding the required XML?19:06
sean-k-mooneyam kind of but we shoudl not be changing the memory we sare using based on a neutorn port19:06
sean-k-mooneyartom: tl;dr our current memory tracking is really broken19:06
artomAh, because not all VDPA devices require it?19:06
sean-k-mooneyand vm that vfio(sriov port or pci passthough) vgpu or nvmeof device is being locked in memory by libvirt19:07
sean-k-mooneymeaning oversubcript does not work19:07
sean-k-mooneyartom: am the simulator does not require it19:08
sean-k-mooneyartom: its not clear if dpdk based vdpa devices would19:08
*** andrewbonney has quit IRC19:09
*** hamalq has quit IRC19:09
sean-k-mooneyso i dont know maybe we could auto add it19:09
*** hamalq has joined #openstack-nova19:09
artomYou're saying "let's not change the memory based on Neutron port", except libvirt kinda does that already for vGPU, to use your own example19:11
sean-k-mooneymy conern right now is to track locked memory proertly we might need severly restrict what type of vms can use neutron sriov port or passthough of pci or vgpus deivces19:11
sean-k-mooneyartom: well it does it for neutron VF ports19:11
sean-k-mooneyso we are already inadvertely doing it based on ports19:11
artomYeah19:12
sean-k-mooneywell libvirt is19:12
artomUX-wise, if something needs doing regardless, we should be asking the user to do it for us19:12
artomThe fact that our memory tracking is broken is a tangential problem :P19:12
sean-k-mooneywell im concerned we might need to block vms that dont use hw:mem_page_size form using sriov port in the future to solve this issue19:13
sean-k-mooneyartom: ya it is but i was trying not to boil the ocean i guess doing it based on the port type  make sense19:14
sean-k-mooneyi was orginaly hoping this was just a bug that would go away19:14
artomI get it, the deadline is looming and you're rushing19:14
sean-k-mooneypartly that19:14
sean-k-mooneyand partly in theory mellonx/nvidia coudl fix this is there driver suppport page faults19:15
sean-k-mooneythey dont right now and may never but if they did it would not need to be locked19:15
sean-k-mooneythe vdpa sim module doe snot need locking but it likely either support page fause or just does not use dma memory19:15
artomMy paranoid conservative opinion is that this needs to be figured out in a spec next cycle ;)19:16
artomInstead of panic-merging stuff ;)19:17
sean-k-mooneywell we discussed locked memory before for sev and realtime but did not have a usecase for it.19:17
sean-k-mooneyi could also just mark the guest as realtime today19:17
sean-k-mooneythat requirement would go away when libvirt start treating vdpa like a vf and upping the mlock limit19:18
sean-k-mooneymaybe that a better idea for now.19:18
sean-k-mooneyso no new extra spec but require a newer libvirt or a realtime guest.19:19
sean-k-mooneyi agree though i would like to not rush this19:19
*** ociuhandu has joined #openstack-nova19:27
sean-k-mooneyartom: for what its worth to use ovs-dpdk and vhost user you have to set hw:mem_page_size=large19:28
sean-k-mooneyother wise it wont work the same way that vdpa breaks today19:28
sean-k-mooneyif you dont add locked19:28
artomYeah, I agree there's precedent19:28
artom(In terms for breaking unless the user does something they strictly-speaking should not have to do)19:29
artomBut... doesn't mean we shouldn't strive to improve on that :)19:29
sean-k-mooneywell again we did not wnat reqouce usage to change basked on port type19:30
sean-k-mooneythat is why you were required to enable hugepages in the flavor or image19:30
sean-k-mooneythere are othere issue this create for attach too19:30
artomWell you clearly shouldn't be allowed to attach ports that require locked memory to a running instance :)19:31
artom(Unless it already has locked memory)19:31
sean-k-mooneyyep19:32
artomWhich we would need to track, and it could come from not only the extra spec, but also other source that caused libvirt to do it for us, etc etc19:32
sean-k-mooneybut not just running19:32
artomHence: spec discussion :)19:32
sean-k-mooneyit would be incorrect to change it on hard reboot19:32
sean-k-mooneyor iff it was off19:32
sean-k-mooneye.g. you cant add to any instance already on the host19:33
sean-k-mooneyas it change the memory usage of the guest19:33
*** k_mouza has joined #openstack-nova19:33
sean-k-mooneysame way addign a vhost-user port shoudl not make the guest suddenly use hugepages19:33
*** ociuhandu has quit IRC19:37
*** k_mouza has quit IRC19:37
*** amodi has joined #openstack-nova19:44
*** ociuhandu has joined #openstack-nova19:49
*** ociuhandu has quit IRC19:52
*** ociuhandu has joined #openstack-nova19:52
openstackgerritsean mooney proposed openstack/nova master: Support per port numa policies with SR-IOV  https://review.opendev.org/c/openstack/nova/+/77379219:57
*** lbragstad has quit IRC20:01
lyarwoodsean-k-mooney: sorry had to go afk, so devstack is still running somehow. Is your host pretty loaded at the moment?20:05
sean-k-mooneyno load is at 4.38 on the host20:06
sean-k-mooneyit has 24cores/48threads20:07
lyarwoodweird20:07
sean-k-mooneyit might be ipv6 being slow20:07
lyarwoodI'm using the medium flavor with 12 vcpus20:07
sean-k-mooneyi dont have native ipv6 and the tunnel is sometimes slow20:07
lyarwoodI'm not using the async stuff but still, it's been running for over an hour now20:07
lyarwoodkk20:08
sean-k-mooneyya that weird what does the load look like in the vm20:08
sean-k-mooneyio wait is also not that high in the host %0.9020:09
lyarwoodkk let me try again with DEVSTACK_PARALLEL=true20:10
sean-k-mooneyit should only take about 20 mings without it20:11
lyarwoodtakes about 6 on a local f32 vm20:12
lyarwoodkk, it's looking quicker tbh20:12
*** lbragstad has joined #openstack-nova20:15
sean-k-mooneyya something is wrong with my ipv6 routing http://paste.openstack.org/show/803402/20:19
sean-k-mooneydns is working but i cant ping20:19
sean-k-mooneythats form the server that is hosting the vm20:19
*** jamesden_ is now known as jamesdenton20:20
*** jamesdenton has quit IRC20:21
*** jamesden_ has joined #openstack-nova20:21
*** jamesden_ is now known as jamesdenton20:22
*** jamesdenton has quit IRC20:29
openstackgerritMerged openstack/nova master: pci manager: replace node_id parameter with compute_node  https://review.opendev.org/c/openstack/nova/+/77874720:30
openstackgerritMerged openstack/nova master: apidb: Compact Queens database migrations  https://review.opendev.org/c/openstack/nova/+/75940420:30
sean-k-mooneylyarwood: problems on my routeer the tunnel is not working properly. for now if you do20:31
sean-k-mooneysysctl -w net.ipv6.conf.all.disable_ipv6=120:31
sean-k-mooneysysctl -w net.ipv6.conf.default.disable_ipv6=120:31
sean-k-mooneyit will work around the issue20:31
sean-k-mooneywith sudo of course20:31
*** irclogbot_0 has quit IRC20:31
sean-k-mooneyit shoudl be falling back to ipv4 anyway but that can slow down installs20:31
sean-k-mooneythis is one of the reason im going to be reinstallin my cloud in a month or too20:32
openstackgerritMerged openstack/nova master: pci: track host NUMA topology in stats  https://review.opendev.org/c/openstack/nova/+/77414920:32
sean-k-mooneyipv6 is nice but somethiem it has issue so untill i have it nativly im going to remove it20:32
sean-k-mooneyam what? ^20:32
*** irclogbot_0 has joined #openstack-nova20:33
*** ociuhandu has quit IRC20:33
sean-k-mooneystephenfin: gibi  the numa node is already in the pci_stats in the compute node table20:34
sean-k-mooneyas is the host numa toplogy blob20:35
sean-k-mooney{"product_id": "101e", "vendor_id": "15b3", "numa_node": 0, "tags": {"dev_type": "vdpa", "physical_network": null, "parent_ifname": "enp6s0f0_0"}, "count": 3}20:36
*** jamesdenton has joined #openstack-nova20:38
sean-k-mooneyartom: i really wish you asked me to review that20:39
openstackgerritArtom Lifshitz proposed openstack/nova master: fakelibvirt: make kB_mem default not laughable  https://review.opendev.org/c/openstack/nova/+/77955920:40
sean-k-mooneythese were class method because we did not want to store this data in an object20:40
artomsean-k-mooney, the NUMA node yes, but not the socket20:40
sean-k-mooneyright but the socket you were goint to store/lookup seperately20:41
artomsean-k-mooney, I save the host numa_topology in the pci stats object20:41
artom(I forget what it's called exactly)20:41
sean-k-mooneyim reviewing https://review.opendev.org/c/openstack/nova/+/774149 now to fiture out what you cahgned but this is going to conflcit with most of my patches20:41
*** whoami-rajat has quit IRC20:42
openstackgerritMerged openstack/nova master: conf: Clean up docs for scheduler options  https://review.opendev.org/c/openstack/nova/+/77363920:42
artomStoring it (the numa_topology) in the object was cleaner than passing it through20:42
artomOnly the socket policy filter() method needs it, but it'd have been passed through like 42 other methods to get there20:43
sean-k-mooneybut those function were ment to be pur fucntion of there input20:43
artomThey still are20:43
artomThey're just in the object now20:43
sean-k-mooneybut do they use any data via self20:44
sean-k-mooneyif so they are not20:44
artomNope20:44
artomOnly the new filter for the socket policy20:44
sean-k-mooneythen they can stay class methods20:44
sean-k-mooneyyou can call class methods via self20:44
artomNo, because they *call* the new socket filter method20:44
artomWhich needs to be on self, not cls20:44
artomBecause of the aforementioned requirement for numa_topology20:44
artomYou can call cls from self, but not the other way around20:45
sean-k-mooneysure but you could pass in the numa object20:46
artomYeah, it's what I was saying20:46
sean-k-mooneywhere are you usin ghtis the next patch20:46
artomYeah, next patch20:46
artomIt's what I was saying - storing self.numa_topology was much cleaner than passing it rhough20:47
sean-k-mooneyi would not have done what you have in the one that just merged20:47
artom*through20:47
sean-k-mooneyya but it break the design of the class20:47
sean-k-mooneyand that make me uncofrotable20:47
sean-k-mooneywe intentionally did not do what you are not doing20:47
sean-k-mooneyso ya you are now using self here https://review.opendev.org/c/openstack/nova/+/772779/17/nova/pci/stats.py#33620:48
artomI figured there was a reason, on the other hand, there was also a thing where node_id was optional which looks like it was only for testing20:50
artomSo it was hard to tell what was legit reason and what was programmer laziness ;)20:50
melwittsean-k-mooney: curious how you get ~5 devstack times? do you disable certain services or something?20:50
sean-k-mooneyi get time of about 15 mins20:51
sean-k-mooneyartom: the numa node is optional20:51
*** jamesdenton has quit IRC20:51
sean-k-mooneynot all devices have one20:51
artomsean-k-mooney, not the numa node, the compute node_id20:51
sean-k-mooneyoh thats required20:52
artomIt didn't use to be20:52
sean-k-mooneywhere is that optional20:52
melwitthm ok. when I do it with DEVSTACK_PARALLEL=1 it takes 25-30 minutes but that is with default things enabled, nearly empty local.conf20:52
melwittI was wondering how people are getting these really fast times20:52
artomsean-k-mooney, https://review.opendev.org/c/openstack/nova/+/778747/2/nova/pci/manager.py20:52
sean-k-mooneymelwitt: it has to be DEVSTACK_PARALLEL=True20:52
sean-k-mooneyunless they fixed that 1 will not work20:53
melwittsean-k-mooney: I used it, it gave me the async time report at the end20:53
melwittmaybe I used True, I don't know for sure20:53
sean-k-mooneythis is also my default contoler http://paste.openstack.org/show/803405/20:53
sean-k-mooneyi have some service disabled yes20:53
melwittthanks20:53
sean-k-mooneyyou could disable tempest and horizon form that list if you anted20:54
sean-k-mooneyor cinder i guess20:54
sean-k-mooneybut that a faily simple compute cloud20:54
sean-k-mooneybaisclaly no heat or swift form the default set20:55
sean-k-mooneyartom: i think that was needed20:56
artomsean-k-mooney, only for testing as far as I could find20:56
sean-k-mooneyartom: we initalise the pci tracker before the comptue service is registered20:56
artomThat's the only place where it was ever not passed in20:56
sean-k-mooneyartom: no i think its need for a fresh install20:56
artomsean-k-mooney, wouldn't that then explode in testing?20:57
sean-k-mooneyits not that is passed i think its set20:57
artomsean-k-mooney, well, you're in luck, the top-most patched failed the gate :)20:58
artomSo any feedback you have it's now or never20:59
melwittsean-k-mooney: thanks20:59
sean-k-mooneyartom: https://github.com/openstack/nova/blame/421c52d9d341b07d850c21e0e702a008a8e1d3b7/nova/compute/resource_tracker.py#L493-L49521:01
sean-k-mooneyartom: when that runs it possible we have not registered teh compute node yet21:02
sean-k-mooneyif i remeber correctly21:02
artomsean-k-mooney, I think that's changed since then...21:03
openstackgerritMerged openstack/nova master: Differentiate between InstanceNotFound and ConstraintNotMet  https://review.opendev.org/c/openstack/nova/+/77530921:03
openstackgerritMerged openstack/nova master: Add functional test for bug 1837995  https://review.opendev.org/c/openstack/nova/+/77544921:04
openstackbug 1837995 in OpenStack Compute (nova) ""Unexpected API Error" when use "openstack usage show" command" [Undecided,In progress] https://launchpad.net/bugs/1837995 - Assigned to melanie witt (melwitt)21:04
artomNow it's only called from _setup_pci_tracker, which has the compute_node object available to it21:04
artomsean-k-mooney, anyways, I've -1'ed the top-most patch that hasn't merged yet21:05
sean-k-mooneyyes but i think that has a similar beahvior21:05
artomTake your time to review it21:05
artomWe can revisit this tomorrow21:05
sean-k-mooneyi was tryign to find it but you updated it and its hard to fid the old version :)21:05
artomI *only* changed node_id to compute_node21:06
sean-k-mooneyactully i can just go in the history i gues21:06
artomNothing else21:06
sean-k-mooneythis is the placves where its called that i was worred about21:08
sean-k-mooneyhttps://github.com/openstack/nova/blob/1273c5ee0b18974d9837e9221fc9270429d428bf/nova/compute/resource_tracker.py#L727-L76121:08
sean-k-mooneywell i think those case its fine21:08
sean-k-mooneyassuming create has the sideffect of it having the id21:08
sean-k-mooneyartom: why did you pass the compute node object in21:10
sean-k-mooneyinstead of the id21:10
artomDon't need to look it up then21:10
artomSaves a DB query - gibi suggested it21:10
sean-k-mooneyid does not21:11
sean-k-mooneyhttps://review.opendev.org/c/openstack/nova/+/778747/2/nova/pci/manager.py#6421:11
sean-k-mooneywe were passing in the compute node id before21:11
sean-k-mooneynow we dont save it and just extract the id21:11
artomWe don't save what?21:12
sean-k-mooneythe compute node object in this object21:12
*** gyee has joined #openstack-nova21:16
artomWe don't need to21:17
sean-k-mooneyyou split out the chage that refactored the interface form the cahgne that used the compute node object21:17
artomWe save the numa_topology in PciDeviceStats21:17
artomhttps://review.opendev.org/c/openstack/nova/+/774149/12/nova/pci/stats.py21:17
sean-k-mooneythat should not have been done imo21:18
artomDebate with stephenfin on that, his idea :)21:18
artomBut to not throw him under the bus too much, I agree with it21:18
artomMakes the changes cleaner21:18
artomOne to always pass compute_node instead of the optional node_id=None21:18
sean-k-mooneynot really21:18
sean-k-mooneyit was not optional21:18
artomAnd another to pull the numa_topology from that and pass it to PciDeviceStats21:19
sean-k-mooneyit was a key word argument21:19
sean-k-mooneybut the comptue node was required in all the production code21:19
sean-k-mooney*id21:19
artomHow is https://review.opendev.org/c/openstack/nova/+/778747/2/nova/pci/manager.py#54 not optional?21:19
artomIt literally says node_id=None21:19
sean-k-mooneyright but all uses of it out side fo test always set it21:19
artomYeah, my point exactly21:20
artomCodify that it's always expected21:20
artomAnd because we'll need the full compute_node later on, replace node_id with the full object21:20
sean-k-mooneysure but not in a patch seperate form the new usage or the full object21:20
sean-k-mooneyartom: sorry this just annoys me because you were chanign someint i did not think you were going to change. it conflicts with my changes and it break my mental model of how the pci trakcer works21:22
artomsean-k-mooney, that changes aren't that dramatic...21:23
sean-k-mooneythe main one was that it never store state in the pci tracker object directly21:24
sean-k-mooneythe numa toplogy object will have to be keep consitent now21:25
sean-k-mooneyok we do sotre state but differently21:26
artomWe're talking about the *host* numa_topology21:26
sean-k-mooneyyes21:26
artomWhen is that ever going to change...21:27
sean-k-mooneywithin th elife time of the agent i guess it not going to21:27
sean-k-mooneyi mean memoy and cpu hotplug are thigns and you can reconfigure hyperthreading on the fly or hugepagers for that matter21:28
sean-k-mooneyhugepage is actully the most likely ot change at runtime21:28
sean-k-mooneybut to have that picked up you need to restart libvirtd21:28
*** k_mouza has joined #openstack-nova21:29
sean-k-mooneyactully se also store the currently pinne cpu in the host numa toplogy blob21:29
sean-k-mooneyso its update every time we boot new vms21:30
*** k_mouza has quit IRC21:33
artomsean-k-mooney, the PCI tracker never uses that information though21:36
artomI could add a comment to warn future programmers21:36
artomWe just need the socket/node mapping, and that's effectively constant21:37
lyarwoodmelwitt / sean-k-mooney ; so in my defence, it's late and I shouldn't be working but when I said ~5mins earlier what I actually wanted to say was ~500 seconds. http://paste.openstack.org/show/803406/21:40
sean-k-mooneylyarwood: ay right that about right with paralle21:41
sean-k-mooneyi think i missed where you said ti too ~500 though21:42
sean-k-mooneysub ten mintues is doable if you have good netowrking, io and a fast cpu21:44
lyarwoodyup I'm just running a 4 vCPU, 16GB, 50GB RAW disk VM on my p1 gen2 with a 1Gbps connection21:46
melwittlyarwood: heh, sorry, it wasn't only you, I had seen other mentions of 5-6 min about it before and your mention made me think to ask what am I doing wrong to not get this result 😆21:52
sean-k-mooneymelwitt: yep dansmith was around the 5-8 minute mark21:52
sean-k-mooneythat i think was on baremetal21:53
sean-k-mooneyrather then nested virt but its doable21:53
dansmithtalking about devstack time?21:54
sean-k-mooneyyep21:54
sean-k-mooneyi think you were geting about 430 ish second if im not mistaken21:55
dansmithyeah, I can do about 5mins with a less-than-full devstack config21:55
dansmithwith OCaaS plus parallel I can get 3xxx yeah21:55
dansmither, 3xx21:56
melwittwhat are the main things you disable?21:56
dansmithdisable_service c-bak etcd3 c-api c-vol c-sch swift horizon dstat21:56
melwittthanks21:56
dansmithtempest if I don't need it21:56
sean-k-mooneyswift i think is kind of slow to set up21:56
sean-k-mooneydstat should not make much of a differnce horizon take a while to complie and compress the static pages21:57
dansmithwell, not all of the systemctl commands are super fast21:57
dansmithsometimes depending on what is running, daemon-reload can take a couple seconds, and start if it waits for the first child, etc21:58
sean-k-mooneyya i notice that more on unstack then anything esle22:00
sean-k-mooneysome service take a long time to stop randomly22:00
dansmithyeah, that's another good reason though,22:00
dansmithmore shtuff to unstack makes the process slower when you're iterating22:01
sean-k-mooneyyep although i normlaly see how long i can go with just doing sudo systemctl restart devstack@n-*22:02
sean-k-mooneyif im hacking on stuff i generally dont restack unless i have too22:02
dansmithwell, when you're working on stuff that crosses multiple projects, as I have been lately, unstack/stack time is important22:04
dansmithespecially if one of those _is_ devstack :)22:04
sean-k-mooneyyep i used to restack multiple times a day22:04
sean-k-mooneynow i just have different envs for different tings22:05
sean-k-mooneyso i restack less22:05
sean-k-mooneystill important to be quick22:05
dansmithyeah just depends on what you're doing22:06
dansmithobviously hacking on a single project, service restart is by far the most efficient :)22:06
sean-k-mooneybasically if i dont need db change i try to jsut checkout the patch i need and restart it22:07
sean-k-mooneyif i get error i restack22:07
sean-k-mooneyit works more times then it proably should22:07
dansmithit *should* work for most things, so .. I'd be concerned if it didn't ;)22:09
sean-k-mooneyi sometime get bitten by the compute service verion if i change to a different series that i want to test22:10
sean-k-mooneyi know i can fix that but i never do22:10
sean-k-mooneyi just restack22:10
*** ociuhandu has joined #openstack-nova22:21
*** rcernin has joined #openstack-nova22:24
*** spatel_ has quit IRC22:27
*** rcernin has quit IRC22:30
*** rcernin has joined #openstack-nova22:30
*** ociuhandu has quit IRC22:38
*** grandchild has joined #openstack-nova22:56
*** tkajinam has joined #openstack-nova23:00
*** adriant has quit IRC23:11
*** adriant has joined #openstack-nova23:12
*** mlavalle has joined #openstack-nova23:22
*** grandchild has quit IRC23:53
*** grandchild has joined #openstack-nova23:54

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!