Tuesday, 2019-04-16

*** markvoelker has quit IRC00:00
*** tosky has quit IRC00:01
openstackgerritLee Yarwood proposed openstack/nova master: compute: Use source_bdms to reset attachment_ids during LM rollback  https://review.openstack.org/65280000:02
*** lbragstad has joined #openstack-nova00:07
*** tetsuro has joined #openstack-nova00:17
*** jding1__ has quit IRC00:23
*** erlon has joined #openstack-nova00:28
openstackgerritzhufl proposed openstack/nova master: Remove conductor_api and _last_host_check from manager.py  https://review.openstack.org/65105900:53
openstackgerritDustin Cowles proposed openstack/nova master: Introduces the openstacksdk to nova  https://review.openstack.org/64366401:05
*** eharney has quit IRC01:14
*** keekz has joined #openstack-nova01:45
*** threestrands has joined #openstack-nova01:47
*** zhubx has quit IRC01:49
*** zhubx has joined #openstack-nova01:49
*** whoami-rajat has joined #openstack-nova01:53
*** erlon has quit IRC01:58
*** hongbin has joined #openstack-nova02:12
*** lbragstad has quit IRC02:21
*** nicolasbock has quit IRC02:29
*** ricolin has joined #openstack-nova02:45
*** weshay_pto is now known as weshay02:49
*** zhubx has quit IRC02:49
*** boxiang has joined #openstack-nova02:49
*** cfriesen has quit IRC02:52
*** lbragstad has joined #openstack-nova03:00
openstackgerritBoxiang Zhu proposed openstack/nova-specs master: Add host and hypervisor_hostname flag to create server  https://review.openstack.org/64545803:01
openstackgerritBoxiang Zhu proposed openstack/nova master: Fix live migration break group policy simultaneously  https://review.openstack.org/65196904:05
*** _erlon_ has quit IRC04:05
*** imacdonn has quit IRC04:06
*** imacdonn has joined #openstack-nova04:07
*** ratailor has joined #openstack-nova04:16
*** hongbin has quit IRC04:40
*** gyee has quit IRC05:08
openstackgerritMerged openstack/nova master: Handle unsetting '[DEFAULT] dhcp_domain'  https://review.openstack.org/65266205:23
*** ratailor_ has joined #openstack-nova05:34
*** ratailor has quit IRC05:36
*** Luzi has joined #openstack-nova05:37
*** ivve has joined #openstack-nova05:40
*** lbragstad has quit IRC05:55
*** sridharg has joined #openstack-nova05:58
*** lpetrut has joined #openstack-nova06:02
openstackgerritBoxiang Zhu proposed openstack/nova master: [WIP] Add host and hypervisor_hostname flag to create server  https://review.openstack.org/64552006:05
*** lpetrut has quit IRC06:16
*** lpetrut has joined #openstack-nova06:16
openstackgerritMerged openstack/nova master: Remove dead code  https://review.openstack.org/65027706:18
*** awestin1 has quit IRC06:26
*** pcaruana has joined #openstack-nova06:26
*** masayukig has quit IRC06:27
*** seyeongkim has quit IRC06:27
*** rpittau|afk has quit IRC06:28
*** kmalloc has quit IRC06:28
*** vdrok has quit IRC06:28
*** TheJulia has quit IRC06:29
*** kmalloc has joined #openstack-nova06:30
*** seyeongkim has joined #openstack-nova06:30
*** belmoreira has joined #openstack-nova06:31
*** hogepodge has quit IRC06:32
*** johnsom has quit IRC06:32
*** seyeongkim has quit IRC06:35
*** kmalloc has quit IRC06:40
*** ralonsoh has joined #openstack-nova06:43
*** TheJulia has joined #openstack-nova06:43
*** markvoelker has joined #openstack-nova06:43
*** NobodyCam has quit IRC06:46
*** TheJulia has quit IRC06:47
*** slaweq has joined #openstack-nova06:49
*** TheJulia has joined #openstack-nova06:52
*** seyeongkim has joined #openstack-nova06:52
*** kmalloc has joined #openstack-nova06:53
*** rpittau|afk has joined #openstack-nova06:53
*** NobodyCam has joined #openstack-nova06:53
*** luksky has joined #openstack-nova06:54
*** johnsom has joined #openstack-nova06:55
*** masayukig has joined #openstack-nova06:55
*** awestin1 has joined #openstack-nova06:56
*** hogepodge has joined #openstack-nova06:56
*** ileixe has quit IRC06:56
*** belmoreira has quit IRC06:56
*** brinzhang has joined #openstack-nova06:57
*** vdrok has joined #openstack-nova06:58
*** ileixe has joined #openstack-nova06:59
*** belmoreira has joined #openstack-nova07:00
*** rpittau|afk is now known as rpittau07:06
*** awalende has joined #openstack-nova07:08
*** tesseract has joined #openstack-nova07:11
*** ratailor__ has joined #openstack-nova07:20
*** rcernin has quit IRC07:20
*** ratailor_ has quit IRC07:22
*** tosky has joined #openstack-nova07:23
*** tssurya has joined #openstack-nova07:29
*** helenafm has joined #openstack-nova07:29
*** ccamacho has joined #openstack-nova07:48
openstackgerritBoxiang Zhu proposed openstack/nova master: Fix live migration break group policy simultaneously  https://review.openstack.org/65196907:56
*** threestrands has quit IRC07:58
*** johnsom has quit IRC08:04
*** seyeongkim has quit IRC08:04
*** johnsom has joined #openstack-nova08:05
*** seyeongkim has joined #openstack-nova08:05
*** rpittau_ has joined #openstack-nova08:05
*** masayukig has quit IRC08:05
*** masayukig_ has joined #openstack-nova08:05
*** masayukig_ is now known as masayukig08:05
*** rpittau_ is now known as rpittua08:05
*** rpittua is now known as rpittau_08:05
*** awestin1_ has joined #openstack-nova08:06
*** awestin1 has quit IRC08:06
*** awestin1_ is now known as awestin108:06
*** rpittau has quit IRC08:06
*** rpittau_ is now known as rpittau08:06
*** phasespace has joined #openstack-nova08:08
*** priteau has joined #openstack-nova08:09
*** ttsiouts has joined #openstack-nova08:21
openstackgerritMerged openstack/nova master: Dropping the py35 testing  https://review.openstack.org/64387108:22
*** tkajinam has quit IRC08:24
openstackgerritMerged openstack/nova master: Bump to hacking 1.1.0  https://review.openstack.org/65155308:25
openstackgerritMerged openstack/nova master: Remove 'nova-cells' service  https://review.openstack.org/65129008:25
*** janki has joined #openstack-nova08:28
*** ratailor__ has quit IRC08:30
*** luksky has quit IRC08:36
lyarwoodstephenfin: https://review.openstack.org/#/c/637224/ - would you mind taking a look at this today?08:37
lyarwood^ not a spec FWIW so feel free to ignore for now08:37
*** luksky has joined #openstack-nova08:39
*** belmoreira has quit IRC08:41
kashyaplyarwood: Morning, hope this answers your question: https://review.openstack.org/#/c/506720/8/specs/train/approved/allow-secure-boot-for-qemu-kvm-guests.rst@28708:43
kashyap(Thanks for the review, so far.)08:43
*** belmoreira has joined #openstack-nova08:43
stephenfinlyarwood: added to the queue :)08:47
openstackgerritStephen Finucane proposed openstack/nova master: Remove '/os-cells' REST APIs  https://review.openstack.org/65129108:52
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 in '/os-hypervisors' API  https://review.openstack.org/65129208:52
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 in '/os-servers' API  https://review.openstack.org/65129308:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'nova-manage cell' commands  https://review.openstack.org/65129408:52
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 for console authentication  https://review.openstack.org/65129508:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove old-style cell v1 instance listing  https://review.openstack.org/65129608:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'bdm_(update_or_create|destroy)_at_top'  https://review.openstack.org/65129708:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_fault_create_at_top'  https://review.openstack.org/65129808:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_info_cache_update_at_top'  https://review.openstack.org/65129908:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'get_keypair_at_top'  https://review.openstack.org/65130008:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_update_at_top', 'instance_destroy_at_top'  https://review.openstack.org/65130108:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_update_from_api'  https://review.openstack.org/65130208:52
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling 'update_cells' on 'BandwidthUsage.create'  https://review.openstack.org/65130308:52
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 for instance naming  https://review.openstack.org/65130408:52
openstackgerritStephen Finucane proposed openstack/nova master: Remove cells code  https://review.openstack.org/65130608:52
lyarwoodkashyap: ack thanks, replied. I was more suggesting we generate the file once and make copies for each instance to use after that.08:55
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_update_from_api'  https://review.openstack.org/65130208:56
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling 'update_cells' on 'BandwidthUsage.create'  https://review.openstack.org/65130308:56
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 for instance naming  https://review.openstack.org/65130408:56
openstackgerritStephen Finucane proposed openstack/nova master: Remove cells code  https://review.openstack.org/65130608:56
lyarwoodkashyap: and that was what led me to ask about where that template/master file could be stored.08:56
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling 'InstanceUnknownCell' exception  https://review.openstack.org/65130708:56
openstackgerritStephen Finucane proposed openstack/nova master: Remove unnecessary wrapper  https://review.openstack.org/65130808:56
openstackgerritStephen Finucane proposed openstack/nova master: db: Remove cell APIs  https://review.openstack.org/65130908:56
lyarwoodkashyap: generating it for each instance is expense IMHO, I'm just looking for ways to avoid that.08:56
openstackgerritStephen Finucane proposed openstack/nova master: conf: Remove cells v1 options, group  https://review.openstack.org/65131008:56
kashyaplyarwood: Let me be sure I'm not misreading you:08:57
kashyaplyarwood: You're saying to generate _once_ and reuse it?  Or generate them ahead store them somewhere and _then_ use a unique VARS file per instance?08:57
kashyaps/ahead store/ahead and store/08:58
* kashyap reads the reply08:58
kashyaplyarwood: Okay, I see what you mean: generate a template once, make copies and use that when needed.09:00
lyarwoodkashyap: yup that's it09:00
kashyapLet me see if there's any gotchas there.  (And yeah, I'd like to avoid needless expensive creations, too)09:00
openstackgerritStephen Finucane proposed openstack/nova master: conf: Remove cells v1 options, group  https://review.openstack.org/65131009:06
openstackgerritStephen Finucane proposed openstack/nova master: Remove conductor_api and _last_host_check from manager.py  https://review.openstack.org/65105909:06
*** luksky has quit IRC09:07
*** bhagyashris has joined #openstack-nova09:09
*** luksky has joined #openstack-nova09:19
kashyaplyarwood: Important bit - I didn't mention this at all in the spec: there's actually a config attribute in /etc/libvirt/qemu.conf that lets you specify the master 'nvram' template ("VARS" file)09:20
lyarwoodkashyap: and that can already have the keys enrolled?09:28
kashyaplyarwood: So currently if you peek at your /etc/libvirt/qemu.conf, it has mapping of which _CODE.fd file belongs to which _VARS.fd file09:30
kashyaplyarwood: Checking with the libvirt folks for further usage09:30
*** bhagyashris has quit IRC09:34
*** dtantsur|afk is now known as dtantsur09:35
*** avolkov has joined #openstack-nova09:53
avolkovkashyap: hi, have a couple of questions on https://review.openstack.org/#/c/506720/. Could you take a look?09:58
sean-k-mooney ... i realised i jsut spent an hour reviewing patchset 8 of kasaps spec09:58
sean-k-mooneyits on 1009:58
*** brinzhang has quit IRC10:00
*** rcernin has joined #openstack-nova10:02
kashyapsean-k-mooney: Heya10:02
kashyapThanks for the review, looking10:02
kashyapA quick point top off the head: the key enrollment script should _not_  a libvirt feature10:02
kashyapAnd we're _not_ "(Bash) shelling out" anything10:03
kashyapIf you read closely, I merely mentioned it to show what is going on behind the scenes.10:03
sean-k-mooneythe external tool is satring qemu and then executing command on the instance right10:03
kashyapNo.10:03
*** luksky has quit IRC10:03
kashyapAs noted in an earlier response on the change, the VARS file generator script is _not_ launching any instance.10:04
sean-k-mooneyok but it is printing raw bytes and stings to something10:04
sean-k-mooneyhttps://github.com/puiterwijk/qemu-ovmf-secureboot/blob/master/ovmf-vars-generator#L10410:05
openstackgerritChris Dent proposed openstack/nova master: Delete the placement code  https://review.openstack.org/61821510:05
kashyapsean-k-mooney: Have you ever tried Secure Boot?10:05
kashyapsean-k-mooney: It is simply sending the escape char so that you can _run_ the tool10:06
sean-k-mooneyonly on phyical machines10:06
kashyapPlease rephrase.10:06
kashyapIt can work the same in a nested env. too.  The script is merely launching the ISO and enrolling the default keys10:07
kashyapThe standard behaviour documented for all distribution wiki pages10:07
sean-k-mooneyi have used secure boot but on physical machines but i have never replaced the keys10:07
kashyap(Ah, okay)10:07
sean-k-mooneykashyap: yes so its using qemu to launch a vm10:07
sean-k-mooneyit may not be the guest vm10:07
kashyapNo, it is launching a QEMU process.  Not every QEMU process is not a "VM"10:07
sean-k-mooneybut its still launching a vm10:07
kashyapSo what?10:08
kashyapIt is the correct thing to do.10:08
sean-k-mooneyso have you considerd how this will work on a host that is configred to run guest with hugepages only10:08
sean-k-mooneye.g. where the host has been tuned so that there is not a large amount of free meory or on a realtime host10:09
kashyap(The process for you to try on a guest: https://wiki.ubuntu.com/UEFI/OVMF)10:09
sean-k-mooneywhat im concened about it how much memoy this will use and can it impact other guests10:09
kashyapsean-k-mooney: That is not relevant.  Because you only launch the process *once* with 256MB of RAM10:09
kashyapAnd generate the VARS file, and that's it.10:09
sean-k-mooneyonce per vm10:10
kashyapsean-k-mooney: No need for concern.  It's a one-time thing for a few seconds10:10
kashyapNo.10:10
kashyapJust once.10:10
kashyapWe will use a template file.10:10
sean-k-mooneyok so why is this not a step triplo should be doing10:10
sean-k-mooneyor why are we not just shipping a template file10:10
kashyapThat is a different topic.  I don't want to derail the core thing at hand10:10
sean-k-mooneyor requiring the operator to do it manually as part of install.10:11
kashyap(Two steps at a time.)10:11
kashyapsean-k-mooney: Yes, that's what I discussed with Lee.10:11
kashyapThe first step is to document that it is an external step an operator must take10:11
kashyapPlease take a few minutes to read the conversation before.10:11
sean-k-mooneywell i dont think nova should be doing it.10:11
kashyap(I'm repeating myself here)10:11
sean-k-mooneyok ill reread10:11
kashyapsean-k-mooney: Nova won't do it.10:12
sean-k-mooneyok then it should not be in the spec. we should have domumention for it or at most mentionit in the other deployer impact if it in the spec10:12
kashyapYes, will mention it as such10:13
sean-k-mooneythe current way the spec is layed out implied that it was a required change to nova. the actuly require change will be knowing where the template file is10:13
sean-k-mooneyok10:13
kashyapNo, it is _not_ required10:15
kashyapI didn't say it as such10:15
sean-k-mooneyrather then copying over all my comments ill leve them on patchset 810:15
kashyapI said "consider" if we should.  That is different than "required".10:15
sean-k-mooneyactully you said "Work out a way to integrate the external Python tool,10:18
sean-k-mooney     `ovmf-vars-generator` into Nova.10:18
sean-k-mooney"10:18
sean-k-mooneyi was reviewing pathset 8 remember10:19
aspiershi folks10:19
sean-k-mooneyin patchset 10 you have change it to document that the operator ...10:19
sean-k-mooneyaspiers: hi10:19
kashyapYes.  PS 10 was there when you started reviewing :-)10:19
aspiersYesterday I updated the AMD SEV spec in case anyone has time to review today10:20
aspiersNot sure how many other specs are in your queue ;-)10:20
sean-k-mooneykashyap: yep but i click on it from a gerrit comment form lee and never checked it was the latest :)10:20
openstackgerritLee Yarwood proposed openstack/nova master: compute: Use source_bdms to reset attachment_ids during LM rollback  https://review.openstack.org/65280010:20
lyarwoodmdbooth: ^ you might be interested in this if you have time today10:20
aspiersI'll be around in case there are things to discuss10:20
sean-k-mooneyaspiers: link?10:20
aspierssean-k-mooney: https://review.openstack.org/#/c/641994/10:20
mdboothlyarwood: Shiny10:20
sean-k-mooneyyep just found it10:21
aspiersI also put it in https://etherpad.openstack.org/p/nova-spec-review-day10:21
aspierssean-k-mooney: In particular I am wondering if hugepages can help with memory accounting. It might be a problem if it can only account for guest RAM and not the other QEMU memory chunks10:21
kashyapsean-k-mooney: Can't parse this sentence: "-1 while i see if any of the comments i just left on patchset 8 have been adressed"10:22
sean-k-mooneykashyap: ill clear my head a bit and take it form the top later today. by the way haveing matial content in the work items section makes it much harder to review IMO10:22
sean-k-mooneykashyap: well i was going to -1 pathset 8 and i wanted to see if my coment were adressed in 1010:23
sean-k-mooneymost werent so my -1 woudl still stand10:23
*** ttsiouts has quit IRC10:23
*** ttsiouts has joined #openstack-nova10:24
kashyapsean-k-mooney: After I read you review in full.  I'll post a new iteration, and write a top-level summary addressing your comments (as needed).  (Instead of more inline replies there.)10:24
kashyapThanks for looking.10:24
aspierssean-k-mooney: but TBH I am a bit confused because the QEMU memory usage breakdown in the table https://specs.openstack.org/openstack/nova-specs/specs/stein/approved/amd-sev-libvirt-support.html#proposed-change seems to contradict what Daniel Berrange has said more recently https://review.openstack.org/#/c/641994/2/specs/train/approved/amd-sev-libvirt-support.rst@167 where he suggests a fudge factor10:25
aspiersof 1.510:25
sean-k-mooneykashyap: cool sounds good. the main reason for the -1 is the implication that the nova libvirt driver ever supporte secure boot before ant that it was "actully" insecure.10:27
kashyapsean-k-mooney: Just responding to that point.  Will fix it.10:27
kashyapsean-k-mooney: Why I wrote is that, some customers and users were _confused_ that Nova supports "SB"10:28
sean-k-mooneyyes they were wrong10:28
sean-k-mooneybut the release notes were pretty clear10:28
*** ttsiouts has quit IRC10:28
kashyapCorrect.  I'll adjust the phrasing in spec10:28
sean-k-mooneyhyperv supprot SB and libvir only supports uefi boot10:28
openstackgerritStephen Finucane proposed openstack/nova master: docs: Remove references to nova-consoleauth  https://review.openstack.org/65296510:32
openstackgerritStephen Finucane proposed openstack/nova master: tests: Stop starting consoleauth in functional tests  https://review.openstack.org/65296610:32
openstackgerritStephen Finucane proposed openstack/nova master: xvp: Start using consoleauth tokens  https://review.openstack.org/65296710:32
openstackgerritStephen Finucane proposed openstack/nova master: nova-status: Remove consoleauth workaround check  https://review.openstack.org/65296810:32
openstackgerritStephen Finucane proposed openstack/nova master: Remove nova-consoleauth  https://review.openstack.org/65296910:32
openstackgerritStephen Finucane proposed openstack/nova master: objects: Remove ConsoleAuthToken.to_dict  https://review.openstack.org/65297010:32
kashyapsean-k-mooney: While removing "implications", I will definitely mention the point that some people can get confused.10:33
kashyap(Wording in the new iteration coming soon.)10:33
*** luksky has joined #openstack-nova10:33
sean-k-mooneysure that is resonable i guess10:34
sean-k-mooneyit should part of the problem statement section right?10:35
*** belmoreira has quit IRC10:35
kashyap(Yeah)10:37
kashyapsean-k-mooney: I see that you were commenitng as you read the spec; as I mentioned the 'os_secure_boot' & 'os:secure_boot' in the Work Items10:37
kashyapsean-k-mooney: How about I add a pointer to Work Items in the "Proposed change".10:37
sean-k-mooneyya10:37
kashyap(It is clearer than paragraphs of text.)10:38
sean-k-mooneyno the work items section as i said really should be sort like 10 lines max10:38
sean-k-mooneyall that content in my opipion shoudl be in teh proposed chages section10:38
*** cdent has joined #openstack-nova10:39
sean-k-mooneyjust grabbed a random spec but this is about the extent of what should be in work items https://github.com/openstack/nova-specs/blob/master/specs/rocky/approved/list-show-all-server-migration-types.rst#work-items10:39
kashyapsean-k-mooney: Note: on your remark: about extending Hyper-V also to report traits -- I don't want to "boil the lake" (much less an ocean) with one spec10:43
kashyapFew things at a time, with reasonable scope.10:43
kashyap(That Hyper-V thing should be a separate item)10:43
kashyapsean-k-mooney: Well, "10 lines max" for Work Items is a personal preference.10:44
kashyapI'd like to follow this structure (which I think is far more clearer):10:44
kashyap  - Describe the change at high-level in the "Proposed change" section10:44
kashyap  - Describe the "how" in _some_ detail in the Work Items section10:44
aspiersIn the SEV spec I've probably gone somewhere in between those two10:45
*** tbachman has quit IRC10:45
cdentkashyap: I agree that that format would probably make more sense, but it isn't the pattern that's been followed in the past10:45
aspiers"Work Items" is not short, but "Proposed change" has more detail.10:46
kashyapcdent: Nod10:46
aspiersHowever that's because for SEV, the design part is much more difficult than the actual implementation10:46
* kashyap --> lunch; been plugging away non-stop at this10:46
aspiersWe've had to debate the design for ages and consider all kinds of complex stuff, but once the design is decided, the actual coding reflected in Work Items is reasonably simple10:47
aspiersI expect each spec will have a different ratio between design complexity and implementation complexity, therefore the two sections will differ in size accordingly10:47
lpetruthi guys, I work on maintaining the Hyper-v driver and I'd also be interested in reporting traits / provider inventory10:47
lpetrutin fact, I was catching up on the specs / libvirt patches to see what needs to be done10:48
*** belmoreira has joined #openstack-nova10:51
*** nicolasbock has joined #openstack-nova10:55
aspierslpetrut: I'm far from an expert so this might be wrong, but IIUC the driver needs to implement update_provider_tree()10:55
lpetrutthanks for the hint10:55
aspierslpetrut: see the libvirt driver for an example10:56
aspiersactually this is also documented very nicely10:57
aspiershttps://docs.openstack.org/nova/latest/reference/update-provider-tree.html10:57
* aspiers feels slightly more confident that he is not spouting misinformation ;-)10:57
lpetrutthat's great. I'm wondering how are hypervisor specific features handled: standard traits via os-traits or custom traits10:59
aspiersI think best practice is that drivers are supposed to only provide standard traits11:00
aspiersSo if hyperv wants to provide a new trait, you should probably try to submit that to os-traits first11:00
aspiersThat's what I did with HW_CPU_AMD_SEV11:00
aspierslpetrut: You can also consider using capabilities since I managed to get that patch landed11:01
aspiershttps://docs.openstack.org/nova/latest/admin/configuration/schedulers.html#compute-capabilities-as-traits11:01
*** ttsiouts has joined #openstack-nova11:02
aspierslpetrut: efried_pto and I drew a diagram which might help https://pasteboard.co/I3iqqNm.jpg11:02
aspiersbut again I knew *nothing* about any of this a few months ago, so take it with a slight pinch of salt11:03
sean-k-mooneycdent: i find the fromat that kashyap propsoed and uses to be much harder to read top to bottom  and while syantatically correct i dont think it smatiaclly alines with the spectempalte defintions11:03
kashyapNo one else complained so far.  I don't want to rework it.11:03
kashyapI spend a _lot_ of time carefully choosing words and maintaining a structure11:03
kashyapI will also note in the summary comment I'm writing11:03
sean-k-mooneykashyap: if you like i can rework it11:03
kashyapPlease no11:04
kashyapLet's not make specs into sterile and overly rigid enviroments.  There is already a high-level structure.  Let the author's thoughts reflect in their writing11:04
sean-k-mooneyi really thing think is a bad way to write specs. its one of the few documation forms i truely care about.11:04
kashyapAnd if it is clear, it is fine.11:04
kashyapsean-k-mooney: Again, "bad way" is an opinion11:04
kashyapI've had several people read it, none of them had complaints11:04
sean-k-mooneyform the template "Work Items11:05
sean-k-mooneyWork items or tasks -- break the feature up into the things that need to be done to implement it. Those parts might end up being done by different people, but we're mostly trying to understand the timeline for implementation.11:05
sean-k-mooney"11:05
*** panda is now known as panda|lunch11:05
sean-k-mooneythe intention is to descibe the order in which things need to be done to implent the feature11:06
kashyapFor now, I'm addressing the main points.11:06
kashyap(Let's not lose sight of what is important.)11:06
kashyapIf many others complain, I will re-adjust the content a bit.11:06
kashyapFor now, I want to re-focus the attention on the actual design and content.11:06
lpetrutaspiers: interesting, wasn't aware that driver capabilities are now reported through traits11:07
sean-k-mooneywell part of my main objection to the current formating is you dont present you actual design at all in the proposed chage section.11:07
sean-k-mooneyat least you didnt in the version i read. ill read the latest version when its ready11:07
aspierslpetrut: it's new, mriedemann originally prototyped it ~1 year ago and then I finished it off https://review.openstack.org/#/c/538498/11:08
aspierslpetrut: https://docs.openstack.org/releasenotes/nova/stein.html#new-features11:09
openstackgerritMerged openstack/nova master: Change a log level for overwriting allocation  https://review.openstack.org/64978811:14
*** pcaruana has quit IRC11:17
*** ccamacho has quit IRC11:34
*** phasespace has quit IRC11:35
*** bbowen has joined #openstack-nova11:36
*** tetsuro has quit IRC11:37
*** belmoreira has quit IRC11:41
*** cdent has quit IRC11:42
*** belmoreira has joined #openstack-nova11:43
*** dtantsur is now known as dtantsur|brb11:45
*** lpetrut has quit IRC11:52
*** erlon has joined #openstack-nova11:59
*** lpetrut has joined #openstack-nova12:00
*** tbachman has joined #openstack-nova12:03
*** pcaruana has joined #openstack-nova12:04
*** NewBruce has joined #openstack-nova12:08
NewBrucemnaser / sean-k-mooney howdy…12:17
*** priteau has quit IRC12:17
sean-k-mooneyNewBruce: o/12:17
NewBruceupdate on https://bugs.launchpad.net/nova/+bug/1822884 - we’ve finished upgrading all nodes on our site to rocky (service level 35), nut the issue still exists...12:18
openstackNewBruce: Error: malone bug 1822884 not found12:18
NewBruce(we have 3 rows in the services table not on level 35, but they have deleted > 0, so im hoping they are ignored)12:18
NewBruceso im officially out of ideas on this one….12:19
sean-k-mooneyif deleted = id then its a soft delete service record12:19
*** priteau has joined #openstack-nova12:19
sean-k-mooneythat bug link does not work form me by the way but i remember the bug you were having12:19
NewBruceNova-compute deleted = 1012:20
NewBrucedeleted = 6012:20
NewBrucedeleted = 6112:20
NewBruceyeah i moved the bug to private for now; some logs got i want to remove; but haven’t heard back from the nova admin on that yet12:20
*** gaoyan has joined #openstack-nova12:21
*** panda|lunch is now known as panda12:22
sean-k-mooneyah ok12:22
NewBrucecleaned up and public again; i can mail you the log files if you are interested12:22
NewBrucehttps://bugs.launchpad.net/nova/+bug/182288412:22
openstackLaunchpad bug 1822884 in OpenStack Compute (nova) "live migration fails due to port binding duplicate key entry in post_live_migrate" [Undecided,New]12:22
*** cdent has joined #openstack-nova12:23
NewBruceposted some more to the launchpad now12:23
*** Luzi has quit IRC12:24
sean-k-mooneyso at this point all compute nodes are running rocky. the neutorn contol plane is entirely on osa12:24
sean-k-mooneyand all compute services are using the same compute service version12:25
NewBrucecorrect12:25
kashyaplyarwood: sean-k-mooney: I forgot to mention: the key enrollment tool is already part of EDK2 package (at least in Fedora), as a sub-RPM, look for: 'edk2-qosb'12:27
sean-k-mooneyhonestly i dont know either. cold miragtion should work as a last resort but i dont quite understand why this would happen12:27
kashyap(Updating the spec)12:27
sean-k-mooneyok yes that is good to add to the reference section12:28
kashyapYes, adding, actually12:28
kashyapI totally forgot that we did it some 8 months ago.  My memory is like goldfish's12:28
*** belmoreira has quit IRC12:28
sean-k-mooneyit will be useful for project like kolla that need to figureout what to install12:28
*** belmoreira has joined #openstack-nova12:33
*** Luzi has joined #openstack-nova12:46
openstackgerritMerged openstack/nova master: conf: Undeprecate and move the 'dhcp_domain' option  https://review.openstack.org/48061612:48
stephenfinsean-k-mooney: We're going to have to make a call on https://review.openstack.org/#/c/555081/23/specs/train/approved/cpu-resources.rst@18012:49
bauzasstephenfin: I still need to provide my comments12:49
stephenfinsean-k-mooney: On one hand, I totally get where you're coming from and agree that it's a valid concern12:49
stephenfinOn the other though, if we don't use 'vcpu_pin_set' to populate this stuff, the user will be left in a state where they can no longer boot any instances with 'hw:cpu_policy' configured until they do additional configuration on each host12:50
sean-k-mooney yes that is one of the bits im most uncomfortable with.12:50
stephenfinThat's as much a breaking change as anything else we've discussed12:50
*** liuyulong has joined #openstack-nova12:50
stephenfinbauzas: Ack. Just focussing on that one point atm12:51
sean-k-mooneyyes its not simple. if we use the totally new config options i suggest we could have tehm pre set them and possible provide an upstread check in nova status?12:51
sean-k-mooneyvcpu_pin_set was never intended to be "pinned cpus" alther it was assumed to be that in some cases12:52
sean-k-mooneyso i dont know what the right way forward is12:52
kashyaplyarwood: Okay, aside: the 'nvram' stanza in /etc/libvirt/qemu.conf will be depreacted.  So we can ignore that12:52
stephenfinHmm, neither do I :\12:52
stephenfinI'm also thinking we need to kill 'hw:emulator_threads_policy=isolate'12:53
sean-k-mooneyi kind of feel we should require the dedicated_cpu_set to be in the config before enableing any of this12:53
sean-k-mooneystephenfin: ya that i was original not sure about  but i think i agree12:54
stephenfinBecause I can't think of anyway to account for the extra core without mangling the request12:54
sean-k-mooneyhw:emulator_threads_policy=share shoudl be sufficent and more efficent12:54
sean-k-mooneywell the acounting is trivial12:54
stephenfinI really, really wish we'd overloaded 'isolate' instead of 'share' to do this offload to 'shared_cpu_set'12:54
stephenfinbecause I don't see a reason why anyone would want to use 'isolate' as it's implemented12:55
sean-k-mooneywe already generate teh resouce:vcpu  part fo teh placement request for the flavor.vcpu and add 1 core to it for isolate12:55
sean-k-mooneywe can easilly just add a pcpu instead12:55
sean-k-mooneywell personally i would like to kill the option entirly.12:56
stephenfinTrue, but rewriting the request like that feels rotten12:56
stephenfinSo would I12:56
sean-k-mooneymy preference would be make the emulator pinning work for floating cpus too12:56
sean-k-mooneyand have a seperate pin set for it12:56
stephenfinWhat would the alternative be? Always offload emulator threads if 'cpu_shared_set' is defined?12:57
stephenfinWhy?12:57
stephenfinThe reason you're offloading this stuff is performance, no?12:57
stephenfinMore specifically, real-time performance12:57
stephenfinIf you're using floating CPUs, you've already given up on that12:57
sean-k-mooneyyes but perfroamce still maters for non pinned instance12:57
sean-k-mooneythe main reason however is to simplfy the code12:57
sean-k-mooneyand the config12:57
sean-k-mooneybasicaly if you define emulator_pin_set in the libvirt section we will always use it12:58
sean-k-mooneyfor all vms and if not then it overlap with the vm cores12:58
stephenfinI was thinking a similar thing but only for instances with PCPUs and only if 'cpu_shared_set' is defined12:58
stephenfin(As opposed to all instances and only if 'emulator_pin_set' is defined)12:59
sean-k-mooneyin either case if emulator_pin_set is not overlapping with the dedicated and floating pin sets then you dont need any accounting in placement12:59
*** belmoreira has quit IRC12:59
sean-k-mooneyi would do it for all instance as i would liek to try and share more code between pinned and floating instance12:59
*** tbachman has quit IRC12:59
stephenfinthat makes sense13:00
sean-k-mooneylike we do with numa i would like to soft pin all vms to a floating_pin_set so you can properly reserve cores on the host for OS or vswith use13:00
stephenfinWell, I think we'll be doing that going forward regardless13:00
sean-k-mooneywe need to do that anyway when we mix pinned and floating vm on the same host13:00
stephenfinYeah13:01
sean-k-mooneyyep13:01
stephenfinAdding another configuration option would be yet another breaking change though13:01
stephenfinSo many breaking changes13:01
sean-k-mooneyit does not need to be13:03
sean-k-mooneyhttps://review.openstack.org/#/c/555081/23/specs/train/approved/cpu-resources.rst@15713:03
sean-k-mooney[compute]13:03
sean-k-mooneypinned_cpu_set13:03
sean-k-mooneyfloating_cpu_set13:03
sean-k-mooney[libvirt]13:03
sean-k-mooneyemulator_cpu_set13:03
sean-k-mooneyi suggested deprecating cpu_shared_set and renaming/moving it to [libvirt]/emulator_cpu_set13:04
sean-k-mooneyso it will not need config change initally althong they will get a deprecation warning13:04
stephenfinHmm, I need to think about this13:06
NewBrucesean-k-mooney / mnaser … we seem to actually have made this worse - RDO - RDO which was working previously is now also broken13:07
*** pcaruana has quit IRC13:07
sean-k-mooneyis it broken the same way?13:08
NewBruceseems to be13:08
NewBrucesame 500 internal error; same duplicate port entries in ml2_port_bindings13:09
sean-k-mooneyand all RDO nodes have the same code deployed and config13:09
*** lbragstad has joined #openstack-nova13:10
sean-k-mooneythe is begining to feel like it might be related to the specific configuration of some of the compute nodes rather than a gloabl configuration issue but that is a feeling rather then anything based on fact13:10
NewBrucenot identical (due to the time span taken to update), we have mostly 18.2.0, but some 18.1.0 and an 18.0.313:11
NewBrucewe did try rolling an 18.2.0 back to 18.1.0 - but that didn’t seem to help13:11
sean-k-mooneycan you try migration between two nodes of the same version on the rdo side for each of the 3 versions you have deployed13:12
NewBruceyeah, i agree its starting to feel something odd (random)… but as you say, feeling - not fact13:12
NewBruceYeah, will do now13:12
sean-k-mooneyperhaps we can narrow it donw to a version and then do a git biset for the changes13:12
mnaserhmm13:17
mnaserthis is strange13:17
mnaserNewBruce: so that cloud is 100% rocky at this point?13:17
*** jmlowe has quit IRC13:19
*** ccamacho has joined #openstack-nova13:20
*** mriedem has joined #openstack-nova13:21
*** eharney has joined #openstack-nova13:22
*** yaawang has quit IRC13:22
NewBrucemnaser yeah, 100% rocky13:22
NewBruce… in all senses of the word :D13:23
*** yaawang has joined #openstack-nova13:23
*** dtantsur|brb is now known as dtantsur13:24
sean-k-mooneythis is the delta in termes of ptaches https://github.com/openstack/nova/compare/18.0.3...18.1.013:25
mnaserNewBruce: curl -H "X-Auth-Token: `openstack token issue -c id -f value`" http://network-sjc1.vexxhost.us/v2.0/extensions | python -mjson.tool | grep binding-extended13:25
mnaserchange network-sjc1 with your neutron endpoint13:26
mnasercan you run that many several times and see if there is every a time that it doesn't return binding-extended?13:26
sean-k-mooneyi wonder if it could be realated to https://github.com/openstack/nova/commit/4a12c9c298913f99570f2f8e93500db687e98dc913:26
sean-k-mooneyhum although that is only on revert13:28
NewBruceso an RDO 18.1.0 -> RDO 18.1.0 does the same thing; fails on duplicate ports13:29
NewBrucemnaser : curl -s -H "X-Auth-Token: $OS_TOKEN" https://kna1.citycloud.com:9696/v2.0/extensions/binding-extended | python -m json.tool13:30
NewBruce{13:30
NewBruce    "extension": {13:30
NewBruce        "alias": "binding-extended",13:30
NewBruce        "description": "Expose port bindings of a virtual port to external application",13:30
NewBruce        "links": [],13:30
NewBruce        "name": "Port Bindings Extended",13:30
NewBruce        "updated": "2017-07-17T10:00:00-00:00"13:30
NewBruce    }13:30
NewBruce}13:30
*** egonzalez has quit IRC13:30
*** egonzalez has joined #openstack-nova13:30
*** efried_pto is now known as efried13:31
NewBruce(#SoccerDad duties - back online in about 30 min)13:31
mnaserNewBruce: but did you hit it many different times, and it always gave a response?13:31
mnaserim wondering if one backend might be acting weird13:31
*** boxiang has quit IRC13:32
*** pcaruana has joined #openstack-nova13:33
*** boxiang has joined #openstack-nova13:33
sean-k-mooneymnaser: so you resuggesting runnint it with "watch -d -n 1 ..." and seing if it changes13:33
mnasersean-k-mooney: well just knowing that there is probably multiple network servers, maybe keep hitting it because *maybe* one of them is responding without that extension13:34
mnaserhttps://github.com/openstack/nova/blob/stable/rocky/nova/conductor/tasks/live_migrate.py#L282 just trying to eliminate all those things here13:34
openstackgerritMatt Riedemann proposed openstack/nova stable/stein: Change a log level for overwriting allocation  https://review.openstack.org/65299413:34
sean-k-mooneymnaser: i think i asked NewBruce to check that against each of the neutron api enpoints directly in the past bypassing the loadblancer13:35
*** belmoreira has joined #openstack-nova13:35
sean-k-mooneyit would cause this issue espically if there was a network partion or some other factor cause different nodes to prefer diffeent api nodes13:35
sean-k-mooneybut if that was the case we would expect the same resules for osa to osa too right?13:36
mnaserthat is true as well13:38
*** jmlowe has joined #openstack-nova13:38
mnasersean-k-mooney: but this is one of those issues where you're just desperately trying to do whatever works lol13:38
efrieddansmith: When you get a chance, would you please do the channel topic thing with https://etherpad.openstack.org/p/nova-spec-review-day ?13:38
openstackgerritMerged openstack/nova-specs master: Add host and hypervisor_hostname flag to create server  https://review.openstack.org/64545813:39
*** ChanServ sets mode: +o dansmith13:39
sean-k-mooneyya i know from what we can tell it should be working if it wasnt for the evidence e.g. its not that would be my respocne if some one asked shoudl this work13:39
*** dansmith changes topic to "Spec review day: https://etherpad.openstack.org/p/nova-spec-review-day -- Current runways: https://etherpad.openstack.org/p/nova-runways-train -- This channel is for Nova development. For support of Nova deployments, please use #openstack."13:39
*** tbachman has joined #openstack-nova13:41
openstackgerritMatt Riedemann proposed openstack/nova stable/queens: Add missing libvirt exception during device detach  https://review.openstack.org/65163913:43
openstackgerritsean mooney proposed openstack/os-traits master: add libvirt image metadata traits  https://review.openstack.org/65299613:43
*** cfriesen has joined #openstack-nova13:47
bauzasmriedem: sorry, finished earlier yesterday13:52
bauzasmriedem: could you please ping me again which stable changes I could review ?13:52
mriedembauzas: at this point, probably queens https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/queens13:53
bauzasack13:53
mriedembauzas: and https://review.openstack.org/#/q/status:open+project:openstack/nova+branch:stable/pike+topic:bug/1669054 for pike13:53
dansmithefried: as a placement and request groups expert, I hope you can look at this one and provide a short path to being able to make request groups expressive enough that we don't have to keep adding legacy stuff like this: https://review.openstack.org/#/c/650963/213:57
efrieddansmith: on my list. I had suggested https://review.openstack.org/#/c/650476/ which is generating lots of discussion both on the spec and on the ML http://lists.openstack.org/pipermail/openstack-discuss/2019-April/thread.html#478213:59
dansmithefried: ack, glad to see that13:59
efriedunfortunately it's not an easy problem.13:59
dansmithstephenfin: bauzas: I trust you will be all over that ^ spec to make sure it will solve your problems13:59
efriedwhich is why we've been talking about it for a couple years14:00
*** awaugama has joined #openstack-nova14:00
*** cdent has quit IRC14:01
mriedemmelwitt: are consoles/console pools in the db model a xen only thing?14:02
mriedemah yes, "The nova-console service is deprecated as it is Xen specific"14:04
*** amodi has joined #openstack-nova14:05
melwittcool. I don't recall the term "console pools" off the top of my head14:05
*** sapd1_x has joined #openstack-nova14:06
NewBrucemnaser have run it in a loop and no changes14:07
*** markvoelker has quit IRC14:07
mriedemmelwitt: ok, reason being https://review.openstack.org/#/c/570202/7/nova/db/sqlalchemy/api.py@179414:08
*** Luzi has quit IRC14:09
mnaserhmm14:09
mriedemlooks like we likely leak consoles records when a server is deleted,14:09
mriedembut that's super latent if so, and xen specific,14:09
mriedemso i care little14:10
mriedemTheo could open a bug for it but i wouldn't hold this up for it14:10
melwitthaha, nice14:10
mnaseris migrate_data stored in a db anywhere or is it thrown into the rpc request14:10
mriedemmnaser: it's only rpc, not persisted14:10
melwittwe deprecated the os-console service kind of recently https://review.openstack.org/61007514:12
mriedemstephenfin: the bottom of your cells v1 removal series needs to be fixed14:12
stephenfinmriedem: I just saw :(14:13
melwittoh yeah, you already said that14:13
stephenfinLooking at it now14:13
mriedemmelwitt: yeah i was looking at that - that deprecates the service,14:13
mriedembut not the apis14:13
melwittah14:13
*** belmoreira has quit IRC14:13
mriedemso if we were to remove the nova-console service the os-consoles api would be broken,14:13
mriedemso we can't really deprecate the api with a microversion and expect it to still work on 2.1 w/o the service itself,14:13
mriedemwhich would mean we either (1) leave the nova-console service forever or (2) obsolete the api and drop the service, like we're doing with nova-cells and nova-network14:14
mriedemmaybe ptg topic fodder - but it'd be nice to have xen people there to actually tell us if they need the thing anymore, BobBall seemed to suggest in the ML that we didn't14:14
mriedemi don't know who works on or uses xenapi in nova anymore though14:15
NewBrucehey mriedem seen the latest in our lovely migration issues ?!14:15
openstackgerritStephen Finucane proposed openstack/nova-specs master: Standardize CPU resource tracking  https://review.openstack.org/55508114:15
mriedemNewBruce: not really - can't get the neutron api extension to go away?14:15
mriedemmelwitt: i'll just add an item in the ptg etherpad14:16
*** belmoreira has joined #openstack-nova14:16
NewBrucewe upgraded the entire site to rocky, and now RDO - RDO fails in addition to RDO - OSA!14:16
NewBruce(updated the launchpad)14:16
*** awalende has quit IRC14:17
*** awalende has joined #openstack-nova14:18
mriedem:/14:18
*** priteau has quit IRC14:19
mriedemwe definitely have multinode jobs running live migration in rocky that uses the new flow14:19
melwittyeah, I had thought BobBall gave the go ahead to remove it but agreed, I don't know of any people currently using xenapi. actually, I thought I've seen some patches proposed over there fairly recently. /me looks14:19
*** gaoyan has quit IRC14:20
melwittoh, nvm, it was coreycb fixing an openssl handling thing https://review.openstack.org/63553314:21
*** awalende_ has joined #openstack-nova14:21
*** awalende has quit IRC14:22
openstackgerritStephen Finucane proposed openstack/nova master: Remove '/os-cells' REST APIs  https://review.openstack.org/65129114:22
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 in '/os-hypervisors' API  https://review.openstack.org/65129214:22
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 in '/os-servers' API  https://review.openstack.org/65129314:22
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'nova-manage cell' commands  https://review.openstack.org/65129414:22
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 for console authentication  https://review.openstack.org/65129514:22
openstackgerritStephen Finucane proposed openstack/nova master: Remove old-style cell v1 instance listing  https://review.openstack.org/65129614:22
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'bdm_(update_or_create|destroy)_at_top'  https://review.openstack.org/65129714:23
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_fault_create_at_top'  https://review.openstack.org/65129814:23
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_info_cache_update_at_top'  https://review.openstack.org/65129914:23
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'get_keypair_at_top'  https://review.openstack.org/65130014:23
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_update_at_top', 'instance_destroy_at_top'  https://review.openstack.org/65130114:23
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_update_from_api'  https://review.openstack.org/65130214:23
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling 'update_cells' on 'BandwidthUsage.create'  https://review.openstack.org/65130314:23
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 for instance naming  https://review.openstack.org/65130414:23
openstackgerritStephen Finucane proposed openstack/nova master: Remove cells code  https://review.openstack.org/65130614:23
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling 'InstanceUnknownCell' exception  https://review.openstack.org/65130714:23
openstackgerritStephen Finucane proposed openstack/nova master: Remove unnecessary wrapper  https://review.openstack.org/65130814:23
openstackgerritStephen Finucane proposed openstack/nova master: db: Remove cell APIs  https://review.openstack.org/65130914:23
*** ttsiouts has quit IRC14:23
*** ttsiouts has joined #openstack-nova14:24
mriedemNewBruce: and you've checked to make sure there aren't any lingering nova-compute services table records in the rdo site database with an older version causing issues?14:25
stephenfinmelwitt: This is something we could do later this cycle, right? https://review.openstack.org/#/q/topic:bp/remove-consoleauth14:25
mriedemNewBruce: and unpinned rpc versions for [upgrade_levels]/compute ?14:25
*** awalende_ has quit IRC14:25
NewBrucemriedem posted the contents of the services table to the launchpad; we have 3 entries which are not 35 but all have deleted status > 014:25
stephenfinmelwitt: Also, this has the feel of a bugfix to me. Thoughts? https://review.openstack.org/#/c/652967/14:25
*** rcernin has quit IRC14:27
mriedemstephenfin: the consoleauth thing has been really confusing for people, in rocky at least,14:27
*** janki has quit IRC14:27
mriedemif there is some migration that people need to do and we can check that in some automated way, to say "you shouldn't upgrade to train" that would be good before ripping the service out14:27
*** dtantsur has quit IRC14:27
*** ttsiouts has quit IRC14:27
*** dpawlik has quit IRC14:28
stephenfinmriedem: You mean more than the nova-status check we already have?14:28
*** ttsiouts has joined #openstack-nova14:28
mriedemstephenfin: also note that there is a rest api that relies on consoleauth https://developer.openstack.org/api-ref/compute/?expanded=create-remote-console-detail#create-remote-console14:28
stephenfinOh, I thought that was able to use the DB tokens too? I must have misread it14:29
mriedemmaybe i'm thinking of https://developer.openstack.org/api-ref/compute/?expanded=create-remote-console-detail#show-console-connection-information14:29
mriedemok remote-consoles is something else, nvm14:30
stephenfinmriedem: I had a look at that and thought it was also able to use the DB tokens14:30
mriedemi think you are https://review.openstack.org/#/c/652969/1/nova/api/openstack/compute/console_auth_tokens.py14:30
*** mlavalle has joined #openstack-nova14:30
bauzaserr, I was about to update my thoughts on the cpu-resources spec :)14:31
bauzasbut then I saw a new PS :)14:31
bauzasdammit14:31
stephenfinbauzas: Update them away :) I'm apply them retrospectively14:31
*** dtantsur has joined #openstack-nova14:31
stephenfinIt hasn't changed significantly outside of changing how we use Flavour.vcpus14:31
bauzasstephenfin: I'm trying to wrap my head around the upgrade impact14:31
melwittstephenfin: hm, yeah, I guess that xvp thing was missed.. so it feels more like a bug fix14:32
stephenfinmelwitt: Cool. I can drag that out so. I imagine no one has spotted it because no one is using it (It's Xen-specific and BobBall said we could kill it)14:32
melwittyeah, that's what I'm thinking too as far as it not being noticed14:33
stephenfinAlas, that was only deprecated last cycle so I guess we can't kill that too this cycle14:33
stephenfin(Removing nova-cells, nova-network, nova-consoleauth, nova-xvpvncproxy and the placement code in one fell swoop/cycle sure would make for interesting release note reading)14:34
NewBrucemriedem we have upgrade_levels = auto across the site, but since everything is service level 35 that shouldnt be an issue right?14:34
*** gaoyan has joined #openstack-nova14:35
*** lpetrut has quit IRC14:36
*** mdbooth has quit IRC14:37
*** priteau has joined #openstack-nova14:37
mriedemNewBruce: on a call and i'd need to load all of this context back into my head14:38
mriedembut you're talking about this check https://github.com/openstack/nova/blob/stable/rocky/nova/conductor/tasks/live_migrate.py#L5114:39
*** mdbooth has joined #openstack-nova14:40
mriedemNewBruce: "we have 3 entries which are not 35 but all have deleted status > 0" yeah those shouldn't be included in the min version check14:41
NewBrucemriedem ok - ive updated the launchpad, i might have also mailed you some logs at some point14:43
mnaseroh hmm now that I think about it14:44
NewBrucebut ping me when your off the call… i will test cold migrate tonight so that we have that as an option; the fact that its failing on RDO - RDO after the rocky upgrade is at least encouraging that the problem seems to be in the rocky side14:44
mriedems/rocky/rdo/?14:44
mnasercould it be possible that not all services have been restarted since everything is at rocky14:45
mriedemosa -> osa live migration is fine right?14:45
mnaserand the max rpc version is not rocky ?14:45
NewBrucemnaser very possible14:45
NewBrucehavent tested OSA - OSA yet, on my todo14:45
*** mdbooth_ has joined #openstack-nova14:45
mnaseraren't you supposed to "SIGHUP" (which is broken right now) to get new versions of rpc stuff14:45
mriedemyes14:45
mnaserso really a restart14:45
mnaserso is it possible the conductors haven't been restarted and are running older code?14:45
mriedemdoes rdo sighup rather than full restart the services on upgrade?14:45
mnaserOSA used to do sighup14:46
mnasertill we found that bug14:46
mriedemyeah so maybe rdo still does as well14:46
mriedemi was never able to recreate one of the theories about the break either https://review.openstack.org/#/c/649464/14:47
*** mdbooth has quit IRC14:48
openstackgerritMerged openstack/nova stable/rocky: Temporarily mutate migration object in finish_revert_resize  https://review.openstack.org/64869114:48
*** dklyle has quit IRC14:49
*** luksky has quit IRC14:50
*** dklyle has joined #openstack-nova14:50
*** mdbooth_ has quit IRC14:50
mriedemi.e. i'm not able to recreate the duplicate entry error in neutron when post_live_migration_at_destination updates the port's host binding14:50
mriedemsean-k-mooney: ^ were you able to recreate that with a neutron functional test?14:50
efriedmriedem: Are you my "stable release liaison"?14:51
mriedemsure14:51
efriedjust added you to https://review.openstack.org/#/c/652868/ and https://review.openstack.org/#/c/652869/14:51
efriedIt is not clear to me how much of https://review.openstack.org/#/q/status:open+(project:openstack/os-vif+OR+project:openstack/python-novaclient+OR+project:openstack/nova)+branch:stable/pike we need to merge before those ^ are a go.14:51
efriedand/or if more things need to be flushed from stein->rocky->queens->pike first14:52
sean-k-mooneymriedem: no although it kind of fell of my plate. i can try again however.14:53
mriedemefried: i've -1ed with a comment14:53
efriedthank you sir14:53
sean-k-mooneymriedem: the main issue was figuring out how to test that withing neutron exsiting test suite14:53
mnaserNewBruce: how long has this system been up for?14:54
mnaseresp the nova ctlplane processes14:54
NewBrucewe had a maintenance window maybe a month ago when everything was shutdown14:55
mnaserthere isn't a way to get the current "highest" level of detected rpc version eh14:55
NewBruceso that would have been the latest - certainly quite some time before the rest of the computes were upgrade14:55
*** helenafm has quit IRC14:57
mriedemmnaser: i've thought about exposing the service version in the api and/or a nova-manage command, but the problem is the services will cache the version so an api/cli could tell you you're at the highest but a service could be running with a lower version in its cache14:58
NewBruceis it worth restarting everything to remove this as a possibility?14:58
mriedemit looks like if you sighup nova-conductor the only thing it will do is reset that cache14:58
mriedemhttps://github.com/openstack/nova/blob/stable/rocky/nova/conductor/manager.py#L18814:59
mordredmriedem: you know everything ...15:00
mordredmriedem: https://wiki.openstack.org/wiki/VirtDriverImageProperties - are those documented anywhere other than that wiki page?15:01
mriedemhttps://docs.openstack.org/glance/latest/admin/useful-image-properties.html ?15:01
mriedemstephenfin: the table format in ^ is bad now15:01
mriedemis that a known issue?15:01
mriedemthat used to actually have grid lines like a .... table15:02
mordredoh! thanks15:02
stephenfinmriedem: kashyap spotted that a few days ago. It's a style thing done by openstackdocstheme15:02
sean-k-mooneymordred: they are ment to be documened in teh glance metadef too but several are missing15:03
kashyapstephenfin: What is "that"?  The "if you reference the same ref twice you'll see funny rendering"?15:03
mriedemmordred: there are some missing from that docs page too - like sean-k-mooney said for metadefs15:03
stephenfinkashyap: The lack of borders on tables15:03
sean-k-mooneymordred: stephenfin is working on a better way to defien and validate imave properties and flavor extra specs15:03
mriedemi try to report glance bugs when i find missing properties in nova code15:03
kashyapstephenfin: Ah, I missed to read the context earlier15:03
mriedemhttps://bugs.launchpad.net/glance/+bug/181189715:03
openstackLaunchpad bug 1811897 in Glance "Useful image properties in glance - hw_disk_bus is also used by the vmware driver" [Undecided,New]15:03
mriedemhttps://bugs.launchpad.net/glance/+bug/180886815:03
openstackLaunchpad bug 1808868 in Glance "Useful image properties in glance - hw_cdrom_bus is not documented" [Medium,Confirmed]15:03
mordredsean-k-mooney: I would support anything that improves a better way to define and validate image properties :)15:03
mriedemhttps://bugs.launchpad.net/glance/+bugs?search=Search&field.bug_reporter=mriedem&orderby=-datecreated&start=0 etc15:04
sean-k-mooneymordred: in theroy they should all be defined here https://github.com/openstack/glance/tree/master/etc/metadefs15:04
mordredtrait:<trait_name> = required is a really special interfae15:04
sean-k-mooneybut it has not been maintained for all new specs and across all drivers15:05
mordredsean-k-mooney: of course it hasn't :)15:05
*** cdent has joined #openstack-nova15:05
sean-k-mooneymordred: well we never added any testing to enforce it so it never will be.15:05
mriedemi found out the other day that azure has a completely undocumented templating rest api and i was pretty surprised and somehow happy that even a giant closed source thing like azure has poor documentation15:06
* mordred cries15:06
mriedemsean-k-mooney: core reviewers in nova can certainly say "i'm not going to approve your shiny nugget until i see the glance docs patch written"15:06
mordredalso - fwiw - in the docs, it says auto_disk_config should return true of falase15:06
mordredand on rackspace it returned "disabled"15:06
mordredso - you know - there's that15:07
sean-k-mooneymriedem: true we have mentioned that for some of the recent windriver ones like vTPM15:07
sean-k-mooneyits a relitvly tirival change if you do it when you add the feature15:07
mriedemmordred: auto_disk_config is a string field in the nova schema so it can be whatever as far as nova is concerned15:07
mordredmriedem: awesome15:07
mnasermriedem: I'd probably avoid suggesting SIGHUP-ing things for now till we figure out the oslo.service stuff, but yeah, maybe worth restarting nova-conductor to reset its cache I guess15:08
mriedemit does look like the xen driver tries to treat it as a bool though15:08
sean-k-mooneymordred: you shoudl review https://review.openstack.org/#/c/638734/15:08
mriedemlike other booleans in the openstack rest api like 1/yes/true/True etc15:08
mordredare _all_ of the extra properties liek that string fields?15:09
mriedemmordred: btw, i think i kind of have a monopoly on depressing topics in this channel and i will fight you over territory15:09
*** seyeongkim has quit IRC15:09
mordredmriedem: I will not fight back - I concede your supremacy in depressing topics in this channel15:09
*** seyeongkim has joined #openstack-nova15:10
mriedemmordred: not all https://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L23315:10
mordredI claim that monopoly in #openstack-sdks15:10
mriedemeverything in ^ should be documented in https://docs.openstack.org/glance/latest/admin/useful-image-properties.html and in glance metadefs15:10
sean-k-mooneymordred: wehre there is a finite set we restict it but where there is not we dont15:10
mriedemanything not in https://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L233 will fail in nova (unless you've forked nova)15:10
sean-k-mooneymriedem: well if the key does not match anything in that we will allow it15:11
mordredok. so that nova file is essentally the ultimate truth15:11
sean-k-mooneye.g. you can define my_randome_metadata_property=whatever15:11
*** mdbooth has joined #openstack-nova15:11
mriedemsean-k-mooney: ummm, are you sure?15:11
sean-k-mooneyyes if we cant we broke our api15:11
mriedemi'm pretty sure we intentionally broke the api for this15:12
mriedemand told people to upstream their forks15:12
mriedemthis is also why we haven't codified flavor extra specs15:12
dansmithyeah, pretty sure you can have anything you want, we just enforce the format of the ones we know about15:12
dansmithbecause otherwise you can't use some of the things like jsonfilter right?15:12
sean-k-mooneyoperators use this for steaing guest to specific host using the image properites filter15:12
mriedemthe image properties filter works on like 3-4 known properties15:13
mordredhave I mentioed how much it sucks that images in v2 just take the extra properties in the root of the image object?15:13
mriedemyou mean flat rather than a special 'properties' sub-dict?15:14
mordredv1's subdict where user-defined metadata went is SO MUCH JBETTER15:14
mordredyeah15:14
mordredflat is horrifically terrible15:14
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/scheduler/filters/aggregate_image_properties_isolation.py i think support all the image metadata keys15:14
mriedemnow you get to get the base properties and subtract anything to figure out the exras15:14
mriedemweeee15:14
mordredyup15:14
mordredespecially since some of the base properties have data types that aren't string15:14
mordredso if you _don't_ deal with all of the base properties, you can be really screwed15:15
dansmithmriedem: this is where we would enforce that all the ones in the dict have to be known right? https://github.com/openstack/nova/blob/master/nova/objects/image_meta.py#L580-L59815:15
*** gtema has joined #openstack-nova15:15
mriedemdansmith: i believe so15:17
openstackgerritMerged openstack/nova stable/rocky: Error out migration when confirm_resize fails  https://review.openstack.org/65212715:17
dansmithmriedem: so, I think we only process the ones we know about, ignore anything else you throw in there15:17
sean-k-mooneyya thats what im seeing too15:17
mriedemi remember people bringing this up in the ML but maybe they were just saying, "my special unicorn properties doesn't make it down to my forked compute manager code, why not?"15:18
sean-k-mooneyi dont remember a spec for removing this so this s a regression as it s an api change15:18
dansmithmriedem: that's a different thing15:18
mriedemsean-k-mooney: this has been this way for *years*15:19
mriedemdanpb did all this work15:19
dansmithmriedem: the unknown image props don't get put into the object, so the compute nodes never see them, that's definitely true15:19
sean-k-mooneywell the unkonw ones are for the scheduler only15:19
mriedemsean-k-mooney: https://github.com/openstack/nova/blob/master/nova/scheduler/filters/aggregate_image_properties_isolation.py#L45 is an ImageMetaProps object,15:20
mriedemso if we don't know about the property, it won't be in that object and the filter can't filter on it15:20
mriedemhttps://github.com/openstack/nova/blob/master/nova/scheduler/filters/aggregate_image_properties_isolation.py#L5615:20
dansmithsame for image_props_filter15:22
*** hamzy has quit IRC15:22
dansmithI guess jsonfilter is only on host state15:22
mriedemyeah the jsonfilter is a whole other unvalidated piece of gorp15:23
dansmithyeah, but I thought it could operate on the image too, but no15:23
NewBruceheading off line for a while; bbl15:23
*** gaoyan has quit IRC15:23
dansmithwell, anyway, we definitely removed some functionality in the image props filter when we did that, but (a) it's been a long time with no real complaint that I know of and (b) I wouldn't call it an api breakage15:24
mriedemliberty https://review.openstack.org/#/c/76234/15:25
dansmither, hmm, maybe we didn't actually15:25
dansmithI thought image props filter could do more generic matching, but it doesn't15:25
mriedemright it's like 3 or 4 properties15:26
mriedemhw version, type, arch something like that15:26
mriedemvery specific15:26
dansmithwhich is why I was thinking about the jsonfilter15:26
dansmithso if there's not some other filter I'm thinking of I guess we're good15:26
mriedemit's AggregateImagePropertiesIsolation15:26
mriedemthat's the generic one15:26
sean-k-mooneyok do we have teh same restion now with flavor extra specs15:26
mriedemto tie aggregates to images15:26
dansmithmriedem: ah, just a straight match to host meta I see15:27
sean-k-mooneyyes15:27
*** gtema has left #openstack-nova15:27
sean-k-mooneythis was used for think like runt this image on the NFV aggrate15:27
*** pcaruana has quit IRC15:27
dansmithsean-k-mooney: no such restriction with flavors15:27
dansmithmriedem: mostly unrelated to this, I wanted to throw something out there just for maybe future use15:29
mriedemtotally unrelated to this, but there are 2 simple changes below https://review.openstack.org/#/c/570202/ (which i need for cross-cell resize as well as rebuild from cell0) that have a +2 and i'm looking for another core to hit those15:29
dansmithmriedem: when I was reading the two numa specs today I was reminded of something I was thinking about earlier, related to cases where we have instances which store old-format data, like something stuck in their flavor which needs to be migrated15:30
sean-k-mooneydansmith: ok i had tought we still supported "bring your own" metadata key for image too i guess not15:30
*** pcaruana has joined #openstack-nova15:30
dansmithmriedem: things that we would need to online_migrate or something, and things that we would be tempted to resolve with "meh, just migrate all your instances left one rack to clean those up".. kinda like the recent discussion about reshape for that stuff15:31
dansmithmriedem: we might benefit from a "needs upgrade" flag on the instance, that would be shown in detailed list, to admins only,15:31
dansmithwhere I could list all tenants and see instances that have "needs upgrade"15:31
sean-k-mooneyso the old usecase woulould be achive with a member_of request in the flavor extra spec15:32
dansmithanything could set that flag, like compute manager when it notices some legacy data, or even libvirt when it notices something like a disk format that needs to be upgraded or something15:32
mriedemdansmith: if the thing setting the flag has to already calculate that it will set the flag, why not just do the online migration right then?15:32
*** ivve has quit IRC15:33
dansmithmriedem: well, because things like the resource topology thing would need to be done to all instances on the compute node at the same time, and after a config change15:33
dansmithmriedem: referring to the thing stephenfin and jaypipes and bauzas and I were discussing last week15:33
bauzasFWIW, on a 1.5h meeting atm15:34
bauzasbut listening15:34
mriedemso instances a,b,c need an upgrade, but the admin doesn't know how to upgrade them, i.e. is it migrating them, running the online_data_migrations cli, restarting the compute they are on, etc15:34
mriedemif the flag were an enum that's one way15:35
*** belmoreira has quit IRC15:35
dansmithmriedem: yeah, so it'd be nice if we also tagged "issues" to the instance that you whittle down until the upgrade flag goes away, but I think that's too heavy for the moment.. but if we LOG.warning() for each instance that we were setting the flag on, then you would at least have a record15:35
dansmithmriedem: yeah, could be iterative, like service_version15:35
dansmithif we always did those things in order15:36
sean-k-mooneydansmith: could you tie it into a nova status check or something15:36
mriedemi don't think i'd rely on someone catching that warning15:36
openstackgerritMerged openstack/nova stable/rocky: Delete allocations even if _confirm_resize raises  https://review.openstack.org/65214615:36
mriedembefore their logs wrap or something15:36
mriedemdepends on how long they retain stuff in es15:36
mriedemwe already blast the logs with warnings that aren't useful15:37
dansmithmriedem: sure, it just makes it lighter for the first rev of the idea, but if we made it a "schema version" sort of iterative "level" like service_version that could be easy, as you would look up the reason in the decoder ring15:37
cdentIs there something up with the gate today? queue seems long and slow15:38
dansmithand we could translate it in the api to "is current or not" if the version is backlevel (and show the version I guess)15:38
mriedemcdent: it was like that yesterday too15:38
mriedemdansmith: i'm not sure what service_version you're referring to15:38
mriedemjust the Service.version?15:38
dansmithno,15:39
openstackgerritMerged openstack/nova stable/rocky: Don't warn on network-vif-unplugged event during live migration  https://review.openstack.org/65179715:39
dansmithSERVICE_VERSION is a global counter of things we've done to services, so we can tell if they're up to date or not15:39
dansmithwe could have a similar global "instance version" which was a little more like a schema version, where we record whether instance records had been modified for a specific transition15:40
*** pcaruana has quit IRC15:40
dansmiththis is not important right now, I was thinking you'd latch onto this so we could expose more info to the operators about upgraded-ness, but I'll just bring it up the next time it's appropriate, like in those specs15:41
mriedemack15:41
bauzassean-k-mooney: so, about the upgrade impact of the cpu-resources thing, I had an idea15:42
bauzaswhat we could do is potentially leave operators upgrading to Train without any impact15:42
bauzasie. config options act exactly like Stein15:42
bauzas(for the existing ones)15:43
mriedemcdent: http://grafana.openstack.org/d/T6vSHcSik/zuul-status?panelId=3&fullscreen&orgId=115:43
bauzasbut, before you would like to use new config options, you'd have to say "hey, reshape my stuff"15:43
sean-k-mooneybauzas: yes. i suggested not enableing any of the new functionality if the new option was not set15:43
bauzasfor this host15:43
sean-k-mooneyyep15:43
sean-k-mooneyso they would upgrade15:43
bauzasand in this case, we would model the new inventories15:43
cdentmriedem: "data without context can never be information"15:43
sean-k-mooneythen move vms if need and then enable the new funcitonaltiy15:44
bauzasfrom an operator pov, it would mean "upgrade your cloud, upgrade your hosts"15:44
sean-k-mooneyso they could to a rooling config update after the upgrade15:44
mriedemcdent: just saying there is a spike in the check queue so things are maybe slow as a result15:44
bauzaswhen you're done, then upgrade your config when you want15:44
sean-k-mooneyyep15:44
bauzasa nova-status check could do the pre-flight check15:44
sean-k-mooneyi think we are on the same page for that workflow15:44
cdentmriedem: sure, but we knew that already (or at least I did)15:44
sean-k-mooneybauzas: it requires use to support both codepaths in train15:45
sean-k-mooneythen remove teh old one in U15:45
bauzassean-k-mooney: yup, this was my concern15:45
bauzasbut15:45
sean-k-mooneybut i think that will help with upgrades significantly15:45
mriedemcdent: well you didn't tell me that, so who's misinformed now?!15:45
bauzaswith addition to be : trigger a reshape by the operator15:45
bauzasor15:45
bauzastrigger a reshape by the config change15:46
cdentheh. touche15:46
*** pcaruana has joined #openstack-nova15:46
sean-k-mooneybauzas: yep i was going to suggest that too.15:48
sean-k-mooneystephenfin: ^ when you get a chance after this call does the above make sense15:48
mriedemstephenfin: https://review.openstack.org/#/c/651291/315:49
mriedemyou've got a pep8 failure now in your bottom cells v1 removal series15:49
mriedemi pulled it down since i didn't want to wait for zuul15:49
*** gyee has joined #openstack-nova15:50
openstackgerritMerged openstack/nova master: Add minimum value in max_concurrent_live_migrations  https://review.openstack.org/64830215:50
openstackgerritMerged openstack/nova stable/rocky: libvirt: disconnect volume when encryption fails  https://review.openstack.org/65179615:50
openstackgerritMatt Riedemann proposed openstack/nova stable/queens: libvirt: disconnect volume when encryption fails  https://review.openstack.org/65303315:53
melwittdansmith: the other day when you mentioned about how ideally quota usage counting from placement would take max(old, new) flavor for a resize, were you meaning only a same host resize? or also for a move?15:53
dansmithmelwitt: well, definitely for same-host, but I think it's probably reasonable for either too, depending on your view of it15:54
dansmithmelwitt: and depending on the resource type15:55
*** pcaruana has quit IRC15:55
dansmithmelwitt: as a user, I would probably think it's kinda silly to consume sum(old, new) of my resources during a resize of any kind because I'm not able to use both resources simultaneously, and the fact that they're counted against my old instance for a while is just an artifact of how nova does the two-phase resize, you know?15:56
melwittdansmith: ok, thanks. just wanted to make sure I understood. I've been thinking about "how would we do that?" eventually and was thinking, would that be done on the placement side /usages API? when it detects allocations for the same 'instance' type consumer for the same resource class? or would this be something we do on the nova side somehow?15:56
dansmithas an operator, I might want to "charge" the user for those resources until they confirm/revert, but that too is an artifact of how nova behaves15:57
dansmithmelwitt: that would be super nova-specific behavior, so I would not think that should ever go into placement15:57
mnasercan someone point me to where network_info is updated from?15:57
melwittdansmith: yeah, I agree from a user perspective definitely sum(old, new) is weird. but I was feeling unsure because of how we really are holding resources in two places15:58
mnaserI have unsuccessfully been digging the code, and I have a cloud here where a bunch of instances have network_info=[]15:58
mriedemmnaser: for neutron, nova.network.neutronv2.api.API._get_instance_network_info or something like that15:58
dansmithmelwitt: right, that's the operator argument, but it still seems silly to me15:58
dansmithmelwitt: that's kinda overhead and transient15:58
melwittdansmith: ok. that was my assumption but like.... we wouldn't be able to just use the /usages API simply anymore. how would we get all the info about what /usages are part of a resize etc. I can't even think about it right now15:58
dansmithmelwitt: "cost of business"15:59
*** ttsiouts has quit IRC15:59
mriedemmelwitt: i don't think you can unless placement has consumer types15:59
*** ttsiouts has joined #openstack-nova15:59
melwittdansmith: yeah, I agree with that too, overhead and transient15:59
dansmithmelwitt: well, the reserved resources are owned by a migration and not an instance, but yeah, consumer types :)15:59
mriedemmnaser: https://github.com/openstack/nova/blob/master/nova/network/neutronv2/api.py#L182415:59
melwittmriedem: yes, definitely true, need consumer types but even still, I'm not 100% sure that would be enough16:00
mriedemmnaser: that's called from here https://github.com/openstack/nova/blob/master/nova/network/base_api.py#L25316:00
melwittand what would the calls look like to work that out on the nova side16:00
mriedemGET /usages?project_id=foo&consumer_type=instance16:01
mnasermriedem: it doesn't seem like there's a task that just calls get_instance_nw_info() from time to time ,gr16:01
*** tesseract has quit IRC16:01
mriedemmnaser: oh but there is16:01
mriedemmnaser: question is, which release is this cloud on?16:01
mnaserrocky16:01
melwittmriedem: yeah, but then you get a sum of everything. how to pick apart and remove the min(old, new)?16:01
dansmithmelwitt: no16:01
*** pcaruana has joined #openstack-nova16:02
mnaser*something* has set the network info cache to empty.. dunno what/why yet16:02
dansmithmelwitt: that would exclude all the migration-held resources16:02
melwittno :)16:02
mriedemmnaser: i've got the bug for you and patch, sec16:02
melwittoh it would? I guess I didn't know that16:02
melwittoh, because the consumer type would be 'migration'?16:02
dansmithmelwitt: if you only show instance-held resources?16:02
dansmithmelwitt: right, what else would you use consumer types for in this case? :)16:02
mriedemright if you want to know new flavor usage, you'd filter on consumer_type=instance,16:02
mriedemif you want to know old flavor usage, you'd filter on consumer_type=migration16:03
dansmiththe only limitation there would be that you'd get current not max(current, old) but I think that's okay16:03
mriedemunless,16:03
dansmithmax() would be nice to charge them for the most they're potentially going to use at any given point to avoid a revert-to-bigger making them go over quota16:03
mriedemyou then create some more servers filling up your quota and then you can't revert the resize16:03
melwittok, so you'd have to have two queries, one for 'instance' consumer type and one for 'migration' instance type in order to take max(old, new)16:03
mriedemdansmith: jinx16:04
dansmithmelwitt: or get back them grouped by type in one query16:04
*** ttsiouts has quit IRC16:04
melwittok, I see16:04
mriedemmnaser: https://review.openstack.org/#/c/591607/16:05
mriedemhttps://bugs.launchpad.net/nova/+bug/175192316:05
openstackLaunchpad bug 1751923 in OpenStack Compute (nova) "_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server" [Medium,Fix released] - Assigned to Maciej Jozefczyk (maciej.jozefczyk)16:05
mriedemmnaser: that was something from the public cloud wg (ovh worked the fix i started), and our public cloud ops team needed it as well16:05
*** mdbooth has quit IRC16:05
mriedemb/c of the same thing you said - network_info gets wiped out and the heal task wouldn't refresh from neutron, but from the cache itself, which is ... dumb16:05
mriedemnote there is a pretty beefy data migration patch before that in the series16:06
mriedemwhich is why we haven't backported it16:06
mnaserah this is tein16:06
mnaserstein16:06
mnaserpoop16:06
mriedemmnaser: i think you also mentioned something like this a few weeks ago which prompted me to write this https://review.openstack.org/#/c/640516/16:07
mriedem^ seems obvious to me, but when i dug into history where were previous attempts to do the same which were reverted because of "potential race issues" or something16:07
mriedemthose were also many years ago so idk if they'd still exist16:07
mnaserI dunno how this kinda just appeared out of nowhere16:08
mnaserqueens cloud upgraded to rocky and poof16:08
mriedemmnaser: that bug has some scenarios where people hit it16:08
*** pcaruana has quit IRC16:08
mriedemmnaser: https://bugs.launchpad.net/nova/+bug/1751923/comments/4 is the case our ops team hit - changing policy16:09
openstackLaunchpad bug 1751923 in OpenStack Compute (nova) "_heal_instance_info_cache periodic task bases on port list from nova db, not from neutron server" [Medium,Fix released] - Assigned to Maciej Jozefczyk (maciej.jozefczyk)16:09
mnasermriedem: ok so I assume _get_ordered_port_list() will not work in rocky16:09
mnaserbecause of the lack of index16:09
mriedemmnaser: it depends on if the instances were created after mitaka16:10
mriedembecause it relies on the virtual interface record and we didn't start creating those for servers until newton16:10
mriedemhence the online data migration https://review.openstack.org/#/c/614167/16:10
mriedema bug was reported last week for that online data migration as well https://bugs.launchpad.net/nova/+bug/1824435 but so far we don't have a reproducer16:14
openstackLaunchpad bug 1824435 in OpenStack Compute (nova) stein "fill_virtual_interface_list migration fails on second attempt" [High,Triaged]16:14
mnaseroh well16:15
mnaserthis cloud has been running since queens16:15
mnaserso I guess I could get away with running this once?16:15
bauzasdansmith: thanks for your comments on https://review.openstack.org/#/c/650963/16:15
mriedemif all the instances on that cloud were created since at least queens you should be ok16:15
bauzasdansmith: I don't disagree with you and I understand your concerns16:16
mriedemmnaser: you could find out by comparing the instances table count to the virtual_interfaces table or something, i.e. is there at least one vif per instance?16:16
bauzasdansmith: I guess we could just have a 'preferred' policy16:16
bauzasdansmith: would that be okay for you ?16:16
dansmithbauzas: meaning, not add the knob, and just make the behavior always "preferred" right?16:17
stephenfinmriedem: Sorry, I rushed that. I assume if you've pulled it down I should leave it alone16:17
bauzasyup16:17
bauzasno UX16:17
sean-k-mooneyprefered in the context of?16:17
sean-k-mooneyvgpu numa?16:17
bauzasI just need to think about the upgrade tho16:17
bauzasbut I don't think it would be a problem16:18
sean-k-mooneywasnt that the suggeted default in the spec16:18
bauzassean-k-mooney: correct16:18
stephenfinIt's dumb but running tox cripples Bluejeans and any day that I've loads of meetings (like today) means I can't kick off anything locally that's going to run in the background16:18
bauzassean-k-mooney: nope, the default was 'nothing changes'16:18
dansmithbauzas: yes I think that'd be ideal16:18
sean-k-mooneyoh well prefered was just going to be impleneted by a weigher right16:18
bauzasdansmith: okay, I'll -W the spec and work on a new PS16:19
dansmithcool16:19
sean-k-mooneywe will prefer host that can provide numa affinty but not filter out any16:19
sean-k-mooneylike the pci weigher16:19
mnasermriedem: SELECT instances.uuid, COUNT(virtual_interfaces.uuid) FROM instances LEFT JOIN virtual_interfaces ON virtual_interfaces.instance_uuid = instances.uuid WHERE instances.deleted=0 GROUP BY instances.uuid;16:19
mnasershows 1/2 but no 0's16:19
bauzassean-k-mooney: yup, just a way to see whether we can have *some affinity*16:19
bauzasif no affinity possible, fine16:19
mnaserso I guess I can cherry-pick that code and run it once..16:19
sean-k-mooneyyep16:20
sean-k-mooneythat is why i assumed it would be the default16:20
sean-k-mooneybecaue its best effort16:20
mriedemmnaser: cool, let me know how it goes16:20
bauzassean-k-mooney: the concern wasn't really the default16:20
bauzasvalue*16:20
bauzassean-k-mooney: the concern is more whether we want to introduce more knobs, and 'required' needed those16:20
openstackgerritMohammed Naser proposed openstack/nova stable/rocky: Force refresh instance info_cache during heal  https://review.openstack.org/65304016:21
bauzasanyway, I'll just write a new revision, and people could chime on it16:21
stephenfinbauzas, sean-k-mooney: RE: the cpu-resources discussion above, it's not nova-compute that makes the call to placement16:22
bauzasstephenfin: which call are we talking about ?16:22
stephenfinthe claim16:23
bauzasyup, I don't disagree16:23
bauzasbut then I don't get your point16:23
mnaserat least it applies cleanly16:24
* mnaser needs to run to an appointment quickly16:24
bauzasmy upgrade concerns are about when and how we should transform inventories16:24
bauzas(and move allocations accordingly)16:24
stephenfinbauzas: We'd either have to request a certain amount of PCPU resources, which couldn't be fulfilled since we're not reporting any inventory (because the config values haven't been set)16:25
stephenfinOr we'd have to keep requesting VCPU resources for everything, which borks the whole idea16:25
bauzasI tend for the latter16:25
bauzasif people start asking PCPU resources, they necessarly have to be fulfilled by hosts ready to accept them, I don't disagree16:26
bauzasbut then, Train is necessarly a mitigation release16:27
bauzasbecause of computes16:27
bauzaswe did had the same problem with rolling upgrades16:27
bauzasyou can't really make use of a feature unless all computes are up to date in general, in particular when it comes to resources usage16:27
stephenfinbauzas: hmm, that removes our ability to transform 'hw:cpu_policy=dedicated' under the hood though16:29
bauzasthat doesn't mean operators can't use PCPU requests with Train16:29
stephenfinand if a CPU is part of the PCPU pool, it can't be part of the VCPU poool16:29
bauzasbut they absolutely need to converge all their inventories *before* they use the PCPU request16:29
stephenfinwithout the transform in place, any existing instances can't be migrated (they wouldn't be using PCPUs but rather the legacy 'hw:cpu_policy=dedicated' extra spec) plus flavours and images would not be requesting the correct stuff16:31
stephenfinThe point is I think we need that rewriting in place, and if we need that then we need some initial PCPU inventories in place as soon as the upgrade is in place, otherwise the ability to do move existing instances or create new ones is gone :(16:32
stephenfinThis could really do with a call, I think16:32
sean-k-mooneyprobably16:34
*** pcaruana has joined #openstack-nova16:36
bauzasstephenfin: I have to leave in a few, but I'll summarize my thoughts16:36
bauzaswe could just leave existing options as they are and leave inventories be VCPU16:37
bauzasbut16:37
bauzasonce operator sets config options (and then we can discuss on this specific trigger), then we do a reshape for this host and split VCPU inventory into VCPU and PCPU inventories16:38
bauzasand we accordingly move allocations16:38
bauzaswith the slight detail that we verify resources *before* doing the reshape so we can raise an exception16:38
bauzasat compute startup16:38
bauzasfor requests, we somehow need to make sure that we transform requests by using PCPU based on a specific point in time16:39
bauzaseither by providing a flag16:40
bauzasor by verifying the PCPU inventories16:40
bauzasstephenfin: thoughts on that ?16:40
bauzasothers: too16:40
*** rpittau is now known as rpittau|afk16:43
openstackgerritLee Yarwood proposed openstack/nova-specs master: Re-propose stable device rescue for Train  https://review.openstack.org/65115116:43
stephenfinbauzas: If that happens though, then everything has to be done at once.16:45
bauzasstephenfin: that happens what ?16:46
bauzasasking for PCPU ?16:46
stephenfinif we wait until some point in time to start reporting PCPU16:46
stephenfinso if we say the operator setting this configuration option is that point in time, then we must also ensure the operator also twists the knob that says "transform all my legacy extra specs/image meta to PCPU requests"16:47
*** tssurya has quit IRC16:47
efriedWe said we weren't going to allow reshapes except on upgrade boundaries. When "we" said that, I was a dissenting vote. There are a number of specs we're discussing in the current cycle where that is going to need to be re-evaluated.16:48
stephenfin(that knob has to exist because there is no a point in time where we can have no PCPU resources so therefore we can't always transform)16:48
efriedWe must either allow reshapes "any time" (conceivably with a compute service restart), or stick to the above guns and require a host to be cleared out before configuration tweaks are done. The latter means we're not reshaping, just shuffling inventory around in a regular update_provider_tree code path, because we don't have allocations to dork with.16:49
bauzasefried: we said earlier that the latter is highly terrible for ops16:50
*** bryan_stephenson has joined #openstack-nova16:50
bauzasearlier being last week16:50
bauzasefried: so that's why I'm considering a reshape at compute startup16:50
bauzasbased on config flags modification16:50
dansmithefried: what you mean is make reshapes obligatory and not contingent on a config value or something right?16:50
efriedI'm in favor of that.16:50
bauzasand the request knob be ops-driven, I like this16:51
dansmithefried: because I think the two are not mutually exclusive, if the shape of the reshape would be defined by something in config (i.e. how many cpus are dedicated vs. shared)16:51
efrieddansmith: I mean allow reshapes to happen when they need to happen, rather than restricting them to upgrade boundaries. That's what bauzas is talking about as well.16:51
efriedYes, dansmith and bauzas we still need to fail the reshape if it entails moving allocations in an impossible way.16:52
dansmithefried: right I know, but I think there's subtlety here16:52
*** markvoelker has joined #openstack-nova16:52
bauzasefried: to be fair, VGPU reshapes are done on compute startup already, not upgrade: )16:52
bauzasof course, it will in theory run once, after upgrading16:52
dansmithefried: we have to maintain config compatibility, but if if we don't have information in the N-1 config to do the reshape, then we have to be able to punt the reshape (triggered by an upgrade) until after the config is updated16:52
efriedlike if you suddenly specify your PCPU pinset to be empty, but have instances running with dedicated CPUs, that's a fail.16:52
dansmithand, I agree that if you have to change how many pcpus are dedicated, we have to reshape again16:53
efriedI think we're on the same page16:53
bauzasefried: that's why I proposed to check the allocations and inventories *before* providing the reshape16:53
*** ccamacho has quit IRC16:53
bauzasif the operator changes the config, but placement says "sorry but you can't", then the compute will fail to restart16:53
dansmithefried: I think the thing I don't want, which I expressed as "only at upgrade time" is something like we reshape every time we restart compute because we decide we can arrange things better, or some state in the db has changed, but reshape due to a config/structural change makes sense16:54
bauzasif the operator changes the config, and placement resources are okay, then the driver returns a ReshapeNeeded16:54
efrieddansmith: wfm16:54
bauzasand then the new inventories and allocations16:54
bauzasokay, so dansmith, efried and I are on the same page16:54
bauzasthere is one last concern from stephenfin about the config knob for the PCPU request16:55
efriedSo e.g. in the PCPU spec, we're inferring the counts and pinsets of VCPUs vs PCPUs based on existing conf options.16:55
bauzasbut I think it's okay to make the request transformation to be "config-driven"16:55
efriedSo the operator needs to change to the new config in such a way that it *exactly* matches what we inferred, right?16:56
bauzasso, the operator would basically tell when he's okay to count PCPUs (ie. probably after the whole nodes config change)16:56
*** markvoelker has quit IRC16:56
efriedOtherwise we don't just need a reshape (move allocations) - we would also possibly need to re-pin guests to different physical processors and such.16:56
bauzasefried: no, I'm saying that existing config will report VCPUs anyway16:57
bauzas(including options that were asking for pinned cpu)s16:57
bauzasefried: only new config option (explicitely cpu_dedicated_set) will trigger a reshape16:57
efriedyes, I get that bauzas, what I'm saying is, we're going to *infer* VCPU/PCPU counts and pinsets based on legacy conf options; but then the operator wants to cut over to using the new conf options.16:57
bauzasno, I don't want us to infer VCPU and PCPU based on those options because they are errorprone16:58
bauzasefried: ^16:58
bauzasthose options being the legacy ones16:58
efriedoh, that's the basis for the PCPU spec as written at PS24 anyway. Haven't checked since then...16:58
bauzasthat's exactly why I'm saying "don't touch anything until operator explicitely says 'I want cpu_dedicated_set')16:59
*** itlinux has joined #openstack-nova16:59
bauzasold world = VCPU16:59
bauzasnew world = VCPU and PCPU17:00
bauzasfor the request, trigger the request option when you consider having enough hosts to sustain PCPU requests17:00
bauzasanyway, I need to bail out17:00
bauzaskids aren't in town, and I promised some evening to my spouse17:01
efriedI think I see. Did you comment accordingly on the spec?17:01
bauzasefried: I think so17:01
efriedokay.17:01
sean-k-mooneyefried: yes the cpu spec currently does infer but i agree with bauzas that we should not17:03
sean-k-mooneyefried: i think not supporting inference form the old config values would remove much of the upgrade impact or atleast help us too.17:04
efriedI'm good with that. Let's get it reflected in the spec17:05
bauzasI just provided a comment trying to summarize my thoughts17:05
bauzasthis said, calling it a day17:05
sean-k-mooneyo/17:05
efriedimacdonn: I agree with your assessment in https://bugs.launchpad.net/nova/+bug/182443517:06
openstackLaunchpad bug 1824435 in OpenStack Compute (nova) stein "fill_virtual_interface_list migration fails on second attempt" [High,Triaged]17:06
efried(I wanted to say that, but not pollute the bug with it)17:06
mriedemdansmith: you want to check my rolling upgrade validation logic that i dumped in gibi's spec https://review.openstack.org/#/c/652608/4/specs/train/approved/server-move-operations-with-ports-having-resource-request.rst@190 ?17:06
dansmithmriedem: I probably have to read the whole spec to make sense of that huh?17:10
imacdonnefried, which one? i.e. do you think we need to address (2) or would fixing (1) obviate that ?17:11
*** munimeha1 has joined #openstack-nova17:11
*** priteau has quit IRC17:11
efriedimacdonn: fixing (1) would obviate. But that assumes we can do so.17:11
efriedimacdonn: IMO we should fix (1) and change (2) to raise an explicit exception to "guarantee" it.17:12
imacdonnefried, right ... so now I'm trying to understand why the row is being created in the first place17:12
efried++17:12
efriedimacdonn: Is it possible for the rows to differ in any material way?17:12
efried(i.e. a way that makes a difference to the outcome)17:13
mriedemdansmith: not really, it's just the usual "how could this fail during an upgrade"17:13
mriedemdansmith: he needs to pass new parameters to compute rpc api methods,17:13
dansmithwell, I read it and seemed like I needed to understand more, so I'm reading the wholething now17:13
mriedemwhich could be (1) stein computes that don't handle those or (2) rpc pinned so we pop those parameters17:13
mriedemok17:13
imacdonnefried, I'm assuming that _security_group_get_by_names() is used elsewhere (or at least intended to be reusable), so probably should consider all possible use-cases, if we tackle that one17:13
efriedused in two places17:14
mriedemimacdonn: efried: honestly i'm not sure how much relevance that code even has anymore if you're using neutron17:15
imacdonnmriedem, I was wondering about that17:15
mriedemi think it at least means if you're using neutron, every project that ever created an instance in nova has a 'default' security_groups table record that is never used or cleaned up17:16
mriedemvestigial17:16
imacdonnthat seems plausible17:17
dansmithmriedem: see if my words help at all17:22
*** ricolin has quit IRC17:25
openstackgerritStephen Finucane proposed openstack/nova master: Remove '/os-cells' REST APIs  https://review.openstack.org/65129117:26
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 in '/os-hypervisors' API  https://review.openstack.org/65129217:26
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 in '/os-servers' API  https://review.openstack.org/65129317:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'nova-manage cell' commands  https://review.openstack.org/65129417:26
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 for console authentication  https://review.openstack.org/65129517:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove old-style cell v1 instance listing  https://review.openstack.org/65129617:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'bdm_(update_or_create|destroy)_at_top'  https://review.openstack.org/65129717:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_fault_create_at_top'  https://review.openstack.org/65129817:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_info_cache_update_at_top'  https://review.openstack.org/65129917:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'get_keypair_at_top'  https://review.openstack.org/65130017:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_update_at_top', 'instance_destroy_at_top'  https://review.openstack.org/65130117:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove 'instance_update_from_api'  https://review.openstack.org/65130217:26
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling 'update_cells' on 'BandwidthUsage.create'  https://review.openstack.org/65130317:26
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling cells v1 for instance naming  https://review.openstack.org/65130417:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove cells code  https://review.openstack.org/65130617:26
openstackgerritStephen Finucane proposed openstack/nova master: Stop handling 'InstanceUnknownCell' exception  https://review.openstack.org/65130717:26
*** sapd1_x has quit IRC17:26
openstackgerritStephen Finucane proposed openstack/nova master: Remove unnecessary wrapper  https://review.openstack.org/65130817:26
openstackgerritStephen Finucane proposed openstack/nova master: db: Remove cell APIs  https://review.openstack.org/65130917:26
mriedemdansmith: replied but yeah i think so17:27
*** hamzy has joined #openstack-nova17:27
dansmithcool17:27
sean-k-mooneymriedem: so related but sperate form gibis spec should we jsut use teh livemigration multiple port bindign workflow for all move operations17:28
sean-k-mooneymriedem: i mentioned it in gibi's spec but it would simplfy things to have a singel common code path17:29
mriedemgiven the issues NewBruce is hitting idk17:29
mriedemmoving all move operations to that model would be a big change i think, and likely not something we should block gibi's spec on17:30
* mriedem lunches17:30
sean-k-mooneyoh im not suggesting we should block on it17:30
sean-k-mooneyits jsut some/alot of the compleix would be reduced17:31
sean-k-mooneybut NewBruce issue is concering ill grant that17:31
*** ralonsoh has quit IRC17:34
francoisp_alex_xu, hi, would you have time to look at https://review.openstack.org/#/c/648123/6 - thanks!17:35
*** dtantsur is now known as dtantsur|afk17:36
*** dklyle has quit IRC17:45
*** Sundar has joined #openstack-nova17:45
*** priteau has joined #openstack-nova17:50
*** igordc has joined #openstack-nova17:50
*** erlon has quit IRC17:54
*** erlon has joined #openstack-nova17:56
openstackgerritMatt Riedemann proposed openstack/nova master: Do not create default security group during instance create if using Neutron  https://review.openstack.org/65306518:05
mriedemimacdonn: let's see what blows up ^18:05
*** erlon has quit IRC18:07
imacdonnmriedem: hmm .. wouldn't use_neutron be true in most cases, when the migration is being run ?18:09
mriedemyes use_neutron is the default and what 99% of deployments are probably using at this point18:10
imacdonnmriedem: I haz the dumb ... how would this solve the problem?18:11
mriedemwe don't hit the problem code if you're using neutron18:12
imacdonnbut .. I am using neutron, and I do hit the prpblem18:12
mriedemwith this patch18:12
mriedem?18:12
imacdonnno, but the patch only makes a difference if you're not using neutron18:12
imacdonn(?)18:12
mriedemif not NEUTRON: create default sec group18:13
imacdonnoh wait, I had it upside-down18:13
imacdonnyeah OK ... I was about to propose making _security_group_ensure_default() return None if the context has no project_id (which does make the migration work for me)18:14
imacdonnwonder if the migration hits https://github.com/openstack/nova/blob/master/nova/db/sqlalchemy/api.py#L174018:16
imacdonnmriedem: ^ I think it will18:19
*** luksky has joined #openstack-nova18:21
imacdonnmriedem: confirmed that I still hit the problem with your change, for above reason18:22
openstackgerritMatt Riedemann proposed openstack/nova master: Do not create default security group during instance create if using Neutron  https://review.openstack.org/65306518:24
mriedemtry this ^18:24
imacdonnmriedem: that seems to work (at least allow the migration to work)18:28
*** priteau has quit IRC18:33
mriedemi still don't know how to recreate your issue18:35
imacdonnyou'd have to delete the marker instance (the one with uuid 00000000-0000-0000-0000-000000000000)18:35
mriedemok so like,18:36
mriedem1. run data migration,18:36
mriedem2. archive/purge deleted records18:36
mriedem3. run data migration18:36
imacdonnI think you have to have at least one project with at least one instance18:36
imacdonn(that's not deleted, I assume)18:37
*** irclogbot_2 has quit IRC18:39
*** irclogbot_1 has joined #openstack-nova18:41
aspiersefried: haven't been following the channel but I'm around for the next few hours in case you get to reviewing the SEV spec18:42
efriedaspiers: ack18:43
imacdonnmriedem: I'm not sure of the exact sequence that gets me into the bad state to begin with, but it has happened repeatedly (after both upgrade and fresh install) ... to force it, you may have to (delete marker instance, run the migration) twice18:44
mriedemon a fresh install you wouldn't have any instances to migrate so i'm not sure how the marker is getting created18:45
mriedemunless you mean: 1. create a test server, 2. run the migration, 3. delete the marker record 4. run the migration again18:46
imacdonnmriedem: by fresh install, I mean that it was not an upgrade ... so like 1) install, 2) create a test instance ... some time later; 3) run the migration18:48
mriedemyeah ok18:48
melwittrandom question: does a cold migration (no change in flavor) resize need to be resize confirmed like a flavor changing resize does?18:54
mriedemyes18:55
imacdonnLast time I tried, it was required... whether or not it *should* ......18:55
melwittthanks y'all18:56
mriedemimacdonn: well i'm unable to recreate your issue in a functional test but i found a new regression, 500 in the api19:09
*** eharney has quit IRC19:17
cdentmriedem: you're so good at that19:18
imacdonnmriedem: hmm, I was just pondering if maybe the marker instance gets deleted when the last "real" instance for a project is deleted ... in testing that, I just got a "ClientException: Unknown Error (HTTP 504)" - not sure if related19:20
openstackgerritMatt Riedemann proposed openstack/nova master: Add regression test for bug 1825034  https://review.openstack.org/65309819:20
openstackbug 1825034 in OpenStack Compute (nova) "listing deleted servers from the API fails after running fill_virtual_interface_list online data migration" [High,Confirmed] https://launchpad.net/bugs/182503419:20
mriedemimacdonn: would be interested to know why my recreate steps for *your* bug don't hit here ^19:20
openstackgerritMatt Riedemann proposed openstack/nova master: Add regression test for bug 1825034  https://review.openstack.org/65309819:21
openstackbug 1825034 in OpenStack Compute (nova) "listing deleted servers from the API fails after running fill_virtual_interface_list online data migration" [High,Confirmed] https://launchpad.net/bugs/182503419:21
mriedemimacdonn: the marker instance is soft deleted as soon as it's created19:21
mriedemhttps://github.com/openstack/nova/blob/master/nova/objects/virtual_interface.py#L30819:22
imacdonnmriedem: ah, right, so it probably requires archival to have happened to reproduce my original problem19:23
mriedemthat's what my functional test does19:23
mriedembut it doesn't hit your issue19:23
imacdonndoes it create the two null rows in security_groups ?19:24
gmannlooking for review on these 2 specs - https://review.openstack.org/#/c/603969/ https://review.openstack.org/#/c/547850/19:24
gmannmriedem: would you like to give second round review on this (API cleanup)  - https://review.openstack.org/#/c/60396919:24
mriedemimacdonn: that i don't know - this is also sqlite so i'm not sure if sqlite is more strict about null values in constraints than mysql (that would be funny if it is)19:24
mriedemgmann: it's in the queue somewhere19:25
gmannok, thanks19:25
imacdonnmriedem: I wouldn't be surprised if sqlite observes null when checking for unique constraint (which we already established that mysql does not)19:26
imacdonnmriedem: OTOH https://sqlite.org/faq.html#q26 seems to suggest otherwise19:29
*** itlinux has quit IRC19:30
mriedemimacdonn: i added this to the test and the test passes http://paste.openstack.org/show/749386/19:31
*** jmlowe has quit IRC19:33
imacdonnmriedem: is this test running with the admin context ?19:33
imacdonnmriedem: i.e. the one that has no project_id19:34
*** itlinux has joined #openstack-nova19:34
mriedemctxt.project_id is an admin context with no project_id,19:34
mriedemself.api.project_id is a non-null value19:34
*** awaugama has quit IRC19:34
melwittgmann: those are in my queue too19:35
*** sridharg has quit IRC19:36
gmannmelwitt: thanks.19:39
openstackgerritMatt Riedemann proposed openstack/nova master: Add post-test wrinkle to list deleted servers before archive  https://review.openstack.org/65313119:39
imacdonnmriedem: FWIW, I can reproduce my original problem with this sequence: 1) create an instance 2) run migrations 3) archive 4) run migrations19:44
mriedemimacdonn: that's what my functional test does though19:46
mriedembut let me try in devstack19:46
mriedemthe neutron fixture in the functional test is likely not creating any virtual interface records19:46
imacdonnmriedem: yeah, except sqlite vs. mysql .... or .... ?19:46
mriedemsure19:46
mriedemi'll spin up a devstack19:46
mriedemalso need to get up and stretch the legs and get some coffee19:46
mriedemsean-k-mooney: dansmith: btw that mitaka->newton regression mentioned yesterday is a thing, i reported a bug https://bugs.launchpad.net/nova/+bug/182501819:48
openstackLaunchpad bug 1825018 in OpenStack Compute (nova) "security group driver gets loaded way too much in the api" [Low,Triaged]19:48
*** dklyle has joined #openstack-nova19:58
*** jmlowe has joined #openstack-nova20:01
*** bbowen has quit IRC20:03
*** igordc has quit IRC20:05
*** eharney has joined #openstack-nova20:14
*** weshay has quit IRC20:14
*** weshay has joined #openstack-nova20:15
*** pcaruana has quit IRC20:19
*** priteau has joined #openstack-nova20:24
*** priteau has quit IRC20:27
mnasermriedem: backporting fixed it cleanly.20:28
mnaserI have an abandoned backport if anyone wants to cherry pick cause it applies cleanly right now20:28
mnaserLeft a comment too so if someone finds the bug and sees the proposed but then abandoned patch, they’ll see some useful info there20:29
openstackgerritmelanie witt proposed openstack/nova master: Add get_counts() to InstanceMappingList  https://review.openstack.org/63807220:30
openstackgerritmelanie witt proposed openstack/nova master: Count instances from mappings and cores/ram from placement  https://review.openstack.org/63807320:30
openstackgerritmelanie witt proposed openstack/nova master: Use instance mappings to count server group members  https://review.openstack.org/63832420:30
openstackgerritmelanie witt proposed openstack/nova master: Add get_usages_counts_for_quota to SchedulerReportClient  https://review.openstack.org/65314520:30
openstackgerritmelanie witt proposed openstack/nova master: Set [quota]count_usage_from_placement = True in nova-next  https://review.openstack.org/65314620:30
mriedemmnaser: ah so the heal task fixed up the network info cache?20:30
mriedemon that rocky cloud20:30
mnaserYep. Just watched them slowly get repopulated20:30
mriedemnice20:30
mnaserHad to apply on all computes tho which was annoying but yeah20:30
mriedembtw, another reason to not backport that data migration, i just found this https://bugs.launchpad.net/nova/+bug/182443520:31
openstackLaunchpad bug 1824435 in OpenStack Compute (nova) stein "fill_virtual_interface_list migration fails on second attempt" [High,Triaged]20:31
mnaserI probably will redeploy again without it.. if they disappear again.. will raise question20:31
mnaserYeah I saw that too as well. Never ran into that though.. yet :X20:31
mriedemmnaser: run this as admin on your stein cloud: openstack server list --all-projects --deleted20:31
mnasero should I be worried about that20:32
mriedemrunning the command or the bug?20:32
mnaserThe command20:32
mriedemit's read-only20:32
mnaserAs in is there some magical bug we’re about to discover20:32
* mnaser doesn’t feel like more work :p20:32
mriedemheh i've already discovered the bug20:32
mriedemi think you already have it20:33
mnaserBut I can do that later, I also can do it both on a cloud that has db archive enabled and disabled too20:33
mriedemsure the workaround is to archive, but you have to do it after running that migration every time20:33
efriedaspiers: Still hanging around?20:39
mriedemimacdonn: efried: yup, recreated http://paste.openstack.org/show/749391/20:41
efriedwoot, in a func test?20:41
mriedemno, devstack20:41
openstackgerritmelanie witt proposed openstack/nova master: Count instances from mappings and cores/ram from placement  https://review.openstack.org/63807320:41
openstackgerritmelanie witt proposed openstack/nova master: Set [quota]count_usage_from_placement = True in nova-next  https://review.openstack.org/65314620:41
openstackgerritmelanie witt proposed openstack/nova master: Use instance mappings to count server group members  https://review.openstack.org/63832420:41
*** ttsiouts has joined #openstack-nova20:44
efriedmriedem: Well, if you can do it in devstack, you can at least mock it in a func test I guess.20:48
mriedemnot necessarily - func test is using sqlite20:49
mriedemdevstack is using mysql20:49
*** cdent has quit IRC20:50
mriedembut https://sqlite.org/faq.html#q26 suggests sqlite and mysql have the same behavior about how nulls are handled in unique constraints20:51
mriedemefried: but my functional test is doing the same steps https://review.openstack.org/#/c/653098/2/nova/tests/functional/regressions/test_bug_1825034.py20:51
*** hamzy has quit IRC20:54
*** wwriverrat has joined #openstack-nova20:56
imacdonnmriedem: I think it'd be interesting to see if your sqlite db has the duplicate rows after the migration is run the first time ... is it possible to query that? Not sure what conditions the tests run under....20:59
mriedemi've added an assertion for that locally and it's just returning 1 security group for the null project_id21:00
imacdonnOK, so that FAQ is wrong .. or we're misinterpreting it21:01
openstackgerritDakshina Ilangovan proposed openstack/nova-specs master: Resource Management Daemon - Last Level Cache  https://review.openstack.org/65123321:04
openstackgerritDustin Cowles proposed openstack/nova master: Introduces the openstacksdk to nova  https://review.openstack.org/64366421:05
openstackgerritDakshina Ilangovan proposed openstack/nova-specs master: Resource Management Daemon - Last Level Cache  https://review.openstack.org/65123321:12
*** wwriverrat has quit IRC21:13
aspiersefried: back21:22
aspiersalthough it's getting late-ish here21:22
*** mchlumsky_ has quit IRC21:23
cfriesenanyone here know OVMF?  Looks like centos 7.6 has modified the OVMF-20180508-3 rpm to no longer contain the file /usr/share/OVMF/OVMF_CODE.fd that nova looks for in nova/virt/libvirt/driver.py.  Instead it now seems to be named /usr/share/OVMF/OVMF_CODE.secboot.fd21:24
openstackgerritDakshina Ilangovan proposed openstack/nova-specs master: Resource Management Daemon - Last Level Cache  https://review.openstack.org/65123321:24
mnasermriedem: is there a patch/fix for the `openstack server list --all-projects --deleted` thing?21:28
imacdonnmnaser, discussion at https://bugs.launchpad.net/nova/+bug/1825034 , I suppose21:35
openstackLaunchpad bug 1825034 in OpenStack Compute (nova) stein "listing deleted servers from the API fails after running fill_virtual_interface_list online data migration" [High,Confirmed]21:35
mriedemmnaser: i don't have a fix no21:38
mriedemthe workaround is to archive21:38
mriedemi put some thoughts into the bug report but they kind of all suck21:38
mnaseryeah, I went through it, none are really ideal21:38
mnaseris --deleted every supposed to actually return data?21:39
mriedemyeah21:39
mriedemuntil you archive21:39
mnaserI don't remember the openstack api returning deleted records21:39
mnaserbut til I guess21:39
melwittI think that might be the only case when it does21:39
mriedemthere is no guarantee that you'll get results because it depends on how the cloud is setup to archive21:39
mnaserof course21:39
mnaserI just didn't know there was an actual api way21:39
melwittand I agree all of the options suck. I kind of liked the last option mriedem put on the bug but that doesn't help if the migration was only run once (and completed). or maybe the migration could hard delete the marker instance itself if it completed. still, if multiple runs are needed it sucks21:40
mriedemright, so the only way to hit this i think is to filter on all_tenants and deleted, which at least thankfully is admin-only21:40
mriedemcould break some internal tools21:40
mriedembut shouldn't break external users21:40
mnasertbh21:43
mnaserall_tenants+deleted will probably hurt a lot in a bigger environment anyways21:43
mriedemas in spanking your dbs?21:43
imacdonnor just making a lot of output21:44
mnaserboth21:45
imacdonnup to api.max_limit, I guess21:45
mnaserhits the db hard, hits tons of apis hard too21:45
melwittyeah, was gonna say max_limit save the day21:45
mnaseryeah we have "don't do it™" rule but that might be a good stop gap21:45
openstackgerritMatt Riedemann proposed openstack/nova master: Exclude fake marker instance when listing servers  https://review.openstack.org/65315821:51
mriedemwell here is one option ^21:51
imacdonnthe fact that the fake UUID is defined in virtual_interface makes it slightly more icky :/21:52
mriedemmnaser: btw, remind me to bring this up the next time you ask that the online data migrations use markers to be more efficient :)21:53
*** itlinux has quit IRC21:53
mnasermriedem: bahaha21:53
mnaserI think I liked the idea jaypipes proposed of using different migration repos like keystone does21:54
mnaserbut I think that means you can't batch them21:54
melwittlol, touche21:54
dansmithmnaser: the point of that suggestion was to *not* batch them21:54
dansmithmnaser: it also doesn't really work well for translations we have to do in python, which is a lot of them21:54
mnaserah yes I see21:55
mnaserits a lot of rebuilding data21:55
dansmithit just helps us avoid re-running them on presumed idempotentcy21:55
mnaserrather than drop column for add column21:55
dansmithand we could solve that pretty easily with something else21:55
mnasera .... marker?21:55
mnaser:-P21:55
dansmithdoesn't have to be in-band21:55
mriedemspeaking of online data migrations, want to drop this old one now? https://review.openstack.org/#/c/651001/21:55
mnaseryeah21:56
dansmithjust a feature flag sort of "I've converted all the flavors, stop asking me"21:56
dansmithproblem is,21:56
mriedemwe could also be better about following up and cleaning these things up21:56
dansmithif you end up with some older services by accident, you create old data, and stop running the migrations afterwards with no way of cleaning them up21:56
dansmithmriedem: yup21:56
mnaserI guess something that could be neat is opt-in soft delete21:57
mnaseror opt-out21:57
mnaserthat'd probably make life a lot simpler and those online data migrations won't hurt as much, as imho you probably rarely find a million (running) instance cloud, but much more likely to have million (in total) vms21:58
dansmithopt-out from ever showing deleted would be okay, because it's equivalent (api-wise) to running archive in a tight loop in the background21:58
dansmithbut opt-out from the soft deleting at all is harder to do21:58
efriedaspiers: Sorry, I missed you again. Left comments on the SEV review.21:59
aspiersthanks!21:59
*** itlinux has joined #openstack-nova22:07
*** ttsiouts has quit IRC22:10
*** ttsiouts has joined #openstack-nova22:11
*** munimeha1 has quit IRC22:12
*** ttsiouts has quit IRC22:12
mriedemhuh this is fun https://docs.openstack.org/python-openstackclient/latest/cli/command-objects/server.html#server-create22:22
mriedem --config-drive <config-drive-volume>|True22:22
mriedemUse specified volume as the config drive, or ‘True’ to use an ephemeral drive22:22
mriedemas far as i know, that's not at all how that parameter works in the API22:22
mriedemdansmith: has nova ever supported passing a volume id for the config drive to server create?22:24
dansmithnfi22:24
dansmithbut no, doesn't sound familiar to me22:24
mriedemmaybe some special sauce from 2012 https://review.openstack.org/#/c/3994/22:29
*** luksky has quit IRC22:35
*** rcernin has joined #openstack-nova22:38
imacdonnmriedem: speaking of cleaning up after migrations .... I wonder if something should be done to handle those null security_groups rows that would have been created for anyone who has already run the migrations while instances exist ...22:40
mriedemdefinitely maybe22:43
imacdonnseems like the sort of thing that could come back to bite later :/22:43
eanderssonYo22:49
eanderssonhttps://github.com/openstack/nova/commit/35f49f403534e174578dcd1b9ab33daf6f14c3e822:50
eanderssonWe need this in stable/rocky22:50
eanderssonironic_url does not actually do anything in the ironicclient for Rocky22:50
eanderssonSo ironic does not respect regions at all22:50
eanderssonTheJulia, ^22:50
*** tkajinam has joined #openstack-nova22:54
mriedemeandersson: and that's due to https://review.openstack.org/#/c/359061/ in python-ironicclient in rocky?22:56
eanderssonactually it's odd22:57
eanderssonconvert_keystoneauth_opts should fix that22:57
mriedemi'm not sure how easy it is to backport that given it's dependent on ironicclient >= 2.4.022:58
eanderssonYea - let me do a bit more research22:59
TheJuliaeandersson: I was just about to link tot he discussion http://eavesdrop.openstack.org/irclogs/%23openstack-ironic/%23openstack-ironic.2019-04-16.log.html#t2019-04-16T22:39:4923:00
TheJuliato the23:00
*** whoami-rajat has quit IRC23:02
melwittmriedem: fyi, I got some more reviews on https://review.openstack.org/611974, even got a +1 from melissaml. should be good to go23:14
mriedemi can't handle that at this late hour23:17
mriedembut will star it23:17
openstackgerritAdam Spiers proposed openstack/nova-specs master: Re-approve AMD SEV support for Train  https://review.openstack.org/64199423:18
*** tosky has quit IRC23:19
openstackgerritmelanie witt proposed openstack/nova master: Fix assert in test_libvirt_info_scsi_with_unit  https://review.openstack.org/65316823:21
melwittmriedem: thanks. I got antsy with some of the pike changes going to the gate23:21
*** bbowen has joined #openstack-nova23:26
*** mlavalle has quit IRC23:33
*** itlinux has quit IRC23:35
aspierskashyap: definitely need your input on https://review.openstack.org/#/c/641994/6/specs/train/approved/amd-sev-libvirt-support.rst@123 :)23:37
* aspiers goes to bed23:37
*** avolkov has quit IRC23:52

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!