Wednesday, 2019-10-30

*** panda has quit IRC00:02
*** panda has joined #openstack-nova00:05
*** dviroel has quit IRC00:08
*** tkajinam has joined #openstack-nova00:16
*** brinzhang_ has joined #openstack-nova00:41
*** brinzhang has quit IRC00:44
*** ricolin has quit IRC00:45
*** pots has quit IRC00:54
*** pots has joined #openstack-nova00:55
*** ricolin has joined #openstack-nova00:57
*** Liang__ has joined #openstack-nova01:03
*** nanzha has joined #openstack-nova01:07
openstackgerritArthur Dayne proposed openstack/nova master: libvirt:driver:Disallow AIO=native when 'O_DIRECT' is not available  https://review.opendev.org/68277201:16
*** aloga has quit IRC01:24
*** brinzhang has joined #openstack-nova01:29
*** brinzhang_ has quit IRC01:32
*** nanzha has quit IRC01:39
*** nanzha has joined #openstack-nova01:39
*** dviroel has joined #openstack-nova01:45
*** Xuchu has joined #openstack-nova01:46
*** Dinesh_Bhor has quit IRC02:10
*** slaweq has joined #openstack-nova02:13
*** brinzhang has quit IRC02:13
*** brinzhang_ has joined #openstack-nova02:13
*** slaweq has quit IRC02:17
*** slaweq has joined #openstack-nova02:22
*** slaweq has quit IRC02:27
*** brinzhang has joined #openstack-nova02:28
*** brinzhang_ has quit IRC02:31
*** slaweq has joined #openstack-nova02:37
openstackgerritSundar Nadathur proposed openstack/nova master: ksa auth conf and client for Cyborg access  https://review.opendev.org/63124202:41
openstackgerritSundar Nadathur proposed openstack/nova master: Add Cyborg device profile groups to request spec.  https://review.opendev.org/63124302:41
openstackgerritSundar Nadathur proposed openstack/nova master: Create and bind Cyborg ARQs.  https://review.opendev.org/63124402:41
openstackgerritSundar Nadathur proposed openstack/nova master: Get resolved Cyborg ARQs and add PCI BDFs to VM's domain XML.  https://review.opendev.org/63124502:41
openstackgerritSundar Nadathur proposed openstack/nova master: Delete ARQs for an instance when the instance is deleted.  https://review.opendev.org/67373502:41
openstackgerritSundar Nadathur proposed openstack/nova master: Add cyborg tempest job.  https://review.opendev.org/67099902:41
*** slaweq has quit IRC02:52
*** yaawang has quit IRC02:55
*** yaawang has joined #openstack-nova02:56
*** slaweq has joined #openstack-nova02:57
*** brinzhang has quit IRC03:03
*** brinzhang has joined #openstack-nova03:04
*** brinzhang has quit IRC03:08
*** brinzhang has joined #openstack-nova03:09
*** brinzhang has quit IRC03:10
*** brinzhang has joined #openstack-nova03:10
*** slaweq has quit IRC03:12
*** mdbooth has quit IRC03:12
*** slaweq has joined #openstack-nova03:13
*** mdbooth has joined #openstack-nova03:14
openstackgerritMerged openstack/nova master: Make nova-next multinode and drop tempest-slow-py3  https://review.opendev.org/68398803:14
*** slaweq has quit IRC03:22
*** slaweq_ has joined #openstack-nova03:23
openstackgerritQiu Fossen proposed openstack/nova-specs master: Support fuzzy querying instance by tag  https://review.opendev.org/69165103:24
openstackgerritSundar Nadathur proposed openstack/nova-specs master: Updated Nova-Cyborg interaction spec.  https://review.opendev.org/68415103:25
*** slaweq_ has quit IRC03:27
*** slaweq_ has joined #openstack-nova03:32
*** belmoreira has quit IRC03:33
*** slaweq_ has quit IRC03:37
*** psachin has joined #openstack-nova03:37
*** slaweq_ has joined #openstack-nova03:38
*** slaweq_ has quit IRC03:47
*** slaweq_ has joined #openstack-nova03:48
*** mkrai has joined #openstack-nova03:48
*** nweinber__ has joined #openstack-nova03:52
*** slaweq_ has quit IRC03:57
*** dave-mccowan has quit IRC04:02
*** nweinber__ has quit IRC04:15
*** dviroel has quit IRC04:23
*** brinzhang_ has joined #openstack-nova04:25
*** brinzhang has quit IRC04:29
*** abaindur has joined #openstack-nova04:31
*** udesale has joined #openstack-nova04:35
*** brinzhang has joined #openstack-nova04:48
*** brinzhang has quit IRC04:49
*** brinzhang has joined #openstack-nova04:50
*** brinzhang_ has quit IRC04:51
*** brinzhang has quit IRC04:51
*** brinzhang has joined #openstack-nova04:51
*** abaindur has quit IRC04:52
*** brinzhang has quit IRC05:00
*** brinzhang has joined #openstack-nova05:00
*** janki has joined #openstack-nova05:17
*** brinzhang_ has joined #openstack-nova05:24
*** brinzhang has quit IRC05:28
*** brinzhang has joined #openstack-nova05:31
*** brinzhang_ has quit IRC05:34
*** brinzhang has quit IRC05:35
*** brinzhang has joined #openstack-nova05:35
*** larainema has joined #openstack-nova05:39
*** ratailor has joined #openstack-nova05:47
*** links has joined #openstack-nova05:51
*** ileixe has quit IRC05:56
*** ivve has joined #openstack-nova05:57
*** ileixe has joined #openstack-nova05:59
openstackgerritAndreas Jaeger proposed openstack/nova stable/train: Switch to opensuse-15 nodeset  https://review.opendev.org/69203206:04
*** jawad_axd has quit IRC06:05
openstackgerritAndreas Jaeger proposed openstack/nova stable/stein: Switch to opensuse-15 nodeset  https://review.opendev.org/69203306:07
*** ivve has quit IRC06:15
*** Xuchu has quit IRC06:34
*** jawad_axd has joined #openstack-nova06:36
*** Xuchu has joined #openstack-nova06:36
*** igordc has quit IRC06:37
*** sridharg has joined #openstack-nova06:43
*** mkrai has quit IRC06:43
*** brinzhang has quit IRC06:47
*** luksky has joined #openstack-nova06:57
*** dpawlik has joined #openstack-nova06:58
*** dpawlik has quit IRC07:03
*** pcaruana has joined #openstack-nova07:09
*** ircuser-1 has quit IRC07:14
*** dpawlik has joined #openstack-nova07:18
*** mkrai has joined #openstack-nova07:20
*** Luzi has joined #openstack-nova07:22
*** jawad_axd has quit IRC07:32
*** lpetrut has joined #openstack-nova07:35
openstackgerritArthur Dayne proposed openstack/nova master: libvirt:driver:Disallow AIO=native when 'O_DIRECT' is not available  https://review.opendev.org/68277207:39
*** jawad_axd has joined #openstack-nova07:42
*** sridharg has quit IRC07:44
*** tkajinam has quit IRC08:03
*** maciejjozefczyk has joined #openstack-nova08:12
*** mkrai has quit IRC08:13
*** ratailor has quit IRC08:16
*** ratailor has joined #openstack-nova08:18
*** damien_r has joined #openstack-nova08:20
*** jraju__ has joined #openstack-nova08:21
*** links has quit IRC08:21
*** jraju__ has quit IRC08:22
*** links has joined #openstack-nova08:26
*** slaweq_ has joined #openstack-nova08:30
*** janki has quit IRC08:46
*** jchhatbar has joined #openstack-nova08:47
alex_xugibi: good morning08:50
gibialex_xu: good morning08:50
alex_xugibi: just want to check the plan of project update08:51
gibialex_xu: for me the slides looks OK08:51
alex_xuyes, I think it is ok also08:51
alex_xudo we need to expand the U release, I guess there are more specs08:51
alex_xubut can't sure which is for sure can be accepted08:52
alex_xugibi: so who responsible for which pages?08:52
gibialex_xu: let me check what are the freshly approved spec...08:53
*** ivve has joined #openstack-nova08:53
gibialex_xu: I think we should only mention already approved specs08:53
alex_xucool08:53
gibialex_xu: regarding spliting the work. I'm OK if you or stephenfin talks about the generic part of the beginning. I can talk about couple of features on slides 5 - 9. I think we can note who talk about what in the speaker notes08:55
*** jangutter has joined #openstack-nova08:56
alex_xugibi: ok, cool, I guess all we familar with different features08:56
gibialex_xu: yes definitly08:56
alex_xugibi: looks like stephenfin is on vacation already, probably we need to catch him at Shanghai :)08:57
gibialex_xu: I've just checked we can less specs approved than what is already mentioned on the Ussuri slides so I think we dont need to extend that08:58
*** rpittau|afk is now known as rpittau08:58
*** ralonsoh has joined #openstack-nova08:58
gibialex_xu: yes stephenfin is already on vacation. I will travell on Friday. Lets try to sit together at some point during Monday next week08:59
alex_xugibi: ok, no problem. the features in slides looks like a feature begin from previous release, and looks like more ensure to be accepted08:59
gibialex_xu: I will add some speaker notes now about what I can talk about08:59
alex_xugibi: cool09:00
*** dtantsur|afk is now known as dtantsur09:01
*** vesper has joined #openstack-nova09:08
*** vesper11 has quit IRC09:09
*** slaweq_ is now known as slaweq09:18
*** trident has quit IRC09:26
*** dpawlik has quit IRC09:33
alex_xugibi: it is time to assign everything else to Stephen :)09:34
gibialex_xu: yeah. I guess that is the default behavior :)09:34
*** trident has joined #openstack-nova09:34
alex_xuhah09:34
gibialex_xu: I will drop a mail to you and stephen to summarize what we talked here now, and ask stephenfin about a sitdown on Monday09:35
alex_xugibi: cool, thanks09:35
*** Liang__ has quit IRC09:36
*** elod has quit IRC09:37
*** ratailor_ has joined #openstack-nova09:38
*** tesseract has joined #openstack-nova09:38
*** ratailor has quit IRC09:41
*** dpawlik has joined #openstack-nova09:48
*** cgoncalves has quit IRC09:49
*** nanzha has quit IRC09:51
*** Xuchu has quit IRC09:55
*** elod has joined #openstack-nova09:57
*** cgoncalves has joined #openstack-nova09:58
*** nanzha has joined #openstack-nova10:02
*** aloga has joined #openstack-nova10:22
*** gryf has quit IRC10:27
*** rcernin has quit IRC10:27
*** gryf has joined #openstack-nova10:29
*** tbachman has quit IRC10:34
*** pcaruana has quit IRC10:35
*** tbachman has joined #openstack-nova10:38
*** openstackstatus has quit IRC10:44
*** mgoddard has quit IRC10:46
*** tbachman has quit IRC10:46
*** mgoddard has joined #openstack-nova10:47
*** sulo has joined #openstack-nova10:55
*** sulo has left #openstack-nova10:56
*** panda is now known as panda|pto11:00
*** arxcruz is now known as arxcruz|lunch11:10
*** jchhatbar has quit IRC11:14
openstackgerritSilvan Kaiser proposed openstack/nova master: Move Nova Quobyte driver to LibvirtMountedFileSystemVolumeDriver  https://review.opendev.org/68706611:19
*** brinzhang has joined #openstack-nova11:34
*** dpawlik has quit IRC11:41
*** dviroel has joined #openstack-nova11:51
*** ratailor_ has quit IRC11:56
*** brinzhang_ has joined #openstack-nova11:57
*** pcaruana has joined #openstack-nova12:00
*** brinzhang has quit IRC12:01
*** brinzhang has joined #openstack-nova12:01
*** brinzhang_ has quit IRC12:02
*** rouk has quit IRC12:02
*** brinzhang has quit IRC12:03
*** brinzhang has joined #openstack-nova12:03
*** dpawlik has joined #openstack-nova12:08
*** larainema has quit IRC12:09
*** tbachman has joined #openstack-nova12:09
*** dpawlik has quit IRC12:12
*** arxcruz|lunch is now known as arxcruz12:18
*** AdamMork has joined #openstack-nova12:18
*** jawad_axd has quit IRC12:24
*** jawad_axd has joined #openstack-nova12:24
*** jawad_axd has quit IRC12:25
*** markvoelker has quit IRC12:25
*** markvoelker has joined #openstack-nova12:25
*** jawad_axd has joined #openstack-nova12:26
*** AJaeger has joined #openstack-nova12:26
*** udesale has quit IRC12:26
AJaegernova stable cores, please review https://review.opendev.org/692032 to update train opensuse job so that Infra can retire openSUSE 15.012:27
*** brinzhang_ has joined #openstack-nova12:28
*** jawad_ax_ has joined #openstack-nova12:28
*** brinzhang has quit IRC12:30
*** jawad_axd has quit IRC12:31
*** jawad_ax_ has quit IRC12:33
*** nweinber__ has joined #openstack-nova12:35
*** jmlowe has quit IRC12:36
*** Sundar has joined #openstack-nova12:36
*** mvkr has quit IRC12:38
*** brinzhang has joined #openstack-nova12:38
*** brinzhang_ has quit IRC12:41
*** READ10 has joined #openstack-nova12:50
*** jawad_axd has joined #openstack-nova12:51
*** jawad_axd has quit IRC12:53
*** jawad_ax_ has joined #openstack-nova12:53
*** jawad_axd has joined #openstack-nova12:54
*** jawad_ax_ has quit IRC12:57
*** eharney has quit IRC12:57
*** jawad_axd has quit IRC13:00
*** jawad_axd has joined #openstack-nova13:01
*** jawad_axd has quit IRC13:01
*** jawad_axd has joined #openstack-nova13:02
*** psachin has quit IRC13:02
openstackgerritBrin Zhang proposed openstack/nova-specs master: Support re-configure deleted_on_termination in server  https://review.opendev.org/58033613:10
*** jmlowe has joined #openstack-nova13:11
*** mvkr has joined #openstack-nova13:12
*** xek has joined #openstack-nova13:13
*** luksky has quit IRC13:19
*** ygk_12345 has joined #openstack-nova13:20
*** tbachman has quit IRC13:20
*** mriedem has joined #openstack-nova13:22
*** tbachman has joined #openstack-nova13:23
*** brinzhang_ has joined #openstack-nova13:27
*** brinzhang_ has joined #openstack-nova13:28
*** brinzhang has quit IRC13:30
*** READ10 has quit IRC13:30
mriedemsmcginnis: queens-em tag patch is up https://review.opendev.org/#/c/692142/13:46
mriedemelod: ^13:46
*** artom has quit IRC13:48
*** tbachman has quit IRC13:49
smcginnisThanks mriedem13:50
smcginnisDid that last one make it, or still stuck in recheck hell?13:50
*** tbachman has joined #openstack-nova13:51
*** luksky has joined #openstack-nova13:52
*** dpawlik has joined #openstack-nova13:54
mriedemmerged late last night13:56
smcginnisNice!13:58
*** dave-mccowan has joined #openstack-nova14:01
*** eharney has joined #openstack-nova14:03
*** tbachman has quit IRC14:04
elodmriedem: thanks, i'm a bit sad that gibi's patch sets couldn't make it, but of course, those are not crucial to be part of the final release o:)14:06
*** tssurya has joined #openstack-nova14:06
elodso, the final and em patch looks good, thanks!14:06
*** tbachman has joined #openstack-nova14:07
*** tbachman has quit IRC14:20
*** jaosorior has joined #openstack-nova14:21
*** tbachman has joined #openstack-nova14:21
openstackgerritMerged openstack/nova master: Log some stats for image pre-cache  https://review.opendev.org/68817314:26
*** mlavalle has joined #openstack-nova14:29
*** ganso has quit IRC14:29
*** dasp has quit IRC14:32
*** artom has joined #openstack-nova14:34
*** ygk_12345 has quit IRC14:35
mriedemsomeone want to take a pass at mel's host_status UNKNOWN policy change patch? i'm +2 on it. https://review.opendev.org/#/c/679181/14:39
mriedemsame with gibi's evacuate support for qos ports series https://review.opendev.org/#/q/topic:bp/support-move-ops-with-qos-ports-ussuri+status:open14:39
mriedemdansmith: when you get a chance it'd be cool if you could take a pass at my re-worked fix for the cross_az_attach=False bug fix https://review.opendev.org/#/c/469675/14:40
mriedemshould be 35% less gross than before14:41
dansmithmriedem: enqueued14:41
dansmith35.0%? pretty precise grossness measurement14:41
mriedemi was in the lab all last night14:41
dansmithheh14:41
*** dasp has joined #openstack-nova14:42
efriedmriedem: I was working through that14:42
*** links has quit IRC14:42
efriedI have some questions, so dansmith if you wouldn't mind holding off approving it...14:42
mriedemefried: which? the az one?14:43
efriedsorry, the host_status-unknown one14:43
mriedemoh14:43
mriedemdan and i are talking about the az one14:43
efriedack14:43
sean-k-mooneydansmith: thanks for reviewing the image metadta prefilter series. ill try and adress your comments this week14:55
dansmithsean-k-mooney: np14:56
mriedemeasy multi-cell functional test-only change that's been sitting a long time https://review.opendev.org/#/c/452006/14:56
mriedem2.5 years...14:56
dansmithlet's not rush that one14:57
*** JamesBenson has joined #openstack-nova15:00
efriedmriedem, melwitt: I'm +2 on the host_status UNKNOWN policy patch, but would like some help understanding a thing before +Wing. (No hurry)15:00
*** xek_ has joined #openstack-nova15:03
efriedmriedem: how come those other release patches are stacked on the queens-em one?15:03
dansmithmriedem: for N volumes your patch also avoids making N volume.get() calls instead opting for the bulk query once yeah?15:03
mriedemno particular reason, laziness15:03
efriedight15:04
mriedemdansmith: yeah - which is where i mentioned i could split that part out if it helps15:04
dansmithoh wait15:04
dansmithit actually just does volume.get() on each one in _get_volumes_for_bdms()15:05
*** xek has quit IRC15:05
dansmithI was thinking this was an improvement, but it just moves where you make all the individual calls15:05
*** eharney has quit IRC15:05
*** jaosorior has quit IRC15:06
*** READ10 has joined #openstack-nova15:06
*** efried1 has joined #openstack-nova15:07
mriedemefried: i replied in https://review.opendev.org/#/c/679181/15:07
*** efried has quit IRC15:08
*** efried1 is now known as efried15:08
mriedemdansmith: yeah, i mentioned using GET /volumes with a list but as commented i don't think that would work and i should remove that comment15:08
*** READ10 has quit IRC15:08
dansmithyeah, sorry, just pulling all this context back in15:08
mriedemit does mean we'd avoid multiple GETs on the same volume, but you can't really boot from volume with multiple servers and the same pre-existing volume anyway15:09
mriedemunless it's a multiattach volume anyway15:09
melwittefried: let me fix it, I think I just spaced bc gmann added those two in a patch below mine15:10
mriedemmelwitt: do it in a follow up15:10
mriedemit's hard enough getting things through the gate right now15:11
melwittok15:11
efriedI just want to understand why the tests work15:11
efriedstill reading mriedem's reply...15:11
mriedemefried: the policy docs are just an omission15:11
mriedemnote that those are just docs...not code15:11
efriedoh, that's just a doc thing? (/me doesn't really understand policy)15:11
efriedgot it.15:11
mriedemright, the policy rule is real, the API method / route stuff is docs15:11
*** Sundar has quit IRC15:12
openstackgerritBalazs Gibizer proposed openstack/nova master: Add functional test for two-cell scheduler behaviors  https://review.opendev.org/45200615:12
mriedemah gdi, fake_nodes15:13
efriedmelwitt, mriedem: +W. If you want to hit any of that other stuff in the fup too, feel free, but none of it was super important.15:13
melwittok, thanks15:14
gibimriedem: fix is on the way, and then I will +2 it15:14
efriedmelwitt: also, wouldn't hurt to update the bp text for at least these two things:15:14
efried- the name of the policy rule15:14
efried- the fact that the host_status field is omitted, not included as ""15:14
efried(there turned out to be no spec for this, right?)15:15
melwittah, right, will do15:15
efriedthx15:15
melwittyeah no spec15:15
openstackgerritBalazs Gibizer proposed openstack/nova master: Add functional test for two-cell scheduler behaviors  https://review.opendev.org/45200615:15
*** dpawlik has quit IRC15:16
dansmithmriedem: question for you in there, and figure I'm standing by for you to remove that comment and split the patches if you're going to15:18
*** eharney has joined #openstack-nova15:19
*** elod has quit IRC15:19
*** jawad_axd has quit IRC15:19
mriedemi'll split the patches if needed but not keen to if i can help it, at least for the volume GET stuff since that touches a lot of unit tests. the config option docs and such are easy to split out though, but also minor.15:19
dansmithmriedem: yep I figured, hence my comment about it :)15:19
*** elod has joined #openstack-nova15:19
dansmithit's easy enough to see what test changes are related to the refactor so I'm not too concerned15:20
dansmithmriedem: are you planning to backport this?15:20
*** JamesBen_ has joined #openstack-nova15:21
*** lpetrut has quit IRC15:22
*** ganso has joined #openstack-nova15:23
*** JamesBenson has quit IRC15:23
*** gyee has joined #openstack-nova15:24
*** Luzi has quit IRC15:25
mriedemdansmith: replied, it's about pinning the instance to the default 'nova' zone which is a no-no15:26
mriedemdansmith: backporting it would probably be difficult, at least past train15:27
*** brinzhang_ has quit IRC15:27
mriedemi guess it depends on how much people/distros need it, but it's extremely latent and from talking to smorrison about this he says his users just have to always specify an az15:28
dansmithmriedem: okay, I didn't think about the cinder default being 'nova'15:28
dansmithmriedem: sure, was just curious15:28
*** brinzhang_ has joined #openstack-nova15:28
*** brinzhang_ has quit IRC15:29
*** brinzhang_ has joined #openstack-nova15:29
mriedemactually it won't even backport cleanly to train b/c of https://review.opendev.org/#/c/667133/15:29
*** brinzhang_ has quit IRC15:29
mriedemthat's pretty trivial to resolve though15:29
*** brinzhang_ has joined #openstack-nova15:30
openstackgerritMatt Riedemann proposed openstack/nova master: Add support matrix for Delete (Abort) on-going live migration  https://review.opendev.org/62578115:33
*** brinzhang_ has quit IRC15:34
artommriedem, can https://review.opendev.org/#/c/649419/  be expected to merge? I want to know if I can backport it internally to Newton15:35
dansmithmriedem: yeah, it's cool, I just kinda thought you'd be wanting that, is all15:35
artomOr should I stick to the merged pike version15:35
openstackgerritMatthew Booth proposed openstack/nova master: Cleanup libvirt test_mount unit tests  https://review.opendev.org/69217315:37
openstackgerritMatthew Booth proposed openstack/nova master: Allow alternate implementations of mount/umount in _HostMountState  https://review.opendev.org/69217415:37
mriedemartom: ask your stable core brethren15:38
artomlyarwood, around? plz to be lookink at https://review.opendev.org/#/c/649419/ :)15:39
mriedemdansmith: i'm cool with not backporting it and letting it bake on master. if let's say some edge distro product thing needed it then i could be convinced to backport...15:39
artomlyarwood, it would facilitate the backport for https://bugzilla.redhat.com/show_bug.cgi?id=1633909, which we want for OSP10z1415:39
openstackartom: Error: Error getting bugzilla.redhat.com bug #1633909: NotPermitted15:39
dansmithmriedem: heh15:39
artomopenstack, apologies15:39
artomAh, he took the whole day off as PTO15:40
mriedemthere are others15:40
mriedemthere is one in france always saying he's happy to review backports15:40
*** jmlowe has quit IRC15:41
artomHe's on PTO as well15:41
mriedemjesus15:42
dansmithartom: I can hit that if you want15:42
artomdansmith, thank you :)15:42
* melwitt slips under the radar15:42
dansmithseems reasonable, it's a nasty issue, if we still care about it15:42
artommelwitt, you were next on the list! But dansmith preempted tht15:42
artom*that15:42
melwittwoohoo15:42
openstackgerritEric Fried proposed openstack/nova master: Only allow one scheduler service in tests  https://review.opendev.org/68248615:42
dansmithartom: I just couldn't stand to see where it was going with mriedem calling out all the team15:44
artomdansmith, ... it was going to melwitt :P15:44
dansmithartom: right, who I figured was stranded with no services in the third world country we call California15:45
artomAnd on fire.15:45
melwitthaha15:46
mriedemyou can always bug tonyb https://review.opendev.org/#/admin/groups/540,members15:46
openstackgerritEric Fried proposed openstack/nova master: Only allow one scheduler service in tests  https://review.opendev.org/68248615:47
melwittI have learned to greatly appreciate when I have internet access15:48
mriedemmy grandpappy always said internet is a privilege not a right15:48
mriedemyou know this has to be done: https://southpark.cc.com/clips/x2wsii/is-there-internet-here15:49
melwittlol15:51
*** jaosorior has joined #openstack-nova15:51
dansmiththat's pretty much how I picture california in my head15:51
*** luksky has quit IRC15:52
melwittthat was me driving to different coffee shops and libraries looking for wifi. first three places I tried didn't have internet15:54
melwitt"is there internet here??"15:55
artommelwitt, actually, you might still be on the hook for https://review.opendev.org/#/c/649419/ and the patch above it, looks like dansmith didn't +W15:55
artomTwo RH +2s for a non-RH backport is cool, right?15:55
melwittyeah15:55
dansmithartom: yeah I figured if she was around she'd be hitting it after I'm done15:55
artomI misled her by explicitly told her you'd preempted her15:56
artomMea culpa15:56
melwitt:)15:56
*** mkrai has joined #openstack-nova15:56
dansmithmriedem: aren't you respinning the az one to remove the bulk comment? that's why I didn't +216:02
*** davee_ has joined #openstack-nova16:03
mriedemdansmith: yeah i will, was busy abandoning old patches from my dashboard16:08
*** ivve has quit IRC16:10
*** nanzha has quit IRC16:12
openstackgerritMatt Riedemann proposed openstack/nova master: WIP: Reset vm_state to original value if rebuild claim fails  https://review.opendev.org/69218516:15
melwittmriedem: do you remember how is REQUIRES_LOCKING related to NeutronFixture? I couldn't find a note about it in nova/test.py https://review.opendev.org/64941916:19
mriedemmelwitt: i think it's related to https://review.opendev.org/#/c/649385/ and how if you're using nova-net you have to require locking b/c of the network manager16:21
*** igordc has joined #openstack-nova16:21
mriedembut it's been a long time since i did that backport16:21
melwittok, was just curious16:21
openstackgerritMatt Riedemann proposed openstack/nova master: Default AZ for instance if cross_az_attach=False and checking from API  https://review.opendev.org/46967516:22
mriedemdansmith: ^16:22
*** mvkr has quit IRC16:31
*** ricolin has quit IRC16:41
*** jawad_axd has joined #openstack-nova16:44
openstackgerritEric Fried proposed openstack/nova master: Use SDK for add/remove instance info from node  https://review.opendev.org/65969116:46
openstackgerritEric Fried proposed openstack/nova master: Use SDK for getting network metadata from node  https://review.opendev.org/67021316:46
*** jawad_axd has quit IRC16:49
*** links has joined #openstack-nova16:50
*** links has quit IRC16:50
*** damien_r has quit IRC16:51
AJaegernova stable cores, please review https://review.opendev.org/692032 to update train opensuse job so that Infra can retire openSUSE 15.016:57
*** rpittau is now known as rpittau|afk16:59
*** mdbooth has quit IRC16:59
*** openstackstatus has joined #openstack-nova17:03
*** ChanServ sets mode: +v openstackstatus17:03
*** jmlowe has joined #openstack-nova17:05
mriedemdone17:09
*** tssurya has quit IRC17:12
*** mdbooth has joined #openstack-nova17:13
*** dlbewley has joined #openstack-nova17:15
openstackgerritMatt Riedemann proposed openstack/nova master: Reset vm_state to original value if rebuild claim fails  https://review.opendev.org/69218517:17
artommelwitt, dansmith thanks you folks for the backport reviews!17:18
*** mvkr has joined #openstack-nova17:21
*** JamesBen_ has quit IRC17:23
*** dpawlik has joined #openstack-nova17:23
*** JamesBenson has joined #openstack-nova17:24
*** Garyx_ has quit IRC17:25
*** xek_ has quit IRC17:29
*** mdbooth has quit IRC17:33
*** mdbooth has joined #openstack-nova17:34
*** dtantsur is now known as dtantsur|afk17:49
*** Garyx has joined #openstack-nova17:51
*** mlavalle has quit IRC17:53
*** macz has joined #openstack-nova18:04
mriedemdansmith: i think i found a justification for https://review.opendev.org/#/c/669545/ now - https://bugs.launchpad.net/nova/+bug/185068218:09
openstackLaunchpad bug 1850682 in OpenStack Compute (nova) "functional tests in rocky randomly fail with "Build of instance was re-scheduled: Cannot modify readonly field uuid"" [Undecided,New]18:09
dansmithaight18:10
*** pcaruana has quit IRC18:10
*** nweinber__ has quit IRC18:10
openstackgerritMatt Riedemann proposed openstack/nova master: Nova compute: add in log exception to help debug failures  https://review.opendev.org/66954518:10
*** eharney has quit IRC18:14
*** mkrai has quit IRC18:14
openstackgerritEric Fried proposed openstack/nova master: Use annotated ddt for test_cpu_policy_constraint  https://review.opendev.org/69220518:18
* artom can't seem to POST a bug to launchpad18:18
artomSome sort of timeout18:19
artomAnd their stupid interface means hitting the back button loses all you bug text18:19
artomIt's still saved by firefox as part of the POST request though, so if I just keep hitting refresh and re-sending...18:19
openstackgerritArtom Lifshitz proposed openstack/nova master: Avoid error 500 on shelve task_state race  https://review.opendev.org/69220618:22
artomThere we go18:24
artommdbooth, sean-k-mooney ^^ keep me honest18:24
artomThere was a definite lack of hairy tentacles18:24
artomDid I miss something?18:24
*** tbachman has quit IRC18:25
*** luksky has joined #openstack-nova18:26
*** dpawlik has quit IRC18:26
mriedemartom: yeah just refresh on lp until it goes through18:30
AJaegerthanks, mriedem18:31
*** tbachman has joined #openstack-nova18:34
mriedemartom: question on your patch18:34
*** dpawlik has joined #openstack-nova18:37
*** AdamMork has quit IRC18:40
artommriedem, cheers, replied18:41
artom(To one of the things, at any rate)18:43
mriedemyeah, this is just whack a mole18:44
mriedemwhich i'm not really against but18:44
mriedemwe could also just have a generic decorator in the api methods that handles this exception and returns HTTPConflict so we don't have to do the dance in the wsgi controller route handler code18:45
artomYep18:45
artomI wasn't aware of the widespreaded-ness of this18:45
artomSuch a decorator would also remain backportable18:45
mriedemyeah and no maybe,18:46
mriedemnot all route handlers have 409 in their @wsgi.expected_errors((404, 409)) decorator18:46
mriedemi don't know how much that matters honestly - i wouldn't be surprised if we leak things through in some cases18:47
artomWe could always do the wait and see thing18:47
artomWack this mole, and if another mole pops up, think about a more generic appraoch18:47
mriedemthat's what i'd probably do18:48
mriedemno point overengineering this now18:48
artomAck18:48
* artom continues whacking the mole18:48
*** mlavalle has joined #openstack-nova18:53
umbSublimeIs there any reference documentation to help fill up a new blueprint (to avoid wasting peoples time)19:00
*** abaindur has joined #openstack-nova19:01
mriedemumbSublime: https://docs.openstack.org/nova/latest/contributor/blueprints.html19:03
*** tesseract has quit IRC19:03
*** eharney has joined #openstack-nova19:03
*** pcaruana has joined #openstack-nova19:05
*** ivve has joined #openstack-nova19:06
efriedmriedem: added comment as requested to https://review.opendev.org/#/c/682486/ and gibi is +2, wanna send it?19:14
mriedemhmmm, maybe19:15
dansmithcah-ripes19:15
dansmithchange failed in the gate and rechecked 4.5h ago, still hasn't started a single job.19:15
mriedemthe gate is not happy19:16
mriedembuttloads of ssh fails in guests for different reasons,19:16
*** dpawlik has quit IRC19:16
mriedemthe timed out talking to cell db thing,19:16
mriedemhttps read timeout errors19:16
artom    b'Expected: save(expected_tast_state=[None])'19:16
artom    b'Actual:   save(expected_task_state=[None])'19:16
artom*sigh*19:16
artomOh!19:16
* artom fails19:16
openstackgerritArtom Lifshitz proposed openstack/nova master: Avoid error 500 on shelve task_state race  https://review.opendev.org/69220619:19
* artom -> school run19:21
*** artom has quit IRC19:21
*** jaosorior has quit IRC19:21
*** abaindur has quit IRC19:21
*** abaindur has joined #openstack-nova19:21
*** tbachman has quit IRC19:23
openstackgerritMerged openstack/nova master: api-ref: remove mention about os-migrations no longer being extended  https://review.opendev.org/68210219:28
*** pcaruana has quit IRC19:29
*** jawad_axd has joined #openstack-nova19:29
efriedmriedem: are there other projects that are running tempest-slow that could stop doing that, similar to  https://review.opendev.org/683988 ?19:32
*** jawad_axd has quit IRC19:34
*** READ10 has joined #openstack-nova19:36
mriedemefried: don't know19:36
*** pcaruana has joined #openstack-nova19:39
mriedemefried: i'd like to remove that part of the comment that gibi pointed out and then approve, is that ok with you?19:43
efriedsure19:43
openstackgerritMatt Riedemann proposed openstack/nova master: Only allow one scheduler service in tests  https://review.opendev.org/68248619:44
openstackgerritMerged openstack/nova master: Refactor volume connection cleanup out of _post_live_migration  https://review.opendev.org/68274119:54
openstackgerritMerged openstack/nova master: Move pre-3.44 Cinder post live migration test to test_compute_mgr  https://review.opendev.org/68359719:55
openstackgerritMatt Riedemann proposed openstack/nova master: Avoid error 500 on shelve task_state race  https://review.opendev.org/69220619:57
*** pcaruana has quit IRC20:09
*** abaindur has quit IRC20:13
*** abaindur has joined #openstack-nova20:13
*** abaindur_ has joined #openstack-nova20:17
*** abaindur has quit IRC20:18
*** abaindur_ has quit IRC20:19
*** abaindur has joined #openstack-nova20:19
*** READ10 has quit IRC20:27
*** adriant has quit IRC20:31
*** aloga has quit IRC20:36
*** artom has joined #openstack-nova20:45
*** ralonsoh has quit IRC20:46
*** tbachman has joined #openstack-nova20:47
dansmithefried: asking because I think I've seen some notes from you around this code in the past,20:52
dansmithbut do you have strong opinions on what we should do if we have, say, three glance endpoints and we get an error from one trying to do something?20:52
efrieddansmith: we should not have three glance endpoints20:52
dansmithlooks like we round-robin the available endpoints, but don't move onto the next one after a failure20:52
efriedunless you mean three interfaces20:53
dansmithefried: three urls20:53
efriedf, did we not get rid of that api_servers thing yet?20:53
efriedstand by20:53
dansmithno, the round-robin-ing you mean? that's still there20:53
dansmithI honestly thought that we had pushed responsibility for that down into the glance client, but ... the round-robin stuff is there20:54
efriedomg we haven't even deprecated it yet!20:54
efriedwe should do that.20:54
dansmithefried: use more words.. what are the people currently using that supposed to do?20:54
efriedsorry20:54
efriedokay20:54
*** markvoelker has quit IRC20:55
efriedpeople using api_servers were theoretically stuffing multiple endpoints in there, and then they were responsible for making sure all of those endpoints pointed to the same images.20:55
*** markvoelker has joined #openstack-nova20:55
dansmithright20:55
efriednova would, on a given call, pick "the next one" and use it.20:55
efriedafaik nova never had any logic to, say, "try the next one" if one failed.20:56
dansmithright, it doesn't, but if we can have multiples we should, but.. go on20:56
efriedBecause nova doesn't keep track of how many there are. It just loads them up and cycles over them.20:56
efriedso if we had only one, we still "cycle" over just the one.20:56
efriedand if we tried to add a "try next" thing, we would just end up trying... the same one.20:56
dansmithwell, all we'd have to do is retry N-1 times for an endpoint count of N and then we'd hit them all and not dupe one or more20:57
efriedif api_servers was a thing we still wanted people to use, I could see enhancing our logic to do that properly - keep track of how many there are and, on certain classes of error, try the next and so on until we succeed or come full circle20:57
efriedbut20:57
efriedwe don't want people to use api_servers.20:57
dansmithand that is because why?20:57
efriedThey should use a single endpoint that's a front for a load balancer20:57
efriedand use standard ksa options like every other service20:58
dansmithyeah, so the problem there is that you have to have a HA LB for that20:58
dansmithcloud client software needs to handle failures like this to be robust, I'm not sure why nova shouldn't20:59
efriedI thought there was a project that did that for you20:59
efrieddo we try to HA other things? cinder, keystone, neutron...?20:59
efried(trick question)20:59
*** markvoelker has quit IRC20:59
*** aloga has joined #openstack-nova20:59
dansmithwell, glance and keystone would be much easier to retry than the others20:59
smcginnisNot the API, but cinder services can be run active/active.20:59
smcginnisAnd I've seen people use the active/passive pacemaker setup for things too.21:00
dansmithyeah21:00
efriedsmcginnis: behind a single catalog endpoint tho?21:00
smcginnisYeah21:00
efriedmelwitt: would you mind giving this another look? https://review.opendev.org/#/c/615704/21:00
dansmithefried: well, point is, if you're going to deprecate it, I'd appreciate you getting that patch up so I can point the people I'm talking to about it21:00
dansmithefried: I remember you talking about this a while ago, had kinda assumed it had already happened21:01
efrieddansmith: are you playing devil's advocate or do you really think nova should be responsible for load balancing glance?21:01
dansmithtripleo is using api_servers, so they're going to need some notice21:01
dansmithefried: not load balancing, but being able to have multiple endpoints and find one that works, yeah21:02
efriedyeah, "everyone" is probably still using api_servers, because we only added the standard ksa stuff in, I think, queens.21:02
dansmithhaving to have a floating VIP and HA'ing services behind that VIP for everything really sucks21:02
efriedis tripleo using multiple endpoints to api_servers?21:02
dansmithefried: just to be clear, "the ksa stuff" doesn't imply anything about multiple endpoints right?21:03
dansmithefried: they are21:03
melwittefried: whoa, that is old. I can look at it again once I figure out what it's about again21:03
dansmithefried: for glance, but we don't do the failover stuff, so not really for much gain21:03
efriedmelwitt: thanks, I got a nudge from a downstream (not $employer) who needs it.21:03
melwittack21:03
efrieddansmith: right, the way nova handles it really isn't HA-ish, it's more load-balance-ish.21:04
dansmithefried: currently, right21:04
efrieddansmith: and no, "the ksa stuff" doesn't have anything for mulitiple endpoints... I don't think. (Can you put multiple endpoints in the service catalog?)21:04
dansmithgah21:05
efrieddansmith: I can tell you this, cutting over to sdk to talk to glance, api_servers will have to be gone first.21:05
dansmithbrb21:05
*** ociuhandu has joined #openstack-nova21:05
dansmithefried: I dunno about the service catalog, probably not21:06
dansmithnot saying that's a good thing, but.. :)21:06
dansmithefried: okay so, can you put up a patch to deprecate it real quick so I can point people to it?21:07
efriedI'm going to have to nail down mordred for the more philosophical aspects of this.21:07
dansmithefried: -W it if you want for completeness and comments or something21:07
efriedI can do that, yeah, though now you've got me doubting if we *should*.21:07
dansmithefried: ack, we can blame lots of stuff on him like this21:07
dansmithefried: that's fine, a place to collect feedback21:07
efried...21:07
dansmithefried: the tripleo people can go voice their hate there if they want, or they may say "meh, okay, we can VIP this too" I honestly don't know21:08
efriedcool21:08
dansmithefried: can you do that, like, uh, quickly so I can reply to someone quickly? :D21:08
efriedon it rn21:08
dansmithspanks.21:08
efriedif people would stop bugging me in IRC.21:08
efrieddansmith: while you're waiting, I found this (search for api_servers) http://specs.openstack.org/openstack/nova-specs/specs/queens/implemented/use-ksa-adapter-for-endpoints.html#proposed-change21:12
*** ircuser-1 has joined #openstack-nova21:12
dansmith"The exception is [glance]api_servers, which will continue to be supported."21:13
*** panda|pto has quit IRC21:14
dansmithI wonder if "valid_interfaces" is an arbitrary list, such that you could have internal1, internal2, internal3?21:14
openstackgerritJoshua Cornutt proposed openstack/nova master: Moving get_hash_str() from md5 to sha-256  https://review.opendev.org/61570421:15
dansmithI also don't know how LBs (or HAproxy) handle a failed http session post-request.. meaning if I do a GET against glance and then the connection is reset after I've sent the GET, does the LB retry and re-play that against the next server?21:16
mriedemdansmith: i think it might be, we've had to dig into the ksa code a few times to answer that21:16
dansmithI'd kindof doubt it.. figure it'd fail the client, mark the server, next request doesn't include the bad server21:16
dansmithwhich means if we don't retry we propagate the failure up21:16
dansmithwhich is fine in the trivial case, but not so much when we're making multiple calls as part of a complex operation, like against cinder or neutron21:17
dansmithmriedem: might be what, arbitrary?21:17
dansmithor might be constrained?21:17
*** panda has joined #openstack-nova21:18
dansmithbased on the description in that spec, though, it tries them in order, so you'd hammer internal1 until it fails, and then try internal2, it sounds like21:18
dansmithwhich wouldn't be what you'd want really21:18
openstackgerritEric Fried proposed openstack/nova master: Deprecate [glance]api_servers  https://review.opendev.org/69222721:21
efrieddansmith: ^21:21
mriedemdansmith: i think valid_interfaces is arbitrary but i've had to look this up multiple times in the ksa code21:21
dansmithefried: thanks21:21
efrieddansmith: no, valid_interfaces is public/private/internal only21:21
efriedI think they can have 'URL' suffixes for legacy reasons21:21
mriedemalso, on that note in the spec for api_servers, i want to say the godaddy or smorrison made a point that they still really needed that when the spec was being reviewed21:22
*** maciejjozefczyk has quit IRC21:22
dansmithmriedem: yeah, I *think* people are going to be quite unhappy about it, but we'll see21:22
mriedemmight have been in the ML, i don't see it in review comments21:22
efried"really needed" meaning it was easier for us to continue to support it than it was for them to stop relying on our shoddy load balancing hack21:22
efriedI'm having trouble finding sympathy for "wah, I don't want to set up a load balancer in my cloud" from someone who has enough resources to have multiple glance API endpoints in the first place.21:24
efriedIsn't there an openstack project that does load balancing for you?21:24
dansmithefried: a six-node mini cloud would still have three controllers in our setups21:24
mriedemefried: are you thinking of octavia?21:25
dansmithefried: we don't deploy less than three controllers (and thus glance apis) no matter how big or small your cloud is, which means it's *always* an issue21:25
efried...yes, it looks like I may be thinking of octavia21:25
mriedemi don't know much about octavia, but doesn't that spawn VMs to run the load balancer?21:25
mriedemand thus kind of a chicken and egg issue?21:25
efriedI dunno how it works21:25
dansmithyeah, octavia is not a solution to this problem21:25
mriedemjohnsom would know21:26
dansmithit's a thing you use on top of a functioning cloud to LB an application21:26
mriedemright21:26
dansmith(afaik)21:26
efriedit's not a chicken/egg when you're using it as a *load balancer*. It would be chicken/egg if you were using it for HA.21:27
mriedeme.g. we don't use trove instances to run the nova db...21:27
efriedwhich, again, this is not and never has been.21:27
*** ociuhandu has quit IRC21:27
dansmithefried: it's totally chicken and egg for nova to use it between it and glance, for whatever reason :D21:28
efriedbut at this point I need to stfu and get back to unwedging osc. I'll look for discussion on that patch I guess.21:28
dansmithyep, thanks pointing people at that soonly21:28
efriedand probably expect dansmith to start a ML thread for his people21:28
efriedyeah21:28
dansmithyeah, octavia requires a functioning nova to work against21:29
dansmithso nova using octavia to talk to glance is not a thing :)21:29
mriedemhere is the thread btw http://lists.openstack.org/pipermail/openstack-dev/2017-April/116028.html21:29
*** ociuhandu has joined #openstack-nova21:29
mriedemwhich also bled into may21:29
mriedemhttp://lists.openstack.org/pipermail/openstack-dev/2017-May/116574.html is godaddy21:30
dansmithyeah21:32
dansmithI know tripleo and similar things HA certain endpoints, but AFAIK, glance and rabbit have always been multiple servers21:32
*** JamesBenson has quit IRC21:33
dansmithespecially for fundamental things like that, it's far easier to set that kind of thing in config vs. something you need an api to do,21:33
dansmithfor the same reason that cells in the database is problematic21:33
dansmithanyway21:33
johnsomo/ Not all of the Octavia drivers use nova, but I still would not recommend using Octavia load balancers for internal API traffic. There are many chickens and eggs to fight with for that.  I.e. keystone, neutron, potentially glance/nova, etc.21:34
mriedemnote that a lot of that ML thread is about removing the endpoint_override options, which we haven't done i don't think21:34
mriedemksa has an override option21:34
mriedemhttp://lists.openstack.org/pipermail/openstack-dev/2017-May/116149.html sums up what dan was saying21:35
dansmithwell, if I'm understanding this right, once it's down in ksa then we lose the ability to retry smartly right?21:35
mriedemi'm just saying the thread diverged a bit,21:35
dansmithyup21:35
mriedembetween overrding the endpoint (don't use the catalog) and multiple api servers for load balancing21:35
mriedemsomething something openstack ironic > k8s ingress > nova|glance > profit!!!21:37
dansmithand godaddy and blair there actually mention failover more than load-balancing21:37
dansmithheh, well, k8s actually handles this as part of the infra, AFAIK, so ... yeah ;)21:37
mriedemyeah, that's blair's "I think it'd be useful if all OpenStack APIs and their clients actively handled this poor-man's HA without having to resort to haproxy"21:38
mriedemi.e. would be nice if openstack just handled this natively21:38
dansmithright21:40
sean-k-mooneyhonestly k8s  with the nginx ingress control + cert manger + lets encrypt is a nice way to get free tls and do load balancing or at least revers proxying21:40
dansmithbecause it's kinda dumb for us to reject all kinds of pet features and then not be elastic and failure-handly ourselves21:40
sean-k-mooneyjohnsom: does octavia have any support for metallb?21:41
johnsomsean-k-mooney: not at the moment21:42
johnsomYou can use Octavia for you ingress in k8s though21:42
sean-k-mooneyyes21:42
*** ociuhandu has quit IRC21:43
sean-k-mooneybut that does seam well like more effort21:43
johnsomThe other thing to think about when replacing an LB tier is rate limiting, WAF, and TLS offload21:43
sean-k-mooneyyou do not nessisarly want your load balancer to have to do those things21:44
mriedemhttps://docs.openstack.org/nova/latest/configuration/config.html#glance.api_servers isn't an LB tier :)21:44
sean-k-mooneybut yes21:44
johnsomI guess that is the context I missed. That seems duplicate to keystone, why not just deprecate it?21:45
mriedemkeystone doesn't have a round robin concept like what this thing is21:45
mriedemit's just a very dumb, oh endpoint 1 failed, try endpoint 221:46
johnsomDNS provides round robin21:46
dansmithDNS doesn't provide smart round robin unless you resolve the name to all the addresses and then try them yourself21:47
dansmithsmart meaning "know when to stop retrying"21:47
dansmithdumb use of DNS RR gets you poor-man's LB, which is not what we're talking about21:47
dansmithsince we're off on a tangent, I'm gonna drop off21:48
sean-k-mooneydansmith: so i have scolled back to the point where you have strong opions on how this should work. what was the orginial question?21:52
openstackgerritMerged openstack/nova master: Mark "block_migration" arg deprecation on pre_live_migration method  https://review.opendev.org/68296321:53
openstackgerritMerged openstack/nova master: Move Destination object tests to their own test class  https://review.opendev.org/68301721:53
openstackgerritMerged openstack/nova master: Remove 'test_cold_migrate_with_physnet_fails' test  https://review.opendev.org/68396121:53
sean-k-mooneydansmith: was it solely how to support glance active active21:54
*** ociuhandu has joined #openstack-nova21:55
sean-k-mooneyand then the load balancer converstation was a tangent form [glance]/api_servers not being a loadblancing teir or even really a ha solution21:56
eanderssonIs there a way to make nginx retry when uwsgi (nova-metadata-api) gets overwhelmed?21:56
eanderssonSeeing nova-metadata-api falling behind and causing nginx to throw 502s21:57
sean-k-mooneyam can you just increase the process count on the uwsgi config21:58
eanderssonWe have 5x 24 nova-metadata-plus processes and feels like it should be high enough21:58
sean-k-mooneyah ok21:58
eanderssonKinda feel like 120 should be enough.21:58
sean-k-mooneyhave you truned on memcache caching for those too21:58
eanderssonI'll probably end up bumping it, but feel like I am missing something21:59
eandersson[cache] has memcached enabled, and keystone auth too21:59
donnydmelwitt: thank you :)21:59
sean-k-mooneyok because if you have lots of worker but use the normal in memeory caceh you end up rebuilding stuff in each worker over and over again22:00
eanderssonGonna bump the worker count to like 200, but will dig deeper22:03
eanderssonMight do something like https://stackoverflow.com/questions/44581719/resource-temporarily-unavailable-using-uwsgi-nginx as well to see if that helps a bit22:03
eanderssonsince it looks like a ddos, but kinda unavoidable when you have... many many thousands vms :p22:03
*** ociuhandu has quit IRC22:04
sean-k-mooneyya its possible your hitting connection limits. how is memory looking on those nodes. you dont see OOM event or any other red flags22:04
*** ociuhandu has joined #openstack-nova22:10
eanderssonWe had to beef them up due to neutron.22:10
eanderssonSo got plenty of memory :D22:10
eanderssonMight be stacking too many services. Could be hitting limits on context switching.22:11
sean-k-mooneyim glad but also sad. "whats using all the ram? our cloud software :("22:11
eanderssonYea I think I mentioned this already, but each neutron worker peaks out at 8.2gb memory per process when under heavy load :'(22:12
sean-k-mooneyya that is much more hten i was expecting22:12
eanderssonI do wonder what kind of hardware people use for their larget clusters22:13
eanderssonfor the "control plane"22:13
mriedemare you sure it's using the cache properly? do you see "Using cached metadata for" in the metadata API debug logs?22:13
*** jawad_axd has joined #openstack-nova22:13
eanderssonI'll check in our lab setup sec22:14
sean-k-mooneyenabling memcache instead of the dict cache makes a huge difference so if its not useing it properly you could definetly reduce teh amount of workers as a resutl of fixing it22:15
*** ociuhandu has quit IRC22:15
mriedemfwiw cern and huawei public cloud run a metadata api per cell with direct access to the cell db rather than a huge since metadata api that spans the entire deployment22:16
mriedems/since/single/22:16
mriedemhttps://docs.openstack.org/nova/latest/user/cellsv2-layout.html#nova-metadata-api-service22:16
sean-k-mooneymriedem: ya that seams like a better way to approch scaling22:16
mriedemblizzard is only 1 cell though right?22:16
sean-k-mooneydont you have to modify neutorn to proxy to the local instance in that case too22:17
eanderssonYea seeing22:17
eandersson> Using cached metadata for X get_metadata_by_remote_address22:17
eanderssonYea one cell many regions22:18
*** jawad_axd has quit IRC22:18
mriedemsean-k-mooney: yes that's in that doc22:19
sean-k-mooneywoudl it make sense even in a singel cell deployment to configure different neutron metadata proxies to hit different apis directly rather then loadbalnce?22:20
*** mlavalle has quit IRC22:20
*** ociuhandu has joined #openstack-nova22:20
sean-k-mooneyeandersson: out of interest do you knwo how long the resonce form the metadata api take when you dont get teh 502s22:21
eandersson~0.00422:22
sean-k-mooneyok that is normal. we have seen up to 10s of seconds in the gate randomly before we enabeld the memcache caching22:23
eanderssonI might have overtuned nginx tbh combined with the nature of metadata22:23
eanderssonI set worker count to auto in nginx which ends up creating processors based on the vcpu count22:24
eanderssonbut uwsgi worker is set to 2/vcpu22:24
*** igordc has quit IRC22:25
sean-k-mooneyso i honestly dont know how much load you have but i would try a smaller value like 822:25
sean-k-mooneyinstead of 2422:25
sean-k-mooneyi certely would not have more then 1 per hyper tread22:25
eanderssonYea exactly got one per core now (48/2)22:25
eanderssonfor nova-api at least22:26
sean-k-mooneyya if your resonce time is in the order of 4ms i would bring that down as i siad to like 8-16 and see if that help rahter then go from 24 to 3022:27
eanderssonWe did a lot of benchmarking, but it's difficult to tell how the system performs in real enviornments22:27
*** dviroel has quit IRC22:29
*** ociuhandu has quit IRC22:31
*** ociuhandu has joined #openstack-nova22:33
*** ociuhandu has quit IRC22:37
*** ociuhandu has joined #openstack-nova22:38
*** mriedem has quit IRC22:40
*** ociuhandu has quit IRC22:43
*** adriant has joined #openstack-nova22:51
*** jawad_axd has joined #openstack-nova22:55
*** markvoelker has joined #openstack-nova22:56
*** jawad_axd has quit IRC22:59
*** markvoelker has quit IRC23:00
*** tkajinam has joined #openstack-nova23:01
*** tkajinam has quit IRC23:01
*** macz has quit IRC23:01
*** tkajinam has joined #openstack-nova23:04
*** slaweq has quit IRC23:05
*** slaweq has joined #openstack-nova23:10
sean-k-mooneyartom: i left a comment for you here https://review.opendev.org/#/c/692206/3/nova/api/openstack/compute/shelve.py23:14
sean-k-mooneyartom: i think that would also fix the issue more generally23:14
sean-k-mooneybut i have not tested it23:14
sean-k-mooneyanyway night all o/23:15
*** slaweq has quit IRC23:16
*** jawad_axd has joined #openstack-nova23:16
artomsean-k-mooney, nice idea23:17
artomI'd kinda be afraid of any side effects23:17
artomIt'd a wide-reaching change23:17
*** jawad_axd has quit IRC23:21
*** ivve has quit IRC23:21
*** jawad_axd has joined #openstack-nova23:37
*** jawad_axd has quit IRC23:41
openstackgerritMerged openstack/nova master: fixtures: Add support for security groups  https://review.opendev.org/68680223:43
*** ociuhandu has joined #openstack-nova23:43
openstackgerritMerged openstack/nova master: nova-net: Migrate 'test_floating_ips' functional tests  https://review.opendev.org/68434423:47
*** ociuhandu has quit IRC23:48

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!