*** erlon has joined #openstack-nova | 00:03 | |
*** slaweq has joined #openstack-nova | 00:13 | |
*** brinzhang has joined #openstack-nova | 00:22 | |
*** Dinesh_Bhor has joined #openstack-nova | 00:31 | |
*** Dinesh_Bhor has quit IRC | 00:31 | |
*** slaweq has quit IRC | 00:45 | |
*** bhagyashris has joined #openstack-nova | 00:55 | |
*** slaweq has joined #openstack-nova | 01:11 | |
*** licanwei has joined #openstack-nova | 01:35 | |
*** Dinesh_Bhor has joined #openstack-nova | 01:37 | |
*** Dinesh_Bhor has quit IRC | 01:37 | |
*** slaweq has quit IRC | 01:44 | |
*** jwongz has joined #openstack-nova | 01:52 | |
openstackgerrit | Joshua Cornutt proposed openstack/nova master: Added [key_pairs]/fingerprint_algorithm options https://review.openstack.org/615460 | 01:54 |
---|---|---|
openstackgerrit | Joshua Cornutt proposed openstack/nova master: Added [key_pairs]/fingerprint_algorithm options https://review.openstack.org/615460 | 01:56 |
*** Dinesh_Bhor has joined #openstack-nova | 01:59 | |
*** hongbin has joined #openstack-nova | 02:05 | |
*** mhen has quit IRC | 02:10 | |
*** mhen has joined #openstack-nova | 02:11 | |
*** slaweq has joined #openstack-nova | 02:11 | |
*** erlon has quit IRC | 02:21 | |
*** Dinesh_Bhor has quit IRC | 02:31 | |
*** Dinesh_Bhor has joined #openstack-nova | 02:40 | |
*** slaweq has quit IRC | 02:45 | |
*** mrsoul has quit IRC | 02:50 | |
*** TuanDA has joined #openstack-nova | 02:55 | |
*** Cathyz has joined #openstack-nova | 03:05 | |
*** slaweq has joined #openstack-nova | 03:13 | |
*** licanwei has quit IRC | 03:38 | |
*** slaweq has quit IRC | 03:44 | |
*** Dinesh_Bhor has quit IRC | 03:47 | |
*** Cathyz has quit IRC | 03:48 | |
*** Dinesh_Bhor has joined #openstack-nova | 03:51 | |
*** dave-mccowan has quit IRC | 04:00 | |
*** slaweq has joined #openstack-nova | 04:16 | |
*** udesale has joined #openstack-nova | 04:31 | |
*** janki has joined #openstack-nova | 04:47 | |
*** slaweq has quit IRC | 04:48 | |
*** hongbin has quit IRC | 04:50 | |
*** Nel1x has quit IRC | 04:58 | |
*** kevinbenton has quit IRC | 05:00 | |
*** kevinbenton has joined #openstack-nova | 05:00 | |
*** ratailor has joined #openstack-nova | 05:07 | |
*** slaweq has joined #openstack-nova | 05:16 | |
*** pcaruana has joined #openstack-nova | 05:23 | |
*** Dinesh_Bhor has quit IRC | 05:24 | |
*** Dinesh_Bhor has joined #openstack-nova | 05:30 | |
*** pcaruana has quit IRC | 05:32 | |
*** moshele has joined #openstack-nova | 05:33 | |
*** ivve has joined #openstack-nova | 05:33 | |
*** slaweq has quit IRC | 05:44 | |
*** zul has quit IRC | 05:49 | |
*** slaweq has joined #openstack-nova | 06:11 | |
*** moshele has quit IRC | 06:18 | |
*** tetsuro has joined #openstack-nova | 06:18 | |
*** fghaas has joined #openstack-nova | 06:22 | |
*** pvc has joined #openstack-nova | 06:23 | |
pvc | hi anyone successfully integrate cyborg to nova? | 06:23 |
*** fghaas has quit IRC | 06:27 | |
*** hshiina has joined #openstack-nova | 06:33 | |
*** slaweq has quit IRC | 06:39 | |
*** Luzi has joined #openstack-nova | 07:03 | |
*** slaweq has joined #openstack-nova | 07:11 | |
*** dpawlik has joined #openstack-nova | 07:12 | |
*** jaosorior has joined #openstack-nova | 07:13 | |
*** slaweq has quit IRC | 07:16 | |
*** dpawlik has quit IRC | 07:16 | |
*** jaosorior has quit IRC | 07:24 | |
*** jaosorior has joined #openstack-nova | 07:27 | |
*** jangutter has joined #openstack-nova | 07:39 | |
*** dpawlik has joined #openstack-nova | 07:42 | |
*** ccamacho has joined #openstack-nova | 07:44 | |
*** dpawlik has quit IRC | 07:47 | |
*** ircuser-1 has joined #openstack-nova | 07:53 | |
*** Dinesh_Bhor has quit IRC | 07:54 | |
*** hshiina has quit IRC | 07:58 | |
*** pcaruana has joined #openstack-nova | 08:06 | |
*** ralonsoh has joined #openstack-nova | 08:14 | |
*** fghaas has joined #openstack-nova | 08:16 | |
*** dpawlik has joined #openstack-nova | 08:17 | |
*** tssurya has joined #openstack-nova | 08:20 | |
*** helenafm has joined #openstack-nova | 08:31 | |
*** fanzhang has quit IRC | 08:42 | |
*** ratailor_ has joined #openstack-nova | 08:46 | |
*** ratailor has quit IRC | 08:48 | |
*** bauwser is now known as bauzas | 08:56 | |
bauzas | good morning stackers | 08:56 |
*** jpena|off is now known as jpena | 08:56 | |
*** Dinesh_Bhor has joined #openstack-nova | 09:01 | |
*** ccamacho has quit IRC | 09:04 | |
pvc | bauzas | 09:06 |
pvc | it is possible to implement cybrog to nova | 09:06 |
bauzas | it's a long story but some people try to do this | 09:10 |
*** ccamacho has joined #openstack-nova | 09:12 | |
jangutter | it's also unlikely to happen in the short term. | 09:13 |
*** ttsiouts has joined #openstack-nova | 09:20 | |
*** pvc has quit IRC | 09:27 | |
johnthetubaguy | bauzas: just checking, partly for pvc, but I didn't think we had any code in place to integrate cyborg in master (yet)? Apart from maybe PCI pass-through? | 09:28 |
bauzas | johnthetubaguy: AFAIK, cyborg directly modifies nova.conf for using PCI passthru, yes | 09:31 |
*** pcaruana has quit IRC | 09:31 | |
johnthetubaguy | ah, got it, thanks | 09:31 |
bauzas | johnthetubaguy: there was a YVC session | 09:31 |
johnthetubaguy | yeah, I was more checking there wasn't post PTG progress I didn't know about while I was out of it | 09:32 |
*** pcaruana has joined #openstack-nova | 09:32 | |
*** ttsiouts has quit IRC | 09:36 | |
*** ttsiouts has joined #openstack-nova | 09:38 | |
*** slaweq has joined #openstack-nova | 09:41 | |
*** adrianc has joined #openstack-nova | 09:47 | |
*** derekh has joined #openstack-nova | 09:49 | |
*** TuanDA has quit IRC | 10:03 | |
*** panda has joined #openstack-nova | 10:05 | |
openstackgerrit | Silvan Kaiser proposed openstack/nova master: [WIP] Added Qemu libquobyte Support to the Quobyte Driver https://review.openstack.org/546500 | 10:14 |
*** ccamacho has quit IRC | 10:15 | |
*** ccamacho has joined #openstack-nova | 10:17 | |
*** takashin has joined #openstack-nova | 10:22 | |
*** cdent has joined #openstack-nova | 10:22 | |
*** bhagyashris has quit IRC | 10:25 | |
*** naichuans has joined #openstack-nova | 10:28 | |
*** davidsha has joined #openstack-nova | 10:30 | |
*** Dinesh_Bhor has quit IRC | 10:34 | |
*** dtantsur|afk is now known as dtantsur\ | 10:35 | |
*** dtantsur\ is now known as dtantsur | 10:35 | |
*** lpetrut has joined #openstack-nova | 10:36 | |
*** aarents has joined #openstack-nova | 10:41 | |
lpetrut | Hi, is it acceptable for exceptions to include another exception as message? Asking as exception_to_dict won't handle it properly: https://github.com/openstack/nova/blob/5859741f4d5e08ec15169b9c8d1aae1836442fd2/nova/compute/utils.py#L58-L87 | 10:41 |
*** fghaas has left #openstack-nova | 10:43 | |
*** maciejjozefczyk has quit IRC | 10:50 | |
*** maciejjozefczyk has joined #openstack-nova | 10:54 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Use links to placement docs in nova docs https://review.openstack.org/614056 | 11:01 |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Remove Placement API reference https://review.openstack.org/614437 | 11:02 |
*** k_mouza has joined #openstack-nova | 11:03 | |
openstackgerrit | Takashi NATSUME proposed openstack/nova master: Fix best_match() deprecation warning https://review.openstack.org/611204 | 11:06 |
*** aarents has quit IRC | 11:18 | |
*** sean-k-mooney has joined #openstack-nova | 11:27 | |
sean-k-mooney | bauzas: can you reivew https://review.openstack.org/#/c/610034/ | 11:36 |
*** dpawlik has quit IRC | 11:38 | |
adrianc | sean-k-mooney: Hi, i would like to co-author the following commits as i am now working towards aligning the POC code for libvirt sriov live migration: https://review.openstack.org/#/c/607365/ https://review.openstack.org/#/c/607365/ | 11:41 |
sean-k-mooney | adrianc: hi yes that is fine feel free too. i have been held up with other work for the last 2 weeks so have not made much progress on this since then | 11:44 |
sean-k-mooney | im currently adressing stephens feedback on the spec but i should get that done today | 11:44 |
adrianc | great, thanks sean-k-mooney | 11:45 |
*** erlon has joined #openstack-nova | 11:52 | |
*** beekneemech has quit IRC | 11:53 | |
*** bnemec has joined #openstack-nova | 11:57 | |
*** dave-mccowan has joined #openstack-nova | 12:04 | |
*** janki has quit IRC | 12:05 | |
*** brinzhang has quit IRC | 12:13 | |
*** udesale has quit IRC | 12:14 | |
*** dpawlik has joined #openstack-nova | 12:16 | |
*** dpawlik has quit IRC | 12:18 | |
*** dpawlik has joined #openstack-nova | 12:18 | |
*** tetsuro has quit IRC | 12:20 | |
*** jpena is now known as jpena|lunch | 12:20 | |
*** dpawlik has quit IRC | 12:23 | |
*** dpawlik has joined #openstack-nova | 12:26 | |
*** dpawlik has quit IRC | 12:27 | |
*** dpawlik has joined #openstack-nova | 12:28 | |
openstackgerrit | Lucian Petrut proposed openstack/nova master: Convert exception messages to strings https://review.openstack.org/615550 | 12:29 |
*** dpawlik_ has joined #openstack-nova | 12:30 | |
*** dpawlik has quit IRC | 12:31 | |
*** k_mouza has quit IRC | 12:31 | |
*** dpawlik_ has quit IRC | 12:32 | |
*** dpawlik has joined #openstack-nova | 12:32 | |
*** jroll has quit IRC | 12:32 | |
*** jroll has joined #openstack-nova | 12:34 | |
*** dpawlik_ has joined #openstack-nova | 12:35 | |
*** dpawlik has quit IRC | 12:37 | |
*** dpawlik_ has quit IRC | 12:42 | |
*** dpawlik has joined #openstack-nova | 12:42 | |
*** ratailor__ has joined #openstack-nova | 12:46 | |
*** aarents has joined #openstack-nova | 12:47 | |
*** ratailor_ has quit IRC | 12:49 | |
*** dtantsur is now known as dtantsur|brb | 12:51 | |
aarents | Hi there, can someone may evaluate this issue: https://bugs.launchpad.net/nova/+bug/1801702 | 12:54 |
openstack | Launchpad bug 1801702 in OpenStack Compute (nova) "Spawn may fail when cache=none on block device with logical block size > 512" [Undecided,New] | 12:54 |
bauzas | sean-k-mooney: ack, will look | 13:03 |
*** Sundar has joined #openstack-nova | 13:05 | |
*** ttsiouts has quit IRC | 13:05 | |
Sundar | cdent: Please LMK when you have some time, regarding https://review.openstack.org/#/c/603955/ | 13:06 |
cdent | Sundar: I'm unlikely to have much time to talk synchronously, a lot going on, but if you leave questions and comments on the review I will respond as soon as I can | 13:07 |
cdent | but if it is something quick? | 13:08 |
Sundar | cdent: OK. I was hoping to get and provide clarification on the general comment: "In general it seems like the resources and representations thereof are too closely tied to the specific needs of the API" | 13:08 |
Sundar | I will respond in the review itself, as best as I can | 13:09 |
Sundar | Thanks | 13:09 |
cdent | thank you | 13:09 |
*** zul has joined #openstack-nova | 13:16 | |
*** tetsuro has joined #openstack-nova | 13:19 | |
*** k_mouza has joined #openstack-nova | 13:25 | |
*** jpena|lunch is now known as jpena | 13:25 | |
bauzas | sean-k-mooney: +2d with a few nits | 13:28 |
sean-k-mooney | bauzas: thanks ill check them out | 13:28 |
*** davidsha has quit IRC | 13:31 | |
*** jistr is now known as jistr|call | 13:32 | |
*** ratailor__ has quit IRC | 13:34 | |
*** panda is now known as panda|bbl | 13:36 | |
*** ttsiouts has joined #openstack-nova | 13:43 | |
tssurya | melwitt: is there any reason on why we calculate the quota for instances/ram/cores per user per project ? (https://review.openstack.org/#/c/569055/2/nova/quota.py@1342), like do we practically use the counts for per user over projects anywhere ? | 13:46 |
sean-k-mooney | tssurya: i belive we used to use the per user qoats in the past i assuemd we still do | 13:47 |
tssurya | sean-k-mooney: we now do per project quotas right ? I am gessing only for keypairs we do per user ? | 13:47 |
sean-k-mooney | tssurya: i was under the impression we had both | 13:48 |
tssurya | ah okay | 13:48 |
tssurya | let me check the documentation | 13:48 |
sean-k-mooney | tssurya: personally i would consider it a fairly big regression if we drop the user quotas | 13:48 |
sean-k-mooney | it was not that uncommon for people to deploy one project per department/team in private cloude and then have a user level quota too | 13:49 |
tssurya | sean-k-mooney: hmm correct looks like we have user quotas also heh, | 13:49 |
sean-k-mooney | so the test team project can have 100 instances but each tester can only have 10 personally | 13:50 |
tssurya | okay sounds fair enough that restricting resources on users also is valid | 13:50 |
sean-k-mooney | i dont know what the plans for this were with unified limits however | 13:50 |
sean-k-mooney | best to ask melwitt et al for comment | 13:50 |
tssurya | thanks sean-k-mooney , yea I don't know about tha unified limits stuff althought there is a spec worked upon by John | 13:51 |
* johnthetubaguy waves | 13:51 | |
* tssurya waves back | 13:51 | |
johnthetubaguy | tssurya: currently the operator can specify per user limits, there are no plans to support this post unified limits | 13:52 |
johnthetubaguy | well basically, the plan is hierarchical limits replace that feature | 13:52 |
sean-k-mooney | johnthetubaguy: really | 13:52 |
tssurya | johnthetubaguy: ah thanks, so we are going to drop user quotas altogether ? | 13:52 |
leakypipes | johnthetubaguy: I'm getting close... | 13:53 |
johnthetubaguy | tssurya: that is what I propose in the spec and my patches, yeah | 13:53 |
leakypipes | johnthetubaguy: sigh, these quota unit test are ugly-as-f... | 13:53 |
sean-k-mooney | johnthetubaguy: so each user would have there own project in that case as a subproject of there teams project | 13:53 |
johnthetubaguy | leakypipes: not been in them yet! | 13:53 |
tssurya | johnthetubaguy: thanks I will go read the spec | 13:53 |
*** leakypipes is now known as jaypipes | 13:53 | |
jaypipes | johnthetubaguy: bring a decanter of scotch. | 13:53 |
jaypipes | you'll need it. | 13:53 |
jaypipes | hopefully I'll have the worst of them cleaned up shortly. | 13:54 |
jaypipes | 9 failures left to fix (after converting them from test.TestCase to test.NoDBTestCase) | 13:54 |
johnthetubaguy | OK, nice | 13:54 |
sean-k-mooney | jaypipes: johnthetubaguy so has there been any push back on the fact that users with acess to multiple project will nolong have a global limit on the resouce that user can use | 13:58 |
sean-k-mooney | i know you can kindo of model that with per user project nested under a larger project but with only 2 levels of nesting that is not as flexible as what could be done before | 13:59 |
*** janki has joined #openstack-nova | 13:59 | |
johnthetubaguy | sean-k-mooney: so I should be more precise, it was per user within a project, if my memory is correct | 13:59 |
*** lpetrut has quit IRC | 13:59 | |
jaypipes | sean-k-mooney: no, no pushback. | 14:00 |
johnthetubaguy | sean-k-mooney: the only users we know about, prefer and want to move to two level project quotas (they used the user thing as a workaround) | 14:00 |
sean-k-mooney | johnthetubaguy: wait could you have a per user qoat that was different form the users global quota for a specific porject before | 14:00 |
jaypipes | sean-k-mooney: what johnthetubaguy said. | 14:00 |
johnthetubaguy | yeah, the opposite, we have support for this direction | 14:00 |
jaypipes | sean-k-mooney: what's a "qoat"? Quota Of All Time? :) | 14:01 |
johnthetubaguy | sean-k-mooney: there was no global user quota, it was per user within a specific project. Maybe I answered tssurya's question badly | 14:01 |
sean-k-mooney | oh then i totally missunder stood how that option worked and always gave my team memebr more instance then i planned lol | 14:02 |
tssurya | johnthetubaguy: no I think you answered my question correctly. we currently support per user quota within projects, after unified limits we won't right ? | 14:03 |
johnthetubaguy | tssurya: yeah, that's right | 14:03 |
sean-k-mooney | i assumed the user quota applied across all project so when i gave them acess to a second project outside there main one i used the user quotat to limit there use of th unlimited project | 14:03 |
tssurya | yea so I guess sean-k-mooney was worried of this to be a big bad regression ? | 14:03 |
johnthetubaguy | tssurya: the folks who used that also modified policy so you couldn't delete someone elses instance, so its all a bit strange around those parts | 14:04 |
tssurya | johnthetubaguy: oh yea that is strange if its all in the same project | 14:04 |
sean-k-mooney | tssurya: from what johnthetubaguy and jaypipes explained there is no regression that i can see so all good | 14:04 |
johnthetubaguy | sean-k-mooney: yeah, I thought that too till I reviewed melwitt's patches, turns out it doesn't do that | 14:04 |
sean-k-mooney | johnthetubaguy: well it was a private developer cloud so no harm in my case but ya | 14:05 |
sean-k-mooney | having to adjust both the project and user limits was always a bit of a pain so it soundls like its an improvment overall | 14:06 |
johnthetubaguy | cool | 14:06 |
tssurya | sean-k-mooney: okay :) | 14:06 |
*** jistr|call is now known as jistr | 14:08 | |
*** jdillaman has quit IRC | 14:09 | |
*** jdillaman has joined #openstack-nova | 14:09 | |
*** mriedem has joined #openstack-nova | 14:11 | |
*** davidsha has joined #openstack-nova | 14:14 | |
*** panda|bbl is now known as panda | 14:15 | |
*** dtantsur|brb is now known as dtantsur | 14:19 | |
*** fried_rice is now known as efried | 14:25 | |
*** k_mouza has quit IRC | 14:25 | |
*** k_mouza has joined #openstack-nova | 14:25 | |
*** SteelyDan is now known as dansmith | 14:37 | |
*** mlavalle has joined #openstack-nova | 14:42 | |
efried | tssurya: I saw Belmiro respond on the ML; does this mean y'all have tried the patch in your env somewhere? | 14:44 |
tssurya | efried: well no all we do it increase the config's value to a very high number, ever since we added the config in queens | 14:45 |
tssurya | so if you allow setting the config to "0" | 14:45 |
tssurya | that will disable it its good for us | 14:45 |
efried | tssurya: The patch actually goes quite a bit further than that. | 14:45 |
efried | I believe with the refresh interval very high, you're still getting the inventory polls. | 14:46 |
efried | Now you won't even get those anymore. | 14:46 |
openstackgerrit | John Garbutt proposed openstack/nova-specs master: Add Unified Limits Spec https://review.openstack.org/602201 | 14:46 |
efried | ...I think. | 14:46 |
*** mvkr has quit IRC | 14:47 | |
tssurya | efried: oh just read the commit (message) | 14:47 |
*** awaugama has joined #openstack-nova | 14:47 | |
tssurya | yea we don't currently have any of those yet downstream (meaning our code is the same as upstream at the tracker level) | 14:47 |
tssurya | but it would be cool to optimize the calls to placement to the maximum | 14:48 |
sean-k-mooney | efried: by the way if you have time can you review https://review.openstack.org/#/c/610034/ | 14:49 |
tssurya | because in rocky you have more stuff than in queens right ? the way the updates are done with pulling the prodiver tree info and everything (meaing the new update_to_placement function)? | 14:49 |
*** hshiina has joined #openstack-nova | 14:49 | |
efried | tssurya: What I think we'll be really looking for in order to have enough confidence to merge this is for deployments such as yours and mnaser's to put this in place and verify, not so much that the number of placement calls drops off a cliff - that should be a given - but that we don't wind up with the RT getting out of sync with placement's view. | 14:49 |
efried | tssurya: I don't believe there's much more in rocky than queens, no. | 14:49 |
*** Sundar has quit IRC | 14:50 | |
*** awaugama has quit IRC | 14:50 | |
tssurya | we are also investigating on our end | 14:50 |
*** ccamacho has quit IRC | 14:50 | |
tssurya | I can see if we can test this patch in our deployment | 14:50 |
tssurya | and give feedback | 14:50 |
efried | One thing about mnaser's comments, which I will also mention in a response on the thread, is that this should not affect *allocations* at all. IIUC, it was allocations getting out of sync that mnaser observed. I will be interested to know if he or you have ever observed anything else (inventories, etc.) getting out of sync. | 14:51 |
efried | Especially you, since you've been running the long-poll in production for a while, right? | 14:51 |
*** awaugama has joined #openstack-nova | 14:51 | |
tssurya | efried: well the long poll used to only switch off traits and aggregates | 14:51 |
efried | tssurya: It would also affect inventory updates. | 14:52 |
efried | oh | 14:52 |
*** ccamacho has joined #openstack-nova | 14:52 | |
efried | yeah, I see what you're saying now. I think you're right. | 14:52 |
efried | and we're not doing much with those in q/r. | 14:52 |
tssurya | yea I am pretty sure it was only the traits and aggregates syn that was off | 14:53 |
tssurya | the inventory sync was like normal for us | 14:53 |
efried | the poll you're stretching would also update inventories, but there was an additional, separate inventory update that was happening outside of that one, so yeah. | 14:53 |
tssurya | ah yea that makes sense | 14:53 |
tssurya | because we didn't have that much out of syn inventory issues | 14:54 |
tssurya | sync* | 14:54 |
efried | tssurya: The theory behind the patch is that you shouldn't have those issues anyway, even if we *never* poll. | 14:55 |
efried | but | 14:55 |
efried | I'm not sure I hit the code paths that do that separate inventory update | 14:55 |
efried | sad face | 14:55 |
jaypipes | three more unit test failures to address... closing in... | 14:55 |
*** tetsuro has quit IRC | 14:55 | |
tssurya | haha, I am trying to go through your commit now... but | 14:55 |
tssurya | so idea is a inventory refresh only if something changes right ? | 14:56 |
efried | tssurya: well, at least in this code path, yes. Basically, if you set the refresh interval to zero, the only time you would get a refresh is if e.g. the virt driver pushes a change via update_provider_tree. | 14:57 |
efried | jaypipes: IYO, should there be a bp and/or spec for this? And should it be multiple patches? | 14:57 |
jaypipes | efried: the cache change stuff? | 14:58 |
efried | y | 14:58 |
jaypipes | efried: I think it would be useful to have a bp for tracking purposes, sure. spec, not so much. | 14:58 |
efried | ight, we'll start there. | 14:58 |
*** k_mouza has quit IRC | 14:59 | |
*** munimeha1 has joined #openstack-nova | 15:01 | |
*** janki has quit IRC | 15:01 | |
mnaser | tssurya: I can give you a script I wrote to audit placement (I think it’s in a paste somewhere). Interesting to see if you see things get out of sync too | 15:02 |
tssurya | efried: hmm yea makes sense as for the syncing issues which I am sure we would hit considering our size, can't we slowly build a heal/sync tool like we have for allocations/aggregates that deployments can run when they want ? | 15:02 |
openstackgerrit | John Garbutt proposed openstack/nova master: WIP: Integrating with unified limits https://review.openstack.org/615180 | 15:03 |
tssurya | like a placement sync tool for plausible things that would go out of sync | 15:03 |
tssurya | mnaser: ah thanks so far we haven't like disabled any major sync updates | 15:04 |
*** k_mouza has joined #openstack-nova | 15:04 | |
*** takashin has left #openstack-nova | 15:04 | |
tssurya | but we surely do have out of sync issues | 15:04 |
efried | tssurya: I think healing allocations is a separate issue. If we can show that inventories/traits/aggregates don't get out of sync when we don't refresh them, that's goodness. And then separately, if we can write something to heal allocations - or better yet, to identify why they're getting out of sync in the first place and close that gap - also goodness. | 15:04 |
mnaser | tssurya: this is for allocations being out of sync tho, so it’d be interesting if this is something we have broken I guess | 15:04 |
*** jistr is now known as jistr|call | 15:04 | |
efried | mnaser: Do you have any suspicions (or better) about where the allocation drift is happening? Like, is it on instances that fail a migration or similar? Resizes? Evacuations? Or (eek) just steady state? | 15:05 |
openstackgerrit | Merged openstack/nova master: Minimal construct plumbing for nova show when a cell is down https://review.openstack.org/591658 | 15:06 |
mnaser | I think it might be around live migrations, I suspect that if live migrations fail on a machine that somehow already has issues talking to placement then it won’t be able to revert the allocation or whatever | 15:06 |
mnaser | Do the compute nodes make all the claims during a resize or live migration? | 15:07 |
mnaser | I.e is it possible scheduler does something that compute cannot revert because of an intermittent issue | 15:07 |
tssurya | efried, mnaser: there is no way we don't have out of sync issues I am literally already working on a consistency tool but I can confirm more after looking further about the statistics regarding allocations/inventories - about how much those are out of sync | 15:07 |
efried | mnaser: certainly, although I thought we had (a bunch of really ugly) code to clean up that mess. | 15:07 |
sean-k-mooney | mnaser: i was under the impression the cpu ram and disk were still claimed in the schduler on migration | 15:08 |
sean-k-mooney | but pci device would be cliamied by the compute nodes | 15:08 |
sean-k-mooney | well that is a bad example | 15:08 |
efried | mnaser: I can't think of a way inventories would get out of sync unless a third party is mucking with them (e.g. CLI). | 15:08 |
efried | mnaser: But failed migrations of various types - absolutely. | 15:08 |
efried | ^^ for allocations | 15:09 |
mnaser | tssurya: let me share a tool that does exactly that! | 15:09 |
mnaser | Yeah I don’t think I ever had issue with inventories | 15:09 |
tssurya | mnaser: that would be great then I can get some statistics | 15:09 |
efried | mnaser: But as we were discussing above, even if you max out the refresh interval, we are still polling (and "healing") the inventories every periodic. | 15:09 |
efried | so if the inventories did drift, we would have fixed them. | 15:10 |
tssurya | efried: true | 15:10 |
efried | So there's really no way to know if drift issues exist there. | 15:10 |
mnaser | yeah but i dont think those would drift in our use cases | 15:10 |
efried | ...until I kill that. | 15:10 |
mnaser | even if they didnt sync | 15:10 |
mnaser | :P | 15:10 |
*** jistr|call is now known as jistr | 15:10 | |
efried | I think I have a pretty major rework to do on this patch now that I'm thinking in terms of the inventory refreshes... | 15:10 |
mriedem | tssurya: https://bugs.launchpad.net/nova/+bug/1793569 | 15:11 |
openstack | Launchpad bug 1793569 in OpenStack Compute (nova) "Add placement audit commands" [Wishlist,Confirmed] | 15:11 |
tssurya | mriedem: thanks | 15:11 |
mnaser | http://paste.openstack.org/show/734146/ | 15:12 |
mnaser | easier to digest paste because of launchpad's wrapping | 15:12 |
mnaser | tssurya: ^ | 15:12 |
openstackgerrit | John Garbutt proposed openstack/nova-specs master: Add Unified Limits Spec https://review.openstack.org/602201 | 15:13 |
*** jiaopengju has quit IRC | 15:14 | |
*** jiaopengju has joined #openstack-nova | 15:14 | |
*** mvkr has joined #openstack-nova | 15:15 | |
*** k_mouza has quit IRC | 15:15 | |
tssurya | mnaser: thanks :) | 15:18 |
mriedem | efried: fwiw, i would at least split out the change to disable the refresh interval (config value of 0) | 15:18 |
mriedem | since that's pretty straight forward i imagine | 15:18 |
efried | mriedem: ack, thx | 15:18 |
*** cfriesen has joined #openstack-nova | 15:21 | |
*** d34dh0r53 has quit IRC | 15:22 | |
*** ttsiouts has quit IRC | 15:22 | |
*** d34dh0r53 has joined #openstack-nova | 15:23 | |
*** ttsiouts has joined #openstack-nova | 15:25 | |
*** lyarpwood is now known as lyarwood | 15:26 | |
*** Luzi has quit IRC | 15:32 | |
*** tbachman has joined #openstack-nova | 15:32 | |
*** ttsiouts has quit IRC | 15:33 | |
openstackgerrit | Maciej Jozefczyk proposed openstack/nova master: Force refresh instance info_cache during heal https://review.openstack.org/591607 | 15:50 |
openstackgerrit | Maciej Jozefczyk proposed openstack/nova master: Add fill_virtual_interface_list online_data_migration script https://review.openstack.org/614167 | 15:50 |
*** ttsiouts has joined #openstack-nova | 15:54 | |
openstackgerrit | Maciej Jozefczyk proposed openstack/nova master: Add fill_virtual_interface_list online_data_migration script https://review.openstack.org/614167 | 15:55 |
*** dklyle has joined #openstack-nova | 16:02 | |
*** dpawlik has quit IRC | 16:05 | |
*** dpawlik_ has joined #openstack-nova | 16:05 | |
*** artom has quit IRC | 16:06 | |
*** hshiina has quit IRC | 16:06 | |
*** pcaruana has quit IRC | 16:06 | |
*** ccamacho has quit IRC | 16:08 | |
openstackgerrit | Eric Fried proposed openstack/nova master: DNM: Trust the report client cache more https://review.openstack.org/614886 | 16:11 |
*** dpawlik_ has quit IRC | 16:12 | |
*** dpawlik has joined #openstack-nova | 16:12 | |
*** ttsiouts has quit IRC | 16:14 | |
*** ttsiouts has joined #openstack-nova | 16:14 | |
*** maciejjozefczyk has quit IRC | 16:16 | |
*** imacdonn has quit IRC | 16:16 | |
*** imacdonn has joined #openstack-nova | 16:17 | |
*** dpawlik has quit IRC | 16:17 | |
*** ttsiouts has quit IRC | 16:19 | |
*** dpawlik has joined #openstack-nova | 16:28 | |
*** k_mouza has joined #openstack-nova | 16:30 | |
*** dpawlik has quit IRC | 16:32 | |
efried | stephenfin: Does :oslo.config:option: not work in renos yet? | 16:33 |
stephenfin | efried: https://github.com/openstack/nova/blob/master/doc/source/conf.py vs. https://github.com/openstack/nova/blob/master/releasenotes/source/conf.py | 16:33 |
stephenfin | efried: It's a different build. You'd need to configure it for reno | 16:34 |
stephenfin | But I strongly advise against doing so | 16:34 |
efried | stephenfin: ack. I think I don't care that much. | 16:34 |
efried | why would you advise against it? | 16:34 |
stephenfin | If we removed/renamed the option in the future, we'd break the build | 16:34 |
efried | as we should | 16:35 |
efried | oh | 16:35 |
efried | the reno should be able to stay forever | 16:35 |
efried | got it. | 16:35 |
stephenfin | Yup | 16:35 |
efried | butbutbut | 16:35 |
sean-k-mooney | i was about to say we cant fix rennos after teh fact right | 16:35 |
efried | shouldn't it therefore point to the respective version of the config.html? | 16:35 |
sean-k-mooney | or at least its non trivail to change them after a release is done | 16:35 |
efried | sean-k-mooney: You asked me to review https://review.openstack.org/#/c/610034/ -- I'm out of my depth on that one, I'm afraid. | 16:36 |
sean-k-mooney | efried: it would if you had the commit checked out | 16:36 |
stephenfin | sean-k-mooney: We couldn't in the past but we can now (reno fixed that). That said, why make life hard for yourself | 16:36 |
stephenfin | *? | 16:36 |
stephenfin | efried: I guess it should, yes | 16:36 |
sean-k-mooney | efried: ok no worries | 16:36 |
stephenfin | But that ship has long since sailed. All our doc links in renos point to latest | 16:37 |
efried | sean-k-mooney: You spelled "outer" wrong! -2! | 16:37 |
efried | that's about the best I can do on that one. | 16:38 |
sean-k-mooney | where in the patch | 16:38 |
efried | https://review.openstack.org/#/c/610034/10/nova/utils.py@1322 | 16:38 |
sean-k-mooney | ... | 16:38 |
sean-k-mooney | yes i did ... | 16:39 |
sean-k-mooney | that means its incorrect in the placement version too which is a pain. | 16:40 |
sean-k-mooney | ill repin hopefully for the last time | 16:40 |
sean-k-mooney | efried: thanks for finding it it woudl have been even more of a pain when backporting this | 16:41 |
efried | sean-k-mooney: Well, it's really not a big deal, especially since that method isn't exposed anywhere outside of the decorator. But sure, if you're going to respin; I imagine the other non-blocking nits are worse than that :) | 16:42 |
sean-k-mooney | well i wasnt but i can if needed | 16:42 |
sean-k-mooney | this has been part of an a downstream ci blocking bug for almost a month so i would like to finally get it fixed | 16:43 |
*** ttsiouts has joined #openstack-nova | 16:47 | |
openstackgerrit | Lance Bragstad proposed openstack/nova master: WIP: experiment with oslo.limit interface https://review.openstack.org/615602 | 16:48 |
lbragstad | johnthetubaguy i took a wild crack at bringing the oslo.limit logic closer to the actual enforcement point, based on the commit you have ^ | 16:48 |
*** openstackgerrit has quit IRC | 16:48 | |
johnthetubaguy | lbragstad: ah, interesting, will have a look | 16:49 |
lbragstad | i guess i wanted to try and see if it was possible to make oslo.limit simple enough to not require a nova.limit module for dynamic limits | 16:49 |
lbragstad | but - what you have there for static limits totally makes sense | 16:49 |
lbragstad | i was just thinking that the original direction of the context manager might be hard to take advantage of if we go from api code -> limit module -> oslo limit | 16:50 |
lbragstad | since oslo.limit and the Enforcer context manager was written to be as close to the actual code consuming resources as possible | 16:50 |
johnthetubaguy | the problem is the retry check is always in another process at the moment | 16:51 |
johnthetubaguy | so not sure where the context manager will help, I am hoping jaypipes's patch will be different though | 16:51 |
lbragstad | dumb question, but what do you mean by the retry logic being in a different process? | 16:52 |
lbragstad | i was under the assumption that retry logic was in the same area of code that the resource consumption happens | 16:52 |
sean-k-mooney | what are people feeilngs about adding more debug logs in the schduler | 16:53 |
sean-k-mooney | or rather in the numa_toplogy_filter to be specific | 16:53 |
sean-k-mooney | im trying to debug a no valid host form schduler logs and the is nothing to go on to determin why the numa toplogy filter decied it was invlaid | 16:54 |
johnthetubaguy | lbragstad: it isn't at the moment sadly | 16:54 |
lbragstad | is the retry logic the thing that protects against race conditions between clients/ | 16:54 |
lbragstad | or is that something else? | 16:55 |
jaypipes | sean-k-mooney: if a deployer is using the NUMATopologyFilter, they don't care about quickness of the scheduler. I say go for it. | 16:55 |
johnthetubaguy | sean-k-mooney: I prefer debug logs only on the reject path, if possible, but what jaypipes said too | 16:55 |
johnthetubaguy | lbragstad: it is that thing, let me link to an example | 16:55 |
sean-k-mooney | ya it wanted to pring the host toplogy and requested guest topology as a debug message only on failure | 16:56 |
johnthetubaguy | lbragstad: this is the recheck for build requests: https://review.openstack.org/#/c/615180/5/nova/conductor/manager.py | 16:56 |
sean-k-mooney | at the moment we jsut asy it did not fit | 16:57 |
sean-k-mooney | that could be a little noisy however... | 16:57 |
johnthetubaguy | lbragstad: the first check is in the API process, here: https://github.com/openstack/nova/blob/8d089111c8554e94e117ada3a7f51a42df59e84f/nova/compute/api.py#L868 | 16:57 |
lbragstad | ahhh | 16:57 |
johnthetubaguy | lbragstad: the recheck is in the conductor, after calling the scheduler | 16:57 |
lbragstad | interesting... so that's not API code | 16:57 |
lbragstad | i see what you mean | 16:57 |
melwitt | sean-k-mooney, tssurya: looks like this already got explained but I can also pile on and say that per user quotas are per project only. so the two-level hierarchy in unifed limits in keystone gets you the same functionality from a nested quota standpoint | 16:58 |
sean-k-mooney | melwitt: ya that was news to me that user quotas are per project | 16:58 |
melwitt | gotcha | 16:58 |
johnthetubaguy | lbragstad: sadly the simplest examples of the recheck basically don't get moved to unified limits | 16:59 |
lbragstad | because it's not in nova-api? | 16:59 |
johnthetubaguy | lbragstad: well this is one that could move to unified limits with a context manager, but it doesn't really make sense for other reasons: https://review.openstack.org/#/c/615180/5/nova/api/openstack/compute/server_groups.py | 17:00 |
stephenfin | jaypipes: Out of curiosity, did you ever post your slides for "Scheduler Wars: A New Hope" anywhere? | 17:00 |
lbragstad | because it's not really a consumable resource, right? | 17:00 |
johnthetubaguy | lbragstad: yeah | 17:00 |
lbragstad | limiting server groups is more of a rate limiting thing | 17:00 |
johnthetubaguy | yeah, its a db bloat protection | 17:00 |
*** helenafm has quit IRC | 17:01 | |
lbragstad | got it | 17:01 |
*** openstackgerrit has joined #openstack-nova | 17:02 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Allow resource_provider_association_refresh=0 https://review.openstack.org/615606 | 17:02 |
johnthetubaguy | lbragstad: just to confuse things, I actually propose we remove the rechecks for these rate-limit like things in the spec, since the check is being demoted lets tidy the code up a bit more if we can | 17:02 |
*** gyee has joined #openstack-nova | 17:02 | |
lbragstad | based on my super vague understand of all this, that seems reasonable | 17:02 |
johnthetubaguy | lbragstad: I am planning on keeping the recheck for everything in unified limits, but right now one part is in API the other is in the conductor process | 17:02 |
melwitt | johnthetubaguy: is the conductor recheck the only problem area for the oslo.limit verify? because I've had a TODO in my head to move the recheck back to nova-api | 17:03 |
lbragstad | oh, nice... | 17:03 |
johnthetubaguy | melwitt: not sure its too much of a problem really, but that is the only recheck that is relevant in the end. Isn't the issue that we need to write into the correct DB before we recheck? | 17:04 |
lbragstad | for historical context, what was the reason for moving the recheck logic to conductor? Just to have it closer to the database? | 17:06 |
melwitt | johnthetubaguy: it is, but related to behavior change we have where multi-create causes potentially a lot of instances to fall into ERROR state if recheck fails, I had been thinking we should instead count build requests + instances for the recheck and do that in nova-api instead of doing the conductor thing | 17:06 |
melwitt | lbragstad: because that's where the instance record is created, yes | 17:06 |
lbragstad | got it | 17:06 |
johnthetubaguy | lbragstad: would it not be OK just to make the context manager optional, because when verify=false the context manager is really strange? | 17:07 |
melwitt | I can propose a WIP patch today to show what I mean. I just hadn't gotten around to trying it out | 17:07 |
johnthetubaguy | melwitt: oh good thinking... build requests | 17:07 |
lbragstad | yeah - i guess you're right | 17:07 |
lbragstad | originally, we were thinking it would make adoption easier, but in this case it might not... | 17:07 |
lbragstad | we just didn't want to have to require developers to have to remember to put specific ordering in their API code (e.g., don't forget to recheck *here*) | 17:08 |
lbragstad | seemed like an interesting opportunity to encapsulate some of that behind the __exit__ of a context manager | 17:09 |
johnthetubaguy | lbragstad: if we make the recheck compulsory, I think it works (like oslo.limit has the configuration about if you recheck or not) | 17:10 |
melwitt | yeah, I suppose other people might run into a similar issue if they've got a resource create happening in a separate service/process as well. I'm sorry I missed this case at previous oslo.limit sessions :\ | 17:10 |
lbragstad | well - i think this is just one of those things where you don't really see it fall apart until you're knee-deep in the code | 17:11 |
johnthetubaguy | yeah, I have similar regrets, that PoC patch has really made me think again on a few things | 17:11 |
lbragstad | which is good, who knows, maybe the context manager is just an over-optimization at this point | 17:12 |
*** ivve has quit IRC | 17:12 | |
melwitt | I think it's definitely a nice option where it fits | 17:12 |
johnthetubaguy | yeah, I like the option of having it | 17:12 |
lbragstad | what if we (oslo.limit) give nova a public API for enforce and recheck (which you can call optionall)? | 17:12 |
jaypipes | stephenfin: yeah, they're on good drive... one sec. | 17:12 |
lbragstad | and forget the context manager for now | 17:13 |
lbragstad | in the future, we could expose a context manager that consumes enforce() on __enter__ and recheck() or reverify() on __exit__() for compatibility | 17:13 |
johnthetubaguy | lbragstad: I think that works, although it makes "claim" seem like the wrong word, its more a resource request | 17:14 |
jaypipes | stephenfin: https://bit.ly/scheduler-wars-a-new-hope and https://bit.ly/scheduler-wars-revenge-of-the-split | 17:14 |
cfriesen | The vTPM spec is up at https://review.openstack.org/#/c/571111/ if any cores feel like taking a look. I believe all comments have been addressed. | 17:15 |
lbragstad | johnthetubaguy yeah - i'm not tied to the terminology, at the time it seemed easy to think of it as "I'm attempting to claim 4 cores on this project" | 17:15 |
stephenfin | jaypipes: Awesome. Thanks very much | 17:15 |
jaypipes | np | 17:15 |
johnthetubaguy | lbragstad: I would be tempted to say include the context manager from the beginning, but its hard to justify if we have no one to use it in v1 | 17:17 |
lbragstad | do you think it's going to be hard to design enforce() and verify() with the hopes that there will be a context manager in the future? | 17:18 |
johnthetubaguy | lbragstad: that sounds good though, some kind of limit.check call sounds good. Not sure you need a recheck one, depending on the arguments you go for. | 17:19 |
lbragstad | doesn't nova need a verify() function that isn't associated to the context manager? | 17:19 |
lbragstad | that way nova-conductor can call it? | 17:20 |
johnthetubaguy | sorry, I just confused things there | 17:20 |
johnthetubaguy | if we have the recheck configuration move into olso.limit we need a verify call that takes a list of resource_names to check, only if configured to | 17:22 |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: remove QuotaEngine.register_resources() https://review.openstack.org/615613 | 17:22 |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: remove defaults kwarg in get_project_quotas https://review.openstack.org/615614 | 17:22 |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: remove get_quota_classes() driver method https://review.openstack.org/615615 | 17:22 |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: remove Context.quota_class https://review.openstack.org/615616 | 17:22 |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: remove FakeContext from quota unit tests https://review.openstack.org/615617 | 17:22 |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: remove _no_class tests https://review.openstack.org/615618 | 17:22 |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: clean up DbQuotaDriver unit tests https://review.openstack.org/615619 | 17:22 |
johnthetubaguy | so that is what I waiting to see ^ | 17:22 |
jaypipes | johnthetubaguy: that's mostly just cleanups and test refactoring. still have the new LimitsDriver stuff to push. | 17:23 |
johnthetubaguy | jaypipes: did you see the thing I did here, for the direction I was thinking: https://review.openstack.org/#/c/615180 | 17:23 |
lbragstad | yeah - i was thinking the main reason for not using a context manager was because nova-conductor does the verification step if configured to do so | 17:23 |
jaypipes | johnthetubaguy: the series there is just slowly chipping away at the bloated quota driver interface, attempting to simplify it as much as possible before introducing further changes. | 17:24 |
johnthetubaguy | lbragstad: yeah, its just right now we use a single call, we pass +0 extra resources in the conductor | 17:24 |
johnthetubaguy | jaypipes: yeah, my thinking was to put a new stack next to the old one, and remove the old stack later, but open to ideas | 17:24 |
johnthetubaguy | (took me all friday to work out that is what I was actually thinking) | 17:25 |
lbragstad | johnthetubaguy right - and that should still be possible if oslo.limit gives you enforce() and verify() functions, right? | 17:27 |
lbragstad | you'd just be calling them from two different places | 17:27 |
johnthetubaguy | lbragstad: yeah totally, I just was wondering if you needed the separation, I think you probably do | 17:27 |
melwitt | yeah, you're just thinking allow another option where the APIs are decoupled right? and then if they are co-located in some part of code, you could use the context manager | 17:28 |
lbragstad | ^ yeah | 17:28 |
melwitt | because enforce == first check and verify == second check with +/- 0 | 17:28 |
johnthetubaguy | yeah, +1 | 17:28 |
lbragstad | the context manager could use the same exact public APIs | 17:28 |
johnthetubaguy | totally agreed with that | 17:28 |
melwitt | that actually sounds like it fits nicely | 17:28 |
johnthetubaguy | +1 | 17:29 |
lbragstad | ok - sounds like we have some things we can improve in oslo.limit then | 17:29 |
johnthetubaguy | I wondered if you could just have a single check API that is used in both sides of the context manager, but possibly not | 17:29 |
johnthetubaguy | i.e. use check in enter and exit | 17:29 |
lbragstad | you mean make enforce() generic enough to handle all cases? | 17:30 |
johnthetubaguy | but I think I prefer oslo.limit deciding if a recheck is done or not | 17:30 |
johnthetubaguy | yeah | 17:30 |
lbragstad | yeah - i think wxy-xiyuan tried to do something like that initially | 17:30 |
* lbragstad doesn't mind having clear separation of purpose | 17:31 | |
jaypipes | johnthetubaguy: did you want a review on the patch above? | 17:31 |
jaypipes | johnthetubaguy: I'm not a fan of mixing usage and limit information in the same function/object | 17:31 |
*** davidsha has quit IRC | 17:32 | |
jaypipes | johnthetubaguy: lemme push up what I've been working on. perhaps it will make a bit more sense. | 17:32 |
johnthetubaguy | jaypipes: from a direction point of view, that would be great | 17:32 |
johnthetubaguy | jaypipes: yeah, that would be cool | 17:32 |
johnthetubaguy | I have to run now, but will catch up later | 17:33 |
johnthetubaguy | lbragstad: yeah, the clear separation is almost certainly a win over a smaller API | 17:33 |
melwitt | not to complicate things but thinking about the APIs, I do wonder how the they'll look in the future when other backends are added, like the etcd locking idea as another way to deal with races | 17:37 |
lbragstad | ^ a couple people also had ideas about using etcd as a way to implement more than two levels of hierarchical checking | 17:38 |
melwitt | to be clear, just thinking to keep those in mind to avoid painting ourselves into a corner for the future. but decoupling APIs shouldn't affect that really | 17:40 |
lbragstad | ideally - the usage of etcd would be an implementation detail for oslo.limit to handle | 17:41 |
lbragstad | i would think - i can't really think of a reason why the data coming from nova would change if etcd were being used | 17:42 |
melwitt | I think it would too, just thinking where it fits. it would be in enforce right? and then verify wouldn't be used in that case? | 17:46 |
melwitt | like, is verify() valid depending on backend? | 17:47 |
*** jpena is now known as jpena|off | 17:48 | |
*** dave-mccowan has quit IRC | 17:49 | |
*** ttsiouts has quit IRC | 17:50 | |
*** dave-mccowan has joined #openstack-nova | 17:50 | |
*** ttsiouts has joined #openstack-nova | 17:51 | |
*** dave-mccowan has quit IRC | 17:55 | |
*** ttsiouts has quit IRC | 17:55 | |
efried | bauzas: LazyLoader? | 17:58 |
bauzas | efried: wat? | 18:00 |
efried | bauzas: I'm wondering if there's a reason functools.partial was necessary there. | 18:00 |
bauzas | oh about my concern? | 18:00 |
bauzas | you mean functools.wraps ? | 18:00 |
*** derekh has quit IRC | 18:00 | |
bauzas | .partial is different | 18:01 |
efried | bauzas: This is unrelated to anything | 18:01 |
bauzas | not sure I understand you | 18:01 |
efried | bauzas: I'm running into a unit test snafu due to LazyLoader because I'm trying to access self.compute.reportclient._provider_tree | 18:01 |
efried | LazyLoader is returning _provider_tree as a functools.partial | 18:02 |
efried | because it's only set up to return *callable* attributes from the thing it's shimming. | 18:02 |
efried | So I'm trying to figure out if there's a way for me to do | 18:02 |
efried | if callable: | 18:02 |
efried | do the thing it does now | 18:02 |
efried | else | 18:02 |
efried | just return the attribute, not a functools.partial | 18:02 |
efried | but in order to do that, I'm going to have to collapse it | 18:03 |
efried | bauzas: https://review.openstack.org/#/c/104556/9..14/nova/scheduler/client/__init__.py | 18:05 |
bauzas | ah this | 18:05 |
efried | I fully expect you to 100% remember all the reasoning that went on behind this change from four years ago. | 18:06 |
bauzas | efried: .partial() is because we don't want to pass __name | 18:07 |
bauzas | example https://stackoverflow.com/questions/15331726/how-does-the-functools-partial-work-in-python | 18:08 |
efried | bauzas: butbutbut, why wouldn't it work to just do this: | 18:08 |
efried | def __getattr__(self, name): | 18:08 |
efried | if self.instance is None: | 18:08 |
efried | self.instance = self.klass(*self.args, **self.kwargs) | 18:08 |
efried | return getattr(self.instance, name) | 18:08 |
efried | (FWIW, it seems to do what I need) | 18:10 |
bauzas | yup, but then that's not a lazy loader ;) | 18:12 |
efried | bauzas: Say wha? | 18:12 |
efried | bauzas: The only difference in laziness is if a caller pulled a method but didn't call it. | 18:13 |
bauzas | efried: calling __getattr__ in your case will run getattr() of the instance, right? | 18:13 |
efried | yes | 18:13 |
bauzas | so, it's executing synchronously | 18:14 |
*** adrianc has quit IRC | 18:14 | |
bauzas | the problem is when importing the module | 18:14 |
bauzas | if you don't lazy load it, then you're loopîng | 18:14 |
bauzas | see why I lazy-load it below | 18:15 |
efried | You mean getattr looping on itself? | 18:15 |
bauzas | nope | 18:15 |
bauzas | we will only import the modules when we call select_destinations() | 18:15 |
bauzas | *not when we import nova.scheduler.client* :) | 18:16 |
efried | that... still happens? | 18:16 |
*** Swami has joined #openstack-nova | 18:17 | |
efried | Never mind about why it matters that we wait to import the report client... | 18:17 |
bauzas | efried: see this blogpost https://snarky.ca/lazy-importing-in-python-3-7/ | 18:17 |
bauzas | efried: anyway, test it | 18:19 |
bauzas | efried: when I wrote this, we were having an import loop | 18:19 |
bauzas | hence the lazyload | 18:19 |
efried | bauzas: Test it how? Is there test code somewhere that verifies that the importing is being done lazily? | 18:19 |
bauzas | but now we changed a lot of the client, so maybe it's not needed | 18:19 |
bauzas | efried: you can pdb it, right? | 18:20 |
efried | I would think so. | 18:20 |
bauzas | or you can just remove the lazyloader and test whether we still have the import loop | 18:20 |
bauzas | if you have concerns by this class | 18:20 |
*** ralonsoh has quit IRC | 18:20 | |
bauzas | anyway, 7:20pm here | 18:20 |
bauzas | ++ | 18:20 |
efried | def __init__(self): | 18:20 |
efried | self.queryclient = LazyLoader(importutils.import_class( | 18:20 |
efried | 'nova.scheduler.client.query.SchedulerQueryClient')) | 18:20 |
efried | self.reportclient = LazyLoader(importutils.import_class( | 18:20 |
efried | 'nova.scheduler.client.report.SchedulerReportClient')) | 18:20 |
efried | Pretty sure that -^ is actually doing the import right away. | 18:21 |
efried | You're calling LazyLoader with the *result* of running importutils.import_class(), which is the imported class. | 18:21 |
lbragstad | melwitt that's a good question, verify is going to need the current usage and limit information, which means it might need to use etcd | 18:24 |
lbragstad | to get that information | 18:24 |
cdent | efried: sure looks that way to me too | 18:24 |
cdent | the reason it avod the import loop is simply because it is called "later" | 18:24 |
cdent | but it isn't lazy | 18:24 |
efried | yuh. I guess I need to pull this part out into a separate patch, sigh. | 18:24 |
melwitt | lbragstad: I was wondering, is verify() even useful though if we are distributed locking on enforce + create? I dunno, just unhelpfully thinking aloud :P | 18:26 |
lbragstad | oh... i missed the distributed lock part | 18:26 |
lbragstad | using etcd as a distributed lock would prevent the need for verify(), then? | 18:27 |
melwitt | that's what I was thinking. this came up at... was it the forum in vancouver? someone asked why do the recheck thing to avoid races, why not use distributed locking instead, and people had some ideas about using etcd as part of distributed locking. and then we were thinking that could be just one of many potential approaches that oslo.limit could provide underneath | 18:30 |
*** jmlowe has quit IRC | 18:30 | |
melwitt | *to handle races | 18:30 |
lbragstad | oh - sure | 18:31 |
lbragstad | i haven't thought about that specifically, yet | 18:31 |
melwitt | and I was just thinking whether that affects the API we're thinking. maybe it would just make verify() optional in the case of that backend | 18:31 |
efried | cdent, bauzas: oic now. The lazy loader is deferring the *instantiation* of the class until some method is called on it. The import of the module is still happening as soon as you instantiate SchedulerClient. But the arg being passed to the LazyLoader is the class object, not an instance of that class. | 18:31 |
efried | And IMO that doesn't really save you anything over just instantiating the class right away in the SchedulerClient init. | 18:31 |
cdent | right | 18:31 |
cdent | It's not clear why you'd want to wait | 18:32 |
melwitt | lbragstad: yeah. too early to think too much about it but at the same time, throwing it out there in case it would cause a big problem with what we're thinking about for now | 18:32 |
lbragstad | in that same session i want to say we were talking about using etcd as a way to communicate events between nova (oslo.limit) and keystone | 18:32 |
melwitt | ah, gotcha | 18:32 |
lbragstad | because - once you go past the 2nd layer of projects, it becomes harder to make assumptions about the resources in the tree, i think | 18:32 |
dansmith | and if you take the locking approach, | 18:33 |
dansmith | you have to figure out how high to lock | 18:33 |
lbragstad | http://specs.openstack.org/openstack/keystone-specs/specs/keystone/rocky/strict-two-level-enforcement-model.html#limit-and-usage-awareness-across-endpoints | 18:34 |
lbragstad | dansmith yeah - that's another important bit | 18:34 |
melwitt | lbragstad: thanks for the link, I had missed that | 18:35 |
lbragstad | yeah, turns out that was an important bit to include as justification for why we decided to keep things at 2 levels | 18:36 |
*** k_mouza has quit IRC | 18:44 | |
*** jmlowe has joined #openstack-nova | 18:53 | |
*** mvkr has quit IRC | 18:55 | |
*** tssurya has quit IRC | 18:56 | |
*** ivve has joined #openstack-nova | 18:57 | |
*** tekfiabderrahman has joined #openstack-nova | 18:59 | |
*** tekfiabderrahman has quit IRC | 19:00 | |
*** dtantsur is now known as dtantsur|afk | 19:24 | |
*** zul has quit IRC | 19:24 | |
openstackgerrit | Jay Pipes proposed openstack/nova master: quota: rename arguments to clarify they are limits https://review.openstack.org/615633 | 19:30 |
lbragstad | does nova allow users to create multiple instances in multiple projects in one request? | 19:31 |
melwitt | no | 19:31 |
lbragstad | ok - cool, just wanted to double check | 19:31 |
openstackgerrit | Merged openstack/nova master: Minimal construct plumbing for nova service-list when a cell is down https://review.openstack.org/584829 | 19:33 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Update compute API.get() mocks in test_server_metadata https://review.openstack.org/615341 | 19:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Update compute API.get() stubs in test_serversV21 https://review.openstack.org/615342 | 19:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Update compute API.get() stubs in test_server_actions https://review.openstack.org/615343 | 19:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Update compute API.get() stubs for test_*security_groups https://review.openstack.org/615344 | 19:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Update compute API.get() stubs for test_disk_config https://review.openstack.org/615345 | 19:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Update compute API.get() stubs in test_access_ips https://review.openstack.org/615346 | 19:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Drop pre-cellsv2 compat in compute API.get() https://review.openstack.org/615347 | 19:38 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Remove "API Service Version" upgrade check https://review.openstack.org/615348 | 19:38 |
*** tbachman has quit IRC | 20:01 | |
mriedem | i'm pretty sure we don't cleanup allocations held by a migration record if we delete a server while it's in VERIFY_RESIZE status, but my recreate test isn't failing for that, and i don't see where in the code we'd cleanup allocatoins for migration records when deleting a server | 20:07 |
cdent | mriedem: I told someone internally to make a bug about that recently | 20:09 |
mriedem | it came to mind in the ML thread with mnaser about leaked/incorrect allocations | 20:09 |
mriedem | but can't seem to reproduce it | 20:09 |
cdent | (upstream bug) but guess they didn't. | 20:09 |
cdent | I think they may have convinced themselves it wasn't happening. I'll see if I can find a reference | 20:09 |
mriedem | i can't see how it *can't* be happening | 20:10 |
mriedem | if you delete while in VERIFY_RESIZE status | 20:10 |
openstackgerrit | Eric Fried proposed openstack/nova master: Remove LazyLoad of Scheduler Clients https://review.openstack.org/615641 | 20:10 |
efried | bauzas, johnthetubaguy, cdent: I couldn't find a circular import, but I guess CI will tell ^ | 20:11 |
mnaser | I’m curious if deleting an instance in ERROR state after failed live migrate or resize doesn’t delete it too | 20:12 |
mriedem | as far as i can tell, for a resize, we only cleanup the allocations held by the migration record when confirming the server (we delete the source node allocations held by the migration) or on revert we swap the allocations held by the migratoin record on the source node with the instance consumer and drop the allocations held by the instance on the target node | 20:12 |
mriedem | mnaser: we should always cleanup allocations held by at least the instance, even if error state, either on the compute or in the api (if the compute is down) | 20:13 |
mriedem | i'm more worried that we're leaking allocations held by the migration record | 20:13 |
melwitt | yeah, we should be ok on the delete in error state case (the local delete path) will take care of the instance allocation | 20:14 |
melwitt | but I agree, I'm not seeing where we take care of the migration related allocations | 20:14 |
*** maciejjozefczyk has joined #openstack-nova | 20:14 | |
mriedem | i do not see where _rollback_live_migration cleans up allocations held by the migration record | 20:14 |
mnaser | I’ve definitely seen it log the rollback message | 20:15 |
mnaser | in terms of removing destination allocation | 20:15 |
mriedem | oh there it is for rollback | 20:19 |
*** jmlowe has quit IRC | 20:19 | |
mriedem | https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L6796 | 20:20 |
mriedem | where i'd expect migration allocations to get cleaned up is here https://github.com/openstack/nova/blob/master/nova/compute/manager.py#L761 | 20:20 |
mriedem | on server delete i mean | 20:20 |
*** maciejjozefczyk has quit IRC | 20:21 | |
melwitt | yeah. guess VERIFY_RESIZE is the only case where we'd have the situation of having outstanding migration allocations? but you said you are trying to recreate the bug and not seeing outstanding migration allocations when the instance is in VERIFY_RESIZE :\ | 20:23 |
mriedem | correct | 20:24 |
mriedem | i can push the test up | 20:24 |
mriedem | maybe my test is busted | 20:24 |
melwitt | I'm wondering if there's maybe something different about how migrations are handled in a functional test environment (like is anything being faked in a way that covers it up?). but I thought we've been able to demonstrate allocation cleanup bugs with func tests before | 20:26 |
cdent | mriedem: you could do a pure api driven integration test as a child of https://review.openstack.org/#/c/613386/ pretty easy/quick | 20:27 |
openstackgerrit | Lance Bragstad proposed openstack/nova master: WIP: experiment with oslo.limit interface https://review.openstack.org/615602 | 20:27 |
mriedem | does that have 2 nodes? | 20:27 |
mriedem | $.hypervisors.`len`: 1 | 20:28 |
mriedem | nope | 20:28 |
cdent | oh yeah, that | 20:29 |
mriedem | resize to same host will also create migration-based allocations (that's a separate bug) | 20:29 |
mriedem | so it's still probably doable with your thing, but what i've got (in functional tests with python) is easier/faster for me | 20:29 |
cdent | this was in response to melwitt suggesting that there was some change that functional might be doing something odd | 20:29 |
mriedem | the functional tests assert that the source node contains the migration allocations after the resize, | 20:30 |
mriedem | so i think they are ok, | 20:30 |
mriedem | they just aren't asserting the migration allocations are removed after the server is deleted | 20:30 |
lbragstad | melwitt jaypipes johnthetubaguy https://review.openstack.org/#/c/615643/ is a quick stab at the oslo.limit changes we talked about (sans requiring a context manager in the initial implementation) | 20:30 |
mriedem | nor do i see tests that delete the server while it's in VERIFY_RESIZE state | 20:30 |
lbragstad | https://review.openstack.org/#/c/615602/ is a nova patch based on johnthetubaguy's that tries to use the new changes | 20:31 |
melwitt | cool lbragstad | 20:34 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add functional test to delete a server while in VERIFY_RESIZE https://review.openstack.org/615644 | 20:35 |
jaypipes | lbragstad: I'm not really following the oslo.limits part, frankly... | 20:35 |
jaypipes | lbragstad: one of the parts that is incredibly confusing and frustrating about the current nova quota engine is its coupling of limits with usage into the same objects (what it calls a "Quota") | 20:35 |
openstackgerrit | Eric Fried proposed openstack/nova master: SIGHUP n-cpu to refresh provider tree cache https://review.openstack.org/615646 | 20:36 |
jaypipes | lbragstad: I was thinking that oslo.limits would stick to the limits stuff and stay out of the usage bits. | 20:36 |
jaypipes | lbragstad: in short, I don't think oslo.limits needs an Enforcer thing. | 20:36 |
melwitt | jaypipes: I had the same frustration and it didn't occur to me to rename things as you have :P | 20:38 |
melwitt | I've been pedantic on any new code comments or discussions we've had on the difference between limits and usage | 20:38 |
*** jmlowe has joined #openstack-nova | 20:39 | |
melwitt | but, I think with oslo.limit the idea was to try and abstract away as much as possible and let it handle the hierarchical enforcement, and all you do is provide it a callback | 20:41 |
*** awaugama has quit IRC | 20:44 | |
lbragstad | oh - sure... | 20:46 |
lbragstad | i can see where that can get muddy | 20:46 |
lbragstad | i guess the only reason we have usage in the oslo.limits stuff is because we need to calculate enforcement, which checks what users are asking for against what they already have | 20:47 |
lbragstad | and to melwitt's point, the callback is what oslo.limit is relying on for the usage of X in a given project | 20:50 |
melwitt | I think what jaypipes is saying is that it's not technically required for oslo.limit to take care of enforcement, since enforcement could be done in projects in whatever way they want. but I thought oslo.limit was also trying to provide a common and simple interface for every project to use and let the callbacks be the only pieces each project would have to write separately. and otherwise use the common oslo.limit interface | 20:54 |
mriedem | i think i found what cleans up the migration-based allocation during server delete of a VERIFY_RESIZE server | 20:54 |
mriedem | https://github.com/openstack/nova/blob/master/nova/compute/api.py#L2096 | 20:55 |
melwitt | mriedem: a-ha! nice find. what a sneaky code | 20:56 |
melwitt | never knew resizes were confirmed right before deleting | 20:56 |
melwitt | makes sense though | 20:56 |
jaypipes | lbragstad: I still don't see why oslo.limits needs to "calculate enforcement". | 20:58 |
melwitt | I don't think it "needs to" but if it does, it makes it so projects don't have to implement the same thing separately. they just have to write callbacks | 20:58 |
melwitt | maybe that commonized code is minimal though, which I guess is your point | 20:59 |
lbragstad | yeah - i think we were operating under the assumption that service would give oslo.limit a project_id + resource type and oslo.limit would return a yes or no based on the hierarchy and usage | 20:59 |
lbragstad | because we didn't want to make service re-implement logic to understand this complex project hierarchy | 21:00 |
lbragstad | services* | 21:00 |
lbragstad | but - putting that aside... how do you see nova's interaction with oslo.limit jaypipes? | 21:01 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add functional test to delete a server while in VERIFY_RESIZE https://review.openstack.org/615644 | 21:01 |
jaypipes | lbragstad: I see oslo.limit as basically the client for Keystone's GET /limits API. | 21:03 |
jaypipes | lbragstad: and maybe the definition of some common objects. that's about it. | 21:03 |
lbragstad | ok | 21:04 |
lbragstad | so - to be clear, you except oslo.limit to still handle the tree of projects from keystone and their respective limits? | 21:04 |
lbragstad | expect* | 21:04 |
*** erlon has quit IRC | 21:13 | |
*** maciejjozefczyk has joined #openstack-nova | 21:16 | |
*** awaugama has joined #openstack-nova | 21:18 | |
jaypipes | lbragstad: I don't necessarily think so, no... | 21:21 |
*** maciejjozefczyk has quit IRC | 21:21 | |
jaypipes | lbragstad: all I need is a flattened dict of project limit amounts for a set of resource types | 21:21 |
jaypipes | lbragstad: I figured Keystone would do the needful when it came to flattening the returned dict of resource type to limit amount. | 21:22 |
jaypipes | lbragstad: still trying to finish up this code. I hope things will be clearer with the last patch in this series. | 21:22 |
lbragstad | ack | 21:22 |
lbragstad | we kinda do that already - http://specs.openstack.org/openstack/keystone-specs/specs/keystone/rocky/strict-two-level-enforcement-model.html#fetching-project-hierarchy | 21:23 |
lbragstad | which was required for the limit work | 21:23 |
*** itlinux has joined #openstack-nova | 21:23 | |
*** tbachman has joined #openstack-nova | 21:34 | |
*** markmcclain has quit IRC | 21:42 | |
openstackgerrit | Matt Riedemann proposed openstack/nova-specs master: Add support for emulated virtual TPM https://review.openstack.org/571111 | 21:43 |
*** ivve has quit IRC | 21:51 | |
*** k_mouza has joined #openstack-nova | 21:52 | |
cfriesen | mriedem: thanks for that tweak ^ | 21:56 |
*** awaugama has quit IRC | 22:02 | |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: update_from_provider_tree: fast fail, clear cache https://review.openstack.org/615677 | 22:10 |
mriedem | efried: cdent: jaypipes: have we found a compromise on https://review.openstack.org/#/c/570847/10/nova/rc_fields.py@45 ? | 22:11 |
mriedem | NET_BW_EGR_KILOBIT_PER_SEC and NET_BW_IGR_KILOBIT_PER_SEC ? | 22:12 |
cdent | mriedem: ooph, I had managed to forget about that | 22:14 |
mriedem | i'm rebasing his series so if we are cool with that naming then i'll make the change | 22:14 |
* cdent is rereading | 22:15 | |
*** markmcclain has joined #openstack-nova | 22:17 | |
cdent | did we consider NET_BW_EGRESS_KBPS or is that too ambiguous for bits and bytes? (although historicall *bps has always been bits?) | 22:17 |
cdent | mriedem: I don't have a huge opinion on that actual name, just that the symbol and the string in the canonical location be the same thing | 22:18 |
cdent | given the canonical location is going to change soonish *shrug* | 22:18 |
mriedem | gibi mentioned the bits vs bytes thing | 22:19 |
mriedem | "I needed to include the unit of the resource as it is not trivial and as the name needs to be all upper case the bit and byte difference (the capital B in KB means byte officially) in the unit cannot be expressed if abbreviated." | 22:20 |
mriedem | i'll just change to NET_BW_EGR_KILOBIT_PER_SEC and NET_BW_IGR_KILOBIT_PER_SEC so we can shit and get off the pot | 22:21 |
cdent | wfm | 22:22 |
efried | since I think the discussion is all my fault, I might as well weigh back in. | 22:30 |
efried | I don't see why the const has to be the same as the string. | 22:30 |
efried | But jaypipes has a point that you can always alias it if you need to. | 22:30 |
efried | So effit, we can just go back to the original names and suck up literally half the line width Python allows us. | 22:30 |
mriedem | no backsies now | 22:32 |
mriedem | i've already changed it | 22:32 |
efried | idgas | 22:39 |
*** munimeha1 has quit IRC | 22:42 | |
*** itlinux has quit IRC | 22:45 | |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add request_spec.RequestGroup versioned object https://review.openstack.org/568840 | 22:47 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add requested_resources field to RequestSpec https://review.openstack.org/567267 | 22:47 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Add bandwidth related standard resource classes https://review.openstack.org/570847 | 22:47 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Transfer port.resource_request to the scheduler https://review.openstack.org/567268 | 22:47 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Reject interface attach with QoS aware port https://review.openstack.org/570078 | 22:47 |
openstackgerrit | Matt Riedemann proposed openstack/nova master: Reject networks with QoS policy https://review.openstack.org/570079 | 22:47 |
*** mvkr has joined #openstack-nova | 22:51 | |
mriedem | UH OH http://logs.openstack.org/47/615347/5/check/tempest-full/bf73d04/controller/logs/screen-n-sch.txt.gz#_Nov_05_20_39_53_554408 | 23:01 |
mriedem | http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22AllocationUpdateFailed%3A%20Failed%20to%20update%20allocations%20for%20consumer%5C%22%20AND%20message%3A%5C%22Error%3A%20another%20process%20changed%20the%20consumer%5C%22%20AND%20message%3A%5C%22after%20the%20report%20client%20read%20the%20consumer%20state%20during%20the%20claim%5C%22%20AND%20tags%3A%5C%22screen-n-sch.txt%5C%22&from=10d | 23:02 |
efried | mriedem: Tell me we have a retry in place for that | 23:08 |
*** lbragstad has quit IRC | 23:09 | |
*** lbragstad has joined #openstack-nova | 23:10 | |
efried | mriedem: Is the code path for is_new_compute_node guaranteed to run only on startup? | 23:13 |
*** mlavalle has quit IRC | 23:17 | |
*** mriedem has quit IRC | 23:20 | |
openstackgerrit | Eric Fried proposed openstack/nova master: Remove LazyLoad of Scheduler Clients https://review.openstack.org/615641 | 23:56 |
openstackgerrit | Eric Fried proposed openstack/nova master: SIGHUP n-cpu to refresh provider tree cache https://review.openstack.org/615646 | 23:56 |
openstackgerrit | Eric Fried proposed openstack/nova master: WIP: Reduce calls to placement from _ensure https://review.openstack.org/615677 | 23:56 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!