opendevreview | melanie witt proposed openstack/nova master: nova-manage: Add 'limits migrate_to_unified_limits' https://review.opendev.org/c/openstack/nova/+/891646 | 00:01 |
---|---|---|
opendevreview | melanie witt proposed openstack/nova master: nova-manage: Add 'limits migrate_to_unified_limits' https://review.opendev.org/c/openstack/nova/+/891646 | 01:26 |
opendevreview | melanie witt proposed openstack/nova master: Add documentation for unified limits https://review.opendev.org/c/openstack/nova/+/893127 | 04:33 |
melwitt | bauzas: I looked at the blueprint status etherpad after reading the nova meeting notes, just fyi it's no longer WIP https://review.opendev.org/q/topic:bp%252Funified-limits-nova-tool-and-docs | 04:42 |
opendevreview | melanie witt proposed openstack/nova master: Add documentation for unified limits https://review.opendev.org/c/openstack/nova/+/893127 | 06:44 |
bauzas | melwitt: ack, will then look today :) | 06:50 |
opendevreview | Amit Uniyal proposed openstack/nova master: Reproducer for dangling bdms https://review.opendev.org/c/openstack/nova/+/889947 | 07:51 |
opendevreview | Amit Uniyal proposed openstack/nova master: Delete dangling bdms https://review.opendev.org/c/openstack/nova/+/882284 | 07:51 |
auniyal | bauzas, I have updated patches, can you please have another look | 08:43 |
bauzas | auniyal: sure | 08:45 |
auniyal | thanks | 08:45 |
auniyal | there are very less zuul jobs ran for reproducer patch -https://review.opendev.org/c/openstack/nova/+/889947/6?tab=change-view-tab-header-zuul-results-summary | 08:46 |
auniyal | like no nova-next | 08:46 |
auniyal | bauzas, is it because the only file which added/modified is in nova/tests directory ? | 08:47 |
bauzas | auniyal: indeed | 08:55 |
bauzas | hmmm, looks like we have a venv issue | 08:55 |
bauzas | https://zuul.opendev.org/t/openstack/build/b915c9886c2a420fbc686b2ca4ccece0 | 08:56 |
bauzas | most of the jobs are failing when creating the venv due to a NoneType exception | 08:56 |
bauzas | AttributeError I mean | 08:56 |
bauzas | nvm, this looks a transient issue https://zuul.opendev.org/t/openstack/builds?job_name=openstack-tox-py39&project=openstack%2Fnova&skip=0 | 09:01 |
opendevreview | melanie witt proposed openstack/nova master: Add documentation for unified limits https://review.opendev.org/c/openstack/nova/+/893127 | 09:21 |
bauzas | auniyal: I'm eventually +2 on the repro patch but please read my comment for understanding the difference between annotations and defaulting arguments, and why you should *never* set default args with mutables https://review.opendev.org/c/openstack/nova/+/889947/comment/34e10b03_a5d53174/ | 09:40 |
auniyal | bauzas, ack thanks | 09:51 |
auniyal | I knew about setting list() insignature but missed, thanks for the comments | 09:52 |
auniyal | can we +W this https://review.opendev.org/c/openstack/nova/+/881457 | 09:53 |
auniyal | bauzas, regading this https://review.opendev.org/c/openstack/nova/+/882284/47/nova/compute/manager.py#4246 | 09:55 |
auniyal | s on bdm.destroy(), bdm object wont get updated correct ? | 09:55 |
auniyal | from DB bdm will will deleted | 09:56 |
auniyal | but locally bdm object, will stay as same | 09:56 |
bauzas | auniyal: not if you play with mutables | 09:59 |
bauzas | bdms is a list, right? | 10:00 |
auniyal | yes | 10:00 |
bauzas | auniyal: https://paste.opendev.org/show/bBaeGvpA8Bx5sSF3wKgZ/ | 10:04 |
bauzas | in my example, tamper_bdms is altering a mutable object, which is a list | 10:05 |
bauzas | that's semantically identical to your _delete_dangling_bdms() method | 10:05 |
bauzas | (provided you alter it) | 10:05 |
auniyal | bauzas, I might be wrong but this is what I tested - https://paste.opendev.org/show/bnMHd0uJuFYaSWSuP3Gu/ | 10:19 |
auniyal | the first obj was bdms, then I destroyed one of bdm from it, there is no change in bdms, then later when I retrieved new bdms list, it did not have deleted bdm | 10:20 |
bauzas | auniyal: because you haven't deleted the local copy | 10:20 |
auniyal | yes, | 10:21 |
bauzas | bdm.destroy() doesn't remove the object from the list | 10:21 |
bauzas | you somehow need to either pop it from the list, or just delete it by hand | 10:21 |
auniyal | yes, so the same I was trying to say, so instead of deleted their, we ca have fresh copy | 10:21 |
auniyal | will it affect performance | 10:22 |
bauzas | auniyal: sure, I don't disagree, but it's just doing an extra RPC roundtrip for some calculation we already made | 10:22 |
bauzas | auniyal: if you compare your new bdm list after the cleanup with the old one, you should in theory see that only the destroyed BDMs are left | 10:23 |
bauzas | auniyal: well, it's optimization | 10:23 |
bauzas | you know what? I'll switch my vote to +1 and I'll leave others chime in | 10:23 |
auniyal | if it may affect, after destroy I can remove bdm from local copy and return same one. | 10:28 |
auniyal | I think only one line change | 10:28 |
bauzas | you don't need to return anything, bdms is a mutable | 10:39 |
bauzas | take my pastebin as a reference | 10:39 |
auniyal | yes, just pass bdm, update the list after bdm destroy | 10:39 |
auniyal | bauzas, TypeError: 'BlockDeviceMappingList' object does not support item deletion, neither remove | 11:14 |
auniyal | it has these - https://paste.opendev.org/show/bnDCZUcY00oiKOYIreMs/ | 11:14 |
auniyal | its inherited from ObjectListBase which is again from ovoo_base.ObjectListBase | 11:16 |
auniyal | this is the oslo one, is there a way to del bdms obj with oslo ? | 11:17 |
*** d34dh0r5- is now known as d34dh0r53 | 12:14 | |
opendevreview | Amit Uniyal proposed openstack/nova master: Delete dangling bdms https://review.opendev.org/c/openstack/nova/+/882284 | 12:23 |
auniyal | bauzas, I have updated the patch can you please have another look | 12:32 |
opendevreview | Amit Uniyal proposed openstack/nova master: Delete dangling bdms https://review.opendev.org/c/openstack/nova/+/882284 | 12:46 |
bauzas | auniyal: sec, finding a quick alternative | 13:43 |
bauzas | sorry, I needed to leave for a sec, back | 14:04 |
bauzas | dansmith: I'm just discussing with auniyal | 14:20 |
dansmith | about removing items from the BDM list? I think he resolved it no? | 14:20 |
bauzas | dansmith: I wonder why oslo.versionedobjects doesn't support directly deleting a list attribute | 14:20 |
bauzas | is it by design or just a miss ? | 14:21 |
dansmith | but I definitely agree with you that fetching again to refresh is not good, and I thought I made that point already, but that might have been *another* patch | 14:21 |
bauzas | we have many builtins but not this one | 14:21 |
dansmith | bauzas: I think it works by index, didn't he say? | 14:21 |
bauzas | dansmith: not exactly, see my comment : https://review.opendev.org/c/openstack/nova/+/882284 | 14:21 |
bauzas | shit | 14:21 |
bauzas | wrong url | 14:22 |
bauzas | dansmith: https://review.opendev.org/c/openstack/nova/+/882284/comment/8a1291a9_e0291cdc/ | 14:22 |
bauzas | I just simplified the loop | 14:22 |
sean-k-mooney | generally its preferable to not use del by the way | 14:22 |
bauzas | but, the fact is, we can access the index, but we can't directly del it | 14:22 |
sean-k-mooney | i know it works but it raises if its not present | 14:23 |
bauzas | so we need to access the somehow hidden objects internal field of the CoercedList objet | 14:23 |
bauzas | sean-k-mooney: I don't disagree on the general case, but in this case, I want to prevent an extra RPC roundtrip for something easy to cleanup | 14:23 |
sean-k-mooney | the bdms is a list of objects filed right | 14:24 |
sean-k-mooney | https://github.com/openstack/oslo.versionedobjects/blob/master/oslo_versionedobjects/fields.py#L1179-L1183 | 14:24 |
dansmith | bauzas: your suggestion is to modify the list while iterating right? that's not good. | 14:24 |
bauzas | dansmith: well, yeah you're right, it's a bad pattern | 14:24 |
sean-k-mooney | we should not need to fetch again | 14:24 |
sean-k-mooney | just recreate the bdms object wihtout the relevent bdms | 14:24 |
bauzas | okay, so I'll remove this comment and just ask auniyal to access the index the other wazy | 14:25 |
dansmith | the current approach that makes a list to remove and then does bdms.objects.remove() for each seems fine to kme | 14:26 |
sean-k-mooney | cant we effectivly jsut do a list comprehaent and create a new bdms object by iterating over the old one and filter out the things we dont want | 14:26 |
dansmith | yep, same same | 14:26 |
bauzas | dansmith: yeah, I'll remove my comment and just leave one comment which is a nit : we can iterate the bdms by 'for bdm in bdms' without requiring to iterate like he wrote with 'for bdm in bdms.objects' | 14:27 |
bauzas | there are many ways to do it | 14:27 |
bauzas | this is just a class python filtering over a list | 14:27 |
bauzas | classic* | 14:27 |
sean-k-mooney | honestly i woudl prefer if this funciton did not modify the passed in bdms object in place | 14:28 |
sean-k-mooney | and instead just constucted a new one and returned that | 14:28 |
dansmith | well, that's not very good either, | 14:29 |
dansmith | because it would need to re-associate the context | 14:29 |
bauzas | yeah | 14:29 |
sean-k-mooney | sorry trying to parse what tha tmeans | 14:29 |
dansmith | I don't remember if obj_clone() keeps the context, so if it does, then you could do that, but .. the point here is that it is _intending_ to modify the list | 14:29 |
dansmith | and save it | 14:29 |
bauzas | the workflow is : | 14:30 |
bauzas | 1/ call RPC to get the list | 14:30 |
bauzas | 2/ sanitize the list and eventually remove stale ones | 14:30 |
sean-k-mooney | right but we rarely use the fact that python supprot pass by pointer | 14:30 |
bauzas | 3/ do the rest of bdm checking that was existing | 14:30 |
sean-k-mooney | so the fact we are modifying an imput arge at all is kind of odd based on our normal code style | 14:30 |
bauzas | sean-k-mooney: well, I actually like python because of its mutables | 14:30 |
bauzas | but this is just a code pattern | 14:31 |
dansmith | I don't think it's a rule or even convention that we don't modify the input arguments in these RPC calls | 14:31 |
bauzas | I just wanted originally to prevent an unnecessary extra RPC roundtrip | 14:31 |
bauzas | dansmith: well, this is after the RPC call | 14:31 |
sean-k-mooney | its not a rule no i just find it tend to make the code less intuitive to read | 14:31 |
dansmith | bauzas: yes and that's good | 14:31 |
sean-k-mooney | as you cant tell form the call site if it modfies the object or not | 14:32 |
dansmith | sean-k-mooney: if we're going to modify it and save it, then modifying the copy we got makes more sense to me anyway, even though nobody else has a pointer to this actual instance other than us :) | 14:32 |
bauzas | n-cpu gets the list, then n-cpu reduces the list | 14:33 |
sean-k-mooney | i guess either works i just like the pattere of x = do_stuff(x) | 14:33 |
bauzas | then n-cpu can call as many saves as it wants | 14:33 |
sean-k-mooney | hostly either works | 14:33 |
sean-k-mooney | the only time i really care is if a funciton modifed inputs and returns something | 14:33 |
sean-k-mooney | i really dont like that partern but if it does one or the other its generally ok | 14:34 |
bauzas | sean-k-mooney: yeah and that's because your brain is set on functional programming while I'm more imperative :p | 14:34 |
sean-k-mooney | this only modifies the input | 14:34 |
sean-k-mooney | bauzas: not really i just have been bitten bug bugs where inputs were modifed in nova and we were not aware of it | 14:35 |
sean-k-mooney | im thinkign of the requst spec in the schudler specificaly | 14:35 |
bauzas | yeah, I do understand | 14:36 |
bauzas | here, it's a filtering case | 14:36 |
bauzas | but I'm fine with both approaches, we can ask amit to return the sanitized list | 14:36 |
bauzas | if you don't like the implicit approach of tampering a mutable | 14:37 |
bauzas | (which I do understand is somehow risky) | 14:37 |
sean-k-mooney | its not that its risky just less common in our code base | 14:37 |
sean-k-mooney | the end result | 14:37 |
sean-k-mooney | we want in ether case | 14:38 |
sean-k-mooney | is a single rpc | 14:38 |
bauzas | I just don't want us to push this feature under the FeatureFreeze bus because of a code pattern argument :) | 14:38 |
sean-k-mooney | so aslong as we get ther its fine | 14:38 |
bauzas | which is the case in the current latest PS | 14:38 |
bauzas | so, +2 | 14:38 |
sean-k-mooney | ack | 14:38 |
sean-k-mooney | elodilles: by the way the os-vif ci has been pretty green https://zuul.openstack.org/builds?project=openstack%2Fos-vif&skip=0 bu twe also dont see many executions there either | 14:56 |
sean-k-mooney | sounds like you are concerredn about hte os-vif-ovs-iptables job sepcifically which has had a few failures https://zuul.openstack.org/builds?job_name=os-vif-ovs-iptables&project=openstack%2Fos-vif | 14:57 |
elodilles | sean-k-mooney: yepp, os-vif-ovs-iptables failed 2 times on the generated 'ci health checking' patch, but i saw that it passed recently | 15:07 |
elodilles | i'm not really concerned, but wanted to see that the DNM patch is passing so that i can abandon it o:) | 15:08 |
sean-k-mooney | elodilles: i think this failrues were mainly related to volumes | 16:02 |
sean-k-mooney | so ya we will see what happens iwht the dnm | 16:03 |
sean-k-mooney | honestly we likely sould disable the volume tests in os-vif | 16:03 |
sean-k-mooney | its also possible that those test are not configured for verification | 16:04 |
sean-k-mooney | its not disabled expleictly however so i dont thin kthis is the issue | 16:05 |
*** ralonsoh is now known as ralonsoh_ooo | 16:20 | |
bauzas | folks, I'm about to EOB, but I'd appreciate if some cores around may look in particular at https://review.opendev.org/c/openstack/nova/+/873221 , https://review.opendev.org/q/topic:dangling-volumes+project:openstack/nova+is:open and https://review.opendev.org/q/topic:ironic-shards+project:openstack/nova+is:open which both have my +2 (mostly, or just asking for nits) | 16:54 |
bauzas | as a reminder, FF is tomorrow and I'd very appreciate if we could somehow give a go to those series | 16:54 |
opendevreview | Amit Uniyal proposed openstack/nova master: Delete dangling bdms https://review.opendev.org/c/openstack/nova/+/882284 | 16:59 |
auniyal | bauzas, dansmith, sean-k-mooney can you please have another look dangling patches | 17:01 |
dansmith | bauzas: there are so many typos in that bottom ironic sharing patch including doc text and renos, I think it's worth a respin.. what do you think? | 17:07 |
sean-k-mooney | auniyal: im -1 on the reproducer https://review.opendev.org/c/openstack/nova/+/889947/6/nova/tests/functional/compute/test_attached_volumes.py | 18:01 |
sean-k-mooney | bauzas: dansmith im confilcited on what to do with the dangeleing volume seriese | 18:08 |
dansmith | I gotta run in a sec, but I don't think it's anything approaching critical | 18:09 |
dansmith | and it doesn't yet address concerns about the cinder volume being detached and reattached outside nova, AFAIK | 18:09 |
sean-k-mooney | right so i didnt think we wanted to suport that | 18:09 |
sean-k-mooney | this would actully enabel detach to work | 18:09 |
sean-k-mooney | albeit with a hard reboot | 18:10 |
sean-k-mooney | adn needing a service user token | 18:10 |
dansmith | I mean it doesn't catch that case, but should | 18:10 |
sean-k-mooney | so the bit im conflicted by is Verify if volume exists and cinder has an attachment ID else delete its BDMS data from nova DB and vice versa. | 18:11 |
sean-k-mooney | the "and cinder has an attachment ID" part | 18:12 |
dansmith | what about it? | 18:13 |
sean-k-mooney | i was not expecting use to delete the BDM if the volume existiend but the atachment was gone | 18:13 |
dansmith | why not? that means it's been detached underneath us | 18:13 |
sean-k-mooney | whihc we dont allow so we shoudl repair it? | 18:14 |
dansmith | (/me will go poof in a few minutes without warning) | 18:14 |
sean-k-mooney | or error if you attached it to somethign else | 18:14 |
dansmith | we should heal ourselves and not require admin intervention | 18:14 |
dansmith | which is what is needed right now | 18:14 |
sean-k-mooney | in the case the volume is deleted i agree | 18:15 |
sean-k-mooney | im unconfrotabel with just the case wehre the attachment is deleted | 18:15 |
sean-k-mooney | perhaps that is valid | 18:15 |
dansmith | why? from our perspective it's the same | 18:15 |
dansmith | the state of the volume has been mucked with undereath us, so bail | 18:16 |
dansmith | don't try to hook it back up or recover in any other way than just avoiding our own problems in the future | 18:16 |
sean-k-mooney | so i woudl error in the attachment delete case personslly and say no you srewed with thsi you need ot manuall fix it | 18:16 |
dansmith | that's exactly what we're trying to avoid | 18:16 |
dansmith | because the user can't manually fix it | 18:17 |
dansmith | just like when we go to delete something and it's already deleted, we say "okay fair enough" and move on | 18:17 |
dansmith | these are distributed systems.. nova's DB could have been restored from ten minutes ago, etc | 18:18 |
sean-k-mooney | well the admin can with the attachemtn refesh command. i was debating if we shoudl do the refresh in the case the voluem is aviabale but i get you dont want that complexity | 18:18 |
sean-k-mooney | i guess the fact the attachment api needs service user | 18:19 |
sean-k-mooney | tokens now | 18:19 |
sean-k-mooney | fixes most of the concerns i have | 18:19 |
sean-k-mooney | as long as we are still very clear that if you detach a volume form cidner or a prot form neutorn your isntance is tainted | 18:19 |
sean-k-mooney | and all we are doing is minimal efffort to boot again | 18:20 |
sean-k-mooney | i guess that is ok | 18:20 |
sean-k-mooney | ill look at this again tomorooww but without the service user chagne i dont this woudl have been the correct approch to take. | 18:25 |
dansmith | sean-k-mooney: right the point in this is very specifically to *not* require the admin to intervene when the only thing they can reasonably do is run a command to do the same thing, which is eliminate the BDM from nova's side | 18:35 |
dansmith | we wouldn't want the admin to "fix" it (IMHO) | 18:35 |
dansmith | and especially with the cinder policy change, the only real thing that could cause this workflow-wise would be a database restore from checkpoint such that the two sides are out of sync, or an admin operation on the cinder side that just then needs to be mirrored with admin operations on the nova side | 18:36 |
dansmith | it's not to _enable_ any sort of bad behavior by the user, it's to avoid the user being stuck and unable to get their instance back | 18:36 |
dansmith | for self-service reasons, I'd much prefer my volume to just become detached and me still be able to work on my instance without waiting in an admin's queue | 18:36 |
opendevreview | Merged openstack/nova master: Add a new NumInstancesWeigher https://review.opendev.org/c/openstack/nova/+/886232 | 18:46 |
opendevreview | melanie witt proposed openstack/nova master: nova-manage: Add 'limits migrate_to_unified_limits' https://review.opendev.org/c/openstack/nova/+/891646 | 19:23 |
opendevreview | melanie witt proposed openstack/nova master: Add documentation for unified limits https://review.opendev.org/c/openstack/nova/+/893127 | 19:23 |
opendevreview | melanie witt proposed openstack/nova master: Add documentation for unified limits https://review.opendev.org/c/openstack/nova/+/893127 | 19:34 |
sean-k-mooney | dansmith: ok i can bye that. i kind of feel like in addtion to this we show have a external event form cidner saying its detach or something like that to make it properly automatic but again with the service token change that shoudl not be reuqired either. | 19:36 |
dansmith | to be clear, I do not think we should _ever_ tolerate detach via cinder, thus no need for an event | 19:36 |
dansmith | this is a corruption recovery mechanism and nothing else | 19:36 |
sean-k-mooney | so ill remove my -1 on the repoducer. i still have not looked at the actual final patch however | 19:36 |
sean-k-mooney | ack | 19:37 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!