| opendevreview | minwoo seo proposed openstack/nova master: Add availability_zone support for migration and live-migration https://review.opendev.org/c/openstack/nova/+/976085 | 01:22 |
|---|---|---|
| *** mhen_ is now known as mhen | 02:24 | |
| opendevreview | minwoo seo proposed openstack/nova-specs master: Add spec for availability_zone support in migration https://review.opendev.org/c/openstack/nova-specs/+/976202 | 02:37 |
| opendevreview | minwoo seo proposed openstack/nova master: Add availability_zone support for migration and live-migration https://review.opendev.org/c/openstack/nova/+/976085 | 02:42 |
| opendevreview | minwoo seo proposed openstack/nova master: Add availability_zone support for migration and live-migration https://review.opendev.org/c/openstack/nova/+/976085 | 02:51 |
| opendevreview | Masahito Muroi proposed openstack/nova master: Add indirection_url to metadata-api wsgi app https://review.opendev.org/c/openstack/nova/+/976205 | 03:09 |
| *** seebaer is now known as seba | 04:11 | |
| *** mhen_ is now known as mhen | 04:14 | |
| opendevreview | minwoo seo proposed openstack/nova-specs master: Add spec for availability_zone support in migration https://review.opendev.org/c/openstack/nova-specs/+/976202 | 04:28 |
| opendevreview | minwoo seo proposed openstack/nova-specs master: Add spec for availability_zone support in migration https://review.opendev.org/c/openstack/nova-specs/+/976202 | 04:30 |
| opendevreview | minwoo seo proposed openstack/nova-specs master: Add spec for availability_zone support in migration https://review.opendev.org/c/openstack/nova-specs/+/976202 | 04:31 |
| *** ykarel_ is now known as ykarel | 05:40 | |
| opendevreview | minwoo seo proposed openstack/nova master: Add availability_zone support for migration and live-migration https://review.opendev.org/c/openstack/nova/+/976085 | 05:50 |
| opendevreview | minwoo seo proposed openstack/nova-specs master: Add spec for availability_zone support in migration https://review.opendev.org/c/openstack/nova-specs/+/976202 | 05:58 |
| opendevreview | minwoo seo proposed openstack/nova-specs master: Add spec for availability_zone support in migration https://review.opendev.org/c/openstack/nova-specs/+/976202 | 06:29 |
| opendevreview | Masahito Muroi proposed openstack/nova master: Add indirection_url to metadata-api wsgi app https://review.opendev.org/c/openstack/nova/+/976205 | 08:15 |
| opendevreview | Taketani Ryo proposed openstack/nova master: mem-enc: introduce a check between mem_encryption and locked_memory https://review.opendev.org/c/openstack/nova/+/971300 | 10:45 |
| gokhan | Hi everyone, | 12:40 |
| gokhan | I'm struggling with a live migration issue where post-copy mode is never triggered, even after the defined timeout, specifically on VMs with 64GB RAM or more. | 12:40 |
| gokhan | Despite having live_migration_permit_post_copy = True, the migration process stays in the "pre-copy" phase indefinitely or until it eventually fails, instead of switching to post-copy when the timeout is reached. I am using OPenStack Caracal version on ubuntu 22.04. My libvirt config is : https://paste.openstack.org/show/bZU8obGPewCaXXlMa9CO/ | 12:40 |
| gokhan | Are there specific qemu.conf or nova.conf tweaks required to ensure the force_complete action actually initiates the post-copy transition instead of just waiting? | 12:42 |
| gokhan | this is also added. sysctl vm.unprivileged_userfaultfd=1 | 13:07 |
| nicolairuckel | Could someone take a look at my test failures here? https://zuul.opendev.org/t/openstack/buildset/3a3255ffc9e64cad8afc30bb1cafc254 I think they are unrelated to my changes but I'd like to be sure before starting a recheck. | 13:55 |
| sean-k-mooney | gokhan: precopy activation is entilly contoled internally in libvirt | 14:42 |
| sean-k-mooney | gokhan: the only way to incluance that is to force complete the migration but that wont nessiarly swap to post copy mode | 14:42 |
| sean-k-mooney | gokhan: so no i bleive what you want to do would requrie a change to libvirt/qemu at least im not aware of a way to do it in libvirt today | 14:43 |
| opendevreview | Ivan Anfimov proposed openstack/os-vif master: wip https://review.opendev.org/c/openstack/os-vif/+/976271 | 15:28 |
| opendevreview | Ivan Anfimov proposed openstack/os-vif master: wip https://review.opendev.org/c/openstack/os-vif/+/976271 | 15:28 |
| opendevreview | Ivan Anfimov proposed openstack/os-vif master: Remove url tags from README https://review.opendev.org/c/openstack/os-vif/+/976271 | 15:29 |
| opendevreview | Ivan Anfimov proposed openstack/os-vif master: Drop support for Python 3.9 and add 3.13 https://review.opendev.org/c/openstack/os-vif/+/976272 | 15:30 |
| opendevreview | Ivan Anfimov proposed openstack/os-vif master: Drop support for Python 3.9 and add 3.13 https://review.opendev.org/c/openstack/os-vif/+/976272 | 15:31 |
| opendevreview | Ivan Anfimov proposed openstack/os-vif master: Drop support for Python 3.9 and add 3.13 https://review.opendev.org/c/openstack/os-vif/+/976272 | 15:31 |
| *** dviroel is now known as dviroel_lunch | 16:11 | |
| opendevreview | Masahito Muroi proposed openstack/nova master: Add indirection_url to metadata-api wsgi app https://review.opendev.org/c/openstack/nova/+/976205 | 16:30 |
| *** dviroel_lunch is now known as dviroel | 16:58 | |
| nicolairuckel | "ModuleNotFoundError: No module named 'pkg_resources'" doesn't look like something I could have touched but a problem within the build pipeline maybe | 17:17 |
| gibi | nicolairuckel: yes it is a knonw build issue independently from your change | 17:18 |
| gibi | we blame setuptools :) | 17:18 |
| nicolairuckel | There's also test_add_remove_router_interface which fails | 17:18 |
| nicolairuckel | But blaming setuptools sounds good to me | 17:18 |
| gibi | yes also a known one :) | 17:18 |
| gibi | sorry this week CI is a mess | 17:19 |
| nicolairuckel | Are there any bug reports for this that I can link to? | 17:19 |
| gibi | you can see the details of it in the Monday's nova meeting logs | 17:19 |
| nicolairuckel | thanks | 17:19 |
| gibi | I'm not sure about bug reports | 17:19 |
| nicolairuckel | is "recheck - known CI issues" good enough then? | 17:19 |
| gibi | nicolairuckel: do not recheck for a day or so it is pointless | 17:20 |
| nicolairuckel | ah, okay | 17:20 |
| gibi | the meeting logs has links to patches we need to land first | 17:20 |
| nicolairuckel | then I'll keep an eye on those | 17:20 |
| nicolairuckel | thanks a lot | 17:20 |
| gibi | but feel free to ping us tomorrow about updates | 17:20 |
| nicolairuckel | At least I can stop trying to debug this now. | 17:21 |
| gibi | yeah, you can go and do something fun instead :) | 17:22 |
| nicolairuckel | will do :D | 17:22 |
| * gibi also drops | 17:23 | |
| dansmith | bauzas: can you see my reply on https://review.opendev.org/c/openstack/nova/+/941483 before you drop? | 17:29 |
| bauzas | just saw it now | 17:31 |
| dansmith | thanks | 17:32 |
| bauzas | dansmith: well, I don't want to validate a lot of things, just making sure that we could not an wrong exception | 17:33 |
| bauzas | something like verifying indeed about the value and just have a catch if we have a problem | 17:34 |
| dansmith | ...what exception? | 17:34 |
| bauzas | this could be UnicodeDecodeError | 17:34 |
| dansmith | AIUI, it's a string encoded in bytes (bytes because it comes from C) and right now it's base64, so it will always be decode-able | 17:34 |
| bauzas | could secret.value() be None ? | 17:35 |
| dansmith | secret is checked for None above, I guess I'm not sure if secret.value could be None, but that won't generate a UnicodeDecodeError :) | 17:36 |
| dansmith | bauzas: as I pointed out, I've been concerned about this exact line a couple times in the past, so I share your concern here obviously, | 17:38 |
| dansmith | I'm just not sure what better we can do about it, and I think exploding in "check can live migrate" is going to give us the result we want, which is an answer of "no" and a useful stack trace | 17:38 |
| bauzas | you know what ? As I said, it could be a FUP, so I'll +2 it and eventually we could discuss this issue by a new change | 17:39 |
| dansmith | I just want to make sure we're not re-litigating this at this late stage, since I brought this up months ago :) | 17:39 |
| bauzas | yeah I understand your point | 17:39 |
| bauzas | hence why it has to be a follow-up | 17:39 |
| dansmith | bauzas: but what specifically are you thinking? just catch any exception and LOG and re-raise? or return something in dest_check_data? | 17:39 |
| bauzas | let's not hold this change | 17:39 |
| bauzas | yeah, something like that, I can provide a wip change if you want | 17:40 |
| bauzas | I'd prefer to return a VTPMSecretNotFound exception instead of something like an global exception | 17:40 |
| dansmith | I gave an either/or... which is it? :) | 17:40 |
| bauzas | sorry the former | 17:41 |
| dansmith | okay but NotFound is not really appropriate for either =None or a decode error | 17:41 |
| bauzas | returning a better exception | 17:41 |
| dansmith | ack | 17:41 |
| bauzas | hmmm, you're right, this could be another exception | 17:41 |
| dansmith | I'm not opposed to doing that here instead of a FUP, I just don't want to end up with this turning into another multi-month turnaround :D | 17:42 |
| bauzas | hmmm, then let me see the code | 17:46 |
| bauzas | and see what I can propose | 17:46 |
| dansmith | bauzas: I just don't want to lose some inertia here, I'm not asking you to write somethig, | 17:48 |
| dansmith | I'm just saying that a UnicodeDecodeError pointing to the line in the stack that decodes the libvirt secret is very useful | 17:49 |
| dansmith | I don't think we have a VTPMSecretMalformed exception, and I think we too often create single-use exceptions thinking it will be helpful, which is is not always | 17:49 |
| dansmith | I'm fine double-checking secret.value==None, which could be done on the same line as L10874, but it will hide the subtle difference of it being _present_ in libvirt but .. missing a value, which is a different thing, | 17:50 |
| dansmith | so I dunno. | 17:50 |
| bauzas | if you prefer to have a specific unicodeerror, I'm fine | 17:52 |
| bauzas | but yeah, maybe checking the value itself whether it's None could be better | 17:53 |
| dansmith | https://libvirt.org/html/libvirt-libvirt-secret.html#virSecretGetValue | 17:57 |
| dansmith | says NULL on error only | 17:57 |
| dansmith | I'd have to chase if that gets actually exposed through the bindings or not | 17:58 |
| melwitt | yeah I think we would want to differentiate the two so nested None check would be better probably | 17:58 |
| dansmith | but returning NotFound for that case would not be appropriate | 17:58 |
| melwitt | in my experience the way the python bindings are auto-generated is the C API returns NULL and error codes but the python will raise exceptions on any errors. I have not seen the actual code for that though, I guess it would be somewhere in their auto generate tool or the config for the tool or something | 17:59 |
| melwitt | i.e. I have not seen the actual code that translates NULL+error codes into python exceptions but it must be somewhere | 18:00 |
| bauzas | hmmmm | 18:00 |
| melwitt | and the python API is undocumented of course :) | 18:00 |
| bauzas | honestly I don't know what to say | 18:00 |
| bauzas | here, we get a bytes and we decode it | 18:00 |
| bauzas | my main concern was about the fact that given we directly return the value, if a new libvirt release was changing the value, we could have another problem, but here I don't want us to do more than just saying something like "this is bizarre" | 18:02 |
| melwitt | we decode it because we specifically encoded it to pass to libvirt because it needs the bytes type | 18:02 |
| bauzas | yup I understood it | 18:03 |
| dansmith | right, libvirt wouldn't change the encoding of the value because we provided it in the first place | 18:05 |
| melwitt | I hear your point but to me it's like ... all of the libvirt python API is like this. we could say the same thing about every libvirt-python function we call. do we check the return type of virNodeDevice for example? I might be remembering wrong but we just call methods on those returned objects without any special error handling | 18:05 |
| dansmith | it's not the bytes or not bytes that is a concern, melwitt, it's what the bytes are.. if we provide it something non-UTF-encoded then decode() will fail | 18:05 |
| dansmith | but since we provided it with something we already encoded to UTF-8 and we do that blindly from the key, there's not much else we can do right now and would have to handle it in the future if the thing we're getting/storing changed | 18:06 |
| melwitt | I guess my thinking is, we are controlling that right? we are the ones who encode it before handing it to libvirt. is the thought that libvirt might tamper with the encoding we gave it somehow? | 18:06 |
| dansmith | it can't | 18:06 |
| dansmith | and yes, we've given it encoded so shouldn't fail here, we would have failed when we set it (unless tampered-with, in which case not much we can do) | 18:07 |
| melwitt | we generate the key in nova/crypto.py, so we control that too | 18:07 |
| bauzas | you're right | 18:07 |
| dansmith | that's why I'm saying I share the concern, but I'm okay with it given the constraints | 18:08 |
| bauzas | so shouldn't be a problem with a new libvirt release | 18:08 |
| dansmith | melwitt: I know, I'm agreeing :) | 18:08 |
| bauzas | and honestly the bytes field wouldn't change | 18:08 |
| melwitt | ok, sorry I'm just not entirely following by no fault of anyone other than me | 18:08 |
| bauzas | melwitt: so, I think I forgot we *did* set it | 18:09 |
| bauzas | so we now that won't change | 18:09 |
| bauzas | so nevermind then my comments | 18:09 |
| melwitt | for me it's that we control every part of the key generation and encoding and decoding and I'm not sure what problem we expect we might have? again, maybe I am just being dumb | 18:09 |
| dansmith | melwitt: it's just a robustness thing | 18:09 |
| dansmith | if we want to try..except, and log "Failed to decode secret value" and then re-raise, that's fine, I'm just not sure that really adds much is my point | 18:10 |
| dansmith | and since this is mid-stack, let's do (or argue about) that in a FUP :D | 18:10 |
| melwitt | like to handle if someone changes the key generation in nova/crypto.py but doesn't also handle the libvirt calls at the same time sometime in the future | 18:11 |
| melwitt | fwiw I am in agreement that it doesn't seem like that would add much value but I am not opposed to doing it | 18:11 |
| dansmith | bauzas: did you see the previous discussion where melwitt pointed out that we define the secret in XML form, which means we can't have provided just any random binary because it had to be in XML? | 18:12 |
| dansmith | that to me is another layer of protection here, not only that we're generating a unicode string, but also that we had to define it in XML which can't have been random binary | 18:12 |
| bauzas | dansmith: no, but looking now the comments | 18:13 |
| bauzas | dansmith: I guess you're talking of https://review.opendev.org/c/openstack/nova/+/941483/comment/42e25da0_7ac1ff5a/ ? /me reads | 18:14 |
| dansmith | no, I'm talking about the actual comments from PS21 | 18:14 |
| dansmith | that's what I was trying to point at this mornign | 18:14 |
| * bauzas reads PS21 comment | 18:15 | |
| melwitt | well the XML doesn't have the secret value in it but the "define secret XML" API returns a secret object and then we call setValue() on that https://github.com/openstack/nova/blob/264e868d4931595140260c0f655a10b525be38f7/nova/virt/libvirt/host.py#L1225-L1227 | 18:15 |
| dansmith | oh wait, that's the opposite of what I thought | 18:16 |
| melwitt | the XML has the secret UUID but not the value | 18:16 |
| bauzas | yeah that's another libvirt API for the secret value, right? | 18:17 |
| dansmith | right the set value is how you do it if you _want_ random non-string encode-able binary | 18:17 |
| melwitt | yes I think that is what I'm seeing | 18:17 |
| dansmith | melwitt: you said this: | 18:17 |
| dansmith | It is apparently possible to create libvirt secrets that don't contain printable characters by using the virSecretSetValue API [1], but we (Nova) have always used the virSecretDefineXML API which has to be specified via XML [2]. | 18:17 |
| dansmith | which I thought implied that we _were_ using the XML-based method | 18:18 |
| dansmith | are you saying we actually use the former? | 18:18 |
| melwitt | we have to give it XML for it to create the secret but we can't give it the value _in_ the XML | 18:18 |
| dansmith | okay but then I'm not sure why the "but we have always used...XML" comment | 18:19 |
| melwitt | it is both, secret = conn.secretDefineXML(xml) creates the secret object which it returns to us and then there is a separate call to secret.setValue(keydata) | 18:19 |
| dansmith | I thought you were saying that we _could_ define random binary with setValue, but we're not because we set it via XML and thus has to be text | 18:19 |
| dansmith | right, so that means we're not getting any extra layer of enforcement of it being tet | 18:20 |
| dansmith | *text | 18:20 |
| dansmith | and that was what I was trying to point out in PS21 and I thought you were telling me no.. I guess I didn't chase it deep enough to confirm. | 18:20 |
| melwitt | yeah I'm guessing I misread our code in libvirt/host.py or something | 18:20 |
| bauzas | sorry folks, I unfortunately need to have a dinner now :( | 18:21 |
| dansmith | bauzas: either way, it doesn't change that we're just decode()ing something we already encode()ed here, I just thought we were further enforcing it as text because of the XML thing but ... apparently not | 18:21 |
| bauzas | okay, fwiw, I +2d the patch, I think we're just enough robust for now if we are sure that everything vtpm secret is set by nova | 18:23 |
| bauzas | but I don't know what else to say here | 18:23 |
| melwitt | the only reason it is ever text is because of SensitiveStringField in the object. which was what we touched on the most recent review comments | 18:24 |
| dansmith | um, that can't be true though | 18:25 |
| dansmith | bauzas: go dinner, we'll pick up later | 18:25 |
| melwitt | correction to my above message: SensitiveString is not the only reason it is text. it is text because it is a "passphrase" secret type in Barbican https://docs.openstack.org/barbican/latest/api/reference/secret_types.html i.e. we Nova create it specifically as a "passphrase" from urandom generate bytes that we Nova then base64 encoded | 19:30 |
| dansmith | yeah.. which also means that what I said recently about wanting a bytes type in the object is not right.. SensitiveString is 100% the right type for that because ^ | 19:44 |
| melwitt | good point | 19:45 |
| dansmith | commented so we don't find that later and be foncused | 19:47 |
| melwitt | smart | 20:00 |
| melwitt | *commenting on the review is smart (sometimes I type parts of words in reverse too) | 20:02 |
| dansmith | well, in this case it was intentional :) | 20:04 |
| melwitt | oh, it went completely over my head then | 20:07 |
| JayF | sean-k-mooney: (or others who may be interested) -- I'd love to chat for 30-60 minutes sometime in March, pre-PTG, and brainstorm options for improving the Nova-compute story around Ironic further. I think we could get a lot of progress out of moving placement updates (at least partially) to Ironic. | 20:16 |
| sean-k-mooney | JayF: sure. | 20:19 |
| JayF | sean-k-mooney: my availability is open rest of today PST, and on Thursday, but then I'm out for a couple of weeks -- what's your march look like? maybe I can get something on a clendar | 20:20 |
| sean-k-mooney | moving inventory reporting to placmenet is proably doable but nova will need to continue to do the creation fo the actual allcoations against those inventories for nova manged servers | 20:20 |
| sean-k-mooney | beyond st patricks day ill be workign all of march | 20:21 |
| JayF | I mean, I don't think we can get rid of having some idea of "these nodes are managed by this nova-compute" | 20:21 |
| sean-k-mooney | i can chat now fo a while or thusday if you like | 20:21 |
| JayF | now() works, you OK with a zoom? | 20:21 |
| sean-k-mooney | sure | 20:21 |
| sean-k-mooney | we can chat again latier but if you want to quickly brainstorme somting i have like 20-30 mins then im gettign dinner one way or anohter | 20:22 |
| JayF | ugh hold on | 20:23 |
| JayF | zoom client doesn't copy+paste properly | 20:23 |
| sean-k-mooney | :) | 20:23 |
| opendevreview | Zhan Zhang proposed openstack/nova-specs master: Add Additional Live Migration Features https://review.opendev.org/c/openstack/nova-specs/+/976311 | 21:44 |
| opendevreview | Zhan Zhang proposed openstack/nova-specs master: Add Additional Live Migration Features https://review.opendev.org/c/openstack/nova-specs/+/976311 | 22:00 |
| opendevreview | melanie witt proposed openstack/nova master: TPM: support live migration of `deployment` secret security https://review.opendev.org/c/openstack/nova/+/925771 | 22:21 |
| opendevreview | melanie witt proposed openstack/nova master: TPM: bump service version to enable live migration https://review.opendev.org/c/openstack/nova/+/975724 | 22:21 |
| opendevreview | melanie witt proposed openstack/nova master: TPM: test live migration between hosts with different security https://review.opendev.org/c/openstack/nova/+/952629 | 22:21 |
| opendevreview | melanie witt proposed openstack/nova master: TPM: add late check for supported TPM secret security https://review.opendev.org/c/openstack/nova/+/956975 | 22:21 |
| opendevreview | melanie witt proposed openstack/nova master: TPM: opt-in to new TPM secret security via resize https://review.opendev.org/c/openstack/nova/+/962052 | 22:21 |
| opendevreview | melanie witt proposed openstack/nova master: DNM vtpm tempest https://review.opendev.org/c/openstack/nova/+/957477 | 22:21 |
| opendevreview | melanie witt proposed openstack/nova master: TPM: fixups for live migration of `host` secret security https://review.opendev.org/c/openstack/nova/+/976316 | 22:21 |
Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!