*** ralonsoh_ is now known as ralonsoh | 10:07 | |
opendevreview | Nisha Brahmankar proposed openstack/nova master: Retain port binding profile after live migration https://review.opendev.org/c/openstack/nova/+/936338 | 10:14 |
---|---|---|
opendevreview | Nisha Brahmankar proposed openstack/nova master: Retain port binding profile after live migration https://review.opendev.org/c/openstack/nova/+/936338 | 10:26 |
opendevreview | Ivan Tkachuk proposed openstack/nova master: Reduce calls to qemu-img for disk_info https://review.opendev.org/c/openstack/nova/+/936246 | 11:21 |
dviroel | hi there nova folks. I would like to request reviews on this nova spec: https://review.opendev.org/c/openstack/nova-specs/+/936140 | 12:52 |
dviroel | just wondering if I should add it to https://etherpad.opendev.org/p/nova-2025.1-status ? under Blueprint needs approval ? | 12:53 |
bauzas | dviroel: sure, add it in the etherpad | 13:02 |
dviroel | ty | 13:04 |
opendevreview | Stephen Finucane proposed openstack/nova master: docs: Add contributor docs for response body validation https://review.opendev.org/c/openstack/nova/+/924597 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: tests: Ensure all APIs have a response body schema https://review.opendev.org/c/openstack/nova/+/924598 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Remove use of microversion constants https://review.opendev.org/c/openstack/nova/+/936362 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Allow min/max_version arguments to expected_errors https://review.opendev.org/c/openstack/nova/+/936363 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Allow min/max_version arguments to response https://review.opendev.org/c/openstack/nova/+/936364 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Adjust validation helpers for a single-method future https://review.opendev.org/c/openstack/nova/+/936365 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Stop using wsgi.Controller.api_version to switch between API versions https://review.opendev.org/c/openstack/nova/+/936366 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Add new, simpler api_version decorator https://review.opendev.org/c/openstack/nova/+/936367 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Only run format checks on strings https://review.opendev.org/c/openstack/nova/+/936368 | 13:28 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Simplify parameter types https://review.opendev.org/c/openstack/nova/+/936369 | 13:28 |
*** iurygregory_ is now known as iurygregory | 13:31 | |
s3rj1k | JFYI: I don't see any mentions about adding specs to etherpad in https://docs.opendev.org/opendev/infra-manual/latest/developers.html#working-on-specifications-and-blueprints | 13:38 |
opendevreview | ribaudr proposed openstack/nova master: Add share_info parameter to reboot method for each driver (driver part) https://review.opendev.org/c/openstack/nova/+/854823 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Support rebooting an instance with shares (compute manager part) https://review.opendev.org/c/openstack/nova/+/854824 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add share_info parameter to resume method for each driver (driver part) https://review.opendev.org/c/openstack/nova/+/860284 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Support resuming an instance with shares (compute manager part) https://review.opendev.org/c/openstack/nova/+/860285 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add helper methods to rescue/unrescue shares https://review.opendev.org/c/openstack/nova/+/860286 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Support rescuing an instance with shares https://review.opendev.org/c/openstack/nova/+/860287 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Allow to mount manila share using Cephfs protocol https://review.opendev.org/c/openstack/nova/+/883862 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Check shares support (compute manager) https://review.opendev.org/c/openstack/nova/+/885751 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Attach Manila shares via virtiofs (API) https://review.opendev.org/c/openstack/nova/+/836830 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add helper methods to attach/detach shares https://review.opendev.org/c/openstack/nova/+/885753 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add instance.share_attach notification https://review.opendev.org/c/openstack/nova/+/850501 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add instance.share_detach notification https://review.opendev.org/c/openstack/nova/+/851028 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add shares to InstancePayload https://review.opendev.org/c/openstack/nova/+/851029 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add instance.share_attach_error notification https://review.opendev.org/c/openstack/nova/+/860282 | 13:41 |
opendevreview | ribaudr proposed openstack/nova master: Add instance.share_detach_error notification https://review.opendev.org/c/openstack/nova/+/860283 | 13:42 |
opendevreview | ribaudr proposed openstack/nova master: Reports instance events to the DB regarding attaching and detaching a share https://review.opendev.org/c/openstack/nova/+/927088 | 13:42 |
opendevreview | ribaudr proposed openstack/nova master: Add libvirt test to ensure metadata are working. https://review.opendev.org/c/openstack/nova/+/852086 | 13:42 |
opendevreview | ribaudr proposed openstack/nova master: Add virt/libvirt error test cases https://review.opendev.org/c/openstack/nova/+/852087 | 13:42 |
opendevreview | ribaudr proposed openstack/nova master: Manila shares admin guide documentation https://review.opendev.org/c/openstack/nova/+/871642 | 13:42 |
opendevreview | ribaudr proposed openstack/nova master: Refactor test_server_shares: Mock in Base Class and trait verification https://review.opendev.org/c/openstack/nova/+/935861 | 13:42 |
sean-k-mooney | s3rj1k: thats because bauzas has an ar to udate the doc :) its not a requirement its just an extra way to highlight things that are avialable to review | 13:44 |
sean-k-mooney | s3rj1k: ildikov filed a but to crrect the docs gap recently | 13:45 |
sean-k-mooney | s3rj1k: the etherpad was added 2 releases ago, before that we used review priorities | 13:45 |
s3rj1k | and there is still an extra step of making stub blueprint | 13:46 |
sean-k-mooney | that is docuemented | 13:46 |
sean-k-mooney | all spec need a bluepirnt | 13:46 |
sean-k-mooney | that in the sepc template too | 13:46 |
sean-k-mooney | https://bugs.launchpad.net/nova/+bug/2089325 is the bug | 13:46 |
ildikov | +1, I was just about to throw in the bug report link :) | 13:47 |
s3rj1k | yea, I mean that there is blueprint for tracking state of spec and there is also etherpad | 13:47 |
sean-k-mooney | the etherpad is purly to help publisise and priories reviews | 13:48 |
sean-k-mooney | it is not require at all for our process | 13:48 |
ildikov | I interpreted the etherpad process as a lightweight project management tool that helps the core team to keep track of things that are ongoing | 13:48 |
sean-k-mooney | yep | 13:49 |
sean-k-mooney | somethime things can get "lost" in gerrit or lanchpad | 13:49 |
s3rj1k | imo, I see this as not so optimal, but ok, if that works | 13:50 |
sean-k-mooney | its manily working aroudn some limistaion in launchpads and gerrit ui | 13:50 |
sean-k-mooney | to be clear i almost exclisilvy ignor the etherpad | 13:50 |
sean-k-mooney | i do look at it form time to time but i look at gerrit much more frequently directly | 13:51 |
sean-k-mooney | i try to take a look at the ether pad once a week however but dont always fidn time too | 13:51 |
s3rj1k | could been just automated away based on data from gerrit and launchpad) | 13:52 |
sean-k-mooney | no the peopel that whated it wanted it to be human qurated | 13:52 |
sean-k-mooney | *curated | 13:52 |
sean-k-mooney | if we did an automated sync we woudl be back to the noise problem | 13:52 |
sean-k-mooney | s3rj1k: i get that process can be furstrating, especially if its not clear why or how it works | 13:55 |
s3rj1k | it just adds extra layer of bureaucracy and this will impact adoption for new community members | 13:56 |
sean-k-mooney | s3rj1k: how so. currently its not required at all | 13:57 |
s3rj1k | I have a gut feeling about this, I guess in time we will see | 13:58 |
sean-k-mooney | its actully there to allow people to engages async and to ask for reviews | 13:58 |
sean-k-mooney | so its intended to help continutors | 13:58 |
sean-k-mooney | i.e. so that they dont have to be on irc to ask for reviews | 13:59 |
sean-k-mooney | when they are not gettign feedback in gerrit | 13:59 |
sean-k-mooney | gerrit, irc and the mailing list are the 3 primary engagment points for contibutors with launch pad (bugs) for users | 13:59 |
s3rj1k | there is still RFE step that is semi-supported | 14:00 |
sean-k-mooney | s3rj1k: im not discounting your feedback im wonderign howe to improvie it (beyond updating the docs) | 14:00 |
sean-k-mooney | so nova does not use RFE bugs | 14:01 |
sean-k-mooney | that a neutron specific thing | 14:01 |
sean-k-mooney | we have experiemtned with adding it but its not part of our process | 14:01 |
s3rj1k | yea, but how to propose a draft for a feature to talk about? rfe seems perfect for this | 14:01 |
sean-k-mooney | most openstack teams dont sue RFE bugs, neutron use it because you cant realy have discussion in bluepirnts due to the ui | 14:01 |
s3rj1k | it's not like rfe is a call to action for core devs) just a place to put text and talk in comments | 14:02 |
sean-k-mooney | right but lanuchpad is not where we have those dicussions in general | 14:02 |
s3rj1k | mailing lists? | 14:03 |
sean-k-mooney | gerrit, irc or the mailing list is outside fo the ptg | 14:03 |
sean-k-mooney | so normally unless its a triival feature we would expect to have design discussion in a spec review | 14:03 |
sean-k-mooney | the actul workflow is ment to be, if you have a feature request bring it up on the mailing list or in the irc meeting | 14:04 |
sean-k-mooney | then based on the inital feedback file a spec or blueprint for it | 14:04 |
sean-k-mooney | specless blueprings are used for very small features that done require an actual spec but most feature will need a spec | 14:05 |
*** iurygregory_ is now known as iurygregory | 14:42 | |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Simplify parameter types https://review.opendev.org/c/openstack/nova/+/936369 | 15:46 |
opendevreview | Stephen Finucane proposed openstack/nova master: tests: Ensure all APIs have a response body schema https://review.opendev.org/c/openstack/nova/+/924598 | 15:46 |
opendevreview | Serhii Ivanov proposed openstack/nova-specs master: [SPEC] `Add Distributed Locking for Host Discovery`. https://review.opendev.org/c/openstack/nova-specs/+/936389 | 16:30 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Fix missing oslo.policy options https://review.opendev.org/c/openstack/nova/+/936392 | 16:37 |
opendevreview | Takashi Kajinami proposed openstack/nova master: Fix missing oslo.policy options https://review.opendev.org/c/openstack/nova/+/936392 | 16:48 |
opendevreview | Serhii Ivanov proposed openstack/nova-specs master: [SPEC] `Add Distributed Locking for Host Discovery`. https://review.opendev.org/c/openstack/nova-specs/+/936389 | 16:53 |
yosef | Hi, in case of complete data loss of openstack database for example in 1 day, is there any way to restore to db, VMs created in this one day? | 17:01 |
sean-k-mooney | yosef: not really. you could try and manually creat the records but there is no tooling to enable that | 17:37 |
opendevreview | Merged openstack/nova master: [codespell] Fixes for latest version https://review.opendev.org/c/openstack/nova/+/923738 | 18:15 |
opendevreview | Merged openstack/nova master: pre-commit: Bump versions https://review.opendev.org/c/openstack/nova/+/923739 | 18:15 |
s3rj1k | sean-k-mooney: Can you take a quick look on that Spec CR? https://review.opendev.org/c/openstack/nova-specs/+/936389 | 19:00 |
s3rj1k | I've put also some of the alternatives that you've mentioned in there | 19:00 |
sean-k-mooney | sure | 19:01 |
s3rj1k | thanks | 19:01 |
sean-k-mooney | im not convicd you primay propsol is what we shoudl do | 19:01 |
sean-k-mooney | btu we can discuss it on the spec | 19:01 |
s3rj1k | at least I hope to have this finalized in any way) best to have this solved rather than ignoring the issue | 19:02 |
s3rj1k | so any way works as long as we go forward :) | 19:02 |
sean-k-mooney | well right now you want to add a distribute lock manager and annoter usage of tooz | 19:02 |
sean-k-mooney | where as we ahve dicussed how to remove tooz form teh code base | 19:02 |
sean-k-mooney | at least twice in the last year or two | 19:03 |
s3rj1k | just like in original rfe, tooz was there yes | 19:03 |
sean-k-mooney | so we may be able to proced with this but your askign for use to reverse cources on that | 19:03 |
s3rj1k | we can replace tooz I guess | 19:03 |
sean-k-mooney | ok jsut so your aware that thsi is a pretty contiversiol approch | 19:04 |
sean-k-mooney | we generally did not want to ahve a distribute lock manager in nova at all | 19:04 |
s3rj1k | that's why I invested time in alternatives) | 19:04 |
sean-k-mooney | the main concern with tooz by the way is its overall mantance | 19:05 |
sean-k-mooney | what migh be simpler to do is some form of leader elelction | 19:05 |
sean-k-mooney | rather then a distributed lock | 19:05 |
s3rj1k | that won't cover CLI | 19:06 |
sean-k-mooney | well to be clear it does not have too | 19:06 |
sean-k-mooney | currently the sideefect does nto break anything | 19:06 |
sean-k-mooney | the db enforces consitenty so the worst case side effecti is the command exits with an error but it does not break anything | 19:07 |
s3rj1k | yea, I guess so, if periodics can be safely fixed for concurrency | 19:07 |
sean-k-mooney | so that ok on the cli | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Allow min/max_version arguments to expected_errors https://review.opendev.org/c/openstack/nova/+/936363 | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Allow min/max_version arguments to response https://review.opendev.org/c/openstack/nova/+/936364 | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Adjust validation helpers for a single-method future https://review.opendev.org/c/openstack/nova/+/936365 | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Stop using wsgi.Controller.api_version to switch between API versions https://review.opendev.org/c/openstack/nova/+/936366 | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Add new, simpler api_version decorator https://review.opendev.org/c/openstack/nova/+/936367 | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Only run format checks on strings https://review.opendev.org/c/openstack/nova/+/936368 | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: api: Simplify parameter types https://review.opendev.org/c/openstack/nova/+/936369 | 19:07 |
opendevreview | Stephen Finucane proposed openstack/nova master: tests: Ensure all APIs have a response body schema https://review.opendev.org/c/openstack/nova/+/924598 | 19:07 |
s3rj1k | no reason to force CLI with cronjobs | 19:07 |
sean-k-mooney | what i was orginally thinking was to use a Rendezvous hash to select the leader form the active schduler and only run the perodic form that one | 19:08 |
sean-k-mooney | that way we dont actully need an external depency (distibuted lock manager) or any addtional rpcs ectra | 19:09 |
sean-k-mooney | im just not sure if that will work with our current system | 19:09 |
sean-k-mooney | a concern with a dlm is makeing sure the lock is released if any of the pods get restarted for example | 19:10 |
s3rj1k | btw, is there a formal doc on tooz deprecation? | 19:10 |
sean-k-mooney | its not deprecated as a lib | 19:10 |
sean-k-mooney | so the only usage of it in nova is the ironic driver and we deprecated the fature that uses it in that | 19:11 |
s3rj1k | so for nova removal in this case | 19:11 |
sean-k-mooney | currently the ironic drivcer can use manual sharding or a dynmic hashring(provided by tooz) | 19:11 |
sean-k-mooney | the dynmaic hash ring is deprecated for removal which would have removed tooz as a side effect | 19:12 |
sean-k-mooney | s3rj1k: each schduler is a memebr of a service group | 19:15 |
sean-k-mooney | https://github.com/openstack/nova/blob/master/nova/scheduler/manager.py#L69 | 19:15 |
sean-k-mooney | there are 2 service group drivers, the db and memcache | 19:15 |
sean-k-mooney | im wondering if we can reuse that to do leader elelction and then have only one of the member run the perodic if its currently the leader | 19:17 |
sean-k-mooney | the problem is the curernt service group api does nto really ahve a way to do that | 19:19 |
sean-k-mooney | it has a way to test if a member of a service group is up | 19:20 |
sean-k-mooney | but not a way to list the memebr and or select a leader | 19:20 |
opendevreview | ribaudr proposed openstack/nova-specs master: WIP: Allow vfio gpu support with Vendor-Specific VFIO framework https://review.opendev.org/c/openstack/nova-specs/+/936407 | 19:22 |
opendevreview | Merged openstack/nova stable/2024.1: Handle neutron-client conflict https://review.opendev.org/c/openstack/nova/+/927732 | 19:43 |
s3rj1k | > but not a way to list the memebr and or select a leader - so that would mean that we would firstly need to extend that and after that try to use that | 19:45 |
sean-k-mooney | for the db dirver its pretty simple | 19:51 |
sean-k-mooney | that basically a tweaked version fo https://github.com/openstack/nova/blob/master/nova/db/main/api.py#L492-L503 | 19:51 |
sean-k-mooney | i.e. get service by binary | 19:51 |
sean-k-mooney | in this case of memcache its a littel more annoying | 19:52 |
sean-k-mooney | we coudl just ignore the service goups and just use the service api | 19:53 |
sean-k-mooney | *service db tables not api | 19:53 |
sean-k-mooney | basically im trying to see what whoudl result in the minium number of addtional rpc/db/rest calls to enable | 19:54 |
s3rj1k | > minium number of addtional rpc/db/rest calls - solution by documentation :) | 19:58 |
sean-k-mooney | maybe, or https://paste.opendev.org/show/bFFLh0pVeBZAOYXoNjKs/ | 20:05 |
sean-k-mooney | we woudl just add that here https://github.com/openstack/nova/blob/master/nova/scheduler/manager.py#L111 | 20:06 |
s3rj1k | hmm, if that works that would be nice, can be combined with docs extension | 20:07 |
sean-k-mooney | sure | 20:07 |
sean-k-mooney | basicaly that one addtional db call on each invocation fo the perodic. but its a net win because instead of 3 of the perodic doing the work in paralel only 1 of them will execute the rest of the code | 20:08 |
sean-k-mooney | so overall that shoudl reduce the db load | 20:08 |
s3rj1k | in that case warning about concurrent CLIs and extension to how periodics work | 20:09 |
s3rj1k | one thing, in case of AA how would leader election work? like fallback? | 20:10 |
sean-k-mooney | so the heartbeat woud expire on the failed schduler | 20:10 |
s3rj1k | and alive one would be elected? | 20:10 |
sean-k-mooney | and then then it would be elimistaed for the set and we would select a new leader | 20:10 |
sean-k-mooney | right so if we sort on the uuid then that would be stable or the hostname | 20:11 |
sean-k-mooney | so if one host/pod dies then it will get remvoed form the set and we just take the first form the list | 20:11 |
sean-k-mooney | its not ture leader election in that when the down host comes back it takes over as the leader | 20:12 |
s3rj1k | this is fine | 20:12 |
sean-k-mooney | where as normally you woudl only do an election when then current wone extits but ya it should work for this case | 20:12 |
s3rj1k | we can sort by uuid + uptime | 20:13 |
s3rj1k | if we can get time info | 20:13 |
sean-k-mooney | well we dont want it really to change often | 20:13 |
sean-k-mooney | and that woudl be unstable ie.. tow diffent schduler check 1 second apprart | 20:13 |
sean-k-mooney | that coudl lead to both thinking the other is the leader | 20:14 |
s3rj1k | I mean prefer ones that run longer | 20:14 |
sean-k-mooney | but i dotn think the unbiase dload is a problem in this case | 20:14 |
sean-k-mooney | maybe i would keep it simple initaly | 20:15 |
sean-k-mooney | we coudl alway teak it later if needed | 20:15 |
s3rj1k | but simple uuid sort would work as this is not very critical place | 20:15 |
sean-k-mooney | by the way up is not directly a property on the service (we hsould make it one) we have a function that takes the last time we saw it and a config value to decied if its still up | 20:16 |
sean-k-mooney | but that pretty easy to handel | 20:16 |
sean-k-mooney | we do that computation and systisze the up value in teh api when we do a service list | 20:17 |
sean-k-mooney | basically we have https://github.com/openstack/nova/blob/master/nova/objects/service.py#L316 | 20:20 |
sean-k-mooney | which tells use when the last heartbet was recived | 20:20 |
sean-k-mooney | and we jus tneed to compare that to the interval and allow missed heatbeats | 20:21 |
sean-k-mooney | or actully we can jsut check https://docs.openstack.org/nova/latest/configuration/config.html#DEFAULT.service_down_time driectly | 20:22 |
s3rj1k | I see report_count in there | 20:22 |
sean-k-mooney | up = datetime.now() - CONF.service_down_time < last_seen_up | 20:23 |
sean-k-mooney | something like ^ would work | 20:24 |
s3rj1k | report_count is per-service right? | 20:24 |
sean-k-mooney | yes | 20:24 |
s3rj1k | so if thing goes down and up it gets nulled and count get started over? | 20:25 |
sean-k-mooney | i dont recall of the top of my head | 20:25 |
sean-k-mooney | in this context we can ignore i ti think | 20:25 |
sean-k-mooney | anyway i better finish for today | 20:26 |
s3rj1k | so basically there are solution how to make that selection more clever | 20:26 |
sean-k-mooney | yep if needed | 20:26 |
s3rj1k | thanks for you input | 20:26 |
sean-k-mooney | if we were trying to distibute load | 20:26 |
sean-k-mooney | we might want a more clever solution | 20:26 |
sean-k-mooney | but since we are trying to make sure only one runs i dont think that isneeded | 20:27 |
sean-k-mooney | we might want a minium uptime before the leader can change | 20:27 |
sean-k-mooney | i.e. double the perodic interval or somthing like that | 20:27 |
sean-k-mooney | just to prevent issues on start up | 20:28 |
s3rj1k | even of we miss, next run will fix that | 20:28 |
sean-k-mooney | or perhaps double the service downtime interval | 20:28 |
s3rj1k | s/of/if/ | 20:29 |
sean-k-mooney | but yes i think we can do this in a lock free way and it will self heal | 20:29 |
sean-k-mooney | actully droping now o/ | 20:29 |
s3rj1k | have a good night | 20:30 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!