16:00:51 #startmeeting nova 16:00:51 Meeting started Tue Nov 2 16:00:51 2021 UTC and is due to finish in 60 minutes. The chair is bauzas. Information about MeetBot at http://wiki.debian.org/MeetBot. 16:00:51 Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. 16:00:51 The meeting name has been set to 'nova' 16:01:07 #link https://wiki.openstack.org/wiki/Meetings/Nova#Agenda_for_next_meeting 16:01:16 good day, everyone 16:01:19 o/ 16:01:27 o/ 16:01:59 \o 16:02:02 as discussed before, I will exceptionnally be only able to chair this meeting for 30 mins 16:02:12 so I'll let gibi co-chair 16:02:15 #chair gibi 16:02:15 Current chairs: bauzas gibi 16:02:29 * gibi accepts the challenge 16:02:39 let's start while people join 16:02:46 #topic Bugs (stuck/critical) 16:02:50 No Critical bug 16:03:13 #link 20 new untriaged bugs (+0 since the last meeting): #link https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New 16:03:31 thanks to anybody who triaged a few of them 16:04:11 #help any help is appreciated with bug triage and we have a how-to https://wiki.openstack.org/wiki/Nova/BugTriage#Tags 16:04:35 32 open stories (+0 since the last meeting) in Storyboard for Placement #link https://storyboard.openstack.org/#!/project/openstack/placement 16:05:20 so, maybe we closed one or more stories in Storyboard, but I don't think so 16:06:00 yeah, last story was written on Oct 21st 16:06:19 any bug to discuss in particular ? 16:07:00 ok, I guess no, moving on 16:07:05 #topic Gate status 16:07:11 Nova gate bugs #link https://bugs.launchpad.net/nova/+bugs?field.tag=gate-failure 16:07:25 nothing new 16:07:33 Placement periodic job status #link https://zuul.openstack.org/builds?project=openstack%2Fplacement&pipeline=periodic-weekly 16:07:45 no issues so far ^ 16:08:01 just the usual reminder, 16:08:03 Please look at the gate failures, file a bug, and add an elastic-recheck signature in the opendev/elastic-recheck repo (example: #link https://review.opendev.org/#/c/759967) 16:08:13 that's it for gate status 16:08:15 this is a gate fix that needs a second core https://review.opendev.org/c/openstack/nova/+/814036 16:08:20 oh right 16:08:44 gibi: I'll look at it while we speak 16:09:12 thanks 16:09:26 (already looked at it, but need one last glance) 16:09:56 moving on or any gate failure to mention besides the above one ? 16:10:54 #topic Release Planning 16:10:58 Yoga-1 is due Nova 18th #link https://releases.openstack.org/yoga/schedule.html#y-1 16:11:10 (3 weeks from now) 16:11:20 err, 2 weeks 16:11:24 +2d 16:11:37 which means, typey typey your specs 16:12:25 https://review.opendev.org/q/project:openstack/nova-specs+is:open is not that large 16:12:48 what makes me tell : 16:12:56 #startvote Spec review day proposal on Tuesday Nova 16th ? (yes, no) 16:13:22 yes 16:13:26 #yes 16:13:30 (how to vote?) 16:13:30 dang, the meetbot doesn't tell what to say 16:13:39 #vote yes 16:13:43 you have a space in front 16:13:43 #vote yes 16:13:47 of the startvote 16:13:48 #startvote Spec review day proposal on Tuesday Nova 16th ? (yes, no) 16:13:48 Begin voting on: Spec review day proposal on Tuesday Nova 16th ? Valid vote options are , yes, no, . 16:13:48 Vote using '#vote OPTION'. Only your last vote counts. 16:13:54 dansmith: huzzah 16:13:54 #vote yes 16:13:54 gibi: yes is not a valid option. Valid options are , yes, no, . 16:14:02 yes 16:14:04 #vote yes, ! 16:14:04 gibi: yes, ! is not a valid option. Valid options are , yes, no, . 16:14:06 #vote yes, 16:14:06 gibi: yes, is not a valid option. Valid options are , yes, no, . 16:14:09 lol 16:14:13 oh man 16:14:21 this is an absolutule fail. 16:14:22 #vote yes, 16:14:22 dansmith: yes, is not a valid option. Valid options are , yes, no, . 16:14:31 #vote yes, no 16:14:31 dansmith: yes, no is not a valid option. Valid options are , yes, no, . 16:14:33 #vote , 16:14:33 gibi: , is not a valid option. Valid options are , yes, no, . 16:14:38 #vote yes, no 16:14:38 dansmith: yes, no is not a valid option. Valid options are , yes, no, . 16:14:41 lets just assume we are ok with the 16th and move on 16:14:43 * bauzas facepalms 16:14:44 #vote yes, no, 16:14:44 dansmith: yes, no, is not a valid option. Valid options are , yes, no, . 16:14:45 this is bot abuse! 16:14:53 #endvote 16:14:53 Voted on "Spec review day proposal on Tuesday Nova 16th ?" Results are 16:14:53 #vote meh 16:14:55 thet is fun 16:15:23 ok I guess this was epic but we should leave this bot quiet back for 5 years 16:15:38 anyway, 16:15:51 #agreed Spec review day happening on Nov 16th 16:15:56 voilà 16:15:59 moving on 16:16:03 #topic Review priorities 16:16:15 #link https://review.opendev.org/q/status:open+(project:openstack/nova+OR+project:openstack/placement)+label:Review-Priority%252B1 16:16:16 also leading space 16:16:45 dansmith: good catch, the copy/paste makes me mad 16:16:50 #topic Review priorities 16:16:57 #undo 16:16:57 Removing item from minutes: #topic Review priorities 16:17:25 fun, the meetbot isn't telling new topics 16:17:33 anyway, next point 16:17:35 it doesn't on oftc I think 16:17:42 #action bauzas to propose a documentation change by this week as agreed on the PTG 16:17:47 but if you don't do it #properly it won't record them either 16:18:13 for adding a gerrit ACL to let contributors +1ing 16:18:31 didn't had time to formalize it yet 16:18:49 #topic Stable Branches 16:18:59 elodilles: floor is yours 16:20:02 I guess he's not around 16:20:05 so I'll paste 16:20:14 stein and older stable branches are blocked, needs the setuptools pinning patch to unblock: https://review.opendev.org/q/I26b2a14e0b91c0ab77299c3e4fbed5f7916fe8cf 16:20:37 we need a second stable core especially on https://review.opendev.org/c/openstack/nova/+/813451 16:21:00 Ussuri Extended Maintenance transition is scheduled to next week (Nov 12) 16:21:07 the list of open and unreleased patches: https://etherpad.opendev.org/p/nova-stable-ussuri-em 16:22:18 I guess we need to make a few efforts before ussuri becomes EM 16:22:31 elodilles: again, I offer my help if you ping me 16:22:44 patches that need one +2 on ussuri: https://review.opendev.org/q/project:openstack/nova+branch:stable/ussuri+is:open+label:Code-Review%253E%253D%252B2 16:23:08 (I'll skim this list) 16:23:22 oh, sorry, DST :S 16:23:24 last but not least: https://review.opendev.org/806629 patch (stable/train) needed 14 rechecks, I was pinged with the question whether testing should be reduced in train to avoid this amount of rechecks (mostly volume detach issue) 16:23:54 elodilles: hah, I warned about it in the channel :p 16:24:06 are the detach issue due to the qemu version we have in bionic 16:24:28 i assume train is not on focal? 16:24:36 yes, train is on bionic 16:24:50 (just like ussuri) 16:26:34 hmmm, technically, Train is EM 16:26:53 im somewhat tempeted to same maybe move it to centos 8 or focal but we could disable the volume tests 16:26:59 yes it is 16:27:11 I'd rather prefer us fixing the gate issues rather than reducing the test coverage, but this depends on any actions we can take 16:27:24 so, let's be pragmatic 16:28:17 well the first question would be does train have gibis event based witing patch or is it still using the retry loop 16:28:39 gibi's patch isn't merged yet, right? 16:28:44 could it help ? 16:28:57 the only options reallly to fi this are change the qemu verions or backport gibis patch 16:29:10 (I'll have to leave gibi chair in the next 2 mins but dansmith has a point I'm interested in) 16:29:37 I also have to go sooner 16:29:43 sean-k-mooney[m]: we can try to backport gibi's patch and see whether that helps 16:29:45 maybe we could swap open and libvirt? 16:29:52 dansmith: I'll 16:29:59 I don't think there is anything in the libvirt topic 16:30:02 lyarwood is out now 16:30:09 okay 16:30:21 okay, elodilles I'll propose to wait for gibi's patch to land in master and then be backported 16:30:32 bauzas: it is backported til wallaby 16:30:33 bauzas: ack 16:30:34 and punt the decision to reduce the test coverage once we get better ideas 16:30:41 if we are talking about https://review.opendev.org/q/topic:bug/1882521 16:31:09 gibi: then we need to backport it down to train 16:31:13 I don't think it will be a piece of cake to bring that back train 16:31:20 *to train 16:31:32 gibi: (apologies I confused with the vnic types waiting patch) 16:31:59 I have to leave, but can we hold this one discussion and go straight to dansmith's point 16:32:01 ? 16:32:09 anyhow we can take that outside when lyarwood is back 16:32:09 so I and dansmith can leave 16:32:21 lets go to that 16:32:25 #topic Sub/related team Highlights 16:32:27 nothing to tell 16:32:32 #topic Open discussion 16:32:37 Bring default overcommit ratios into sanity (dansmith / yuriys) 16:32:39 So, I think we all know the default 16x cpu overcommit default is insane 16:32:41 exciting 16:32:54 we've got reports that some operators are USING those defaults because they think we're recommending them 16:32:59 yes it is 16:33:04 I am so used to Slack and Discord for drop a paragraph level of communication, so pardon all the incoming spam! I prewrote stuff. 16:33:04 hah 16:33:08 yuriys is here to offer guidance and help work on this, 16:33:21 but I think we need to move those defaults to something sane, both in code and update the docs 16:33:28 I guess this can be workload dependent, right? 16:33:29 it basically should not be set over 4x 16:33:39 4:1 for cpu , 1:1 for mem. 16:33:44 yep 16:33:48 I think we started to document things based on workloads 16:33:56 yes 16:33:58 bauzas: it's completely workload dependent, but we should not be recommending really anything, and thus I think the default needs to be closer to 1:1 with docs saying why you may increase it (or not) 16:33:58 but we never achieved this 16:34:01 it's VERY use case specific 16:34:17 it is 16:34:26 I'm referring to https://docs.openstack.org/nova/latest/user/feature-classification.html 16:34:28 Ideally the documentation is restructured on how to scale up these over commits to match the desired use case, density and performance and start at sane values/defaults. Engineers can then template out the necessary config values after they've gotten to know system capabilities. Instead of working backwards from chaos and mayhem, giving future admins the opportunity to reach desired state through scaling up from a stable syste 16:34:28 hould be the goal of arch design. 16:35:07 so can we agree that we'll change the defaults to something that seems reasonable, and modify the docs that just say "these are the defaults" to have big flashing warnings that defaults will never be universal in this case? 16:35:09 the 16:1 number originally was assuming webhosting as the main usecase or similar workloads 16:35:13 specifically: https://docs.openstack.org/arch-design/design-compute/design-compute-overcommit.html 16:35:19 that does not fit with how openstack is typicaly used 16:35:28 dansmith: this sounds a reasonable change 16:35:45 cool, specless bp or bug? 16:36:00 my only worries would go on how this is wired at the object level but we can take the opportunity to lift this 16:36:12 4:1 for cpu is the highist ratio i would general consider usable in production 16:36:32 dansmith: there are a few upgrade concerns with placement as IIRC this is set by the model itself 16:36:32 I think we already moved towards something better when we moved the defaults to placement right? but we need to do something 16:36:54 well we have the inital allcoation ratios now 16:36:56 i said 4 to be reasonable haha, i think i've set about 3:1 with nova-scheduler and placement randomization elements. 16:36:59 bauzas: yeah I think placement now has explicit defaults right? 16:37:02 I'm pretty sure we have some default value in the placement DB that says "16" 16:37:03 but it still default to 16 16:37:07 sean-k-mooney[m]: right 16:37:17 i think we can decrease inial to 4 for cpu an 1 for memory 16:37:22 so I think we can just move those and reno that operators who took those defualts years ago should change them likely 16:37:39 +1 16:37:39 I don't wanna go procedural 16:37:51 so a specless BP could work for me but, 16:37:57 we need renos 16:38:01 sean-k-mooney[m]: yeah I think 4:1 CPU and 1:1 memory is fine for a default, we might need up to up for devstack I guess but that's where insane defaults should be :) 16:38:03 yep i was thinking the same 16:38:10 + we need to ensure we consider the DB impact before 16:38:15 cool, specless bp and renos.. sounds good 16:38:37 if that becomes debatable in the reviews, we could go drafting more 16:38:42 bauzas: i dont think there will be any 16:38:59 but here, we're talking of changing defaults, not changing existing deployments 16:39:00 if we are just changing the initial values it wont affect existing RPs 16:39:02 yeah I think it'll be straightforward, but we can always revise the plan if needed 16:39:07 right 16:39:18 OK, looks to me we have a plan 16:39:24 #micdrop 16:39:40 #agreed changing overcommit CPU ratio to <16.0 can be a specless BP 16:39:49 yuriys: typey typey 16:40:16 and ping me on IRC once you have the Launchpad BP up so I can approve it 16:40:23 * bauzas needs to drop 16:40:29 OK 16:40:37 is there anything else for today? 16:40:43 no idea what that means, but sounds good? 16:41:03 ill just coordinate through dan i suppose 16:41:16 yuriys: you need a file a blueprint here https://blueprints.launchpad.net/nova/ 16:41:33 so we can track the work 16:41:41 yuriys: with just an overview of what we said, no big deal 16:41:46 yepp 16:42:08 Ah sounds good. Dang, I thought I was going to have like do a whole speech and everything 16:42:14 just to win over votes 16:42:15 haha 16:42:18 :) 16:42:26 yuriys: I told you it wouldn't be a big deal :) 16:42:44 it is not a bid deal if dansmith is on your side ;) 16:42:45 yeah, i think the expandability still needs to be part of that doc btw 16:42:50 for dollar reasons 16:43:00 but ill throw up a BP and we'll go from there 16:43:11 cool 16:43:28 is there anything else for today? I don't see other topics on the agenda 16:44:12 it seems not 16:44:19 so then I have the noble job to close the meeting:) 16:44:29 thank you all for joining today 16:44:40 #endmeeting