Thursday, 2020-05-07

*** jamesmcarthur has joined #openstack-meeting-300:43
*** jamesmcarthur has quit IRC01:10
*** jamesmcarthur has joined #openstack-meeting-301:12
*** jamesmcarthur has quit IRC01:36
*** jamesmcarthur has joined #openstack-meeting-301:48
*** jamesmcarthur has quit IRC01:48
*** jamesmcarthur has joined #openstack-meeting-301:49
*** jamesmcarthur has quit IRC01:53
*** apetrich has quit IRC02:09
*** jamesmcarthur has joined #openstack-meeting-302:10
*** jamesmcarthur has quit IRC02:20
*** jamesmcarthur has joined #openstack-meeting-302:33
*** psachin has joined #openstack-meeting-303:37
*** jamesmcarthur has quit IRC03:37
*** jamesmcarthur has joined #openstack-meeting-303:40
*** jamesmcarthur has quit IRC05:53
*** jamesmcarthur has joined #openstack-meeting-305:59
*** jamesmcarthur has quit IRC06:03
*** jamesmcarthur has joined #openstack-meeting-306:10
*** jamesmcarthur has quit IRC06:27
*** belmoreira has joined #openstack-meeting-306:27
*** slaweq has joined #openstack-meeting-306:47
*** ralonsoh has joined #openstack-meeting-307:26
*** belmoreira has quit IRC08:03
*** belmoreira has joined #openstack-meeting-308:15
*** diablo_rojo has quit IRC09:24
*** irclogbot_3 has quit IRC09:41
*** irclogbot_0 has joined #openstack-meeting-309:42
*** apetrich has joined #openstack-meeting-309:44
*** yamamoto has joined #openstack-meeting-311:26
*** artom has quit IRC11:52
*** raildo has joined #openstack-meeting-312:25
*** yamamoto has quit IRC12:54
*** yamamoto has joined #openstack-meeting-312:54
*** psachin has quit IRC13:20
*** artom has joined #openstack-meeting-313:45
*** artom has quit IRC13:46
*** artom has joined #openstack-meeting-313:46
*** belmoreira has quit IRC14:00
*** belmoreira has joined #openstack-meeting-314:00
*** yamamoto has quit IRC14:02
*** yamamoto has joined #openstack-meeting-314:02
*** yamamoto has quit IRC14:07
*** slaweq_ has joined #openstack-meeting-314:14
*** slaweq has quit IRC14:14
*** yamamoto has joined #openstack-meeting-314:18
*** slaweq_ has quit IRC14:32
*** slaweq has joined #openstack-meeting-314:34
*** bbowen has joined #openstack-meeting-315:17
*** jamesmcarthur has joined #openstack-meeting-315:32
gibi#startmeeting nova16:00
openstackMeeting started Thu May  7 16:00:05 2020 UTC and is due to finish in 60 minutes.  The chair is gibi. Information about MeetBot at http://wiki.debian.org/MeetBot.16:00
openstackUseful Commands: #action #agreed #help #info #idea #link #topic #startvote.16:00
*** openstack changes topic to " (Meeting topic: nova)"16:00
openstackThe meeting name has been set to 'nova'16:00
gibio/16:00
artom~o~16:00
bauzas\o16:00
gmanno/16:00
dansmith.16:00
melwitto/16:01
gibi#topic Last meeting16:01
*** openstack changes topic to "Last meeting (Meeting topic: nova)"16:01
gibi#link Minutes from last meeting: http://eavesdrop.openstack.org/meetings/nova/2020/nova.2020-04-30-16.00.log.html16:01
gibiis there anything to bring back from the last meeting?16:01
dansmithI keep seeing that topic and getting falsely excited that *this* is the last of these meetings :)16:02
gibi:) no it is not16:02
gibi#topic Bugs (stuck/critical)16:02
*** openstack changes topic to "Bugs (stuck/critical) (Meeting topic: nova)"16:02
gibiNo Critical bugs16:02
gibi#link 31 new untriaged bugs (-7 since the last meeting): https://bugs.launchpad.net/nova/+bugs?search=Search&field.status=New16:02
bauzasthanks gibi16:03
gibiwe are still on a downward trend but slowing down16:03
bauzasI will help next weerk16:03
gibiI want to reach 0 in the next couple of weeks if possible16:03
bauzaswe have a PTG discussion for this16:03
gibibauzas: thanks16:03
gibiI'm not tracking any RC critical bug at the moment16:04
gibi#link https://bugs.launchpad.net/nova/+bugs?field.tag=ussuri-rc-potential16:04
gibianything bug we need to discuss today?16:04
gibi#topic Release Planning16:05
*** openstack changes topic to "Release Planning (Meeting topic: nova)"16:05
gibiWe cut RC2 this week to include the fix https://review.opendev.org/#/q/topic:bug/1875418+(status:open+OR+status:merged)16:05
gibiI don't see anyithing that is blocking a GA now so I assume RC2 will be the GA code16:05
gibiplease raise any issue with the ussuri release basically now as the RC deadline is today16:06
gibianything else to discuss about the release?16:06
bauzas#link https://releases.openstack.org/ussuri/schedule.html16:06
bauzasGA is next week16:07
gibiyepp16:07
bauzasso unless we have a very large regression, I think we can hold16:07
gibiand next week there will be a community call to present Ussuri for the world16:07
gibiI will talk 5 minutes about what we did in the last cycle, like a really mini project update16:07
gibi#link http://lists.openstack.org/pipermail/openstack-discuss/2020-May/014676.html16:09
gibithis is the details of the community call ^^16:09
gibi#topic Stable Branches16:09
*** openstack changes topic to "Stable Branches (Meeting topic: nova)"16:09
gibiI did not see any major event on the stable branch16:09
gibilyarwood: if you are around, do you have any news?16:10
gibiI guess he is not around16:11
gibi#topic Sub/related team Highlights16:12
*** openstack changes topic to "Sub/related team Highlights (Meeting topic: nova)"16:12
gibiAPI (gmann)16:12
gmanni have not checked the APi related spec for V cycle yet16:12
gmannone things going on is healthcheck #link https://review.opendev.org/#/c/724684/16:12
gibigmann: do we need a bp for that?16:12
gmanni have added this to discuss in PTG also, discussion going in review too.16:13
gmanni asked for spec to have a complete things we can do now and later at least we know we want to do later so that we can design this not breaking when we add other things later16:13
gmannlike unauth, enable/disable options ect16:13
gmannetc16:13
gibispec is even better especially if there are multiple steps16:14
gmannyeah. we can ship it a minimum things for now and i am checking if adding things is possible as config option or not16:15
gmannmain concern is when we add new things, we can add it in compatible way. like on-demand deeper checks16:16
artom"we can ship it a minimum things for now" + 1 to that16:16
artomAre we discussing this in detail now? One idea I had was make it authenticatable from the start, but for now just return the basic 200 OK for everything, authenticated or not16:16
gmannbut i have not checked with poc yet is that work with oslo.middleware or we need to add extra filter for that.16:16
artomAnd then we can spec out the "deep status" healthcheck16:17
melwittyeah I wanted to ask gmann if starting out unauth'ed and then upgrading to auth later, would that pose an issue from the API perspective?16:17
artomAnd zigo makes a good point in the review that it needs to be fast, because haproxy will be hitting it every second16:17
dansmithpresumably this isn't going to be versioned as strictly as the rest of the API right?16:18
gmannmelwitt: it will as many load balancer use without auth and if they need token then it will break them16:18
artomSo it's probably a bad idea to try authentication on every request16:18
artomThere should be a "were authentication headers sent? No --> quick 200 OK" mechanism16:18
gibiartom: nothing heavy on the agenda so I think it is OK to have a sneak-peak of the feature to draw attention16:18
bnemecI don't think things like haproxy are going to be able to auth, so if we add auth we still need to have a basic healthcheck that is unauth'd.16:19
dansmithwe could pretty easily build the healthcheck data from authenticated requests16:19
gmanntrue16:19
dansmithunauth'd healthchecks include very coarse information, which may be up to date if there are auth'd requests keeping it fresh, and if not, it's no worse than a basic check16:19
zigoNot even *one* haproxy hitting it every second, but in most case, 3, so 3 queries per second, constantly.16:20
artombnemec, almost like we need different URLs, one for load balancers, one for humans or other more advanced monitoring solutions16:20
gmannzigo: yeah, default of helthcheck can be a fast responding things.16:20
dansmithzigo: ack, yeah and if we have three cells, that's five databases per check, three mqs per check, which is a good reason to build that information in a cache and just return it from healthchecks16:20
gmannanyways all these things to discuss so spec can be better16:20
bnemecartom: That would probably be the simplest.16:20
gibifeels like we have plenty of things for the spec. lets continue there16:21
gibigmann: any other API releated thing you want to mention?16:21
gmannthat's all for today from me16:22
gibicool, thanks16:22
gibiLibvirt (bauzas)16:22
zigoDo everyone agree that the current healthcheck can still be approved, in the mean while?16:22
zigo*does16:23
gibizigo: we need to know that our future plans with the healthcheck as an extension of the current simple API16:23
gibiare viable16:23
gmannzigo: yeah so that we do not need to change the current proposed.  healthcheck usage16:24
gmanni mean discuss in spec first and then do current proposed one16:24
dansmithdefinitely discuss in spec first16:24
bnemecFor reference, there was a previous healthcheck spec with a bunch of discussion: https://review.opendev.org/#/c/53145616:25
dansmithyeah, I remember,16:25
gmannbnemec: thanks that will be good ref to check too16:25
dansmithplenty of fodder there for needing a wider discssion16:26
zigoFWIW: the same type of patch has already been approved for Neutron, Heat and Cinder, so it's kind of weird that we aren't getting things cross-project this way.16:26
bnemecOh, this also has a great list of previous discussions: https://storyboard.openstack.org/#!/story/200143916:27
dansmithzigo: omg, I'm convinced.. best argument ever16:27
zigo:)16:27
dansmith:P16:27
bauzasgibi: sorry was off16:27
gibibauzas: no worries I call you again16:28
bauzasnothing to say, but aarents asked for some changes16:28
bauzashttps://etherpad.opendev.org/p/nova-libvirt-subteam16:28
bauzaswill try to review them soon16:28
gibibauzas: cool thanks16:28
bauzasthat's it16:28
bauzaskashyap also has a point about q35 but he's not around16:28
*** belmoreira has quit IRC16:29
gibilets quickly finish the agenda and then we can get back to the healtcheck discussion in the Open16:29
gibi#topic Stuck Reviews16:29
*** openstack changes topic to "Stuck Reviews (Meeting topic: nova)"16:29
gibinothing on the agenda. Does anybody have a stuck review to bring up?16:29
gibi#topic Virtual PTG planning16:30
*** openstack changes topic to "Virtual PTG planning (Meeting topic: nova)"16:30
gibiCurrent nova schedule is on the top of the etherpad #link https://etherpad.opendev.org/p/nova-victoria-ptg16:31
gibiCyborg also wants to talk with us about SmartNic and that discussion is now scheduled for June 5 Friday 14:00 UTC - 15:00 UTC16:31
gibianything else about the virtual PTG ?16:31
gmanndo we want to move healthcheck topic with oslo as cross project?16:32
gmanni added at L 179 for now16:32
artomNot sure it's olso crossproject... It's already merged in other projects (ex: https://review.opendev.org/#/c/724676/), so if we want cross-project uniformity (which I think is important), our hands are kinda tied in that sense16:33
gibigmann: If you feel bnemec or other folks from oslo would be good to join to that discussion then lets try to have some dedicated time for an oslo-nova cross session16:33
gibibundled with the policy discussion16:34
artomLike, making it authenticatable and future-proof are important, but it'd be bad form to go off and do our own thing entirely.16:34
gmannok16:34
bnemecI think it's important to keep in mind that there are two things here: enabling the existing simple healthcheck, and designing the next-gen fancy healthcheck16:34
bnemecThe latter should not block the former IMHO.16:34
gibi#topic Open discussion16:35
*** openstack changes topic to "Open discussion (Meeting topic: nova)"16:35
artombnemec, agreed. I guess the point is, if we want to have the same on the same URL (which is debatable in my mind), we need to build in things the latter might need from the start16:36
gibiwe can continue the healthcheck discussion now in the Open16:36
artom*have them both on the same URL16:36
gibi(as nothing else on the agend for Open)16:36
dansmithartom: yeah, that's the thing I'd want to know16:36
dansmithI don't want to have /healthcheck, /useful_healthcheck, /no_serously_this_one, etc16:37
bnemecIf having them both on the same URL blocks having any healthcheck for the next two years then I think that's a bad approach.16:37
artomdansmith, well, yeah, but realistically how many are we going to have?16:37
bnemecI note that https://storyboard.openstack.org/#!/story/2001439 mentioned possibly different behavior for GET vs HEAD.16:37
artomdansmith, one simple, unauthenticated, unversioned, one "fancy", authenticated, versioned16:37
dansmithwe've already identified several levels..16:38
bnemecI have no idea if that's an API no-no though.16:38
bauzashonestly, I co-contributed to this change, but I'm not opiniated a single bit.16:38
gmanni think it should be doable with same url with extra 'backends' to check for oslo? but need to try16:38
dansmithartom: honestly, what does the simple unauth'd one tell you? that apache and mod_wsgi is working right?16:38
dansmithartom: is there any difference between hitting that check vs just the version manifest?16:38
artomdansmith, there isn't16:39
gmannextra configured 'backends'16:39
artomdansmith, the argument from operators is having every project have a common URL for that16:39
artomAnd not nova with /versions, neutron with /healthcheck, cinder with /status or whatever16:39
artom(I made up the last one)16:40
bauzashonestly, if we have different URLs between services, we don't need the healthcheck one16:40
dansmithcan't you hit the / on everyone's api and get the same result?16:40
artomdansmith, I dunno, can you?16:40
gmannnot all service has / (versions) url ?16:40
artomzigo ^^ ?16:40
* zigo reads the backlog16:40
dansmithI don't really know what the oslo base bit gives us... I thought we could provide a function to generate the report or something. is that the case or not?16:40
dansmithgmann: don't they all redirect to something like the version doc? anyway, I'm not really suggesting that as an alternative, I'm just saying a "hello world" seems pointless to me16:41
bauzasdansmith: the only thing that would be nice for ops is that they can disable the healthcheck on their wishes16:41
dansmithbauzas: sorry, what?16:42
zigodansmith: You wont get the same result, no, you get a "300 multiple choice", that's not what operators need.16:42
zigoWe need a "200 ok" ...16:42
bauzasdansmith: the healthech API can return 'sorry, 503' if a file is provided16:42
gmanndansmith: we can implement extra plugins (than default one of file existence check) to generate the report and add in olso to check all plugins  added for healthcheck app16:42
dansmithokay I don't understand either of those fully16:43
bauzasthat's the only single bit that can help HAProxy more than just checking a port16:43
gmanndansmith: current default plugins are file checks.16:43
bauzasbut honestly, as a support engineer years ago, I wasn't trusting healthchecks16:43
zigoAnd the idea behind the file is so one can turn off the API in a nice way: tell Haproxy, I'm going to turn off the API... then really do it.16:43
gmannand yes, port with file16:43
dansmithbauzas: right because they tell you nothing?:)16:43
bauzasI preferred homemade checks based on logics16:43
bauzasfor my haproxy backends16:43
dansmithif the goal is really to have a completely pointless not-really-health-related common url across all projects then whatever16:44
zigodansmith: The point is having something to query for haproxy, nothing more, nothing less.16:44
gmannexactly, it should be 'yes healthy' means your request should be success (as per  general checks we did for minimum required things)16:45
zigoIf we're capable of providing more than that, great, but this shouldn't wait for spec, design, doc, test, implementation, etc.16:45
zigoMy original patch barely activated a feature we already have...16:46
gmannzigo: and if providing more leads to change the existing used one then also fine ?16:46
zigoYeah, great too ! :)16:46
bnemecAlso worth noting that the /healthcheck endpoint is already enabled for some services, so even if we decide to completely redesign it we can't ignore the existing one.16:46
zigoIf it becomes more reliable, that's bonus points.16:46
dansmithgmann: for zigo's use case, but people will write nagios plugins and other monitoring infra against this of course, so while zigo and others only care about the "200 OK" the devil is in the details, like it always is16:47
zigodansmith: Operators do know that this is not enough for monitoring.16:47
zigoI could send you my scripts if you like! :)16:48
dansmithsuper unfortunate that we called it /healthcheck don't you think?16:48
artomdansmith, which why documenting what this actually is and its limitations is important, but I don't see that as a reason to not do it. It's an unobtrusive chance.16:48
artom*change16:48
dansmithartom: of course, the code isn't the obtrusive part :)16:48
zigoWe can call it "/my-http-api-server-is-alive-and-haproxy-can-query-it" but that's a bit long to type ...16:48
artomWhat is? The time we're spending debating this? ;)16:48
artomdansmith, plus, it means you'll get to write another massively influential blog about about /healthcheck vs /ping vs /status, like your evacuate one ;)16:49
zigodansmith: For the monitoring, what we do with nova-api is actually querying https://${HOSTNAME}:8774/v2.1/servers and see if the monitoring instance is in the list for that project.16:50
zigoThat's much better than just checking /healthcheck of course.16:50
dansmithI think what artom is saying is that any change that has few lines of code isn't worth discussing regardless of the actual impact16:50
gibimy opinion consistency across sevices are good so I'm +1 on /healthcheck as of today returning a plain 200 OK. But have a agreement in a spec that if we want to extend that 200 OK with more information then how we extend the /healthcheck API. I'm now OK to have the unauthed vs authed switch between simple 200 OK and complex healthcheck result16:50
artomdansmith, that's completely false and you know :P16:50
artom*know it16:50
artomThis is *adding* and *independant* thing that operators can use or not, at their leisure16:51
artom*an *independant*16:51
bnemecI guess I don't understand the huge drawback of having people write monitoring checks against  a /healtcheck designed for such a thing versus them writing hacky checks against / that doesn't behave the way they want.16:51
artomThough I'll grant that the concern about evolving it is a valid one16:52
melwittif it's called /healthcheck, operators are going to expect it to check health to some extent. and not just be a liveness check (like checking for an open port or something)16:54
melwittso if that was not the intention, I agree the name choice is unfortunatel16:54
melwitt-l16:54
zigomelwitt: In simple words: *no* ! :)16:54
gmannyeah and that is what i thought it was when i first saw. i was not aware of previous oslo spec disucssion.16:55
gmannor until i saw the olso code16:55
melwittzigo: what are you saying "no" about?16:55
zigoAs an operator, we do all sorts of things to check if everything is up, not just checking /healthcheck. If that is your concern, then we can further document that this is not (yet?) what it is for.16:55
artommelwitt, so put a .. warning:: in the documentation saying this is just making sure that the HTTP service is operational16:56
zigoartom: Right.16:56
zigo:)16:56
gibi3 mintes left, lets try to warp it up here but continue it on #openstack-nova and/or in a spec16:56
melwittright, and so what is /healthcheck giving you beyond other checks like whether something is listening on port 8774 or that nova-api responds to http request?16:56
melwittwell, anyway, I think my point is clear. we can wrap it16:57
artommelwitt, the '200 OK' status - / (or /versions?) is "300 multiple choice"16:57
zigomelwitt: If you don't give haproxy some URL to query, it's going to connect to the port, then disconnect, which is very ugly.16:57
zigoSo we got to give it an URL, and that URL must reply "200 ok".16:58
zigoThat's what the /healthcheck is for ...16:58
melwittI understand that, just saying if this is not a healthcheck a different name would have been more appropriate16:58
melwittthis is implying that health is being checked, obviously16:58
artommelwitt, fair point16:59
gmannhow about /healthcheck -> all deeper checks and /healthcheck?https-only-check -> minumum check as proposed16:59
zigo"Thu May  7 16:58:59 2020 - SIGPIPE: writing to a closed pipe/socket/fd (probably the client disconnected) !!!"16:59
zigoThat's what I get constantly in my logs if I don't activate healtcheck stuff.16:59
artommelwitt, but in the interest of cross-project uniformity, and because we can't go back in time and other projects have merged this (for better or worse), our hands are kinda tied16:59
gmannand default is former one, do all deeper checks as this endpoint name suggest17:00
zigo(in my case, that's when using uwsgi)17:00
gibiOK. thank you folks. continue it on #openstack-nova17:00
gibi#endmeeting17:00
*** openstack changes topic to "OpenStack Meetings || https://wiki.openstack.org/wiki/Meetings/"17:00
openstackMeeting ended Thu May  7 17:00:25 2020 UTC.  Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4)17:00
openstackMinutes:        http://eavesdrop.openstack.org/meetings/nova/2020/nova.2020-05-07-16.00.html17:00
openstackMinutes (text): http://eavesdrop.openstack.org/meetings/nova/2020/nova.2020-05-07-16.00.txt17:00
openstackLog:            http://eavesdrop.openstack.org/meetings/nova/2020/nova.2020-05-07-16.00.log.html17:00
gmannartom: not sure what all projects merged and after a common agreement, i will say that was too early to get that in if that is merged especially when it was for common use case of all openstack services and its API.17:01
*** gmann is now known as gmann_afk17:25
*** ralonsoh has quit IRC17:51
*** jamesmcarthur has quit IRC17:53
*** jamesmcarthur has joined #openstack-meeting-317:59
*** jamesmcarthur has quit IRC18:03
*** jamesmcarthur has joined #openstack-meeting-318:39
*** e0ne has joined #openstack-meeting-318:40
*** e0ne has quit IRC18:56
*** gmann_afk is now known as gmann19:06
*** jamesmcarthur_ has joined #openstack-meeting-319:58
*** jamesmcarthur_ has quit IRC19:59
*** jamesmcarthur_ has joined #openstack-meeting-319:59
*** jamesmcarthur has quit IRC20:00
*** e0ne has joined #openstack-meeting-320:57
*** e0ne has quit IRC20:59
*** raildo has quit IRC21:45
*** jamesmcarthur_ has quit IRC22:13
*** jamesmcarthur has joined #openstack-meeting-322:13
*** slaweq has quit IRC22:14
*** jamesmcarthur has quit IRC22:19
*** slaweq has joined #openstack-meeting-322:20
*** slaweq has quit IRC22:24
*** jamesmcarthur has joined #openstack-meeting-322:33
*** jamesmcarthur has quit IRC22:37
*** jamesmcarthur has joined #openstack-meeting-322:38
*** jamesmcarthur has quit IRC22:42
*** jamesmcarthur has joined #openstack-meeting-323:01
*** jamesmcarthur has quit IRC23:04
*** jamesmcarthur has joined #openstack-meeting-323:05
*** jamesmcarthur has quit IRC23:11
*** jamesmcarthur has joined #openstack-meeting-323:11
*** spotz has quit IRC23:16
*** jamesmcarthur has quit IRC23:38
*** jamesmcarthur has joined #openstack-meeting-323:38

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!