Friday, 2021-10-08

opendevreview	Merged openstack/nova master: tests: Walk database migrations in correct order https://review.opendev.org/c/openstack/nova/+/810291	00:30
opendevreview	Merged openstack/nova master: tests: Address some nits with database migration series https://review.opendev.org/c/openstack/nova/+/810856	00:31
opendevreview	Merged openstack/nova master: tests: Silence noise from database tests https://review.opendev.org/c/openstack/nova/+/810857	00:31
opendevreview	melanie witt proposed openstack/nova master: DNM Run against unmerged oslo.limit changes https://review.opendev.org/c/openstack/nova/+/812236	00:51
opendevreview	melanie witt proposed openstack/nova master: Add logic to enforce local api and db limits https://review.opendev.org/c/openstack/nova/+/712139	00:51
opendevreview	melanie witt proposed openstack/nova master: Enforce api and db limits https://review.opendev.org/c/openstack/nova/+/712142	00:51
opendevreview	melanie witt proposed openstack/nova master: Update quota_class APIs for db and api limits https://review.opendev.org/c/openstack/nova/+/712143	00:51
opendevreview	melanie witt proposed openstack/nova master: Update limit APIs https://review.opendev.org/c/openstack/nova/+/712707	00:51
opendevreview	melanie witt proposed openstack/nova master: Update quota sets APIs https://review.opendev.org/c/openstack/nova/+/712749	00:51
opendevreview	melanie witt proposed openstack/nova master: Tell oslo.limit how to count nova resources https://review.opendev.org/c/openstack/nova/+/713301	00:51
opendevreview	melanie witt proposed openstack/nova master: Enforce resource limits using oslo.limit https://review.opendev.org/c/openstack/nova/+/615180	00:51
opendevreview	melanie witt proposed openstack/nova master: Add legacy limits and usage to placement unified limits https://review.opendev.org/c/openstack/nova/+/713498	00:51
opendevreview	melanie witt proposed openstack/nova master: Update quota apis with keystone limits and usage https://review.opendev.org/c/openstack/nova/+/713499	00:51
opendevreview	melanie witt proposed openstack/nova master: Add reno for unified limits https://review.opendev.org/c/openstack/nova/+/715271	00:51
opendevreview	melanie witt proposed openstack/nova master: WIP Enable unified limits in the nova-next job https://review.opendev.org/c/openstack/nova/+/789963	00:51
gmann	melwitt: its green now https://review.opendev.org/c/openstack/devstack/+/812092/4	01:22
melwitt	gmann: \o/	01:22
opendevreview	Merged openstack/nova master: Reproduce bug 1945310 https://review.opendev.org/c/openstack/nova/+/811394	01:25
opendevreview	Merged openstack/nova master: Update min supported service version for Yoga https://review.opendev.org/c/openstack/nova/+/809932	02:36
opendevreview	pengzhang proposed openstack/nova master: the vif_name error leads virsh domifstat cant work https://review.opendev.org/c/openstack/nova/+/813119	03:25
opendevreview	pengzhang proposed openstack/nova master: the vif_name error leads virsh domifstat cant work https://review.opendev.org/c/openstack/nova/+/813119	05:46
opendevreview	norman shen proposed openstack/nova master: Recreate mdev devices according to placement https://review.opendev.org/c/openstack/nova/+/810220	07:33
bauzas	good (late) morning Nova	08:12
lyarwood	\o	08:15
gibi	o/	08:17
opendevreview	Dmitriy Rabotyagov proposed openstack/nova master: Ensure MAC addresses characters are in the same case https://review.opendev.org/c/openstack/nova/+/811947	09:16
opendevreview	Elod Illes proposed openstack/nova stable/train: Reject open redirection in the console proxy https://review.opendev.org/c/openstack/nova/+/791807	09:38
opendevreview	Elod Illes proposed openstack/nova stable/train: address open redirect with 3 forward slashes https://review.opendev.org/c/openstack/nova/+/806629	09:38
gibi	sean-k-mooney: what is the reason of not waiting for network-vif-plugged events during hard reboot? I hard reboot does unplug / plug but does not wait for plugged event. Compared that to boot where we do plug and then wait for plugged event before we let the VM to run	10:20
gibi	in normal ovs case I see that during hard reboot neutron eventually sends the unplugged and plugged events to us we just not waiting for it	10:20
sean-k-mooney	you are correct it will in most cases	10:24
sean-k-mooney	technically linux bridge might miss the removal and not send it	10:24
sean-k-mooney	for ovn and odl it wont	10:24
sean-k-mooney	starnard ml2/ovs it will	10:24
sean-k-mooney	so we dont wait because we need to know which ml2 driver bound the port and the behaivor of that driver	10:25
sean-k-mooney	gibi: i likely will be looking at reqorkign how we do this in yoga its one of the things i want to talk about at the ptg	10:25
sean-k-mooney	in train neutron started teilling us what driver bound the port	10:26
gibi	OK I see, thanks	10:26
sean-k-mooney	so we can actully no start using that info	10:26
sean-k-mooney	we just need to add it to our vif objects and use it where it makes sense	10:26
sean-k-mooney	e.g. wait for event on hard rebot if its ml2/ovs for example	10:26
sean-k-mooney	but not if its odl	10:26
gibi	I see	10:26
sean-k-mooney	gibi: the short answer why we dont wait is we dont rebind the ports on hard reboot so we only get event for backend that send plug time event and since beforfe train we could not tell if ovs port were plug time or bind time we could not wait safly	10:28
sean-k-mooney	gibi: are you seeing a issue wiht not waiting currently?	10:28
sean-k-mooney	or was this just courisity	10:28
gibi	I have a downstream case (on pike) where the nova unplug / replug behavior at openstack server start causes problems in ODL	10:29
gibi	but you said that odl won't even send us plugtime events	10:29
gibi	so waiting won't be the solution	10:30
gibi	wondering if for backends not supporting plug time events we need to rebind	10:30
sean-k-mooney	ah ml2/odl used to send events at bind time yes not at plug time but i think later version might have adress that	10:30
gibi	ohh, good info, then I have to figure out the odl version too	10:30
sean-k-mooney	give me to second and ill see if i can find that in the odl neutron driver	10:31
sean-k-mooney	basically to support bind time event they had to open a websock that odl could call back to to send event to neutron which neuton could then send to nova	10:32
sean-k-mooney	that was not part of the orginal ml2 driver	10:32
gibi	you mean for plug time event	10:32
sean-k-mooney	so in the legacy driver they just hardcoded the port as active as soon as the port was bound https://opendev.org/openstack/networking-odl/src/branch/master/networking_odl/ml2/legacy_port_binding.py#L58-L61	10:33
sean-k-mooney	which resulted in network-vif-plugged beign sent at bind time only	10:33
sean-k-mooney	then latere they added a way to get port status updates form odl https://opendev.org/openstack/networking-odl/commit/3432e0f875f8243a2c1febf2012bd0f58863c56b	10:34
sean-k-mooney	https://opendev.org/openstack/networking-odl/src/branch/master/networking_odl/ml2/port_status_update.py#L91-L95	10:34
sean-k-mooney	and then they would correctly send plug time events	10:34
sean-k-mooney	gibi: but pike might predate that	10:35
gibi	cool, I can check the dates then	10:35
gibi	sean-k-mooney: thanks a lot	10:35
gibi	so the solution is fresh enough networking-odl to have plug time event, and the in yoga think about waiting for plug time event at reboot as well	10:36
sean-k-mooney	no worries assuming we can confirm this works today we could add odl also to the list of "send plug events" or whatever and then start waiting	10:36
sean-k-mooney	yep	10:36
gibi	does neutron have odl jobs in the upstream ci?	10:36
sean-k-mooney	so we dont really need to wait for the ptg to discuss this i just have not had time to work on a spec	10:36
sean-k-mooney	good question im not sure	10:37
gibi	ok I will check that too	10:37
sean-k-mooney	i suspect not	10:37
sean-k-mooney	networking-odl should have jobs however	10:37
gibi	ahh good point that is a separate repo with it own jobs	10:37
sean-k-mooney	https://opendev.org/openstack/networking-odl/src/branch/master/.zuul.d/project.yaml#L51-L52	10:38
sean-k-mooney	it also has multi node versions	10:38
gibi	cool then we can have test coverage upstream about the change	10:38
sean-k-mooney	am on thing i have been talking to artom about is possible adding some other network toplogies to nova's gate as a periodic	10:39
sean-k-mooney	so right now we have ovs with iptables in nova-next, we have ovn for most things and linux bridge for some file changes	10:39
sean-k-mooney	it might be nice to add a weekly multi node odl and ml2/ovs job	10:40
sean-k-mooney	weekly multi-node job that is to tesst migration and resize ectra	10:40
gibi	I would expect that some of this coverage is already running on the neutron side	10:41
gibi	but if not then I agree we can add it	10:41
sean-k-mooney	that will tell them if we break things but only if the project is activly beign developed	10:41
sean-k-mooney	it looks like networking-odl for example only gets comiits every month or so	10:42
gibi	yeah, that is a good point too	10:42
gibi	then we need the periodic to trigger testing even if networking-odl is not changed	10:43
sean-k-mooney	yep just to make sure we did not break something. but really weekly is more then enought i think	10:43
sean-k-mooney	we can review it like the placment jobs	10:43
gibi	yepp, agree	10:44
sean-k-mooney	gibi: i would also like to add dpdk at some point but i have not been maintaining networking-ovs-dpdk so i will need to add devstack support for dpdk or fix it before thats even posible at this poing	10:45
sean-k-mooney	*point	10:45
gibi	I support that idea too. Downstream we use dpdk a lot, so would be nice to get coverage for it upstream.	10:46
sean-k-mooney	fortunetly on the dpdk front its support by distros now so its now just a matter of turning it on and seting a few config values but still need to sit down and do that	10:46
gibi	let me know if I can help	10:46
sean-k-mooney	sure will do. when i get around to workign on the devstack patches ill let you know	10:47
opendevreview	sean mooney proposed openstack/nova master: Ensure MAC addresses characters are in the same case https://review.opendev.org/c/openstack/nova/+/811947	11:09
sean-k-mooney	noonedeadpunk: hopefully you dont mind me updateing ^	11:14
sean-k-mooney	gibi: can you take a look at ^ when you have time	11:15
gibi	ack I will try	11:16
* gibi gathered a big backlog of task this week, craxy		11:16
noonedeadpunk	sure I'm not!	11:19
opendevreview	Lee Yarwood proposed openstack/nova-specs master: WIP libvirt: Allow Manila shares to be directly attached to instances https://review.opendev.org/c/openstack/nova-specs/+/813180	12:20
*** lbragstad_ is now known as lbragstad		13:51
lyarwood	stupid Friday question but can anyone think of an API where we've changed the return code in a microversion? I want to fix the POST os-volume_attachments method to return 202 but I can't think of the best way to do this	14:37
lyarwood	ah nvm I see just duplicate the actual method and decorate it	14:39
bauzas	lyarwood: correct, I was about to point it to you	15:00
bauzas	(sorry for the late ping, kids taxi)	15:00
bauzas	lyarwood: do you want to discuss on your spec ?	15:00
stephenfin	lyarwood: There are quite a few APIs that suffer from that issue. Are you just going to do the one or multiple?	15:00
bauzas	lyarwood: fwiw, I'm leaving in 25 mins for the weekend	15:04
lyarwood	bauzas: sorry was just getting a drink, happy to chat now if you had any questions etc	15:06
bauzas	I briefly looked at your spec	15:06
lyarwood	stephenfin: just os-volume_attachments for now but I'll likely start to address others once I've worked this out	15:07
lyarwood	I'm not sure if people would prefer a single microversion for everything etc	15:07
dansmith	so, I'm pretty sure that we originally said we wouldn't do microversions for things like this	15:08
bauzas	if 500, yes	15:08
lyarwood	that would make my life easier	15:08
dansmith	because nobody is going to opt-in to a new return code, or opt out (back?) to the old one	15:08
sean-k-mooney	dont we requrie a microversion if you are chaning the reponcoe code to one that is not currently used	15:08
bauzas	I mean, 500 to something clearer, not a microversion	15:08
dansmith	if you're changing other things about the api, then doing the return code as part of it makes sense	15:08
dansmith	but not just a "2.98: change a bunch of 200s to 202s"	15:08
bauzas	sec, giving you the API rules	15:08
sean-k-mooney	well 200 to 202 is blocking to async	15:09
lyarwood	I wonder why we have TODOs everywhere for this then	15:09
dansmith	sean-k-mooney: yeah, I'm not saying do it without a microversion, I'm saying don't do it unless there's something else changing	15:09
lyarwood	oh sorry I see	15:09
sean-k-mooney	https://developer.mozilla.org/en-US/docs/Web/HTTP/Status/202	15:09
lyarwood	well tbh it's never going to get done in that case	15:09
sean-k-mooney	lyarwood: you want to move this to be an async api yes?	15:10
dansmith	if we	15:10
bauzas	https://specs.openstack.org/openstack/api-wg/guidelines/microversion_specification.html for the API general rules	15:10
lyarwood	sean-k-mooney: it already is	15:10
sean-k-mooney	oh then 200 is wrong ya	15:10
bauzas	https://docs.openstack.org/nova/latest/contributor/microversions.html#when-do-i-need-a-new-microversion for nova specifics	15:10
lyarwood	yes	15:10
dansmith	are moving it from sync to async, then of course it makes sense to change, but just a microversion for changing the return code but not the behavior is kinda silly	15:11
sean-k-mooney	right	15:11
sean-k-mooney	im more ok with not haveing a microversion	15:11
sean-k-mooney	if we are not changing behavior	15:11
sean-k-mooney	just the return code	15:11
dansmith	I'm not saying we should do that	15:11
bauzas	technically, we agreed on asking for a microversion https://docs.openstack.org/nova/latest/_images/graphviz-a4c9682faf139450eed50fdcc7dd06df5677c197.png	15:11
bauzas	unless it was changing from HTTP500	15:12
dansmith	I'm saying returning 200 for an async api that has always been async is unfortunate, but not that big of a deal	15:12
sean-k-mooney	dansmith: you are say until we modify tese again dont change it	15:12
dansmith	sean-k-mooney: yes	15:12
dansmith	nobody is going to opt-in to "no new behavior but a more accurate return value"	15:12
sean-k-mooney	i think that is also reasonable becasue of ^	15:12
bauzas	but we mention something :	15:12
bauzas	"	15:12
bauzas	The exception to not needing a microversion when returning a previously unspecified error code is the 400, 403, 404 and 415 cases. This is considered OK to return even if previously unspecified in the code since it’s implied given keystone authentication can fail with a 403 and API validation can fail with a 400 for invalid json request body. Request to url/resource that does not exist always fails with 404. Invalid content typ	15:12
bauzas	es are handled before API methods are called which results in a 415.	15:12
bauzas	"	15:12
lyarwood	right but at least it's fixed later when they opt-in to some other change	15:12
dansmith	bauzas: that has nothing to do with this case	15:13
bauzas	nothing tells us to not ask for a microversion if changing from 200	15:13
bauzas	dansmith: sure, I was just looking at what we wrote :)	15:13
bauzas	to see whether we said "cool if 200 -> 202 and async"	15:13
sean-k-mooney	bauzas: under that policy we sould have to do a microversion but there is no untily in changing the repssonce doe and making it opt in	15:13
dansmith	lyarwood: right but all of those kinds of death-by-1000-cuts things impact people, which is why we didn't do a massive cleanup early on	15:13
dansmith	lyarwood: right now if you know that api returns 200 and you check for 200, then returning 202 could break client code even though absolutely nothing has changed	15:14
dansmith	it'd be bad client code, granted, but.. it's pain for pretty much no gain	15:14
bauzas	that's the point ^	15:14
lyarwood	dansmith: and how does that change anything by including it with other changes later on?	15:14
lyarwood	dansmith: bad clients will still break	15:14
sean-k-mooney	lyarwood: in the ohter api change later the clinet needed to be updated anyway	15:15
bauzas	if clients expect a 200 on a call, we can't leave them wondering now why they get something else	15:15
sean-k-mooney	e.g. there was already work for them to do	15:15
dansmith	lyarwood: well, right, I'm saying it's not worth changing like ever, unless some other behavior changes that would require a behavioral change anyway	15:15
dansmith	it's a developer purity thing, like many of our TODOs everywhere, but unless it impacts proper client behavior, it's more likely to break things than fix them, IMHO	15:16
lyarwood	at least this way we could change all of them in one go and fix the same issue in the clients in one go	15:16
lyarwood	instead you're suggesting different people fix them one by one at different times	15:16
dansmith	when we started out do the v3 API, it was all this purity stuff, and we literally unrolled it all when we realized we were just making pain for people	15:16
dansmith	lyarwood: again, I'm not	15:16
lyarwood	okay it's Friday I'm likely missing something but given people don't want this I'm going to stop and move on	15:18
sean-k-mooney	lyarwood: when we have done this clean up in the past we have paired it with other changes with the logic being a client tha want to use that new feature via a microversion need to be updated anyway so they can adapt to both change at the same time.	15:18
dansmith	let me ask this way: what client behavior is going to be wrong if we return 200 that will be right if we return 202?	15:18
dansmith	if there's something there, then it's worth changing	15:18
dansmith	sean-k-mooney: right	15:18
lyarwood	tbh with 200 to 202 it's not our clients that I'm worried about	15:19
lyarwood	it's other callers outside of them and their understanding of the API	15:19
dansmith	lyarwood: right, but ... what's the behavior or understanding that will be wrong?	15:20
lyarwood	the CSI plugin in k8s for OpenStack for example has been misusing it for a while	15:20
sean-k-mooney	your assumeing they are reading the reponce code but not the docs that say this is an async api	15:20
lyarwood	dansmith: assuming it's sync and not polling	15:20
dansmith	meaning, generally 202 means "if you query this thing right after the PUT/POST then you will not see it updated so don't expect it to be" .. is it that? or is it "this operation takes a while?"	15:20
dansmith	lyarwood: but is the actual API operations sync and just not the attachment?	15:21
lyarwood	we return after we cast to the compute to do the actual attachment	15:22
sean-k-mooney	dansmith: i think k8s has been assuming that the attachment was completed when it got the 200	15:22
lyarwood	so it takes a while, but the CSI driver would start polling Cinder instead of Nova and that has caused various issues	15:22
lyarwood	because they assumed the attachment in Nova was done at that point	15:22
dansmith	I'm not sure that	15:23
dansmith	is reasonable behavior to assume from either 200 or 202 though	15:23
dansmith	you're giving it attachment information, which it is storing, and that triggers some behavior right?	15:23
dansmith	like, IIRC, 200 is for "I've stored your object, here's the ID you can use to get it again" and 202 is for "I will store that later, but I can't necessarily give you its identifier yet" right?	15:24
sean-k-mooney	for a 202 they cannot assuem its complete and should expect to poll nova for the compleation	15:24
lyarwood	dansmith: yeah but my understanding with 200 was that processing was complete with the return	15:25
sean-k-mooney	for 200 generally you would assume it blocks until its completed but its valid to retrun before that	15:25
lyarwood	dansmith: and as sean-k-mooney said, 202 requires additional processing	15:25
dansmith	I think that's true of the REST object and not necessarily the actions implied from the thing you're storing	15:25
sean-k-mooney	lyarwood: has the k8s csi plugin be fixed to pool nova by the way	15:26
dansmith	but again, as sean-k-mooney said, this is really a result of them not reading the docs that say it's async and implying which part of the operation is synchronous from the 200	15:26
sean-k-mooney	regardeless of if we cahnge the api	15:26
lyarwood	sean-k-mooney: https://github.com/kubernetes/cloud-provider-openstack/issues/1645 not yet	15:26
sean-k-mooney	ok but they are aware of it and it will get eventually fixed	15:27
lyarwood	yup sure, just outlining my motivation to fix this	15:27
* lyarwood looks at the Nova bug backlog and laughs		15:27
lyarwood	;)	15:27
sean-k-mooney	:)	15:27
sean-k-mooney	i assume we have a customer hitting this so that normally seed up fixing things	15:28
dansmith	to be clear, making it 202 won't fix this, we're just assuming the human that wrote the code would have assumed they should do something different if the code was 202 right?	15:28
lyarwood	yup that's the assumption, it wouldn't fix anything in CSI	15:28
lyarwood	I got triggered by the TODO today and wanted to take a swing a killing it	15:29
sean-k-mooney	if we wnated to fix CSI issue in nova we would have to block	15:29
dansmith	so, I mean.. we're like changing something that could easily break other clients that did read the docs, based on the assumption that someone who didn't would have done the right thing if it was different	15:29
sean-k-mooney	wich i dont think we want to do	15:29
dansmith	pretty meh :)	15:29
bauzas	ok, sorry guys, I need to bail out from now	15:30
lyarwood	bauzas: \o if you want to chat on Monday ping me an email, I should have time to chat once I'm on London	15:31
lyarwood	in*	15:31
bauzas	lyarwood: don't worry, I'll use the power of community brainstorm to move on with your spec ;)	15:32
dansmith	I think we've all put on a few pandemic pounds, but being "on london" would require a substantial increase in waist size, I think	15:32
* lyarwood looks at 16 month old child		15:32
lyarwood	dansmith: it has been a rough 16 months, it wouldn't take much more	15:33
dansmith	heh	15:33
opendevreview	Julia Kreger proposed openstack/nova master: Ignore plug_vifs on the ironic driver https://review.opendev.org/c/openstack/nova/+/813263	22:23
*** tbachman is now known as Guest2264		23:35

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!