Wednesday, 2025-10-22

opendevreview	Matthias Runge proposed openstack/governance master: Appoint jlarriba to Telemetry PTL https://review.opendev.org/c/openstack/governance/+/964516	07:35
opendevreview	Matthias Runge proposed openstack/governance master: Propose jlarriba to Telemetry PTL https://review.opendev.org/c/openstack/governance/+/964516	07:51
frickler	I cannot find any information about whether a PTL stepping down should be replaced via appointment or whether there should be an extra election. this is kind of vague "If an unexpected event occurs that doesn’t give you sufficient time to dedicate to the items above, it is your responsibility to step down and allow someone with more time to take over."	08:15
mnasiadka	Usually it was an appointment from what I see, but I agree we should make it more detailed.	08:21
opendevreview	Michal Nasiadka proposed openstack/governance master: Propose jlarriba to Telemetry PTL https://review.opendev.org/c/openstack/governance/+/964516	08:38
*** iurygregory_ is now known as iurygregory		10:57
fungi	in theory the tc members could decide to hold a special election and then appoint the winner, if they so chose	15:24
gouthamr	https://lists.openstack.org/archives/list/openstack-discuss@lists.openstack.org/thread/SZXCQIETIU7VP43EGYCXPWM4HIYENWX6/#SZXCQIETIU7VP43EGYCXPWM4HIYENWX6	16:57
gouthamr	https://review.opendev.org/c/openstack/governance/+/625537	16:58
gouthamr	^ we didn't go through an "appointment" as such - just a handover like mrunge is doing	16:58
gouthamr	^ sorry, my context message got eaten up by bad copy paste	16:59
gouthamr	we did have a similar situation in ~2018, when the mistral PTL needed to step down, and they handed the responsibilities over to a past PTL	16:59
gouthamr	i don't think we need a midterm election, i think it feels a bit heavy handed, we could ask the core reviewers to look at, and object to the governance change if they need to	17:01
fungi	agreed, unless there's more than one party expressing interest in being the interim ptl, a special election seems like overkill to me	17:04
sean-k-mooney	cardoe: i dont have much expirnce with it but i think grunicorn can be used instead fo uwsigi in most cases	17:51
cardoe	That's my thought too.	18:09
cardoe	So uWSGI advertises its on life support and not getting updates or features or eyes. The company behind it talks about other tech (like ASGI servers and Rust).	18:09
cardoe	I just don't want it to become another eventlet since the signaling is pretty clear.	18:10
cardoe	We've just got uWSGI recommended all over the place and in use all over the place.	18:10
clarkb	one appraoch could be to stop recommending any specific wsgi server and just pick something for our own upstream development needs.	18:16
clarkb	In theory these servers are interchangeable (though I know that there are sometimes different expectations about how the web service is loaded into the wsgi server)	18:16
cardoe	Well the "bug" with switching them around right now is with oslo_config and reading of config files.	18:17
cardoe	Cause service/project (maybe got the variables wrong) are optional to oslo_config initialization	18:18
cardoe	So some projects have got uWSGI specific start up to make it go back to loading /etc/$project/$project.conf	18:18
cardoe	If oslo_config isn't initialized with a service/project variable it uses /proc/self/name for $project in the above.	18:19
cardoe	I've been trying to submit changes to not have that.	18:20
clarkb	sure all the more reason to stop recommending one specific thing so that we stop writing wsgi specific server code	18:22
frickler	where is it actually getting recommended? in docs? IMO the most important step would be to make devstack support something else	18:46
frickler	which would likely require some volunteer to actually implement that	18:47
JayF	spotz[m]: https://bugs.launchpad.net/ironic/+bug/2129596 :(	20:01
clarkb	JayF: might be worth booting the upstream image to see if it is a problem there or just in the dib updated version (dib does change some things like the filesystem from xfs to ext4 by default iirc)	20:45
clarkb	JayF: it almost looks like the backing disk isn't big enough for the disk image. Could it be the size of the image grew and now you're over some implicit limit?	20:49
JayF	yeah, like I said in the ticket I think it could be a "needs more ram", but the image itself doesn't appear to have changed size	20:49
clarkb	not more ram. More disk	20:49
JayF	oh, so you're saying it's not corrupting on load while extracting at boot	20:50
JayF	that it's an invalid image because it's hitting disk space before?	20:50
clarkb	as a theory anyway	20:50
JayF	if that was the case, then it's more likely to be a bug that we didn't go "boom" in the build process	20:50
JayF	the real answer is that I don't have time to dig this deeper anytime soon, and it's another piece of evidence pointing towards centos being a less-stable platform to test against... or at least that's what it seems like now	20:51
clarkb	hrm though its failing on op write which in initramfs is all in memory?	20:53
clarkb	so ya probably unlikely that the actual disk is too small	20:54
JayF	yeah, if it's out of "space" it's out of ram	20:54
JayF	and that feels unlikely given the disk image is the same size	20:54
clarkb	JayF: maybe check syslog logs on the job node to see if oomkiller or similar was invoked	20:55
clarkb	maybe the host just couldn't supply as much memory as you think it would	20:55
clarkb	(and if you were close to the edge and the distro updated pushing things over the limit...)	20:56
JayF	https://review.opendev.org/c/openstack/ironic-tempest-plugin/+/955799 is the job in question that Clif was invested in, and why I started diggin, cheking syslog	20:57
JayF	nothing of real interest in the syslog afaict	20:57
JayF	I will try a ram bump	20:57
JayF	pushed a ram bump and depended that patch on it	20:59
clarkb	JayF: are those loop devices the actual backingdisk for the baremetal node though? the ipa image is running in memory to write out the underlying disk image (cirros in this case I think) and that is the write to the loopback device that contains the disk mapping that fails?	21:15
JayF	it's never even booting into anaconda	21:16
JayF	that driver should boot the ramdisk into anaconda, which does an install	21:16
JayF	we're not even getting that far, we can't boot the kernel/ramdisk	21:16
clarkb	what is loop2 then?	21:16
JayF	I don't know fully. I'd buy that there's something else going on, but I haven't ID'd why it works on one and not the other	21:17
clarkb	looks like anaconda fetches a stage 2 image. Maybe loop2 is where it tries to put that?	21:18
JayF	the job that bumps the ram seems like it may be in better shape, but it's hard to tell until it hard passes or fails so I can get to all the logs	21:19
clarkb	side note your logs make zuul unhappy for some reason so I can't deep link into them	21:19
JayF	if that's the end case, it's still not great that we spent time on this kinda mid-release churn	21:19
clarkb	not sure why	21:19
JayF	oh?	21:19
clarkb	maybe its the binary data in the log?	21:20
clarkb	though that looks like utf8 maybe that I don't have codepoints for	21:20
JayF	try the no-ansi log link	21:20
JayF	the filtered copy	21:20
JayF	or download the unfiltered and pipe it thru `less -R` (note: this is dangeresque for untrusted inputs)	21:20
clarkb	https://zuul.opendev.org/t/openstack/build/143ce59633184515a372f2cf6f500f80/log/controller/logs/ironic-bm-logs/node-0_no_ansi_2025-10-20-19%3A11%3A32_log.txt	21:21
JayF	oh, you mean that UI. I always click raw	21:21
JayF	I kinda hate the built-in log viewer	21:21
clarkb	right you can't deep link doingthat	21:21
JayF	https://d17f8c35fcd162676eb8-c684a40c384a27e613f3f0e997584032.ssl.cf5.rackcdn.com/openstack/143ce59633184515a372f2cf6f500f80/controller/logs/ironic-bm-logs/node-0_no_ansi_2025-10-20-19%3A11%3A32_log.txt this works, yeah?	21:21
clarkb	I wish more people would link to it because from there you can easily go back to the change or you can go to raw etc. But if you start at raw you have to work extra hard to work back to the other zuul info and if you start at the change it super ambiguous what is actually breaking and where to look	21:22
clarkb	yes the raw link works but that doesn't allow deep linking to specific lines	21:22
JayF	oh, I get what you're saying. Yeah that's never functionality I've used before and tbh didn't know existed until now	21:22
JayF	usually I pull it down and grep or load the raw and ^f then copy or screenshot relevant bits to share	21:23
clarkb	(just as an exmaple you linked to a change above and said this is the job in question, but in reality its a change that has run many many job a few of which have failed so I have to do detective work to figure out what you're looking at)	21:23
JayF	well, any the ironic-standalone-anaconda* job since 10/20 has failed	21:23
JayF	^ of	21:23
clarkb	yes I was able to figure out it. But its like 8 extra steps for me to see actual build logs when given a change link	21:24
clarkb	I think it bugs me because everyone does it. I ask for build logs and get links to changes	21:24
clarkb	they aren't the same :)	21:24
JayF	doing devops through a web browser straw is painful no matter what :D	21:25
clarkb	the stage2 appears to be a centos 9 stream image fwiw	21:27
clarkb	(so potentially multiple locations a centos stream image update could impact things?)	21:29
JayF	ram bump is showing benefits early on	22:22
JayF	we must have been right at the cusp if that took us over the edge	22:22

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!