Wednesday, 2025-11-12

clarkbcardoe: got it. I suspect that we can change that to "please use an external wsgi server" or something along those lines and be less uwsgi specific00:05
clarkbwhich is probably a good first step if we think we'll be moving away from uwsgi (whcih I think is being forced on us)00:05
cardoeI mean it does the wrong thing for modern kernels and the answer is to locally patch.00:17
clarkboh I was only aware of the bugfixes only and slow python support updates00:17
clarkbI didn't realize that it was actively harmful to use at this point.00:17
cardoeSo it’s really in a k8s environment. It reads the wrong stuff for memory. So it’ll OOM00:34
cardoeIt also handles health checks resulting in killing it poorly. It stops servicing loops.00:37
clarkbis it not treating reload-on-rss properly beacuse ti reads some incorrect rss value?00:49
clarkbor maybe limit-as doesn't work correctly00:49
cardoeI haven't had the cycles to chase it down specifically.01:40
cardoeThe other issue we've got is let's say for glance there's 10 workers in a process. Someone kicks off a bunch of image uploads and that one pod gets 10 of those. Now there's nothing answering the k8s health check since it doesn't have any out of band health check so it fails health checks. k8s will restart the pod which fine do it after the workers finish. nah. let's stop calling epoll() and drop everything on the ground.01:42
fricklerfun. though in that case I'd argue that the k8s check is bogus06:19
sean-k-mooneyclarkb: gunicorn is a good droping replacement for our current use of uwsgi07:49
sean-k-mooneyclarkb: i have been putting off trying to swap devstack to it because of other thing on my plate but that is the imieadeate replacement i see for us07:50
sean-k-mooneyi thinke with oslo.wsgi we have the opertuntiy to recommend a more moderen and maintaiend stack07:51
mnasiadkasean-k-mooney: last time I tried using gunicorn with Nova I had problems with Nova properly parsing CLI arguments - but that was like two cycles ago08:29
mnasiadkaBut yes, I agree we should have a supported alternative for uwsgi - that also can support ASGI08:31
sean-k-mooneymnasiadka: that because using cli agrument is not technially supprote by the wsgi spec and someithng you shoudl not do08:33
sean-k-mooneysupprot for it in other wsgi servers is all non standard and not interoperable08:33
sean-k-mooneyin general you do not need to use cli args for nova's api08:34
mnasiadkaWell, we only use CLI arguments to point to config files, so I guess that’s not a problem08:34
mnasiadkaBut at that point in time we started moving everything to a ,,standard’’ Ansible role that generates config for uWSGI - since Devstack uses that and we didn’t want to reinvent the wheel08:34
sean-k-mooneyya so if you put the files in the default location it will just work and you may be also able to set the path vai env vars08:35
sean-k-mooneyi think i added that but i woudl need to check08:35
mnasiadkaBut the good thing is that it’s now easy to add support for gunicorn - once devstack uses that for testing and we’re sure we’re not going to send operators to a bad place with using something else than uWSGI08:36
sean-k-mooneyhttps://github.com/openstack/nova/blob/master/nova/api/openstack/wsgi_app.py#L43-L5108:36
sean-k-mooneyso ya https://github.com/openstack/nova/commit/73fe84fa0ea6f7c7fa55544f6bce5326d87743a608:37
sean-k-mooneyitdoes out tha tthe canges i mad were not really requried in the end because we already supproted OS_NOVA_CONFIG_DIR and oslo.config already will search for a /etc/nova/nova.conf.d/ direcotry and load all files there08:38
sean-k-mooneybtu this give you a little more contole if you really need it08:38
cardoefrickler: what other check other than seeing if the pod is still listening on the port its suppose to be servicing would make sense?13:22
cardoesean-k-mooney: so for OpenStack Helm I've gotta pass paths right now and I've gotta come up with a better answer. The helm chart generates /etc/nova/nova.conf, the user can mount overrides into /etc/nova/nova.conf.d/, the init containers generate stuff into /tmp/nova/nova.conf.d/13:26
sean-k-mooneycardoe: if you puthign them in /etc/nova and /etc/nova/nova.conf.d13:32
cardoeThey're read-only in the pod13:33
sean-k-mooneyyou do not need to pass anything to nova to read those13:33
sean-k-mooneyyep that is fine13:33
sean-k-mooneywe do the same 13:33
cardoeSo the init container will write a config snippet that'll have like myip set.13:33
sean-k-mooneyyour issue is that you cant copy the content form /tmp to them13:33
cardoeYes. /etc is read-only.13:34
sean-k-mooneyya it proably shoudl not be13:34
sean-k-mooneymyip does not need to be set by hte way13:34
cardoeBad example then. I dunno what the nova one does.13:34
sean-k-mooneyit can be but you only need to set it to override the default13:34
cardoeIt's a projected volume so its gonna be read-only in the pod.13:35
sean-k-mooneyprojected form a config map?13:35
cardoeone or more config maps and one or more secrets13:35
sean-k-mooneywhat we do is we use kolla's init system so we project the readonly config map to /var/lib/openstack/nova/config and then copy it to /etc/nova.*13:36
cardoeWe did it this way so that the user can have restart-less changes like debug level by editing the configmap.13:37
sean-k-mooneythat still needs a SIG_HUP13:37
sean-k-mooneyto the pod to have it take effect13:37
sean-k-mooneywell to the nova process13:37
sean-k-mooneyin any case for the nova api and metadata you can defien an envionment vaaribel to have it read addtion config files or directories13:38
cardoeYeah but not everything is consistent.13:39
sean-k-mooneysure13:39
cardoeNot even all services read /etc/$service/$service.conf.d when run via WSGI servers still13:40
sean-k-mooneybut using cli options is not wsgi compleent13:40
cardoeI don't disagree.13:40
sean-k-mooneycardoe: they shoudl if they are usign oslo.config13:40
sean-k-mooneythat is done automaticlly13:40
cardoeIts not13:40
sean-k-mooneyit is if they have initalized the oslo.config properly. so swift wont13:41
cardoeIts done automatically by reading /proc/self/procname13:41
cardoeThey have to call oslo.config and pass project="nova" or whatever project it is.13:41
sean-k-mooneyyep or have procname set properly 13:41
cardoeI've been submitting patches the past 2 cycles13:41
cardoeThen some of them parse sys.args (cause oslo.config defaults that) of what the WSGI server had sent to it.13:44
sean-k-mooneyyep nova will do that if it can13:44
sean-k-mooneybut again that not standars complianet form a pure wsgi point of view13:44
sean-k-mooneybut it works for uwsgi and maybe apache13:45
cardoeyeah except when the args aren't intended for nova but the wsgi server itself and the flag is the same for nova and the wsgi server but with a different value or meaning13:45
sean-k-mooneywe are unfortully still using apache https://github.com/openstack-k8s-operators/nova-operator/blob/main/templates/novaapi/config/httpd.conf but im hoping we can eventually use somehting lighter weight13:45
sean-k-mooneywe do at least use apache for tls termination so its not entirly unhelpful13:46
sean-k-mooneycardoe: the reason that command line arge were ever supproted was really beacuse fo the eventlet webserver and the assocated console script13:48
sean-k-mooneysicne we were shiping a wsgi applciaiton and its hosting server as a single entry poin twe coudl forward agument to the applction13:49
cardoeyeah it makes sense13:49
cardoeJust the number of rando bug reports for starting up a process over the past few cycles has been surprising to me.13:50
sean-k-mooneyfor gunicorn you can pass -n to set the applcation name for what its worth13:50
sean-k-mooneyhttps://docs.gunicorn.org/en/latest/run.html#commonly-used-arguments13:50
cardoeOpenStack Helm lets the user pick their own for uwsgi and gunicorn so that's part of the problem.13:50
sean-k-mooneyack, i think kolla semi recently mvoed form apache2 to uwsgi in the last 2-3 years13:51
sean-k-mooneyi think they only supprot one however now 13:51
sean-k-mooneyas in they decised they were porting and only supprot both as a tempory messure13:51
sean-k-mooneyalthough i coudl be wrong13:52
sean-k-mooneyoh they still have both https://github.com/openstack/kolla-ansible/blob/master/ansible/roles/nova/templates/nova-api.json.j213:52
cardoeas far as the HUP, I've done something ghetto in a few services which is touch a file early on. Then with bash if the modification time of configs are newer, touch the magic file again and send PID 1 a HUP.13:54
cardoeSince k8s doesn't have a signaling path13:54
sean-k-mooneycardoe: ya so for GMR you can just use a path based trigger13:55
sean-k-mooneythere is not reason oslo.config coudl not supprot it for config reloading13:55
cardoeGMR?13:56
sean-k-mooneyGuru meditation reports13:57
sean-k-mooneyhttps://docs.openstack.org/oslo.reports/latest/user/history.html#id3813:57
sean-k-mooneyin 1.11.0 oslo report gained the ablity to trigger on file modification events13:58
sean-k-mooneythere is no reason that oslo.config coudl not do the same in principal13:58
sean-k-mooneyeither on a specific file or by watchign all of the config files for chagne13:58
sean-k-mooneyin our downstream installer we configure gmr to trigger on touching /var/lib/nova 13:59
sean-k-mooneyhttps://github.com/openstack-k8s-operators/nova-operator/blob/main/templates/nova.conf#L37813:59
sean-k-mooneyi think that uses inotify or similar behind the seens14:00
cardoeYeah14:05
cardoeThat would make sense14:05
sean-k-mooneyfor what its worth we do have a simialr desire i to allow some fields ot update without a restart, mainly for password rotation, that sort of thing14:09
sean-k-mooneyright now we hash the cofnig map and do a full pod restart anytime the content changes14:09
sean-k-mooneybut long term it woudl be nice if the reload was just built into oslo.config14:10
cardoeYeah that would make sense.14:16
cardoeSo hashing the configmap and restarting is what most of the OpenStack Helm (OSH) stuff does but trying to make it possible to change some things without that.14:16
-opendevstatus- NOTICE: Zuul job log URLs for storage.*.cloud.ovh.net are temporarily returning an access denied/payment required error, but the provider has been engaged and is working to correct it14:50
clarkbcardoe: you could have uwsgi listen on a second port that isn't exposed maybe? I'm not arguing for uwsgi but usually there are workarounds like that with webservers in particular15:44
cardoeclarkb: all http-sockets run in the same loop so it won't service anymore17:21
cardoeThe best approach that folks have suggested is to talk to it via the master-fifo which is a named pipe. They're not wrong.17:22
cardoeuWSGI 2.1 will bring that master-fifo as a regular socket... but that version's been canceled.17:23
clarkbinteresting. Looks like there is also snmp support17:24
clarkbbut also looks like sigint is expected to be less bad than sigterm. Not sure if the differences are meaningful for say the glance upload problem17:24
cardoeI will say one of my bigger complaints with that glance problem was finally fixed in 2.0.31 so less of an argument out of me now.17:30
cardoeThe fix that sat in their PR queue for that was merged 30 days ago.17:33
cardoeIt had been authored by someone years ago.17:33
cardoeIt had been brought up on openstack-discuss a number of times.17:33
cardoeMy only point is that it seems very very passively maintained and I don't want to see us tied to another thing that we're going to struggle to migrate away from like eventlet in the future.17:35
cardoeThe "official" source download URL had its SSL certificate expired for nearly 3 weeks when I brought this up in the TC before. OpenStack projects switched to fetching it from files.pythonhosted.org17:36
clarkbthe main issue opendev has run into is that the compilation is not reliable on arm64. We ended up dropping arm64 as a wrokaround for uwsgi container image builds. But I also have some changes up to switch to a different wsgi server17:37
clarkbcardoe: I thought openstack consumes uwsgi from distro packages?17:37
clarkbanyway I'm the last person that would advocate for uwsgi. I'm actively trying to remove it from opendev17:37
cardoeThere were a couple things fetching it and building it.17:38
clarkbgranian is what I am experimenting with because it supports rsgi and asgi too so seems flexible. But frickler rightly called out it too appears to be maintained by a single person and could go the way of uwsgi quickly (though I think it is far more actively maintained today)17:39
cardoeyeah that sounds more flexible for the future.17:41
cardoeuwsgi also only has commits from one person cause the commercial entity behind it seems to be gone and their company website is just a redirect to the uwsgi github project page.17:42
cardoeThat one committer also very clearly works at a company that's doing AI stuff around opentelemetry17:42
clarkbyes I think that is why uwsgi is on life support now. granian is strictly better but may not be appropriate for openstack17:42
cardoeWhen the entity behind uwsgi used to be a web hosting company.17:43
clarkbgunicorn was mentioned earlier and is probably more similar to what we thought uwsgi was17:43
cardoeI'm still using uwsgi as well. I've not gone to anything else.17:43
clarkbI've just sent email about this, but I think we often confuse/conflate decisions made to achieve goals as static entities. Having a goal along the lines of "Openstack should be compatible with a well tested and performant wsgi server" is probably still a good goal to have. Uwsgi being that server is probably no longer addressing the goal.17:48
clarkball that to say I think it might be healthy for us to shift to a mindest where we consider goals first and whether or not the goals are still relevant and then whether or not the decisions we've made are still in alignment with those goals and adjust accordingly when making big changes17:48
clarkbwhen all we see is "uwsgi bad" or "prefer pytest" it is really easy to overlook the reasons that we may do something one way or another in the first place and create new unexpected problems when implementing changes unilaterally or without broader consideration for community needs. FWIW I think the uwsgi replacement process has been collaborative and open in the community so may17:50
clarkbbe a bad example17:50
clarkbbut also I think if we become more open to reevaluation then just like I can git revert a commit that goes side ways we build in an expectation of more agility in the first place. WHich for a decision like "should we allow pytest" is probably super low risk to revert or pivot later17:52
sean-k-mooneyclarkb: nice email by the way :)18:23
sean-k-mooneyclarkb: on uwsig replacement im not actlly aware fo any real effort yet to move off uwsgi18:24
sean-k-mooneywe have talked about it18:24
clarkbsean-k-mooney: ya I think most of the effort has been in evaluating potential alternatives18:24
sean-k-mooneybut i dont think anyone has actully made concreate proposals to do it other then using apache18:24
sean-k-mooneythe wsigi server in general shoudl be replacable18:25
gouthamr> clarkb: nice email by the way :) 18:25
gouthamr++18:25
sean-k-mooneyso i think this is more a case fo pocing it adding supprot to devstack and just chooing a new server18:25
clarkbyes, though the code you write to glue the wsgi server to the application sometimes differs by wsgi implementation. https://review.opendev.org/c/opendev/system-config/+/944806/5/playbooks/roles/lodgeit/templates/docker-compose.yaml.j2 sort of points at that18:26
clarkbI suspect that is actually part of the issue here as we've long tried to ship that glue as a one size fits all (at least for uwsgi)18:26
sean-k-mooneyya so we used to use pbr to standarise that18:26
sean-k-mooneybut obviously that not going to be a thing we continue to do18:27
clarkbya pressure from both sides forcing us to give up on that18:27
sean-k-mooneywe could perhasp centralise that in oslo.wsgi if we were to create that18:27
clarkbwhich is maybe a good sign we shouldn't bother18:27
sean-k-mooneyya maybe i think for the most part the binding we use18:28
sean-k-mooneywont realy vary form service to servce18:28
sean-k-mooneybut it can form server ot server so even having a example for each might be eough18:28
mnaserfrom an operator side, it "shouldn't" matter -- but it ends up mattering somehow.  for example, neutron-server had some stuff where out of the box you couldn't just .. switch to wsgi because the old eventlet based server started up other tasks in the background/etc18:28
mnaseri am gonna guess switching to uwsgi helped us uncover all of that mess18:29
sean-k-mooneywell neutron was a bit special18:29
sean-k-mooneybecause they didnt have a pure api18:29
mnaseryeah so i think we are past that period, so having a pure wsgi entrypoint should be easy now18:29
sean-k-mooneythe netorn server was both the rest api and the conductore of long runing and perodic tasks18:29
mnaserit would be a matter of flipping it from one server to another..18:29
sean-k-mooneyright but it shoudl be now18:29
sean-k-mooneythey have now actully split the wsgi app and the rpc/conductor process18:30
mnaserand now i guess it can/could be as easy as a configure_wsgi api in devstack that would simply (maybe) use a different server depending on a localrc config18:31
mnaserand it should just work(tm)18:31
sean-k-mooneynow that they did the engenierring work to seperatre it out then ya18:31
sean-k-mooneyim not sure if there are other proejct that dont have the seperation however that woudl have to do the same exercise18:32
sean-k-mooneywe have see glance watcher and other all have to fix the fact that wehn they ran under the eventlet server18:32
sean-k-mooneythat htey coudl do long running background tasks18:32
sean-k-mooneythat didnt actully fit with the wsig request lifecycle18:33
mnaserbut that means that in theory they already dont work "with uwsgi" anyways18:33
sean-k-mooneycorrect18:33
sean-k-mooneythey did not18:33
sean-k-mooneyglance had many feature that were just flat out broken18:33
sean-k-mooneyit too a while to fix it18:33
sean-k-mooneyneutron didnt really suprpot runing under uwsgi at all18:34
sean-k-mooneyor apache really18:34
sean-k-mooneythe event let removal cause many proejct to revauate there acicture and fix thing like this18:34
sean-k-mooneyhere is an exampel that we are fixing in watcher https://github.com/openstack/watcher/blob/master/watcher/api/scheduling.py18:35
clarkband going back to the big picture I think what I'm ultimately advocating for is a more proactive approach to identifying issues that we may face in the future with the decisions we've made in the past so that we can more proactively take action before it is super painful when you start making quick changes out of necessity18:36
sean-k-mooney+118:36
sean-k-mooneynot having all this decsions being urgent becasue we need to adress it woudl be a nice change18:37
fungithe mailman community seems to prefer gunicorn overall, even though the container images we base our deployment on still rely on uwsgi19:22

Generated by irclog2html.py 4.0.0 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!