tonyb | Thanks to sean-k-mooney[m] I managed to reduce the list of stale/stuck/corrupt VMs from 28 to 12 (It took a while because I was *very* careful with my db manipulations) | 00:31 |
---|---|---|
tonyb | Now I have 12 instances that are in the building/scheduling vm/task state. | 00:32 |
tonyb | Some have entries in nova_api.instace_mappings pointing at cell0 some at the real cell (which has a NULL name?) | 00:33 |
tonyb | AFAICT the instances don't have any allocations in placement | 00:34 |
tonyb | What is the "most correct"/"least horrible" DB update I can do to let nova know those instances are gone? | 00:35 |
tonyb | I was thinking of something like update instances set vm_state='ERROR', set task_state=???, set updated_at=now(), set deleted_at=now() where uuid in (....) ? | 00:39 |
clarkb | tonyb: I wonder if we just set vm state to active if a delete woudl work | 02:17 |
tonyb | I didn't try that. | 03:16 |
opendevreview | Merged openstack/nova stable/2023.2: Updates glance fixture for create image https://review.opendev.org/c/openstack/nova/+/906088 | 05:10 |
opendevreview | Merged openstack/nova stable/2023.2: Fixes: bfv vm reboot ends up in an error state. https://review.opendev.org/c/openstack/nova/+/906089 | 06:40 |
sean-k-mooney[m] | tonyb: to mark an instance as deleted in nova you set deleted=id | 08:37 |
tonyb | thanks that's basically what I did. | 08:38 |
sean-k-mooney[m] | so there is no way to take an instance from cell 0 and do anything other then delete it | 08:39 |
sean-k-mooney[m] | cell0 is basically a gravyard for instnce we were unable to boot | 08:39 |
sean-k-mooney[m] | and they should not have any allocaations in placement if they are in cell 0 | 08:39 |
sean-k-mooney[m] | for the ones in schduling they have not had a host selected so they willl not be in any cell db | 08:40 |
sean-k-mooney[m] | they will just have a build request in the api db | 08:40 |
sean-k-mooney[m] | the ones in the real cell in building likely failed because of the rabbit outage | 08:41 |
sean-k-mooney[m] | i assume you just want them gone so you can boot new vms with nodepool | 08:41 |
tonyb | yup that's the aim. I did get rid of them | 08:42 |
sean-k-mooney[m] | if they are in building and are in a really cell then they likely have allocations so you would mark them as deleted in the cell db then delete there allocations in placement | 08:42 |
sean-k-mooney[m] | ok in that case are they all now cleanned up? | 08:43 |
tonyb | yup. I did discover some other nodes that have allocations but are gone. | 08:43 |
tonyb | I'll figure that out tomorrow, but it should be easy to find them in placement | 08:44 |
sean-k-mooney[m] | we have two related command nova-manage heal allocatiosn and audit | 08:44 |
sean-k-mooney[m] | the heal allocation command creates alloctions that are missing | 08:44 |
sean-k-mooney[m] | the audit command removes allocatons that should nolonger exist | 08:45 |
tonyb | oh! cool. | 08:45 |
sean-k-mooney[m] | and has a dry run mode by default that just prints them | 08:45 |
tonyb | that's perfect for tomorrow! | 08:45 |
tonyb | sean-k-mooney[m]: thanks for all your help | 08:45 |
sean-k-mooney[m] | https://github.com/openstack/nova/blob/stable/victoria/nova/cmd/manage.py#L2695 | 08:46 |
tonyb | sean-k-mooney[m]++ | 08:52 |
gibi | elodilles, sean-k-mooney[m]: could you check these backports please https://review.opendev.org/q/topic:%22power-mgmt-fixups%22+branch:stable/2023.2 | 09:14 |
gibi | the master revert landed yesterday (finally) | 09:15 |
sean-k-mooney[m] | sure just reviewing one of mels patchs ill look at those next | 09:15 |
gibi | thanks | 09:19 |
bauzas | sean-k-mooney: gibi: got some devstack issue with ovn, because the package is missing | 09:28 |
bauzas | (RHEL9.3 here) | 09:28 |
sean-k-mooney[m] | its in a seperate repo call fast datapath on rhel | 09:29 |
bauzas | when looking at https://github.com/openstack/devstack/blob/master/tools/fixup_stuff.sh I see that centos9 installs centos-release-openstack-victoria package | 09:29 |
sean-k-mooney[m] | our you can just enabel the rdo repos | 09:29 |
sean-k-mooney[m] | i can give you the command for the rdo repos one sec | 09:29 |
bauzas | I can insdtall rdo-ovn | 09:29 |
bauzas | it says this is a wrapper for OVN | 09:30 |
sean-k-mooney[m] | https://github.com/openstack-k8s-operators/edpm-ansible/blob/28f80d2303497972a1d3b493760c4ff1557b973f/roles/edpm_nova/molecule/default/prepare.yml#L40 | 09:31 |
sean-k-mooney[m] | if you dont have the repos then i belive that will fix it for you | 09:31 |
sean-k-mooney[m] | change -b antelope to master | 09:32 |
bauzas | https://paste.opendev.org/show/b95bS56avBrLTeoma0QU/ | 09:32 |
bauzas | that's what I have | 09:32 |
sean-k-mooney[m] | ok then devstack already enabled rdo for you | 09:33 |
bauzas | correct | 09:33 |
sean-k-mooney[m] | how did it fail exactly | 09:33 |
sean-k-mooney[m] | you could just go back to ml2/ovs by the way | 09:34 |
sean-k-mooney[m] | but if you want to keep ovn | 09:34 |
bauzas | https://paste.opendev.org/show/bkHbTVtFyFTMA34VV1B7/ | 09:34 |
sean-k-mooney[m] | i can take a look at the error with you in a few minutes | 09:34 |
bauzas | yeah I think I'll change my local.conf and play with ml2/ovs | 09:34 |
bauzas | (my local.conf is pretty simple, nothing is told about the networking) | 09:35 |
sean-k-mooney[m] | oh its the metadata agent not ovn | 09:35 |
sean-k-mooney[m] | you can always just trun that off and use config drive | 09:36 |
bauzas | that's my local.conf https://paste.opendev.org/show/bwqKtvfeZhUT9aL1kFoZ/ | 09:37 |
sean-k-mooney[m] | https://github.com/openstack/nova/blob/master/.zuul.yaml#L192-L201 | 09:37 |
sean-k-mooney[m] | that is how to swap to ml2/ovs | 09:37 |
bauzas | q-ovn-metadata-agent.service that's the service which failed | 09:38 |
sean-k-mooney[m] | oh you need https://github.com/openstack/nova/blob/master/.zuul.yaml#L186-L189 as well | 09:39 |
bauzas | oh wait, I have ovn installed | 09:39 |
sean-k-mooney[m] | yes | 09:39 |
sean-k-mooney[m] | its the neturon metadata agent for ovn that failed | 09:39 |
sean-k-mooney[m] | not the ovn install | 09:39 |
bauzas | yeah | 09:40 |
bauzas | I can try to dig into the logs of that agent, if I'm able to find them | 09:40 |
sean-k-mooney[m] | they are in the journal like all the rest | 09:40 |
sean-k-mooney[m] | so just journalctl -u devstack@whatever | 09:40 |
sean-k-mooney[m] | im stepping away for 5-10 minutes to grab coffee but when im back if you want me to ssh in and take a look with you i can | 09:41 |
bauzas | yeah | 09:41 |
bauzas | and thanks | 09:41 |
bauzas | ahah, found the culprit https://paste.opendev.org/show/bFigNVupZWRpjlmuzXZ9/ | 09:48 |
sahid | o/ regarding live migrations, we are agree that we can live-migrate vm from compute node that are N-1 to N ? | 09:50 |
sahid | in a situation of upgrading version of openstack | 09:50 |
bauzas | sure, not the other way | 09:53 |
sahid | ack thanks | 09:58 |
sean-k-mooney | bauzas: ah the privsep helper | 10:12 |
sean-k-mooney | bauzas: so tht issue is that its not on your path | 10:13 |
sean-k-mooney | bauzas: i put a path fix in devstack to fix that at one point | 10:13 |
bauzas | [stack@lenovo-sr655-01 devstack]$ which privsep-helper | 10:14 |
bauzas | /usr/bin/which: no privsep-helper in (/opt/stack/.local/bin:/opt/stack/bin:/sbin:/bin:/usr/sbin:/usr/bin:/usr/local/sbin:/usr/local/bin) | 10:14 |
bauzas | hmmmm | 10:14 |
sean-k-mooney | it will be in the venv bin dir | 10:16 |
sean-k-mooney | if you have it disabled | 10:16 |
bauzas | I disabled the GLOBAL venv indeed | 10:16 |
sean-k-mooney | then it should be in /usr/local/bin but apprently not | 10:16 |
bauzas | I have oslo.privsep | 10:17 |
bauzas | lemme see where it's landed | 10:17 |
sean-k-mooney | https://github.com/openstack/devstack/blob/5c1736b78256f5da86a91c4489f43f8ba1bce224/stack.sh#L838 this is the rwrokaround for usign the venv | 10:17 |
sean-k-mooney | so you just need to do the same with where ever it is | 10:17 |
bauzas | I suspect that privsep got installed in the venv as I first started to stack.sh with this flag | 10:18 |
sean-k-mooney | ah ya so there are a bunch of things you have to manually clean up | 10:18 |
sean-k-mooney | if you did that | 10:18 |
sean-k-mooney | and removing those symlinks form /usr/local/bin is one of them | 10:19 |
bauzas | and after that, even when asking devstack to reclone, I tihnk it got a problem | 10:19 |
sean-k-mooney | reclone is not enaouch | 10:19 |
sean-k-mooney | so you need to delete the venv dir | 10:19 |
sean-k-mooney | and then remove all the broken symlinks form /usr/local/bin | 10:19 |
bauzas | yeah | 10:20 |
bauzas | [stack@lenovo-sr655-01 devstack]$ ll /usr/local/bin/privsep-helper | 10:20 |
bauzas | lrwxrwxrwx. 1 root root 39 Jan 31 11:10 /usr/local/bin/privsep-helper -> /opt/stack/data/venv/bin/privsep-helper | 10:20 |
sean-k-mooney | yep | 10:20 |
sean-k-mooney | but neutron is proably not in the venv | 10:20 |
bauzas | okay, I'll reinstall the package | 10:20 |
bauzas | and I'll look at other symlinks in /usr/local/bin | 10:21 |
sean-k-mooney | as i said nuke the venv first then remove any broken links | 10:21 |
sean-k-mooney | i didnt document the process but it took me about an hour to do that on centos properly but its doable to make work | 10:22 |
bauzas | yeah | 10:23 |
bauzas | thanks for the help | 10:23 |
sean-k-mooney | note i had to remove some egg link files too | 10:24 |
sean-k-mooney | what i basically did ws unstack.sh; clean.sh | 10:24 |
bauzas | yup, https://github.com/openstack/devstack/blob/master/tools/fixup_stuff.sh says it | 10:24 |
sean-k-mooney | remove all openstack related thigns form /usr/local/bin | 10:24 |
sean-k-mooney | and remove all openstack packages via pip and or egglink files | 10:25 |
sean-k-mooney | after that i cloud jsut stack aagain without the venv and it was fine | 10:26 |
sean-k-mooney | the venv is a huge improvement but goign between each mode is a pain | 10:27 |
bauzas | yeah and clean.sh doesn't do all the cleanup :) | 10:27 |
bauzas | I recommend everyone stacking a new env every quarter :) | 10:27 |
sean-k-mooney | it appreanly does not install ovs and ovn which i coudl have bet money on it doing in the past | 10:27 |
sean-k-mooney | that bit me a few days ago | 10:27 |
sean-k-mooney | i almost added that to clean but i had other things to do | 10:28 |
bauzas | sean-k-mooney: I wonder, I guess we can't easily run the zuul jobs on our local machines ? | 10:28 |
sean-k-mooney | well... i can kind of | 10:29 |
bauzas | it would be simplier for a new developer to just hook their dev machines into some zuul registery so they would just run a specific zuul job for installing devstack and its knobs | 10:29 |
sean-k-mooney | i wrote an ansibel molecule senairo | 10:29 |
bauzas | yeah I remember that | 10:29 |
sean-k-mooney | that uses the upstream zuul job roles | 10:30 |
sean-k-mooney | https://github.com/SeanMooney/ard/blob/master/molecule/default/molecule.yml | 10:30 |
bauzas | ideally, I'd just use the .zuul.yaml definitions | 10:30 |
sean-k-mooney | ya that is not possibel today | 10:30 |
sean-k-mooney | there was a propoal to do that with a thing called zuul-runner | 10:30 |
sean-k-mooney | clarkb: ^ that never actully happened right | 10:30 |
bauzas | interesting | 10:31 |
bauzas | I mean, don't get me wrong | 10:31 |
sean-k-mooney | bauzas: the idea was you would point it at zuul adn give it the proejct name ectra and give it some resouce to use and well it would run the zuul executor logic locally and run the job agaist those resoucs | 10:31 |
bauzas | the usecase I see here is that if dev contributors start using zuul roles for deploying their devstacks, they could get less effort in configuring devstack and installing it, but they could also give the money back by also contributing to the roles definitions | 10:32 |
sean-k-mooney | bauzas: if you have never done it zuul is actully pretty simple to run on your laptop with a singel docker-compoes up | 10:32 |
bauzas | for a newcomer, trying to untangle devstack (if their install failed) is somewhat tricky | 10:32 |
*** elodilles is now known as elodilles_afk | 10:33 | |
sean-k-mooney | it is but i still think its an exersise tehy should do to understand whot things work | 10:33 |
sean-k-mooney | but thats why i create the ard stuff | 10:33 |
bauzas | the more they'd stick to 'standard validated upstream roles', the better chances they have to get their nodes ready | 10:33 |
sean-k-mooney | did you know that devstack used to have a vagrant file | 10:33 |
bauzas | yup | 10:33 |
bauzas | I remember it | 10:33 |
sean-k-mooney | it might still exist | 10:33 |
sean-k-mooney | i never got it to work for what it wanted | 10:34 |
sean-k-mooney | so the stuff i have in the ard repo can be pointed at any host and it will give you a mulit node devstack | 10:34 |
sean-k-mooney | i used it to provisin some host in china form ireland for gibi when he first joined redhat | 10:35 |
sean-k-mooney | i was looking at a zuul job yesterday when the time stamps suddely went 5 hours into the past | 10:36 |
sean-k-mooney | when i realised zuul was running in the us east coast and the vms were in europe | 10:36 |
sean-k-mooney | i was pretty churffed that zuul/ansible works that well across an ocean | 10:36 |
*** ravlew is now known as Guest1201 | 10:42 | |
*** ravlew1 is now known as ravlew | 10:43 | |
opendevreview | sean mooney proposed openstack/nova master: [WIP] add libvirt connection healtcheck https://review.opendev.org/c/openstack/nova/+/907424 | 12:58 |
opendevreview | Merged openstack/nova stable/2023.2: Revert "[pwmgmt]ignore missin governor when cpu_state used" https://review.opendev.org/c/openstack/nova/+/905672 | 13:02 |
opendevreview | Merged openstack/nova stable/2023.2: cpu: make governors to be optional https://review.opendev.org/c/openstack/nova/+/905673 | 13:02 |
bauzas | sean-k-mooney: fwiw, straight devstack install with GLOBAL_VENV=False worked like a charm with Centos 9 Stream | 13:59 |
bauzas | (ovn and all the likes) | 13:59 |
bauzas | no need to tweak anything | 13:59 |
sean-k-mooney | yep | 14:00 |
sean-k-mooney | centos 9 stream is a supproted distro | 14:00 |
sean-k-mooney | rhel is not | 14:00 |
bauzas | and good news, I checked that the nvidia GRID driver correcly works with latest C9S release | 14:00 |
sean-k-mooney | a lot of the check for rpm distos explcity dont check for rhel | 14:00 |
bauzas | that was my main concern ^ | 14:00 |
sean-k-mooney | ya | 14:00 |
sean-k-mooney | the kernels are close enouch that it should install | 14:00 |
sean-k-mooney | btu you never know | 14:01 |
bauzas | with mdev live migration, you need recent qemu and libvirt versions plus a recent kernel too | 14:01 |
bauzas | so I'm doublecheckign the versions now with the 30 Janth c9s build | 14:01 |
sean-k-mooney | i think we shoudlbe fine. centos 9 stream has new version of qemu and libvirt then rhel | 14:01 |
sean-k-mooney | but if you need even newer you can enabel the fedora virt preview repos | 14:02 |
sean-k-mooney | those have c9s builds too now | 14:02 |
sean-k-mooney | https://copr.fedorainfracloud.org/coprs/g/virtmaint-sig/virt-preview/ | 14:02 |
bauzas | libvirt-9.10 | 14:03 |
bauzas | qemu-kvm-common-8.2 | 14:03 |
sean-k-mooney | libvirt 10 came out 2 weeks ago but 9.10 is from novemeber | 14:03 |
sean-k-mooney | libvirt 10 is in the copr repo | 14:04 |
sean-k-mooney | 10.0.0-3 | 14:04 |
sean-k-mooney | and qemu 2:8.2.0-6 | 14:04 |
sean-k-mooney | i think you should be good with the version you have to be honest | 14:05 |
gibi | elodilles_afk: sean-k-mooney: thanks for the +2s on the powermgmt backports. This is the (hopefully) last set this time to Antelope https://review.opendev.org/q/topic:%22power-mgmt-fixups%22+branch:stable/2023.1 | 14:20 |
bauzas | sean-k-mooney: sorry was in meeting but yeah we're fine https://specs.openstack.org/openstack/nova-specs/specs/2024.1/approved/libvirt-mdev-live-migrate.html#dependencies | 14:48 |
*** elodilles_afk is now known as elodilles | 15:21 | |
opendevreview | Fabian Wiesel proposed openstack/nova master: vmware: Integer division Python 2 -> 3 fix https://review.opendev.org/c/openstack/nova/+/907444 | 15:54 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Augment the LibvirtLiveMigrateData object https://review.opendev.org/c/openstack/nova/+/904175 | 17:12 |
opendevreview | Sylvain Bauza proposed openstack/nova master: check both source and dest compute libvirt versions for mdev lv https://review.opendev.org/c/openstack/nova/+/904176 | 17:12 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Check if destination can support the src mdev types https://review.opendev.org/c/openstack/nova/+/904177 | 17:12 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Reserve mdevs to return to the source https://review.opendev.org/c/openstack/nova/+/904209 | 17:12 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Modify the mdevs in the migrate XML https://review.opendev.org/c/openstack/nova/+/904258 | 17:12 |
bauzas | dansmith: just updated my series | 17:13 |
bauzas | about the edge cases of a failing migration that would leak the dict, I'm a bit puzzled on how to correctly test that | 17:13 |
bauzas | for sure I can add another functional test that would force a migration to break, maybe that would help but that would only check the case I already planned | 17:14 |
bauzas | actually this sounds doable, I'll come up with something | 17:15 |
sean-k-mooney | gibi: dansmith: related to the health check converstation. i alrady have a time stamp for when each healthcheck result was recorded so im just going to add one more timestamp for the overall responce | 17:27 |
dansmith | sean-k-mooney: cool | 17:27 |
dansmith | bauzas: ack | 17:27 |
sean-k-mooney | you could figure that out yourslef on the client side but that seams pretty trivial to do and it allow you one extra data point as a clinet, how long did the responce take to get to me | 17:28 |
sean-k-mooney | we expect this to mainly run on localhost so that should be instant but if its not then it might be useful | 17:29 |
sean-k-mooney | melwitt: did you push an update to your series by the way. i wont get to it today but ill try and take a look again tomorrow | 17:30 |
melwitt | sean-k-mooney: no not yet :( I will later today | 17:31 |
sean-k-mooney | no rush | 17:31 |
opendevreview | Sylvain Bauza proposed openstack/nova master: check we don't leak in the func test https://review.opendev.org/c/openstack/nova/+/907465 | 17:45 |
bauzas | dansmith: made the functest, proofing we don't leak ^ | 17:45 |
bauzas | I'll rebase my series by squashing that one with the change in question | 17:46 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Reserve mdevs to return to the source https://review.opendev.org/c/openstack/nova/+/904209 | 17:47 |
opendevreview | Sylvain Bauza proposed openstack/nova master: Modify the mdevs in the migrate XML https://review.opendev.org/c/openstack/nova/+/904258 | 17:47 |
bauzas | there we go | 17:47 |
* bauzas bails out now | 17:49 | |
clarkb | sean-k-mooney: no there isn't a good way to run zuul jobs locally | 18:06 |
clarkb | sean-k-mooney: this is why tools like tox and nox and make etc are valuable and honestly good ideas regardless of your CI system | 18:06 |
sean-k-mooney | clarkb: well we would not remove them evn if we coudl run zuul jobs locally | 18:23 |
clarkb | sure, btu if those tools work properly you reduce a lot of problems | 18:24 |
clarkb | and the problems with not disabling selinux don't go away running a zuul job locally | 18:24 |
sean-k-mooney | hehe well that is just because bauzas is using an os that is not supported by the too he was trying to od | 18:24 |
sean-k-mooney | the devstack is_fedora check i think works for centos and rhel but we have other check that check for fedora and centos by name | 18:25 |
sean-k-mooney | there are docs that tell you to use specific function in devstack to do this but perople done alwasy follow the docs even if they wrote them :) | 18:25 |
clarkb | more generally there is a balancing act between being so tied to your CI system that the software only really works there and ensuring the software is generally deployable | 18:28 |
clarkb | a common place we see this is people updating devsatck job vars rather than setting appropriate defaults in devstack proper | 18:28 |
clarkb | and we should try and avoid those tendencies and ensure the software can stand on its own | 18:28 |
sean-k-mooney | selunx disbaling is done here https://github.com/openstack/devstack/blob/5c1736b78256f5da86a91c4489f43f8ba1bce224/tools/fixup_stuff.sh#L38-L43 garded by is_fedora so that hsoudl have worked | 18:29 |
clarkb | where "run this zuul job locally" is potentially useful is determining why builds have failed | 18:30 |
clarkb | it is less useful as a "run our software" option | 18:30 |
sean-k-mooney | clarkb: so the issue is i have been using devstack for so long now that i rearly have issues and i can fix them pretty quickly when i do | 18:32 |
clarkb | yup, we end up with blinders and biases. I'm just saying that in this situation I feel ike the problem exist in devstack and not a lack of tooling in zuul | 18:33 |
sean-k-mooney | zuul jobs can be wriite so tehy are runablle locally | 18:39 |
sean-k-mooney | they are just ansible after all | 18:39 |
sean-k-mooney | the only thing tha tmakes that hard with our devstack ones is bascilly a meta palybooks to run all the pre/post playbooks in the write order | 18:40 |
sean-k-mooney | if you coud trivially gerneate that you and the inventory you could jus tupdate the ips and run it agaisnt your own vms | 18:41 |
sean-k-mooney | in the molecule senario and playbooks i worte in my ard repo i basically just did that | 18:41 |
sean-k-mooney | i reused the roles and statically combined themn to give me somethign pretty close to the upstream jobs | 18:42 |
clarkb | right. I'm basically asserting that needing to resort to that for users to run devstack is a bug though | 18:48 |
clarkb | and we should try and make the tool easier to use rather than make CI a stand in | 18:48 |
sean-k-mooney | right but devstack is not hard to use | 19:21 |
sean-k-mooney | all the issue bauzas hit where tyring to use it in unsupproted ways (an os that is not supproted after starting in a mode that is not supproted on its nearest supproted cousin) | 19:22 |
sean-k-mooney | dansmith: i have a terible idea. | 19:42 |
dansmith | yeah? | 19:43 |
sean-k-mooney | the dbcounter uses a plugin https://opendev.org/openstack/devstack/src/branch/master/tools/dbcounter/dbcounter.py | 19:43 |
sean-k-mooney | do you think i could create one for the healthchecks to detect if a conenction resets and is reestablished | 19:43 |
sean-k-mooney | would that be an insane thing to do even look into or something to condier | 19:44 |
dansmith | yeah I think that's a bad idea, | 19:45 |
dansmith | because it has to be in the [db]/connection string to work | 19:45 |
sean-k-mooney | oh ok ya i didnt know that part | 19:45 |
sean-k-mooney | never mind then i just saw it pringitn stats since i have journaclt open | 19:46 |
sean-k-mooney | and then i rememebred it was a plugin | 19:46 |
sean-k-mooney | the example is litcally just regestirign an evnt with a call back | 19:49 |
sean-k-mooney | https://docs.sqlalchemy.org/en/14/core/connections.html#sqlalchemy.engine.CreateEnginePlugin | 19:49 |
sean-k-mooney | but i mised the url part | 19:50 |
sean-k-mooney | although you can enable it a diffent way | 19:50 |
sean-k-mooney | anyway i was ment to finish a while ago so ill leave that for now | 19:51 |
dansmith | feels like a monkeypatch to me :) | 19:53 |
sean-k-mooney | kind of but via a supported interface | 19:54 |
sean-k-mooney | as in its an intential extention point but | 19:54 |
JayF | I'll note Ironic uses this event interface directly to set PRAGMA on sqlite usages. | 19:54 |
sean-k-mooney | the db is the last part on my list | 19:54 |
JayF | So you could potentially get the same behavior without having to monkey patch | 19:54 |
sean-k-mooney | JayF: im not planning to monkey patch | 19:54 |
sean-k-mooney | JayF: what im trying to figure out is a clean way to detect if an database connection is lost and restablished | 19:55 |
JayF | Oh yeah, I figured that, but just pointing out that event interface is generally useful for stuff like that: https://github.com/openstack/ironic/blob/master/ironic/db/sqlalchemy/__init__.py#L28 | 19:55 |
sean-k-mooney | without have to do that everywhere we interact wit the db | 19:55 |
JayF | Yeah, I'm saying: register events in the same way they do in that CreatePlugin | 19:55 |
JayF | that example in Ironic is actually pretty darn close to half of what you need in terms of registering the connect event | 19:55 |
sean-k-mooney | ya so i looke at doing that too | 19:56 |
sean-k-mooney | but its a little tricky to ensure we do that give how we mange our engine fasads | 19:56 |
JayF | Note that doesn't have a handle on anything | 19:57 |
JayF | Do you not have a place, ala __init__.py in a relevant module, you're guaranteed to be run in every thread? | 19:57 |
JayF | the dbapi_connection is provided for you on a ConnectionEvent | 19:57 |
sean-k-mooney | simple answer no we have at least 2 places like that posble more | 19:59 |
sean-k-mooney | for the cell db i belive all engin creation goes though this function https://github.com/openstack/nova/blob/master/nova/db/main/api.py#L138 | 20:00 |
sean-k-mooney | for the api db here https://github.com/openstack/nova/blob/master/nova/db/api/api.py#L50 | 20:01 |
sean-k-mooney | but there is a lot of code to review | 20:01 |
JayF | I'm saying I think your assumption that it has to be done *at* engine creation time is inaccurate | 20:01 |
JayF | that it can be done anytime *before* that engine creation, as well | 20:01 |
JayF | which is likely a much easier problem to solve | 20:01 |
JayF | but I'm not super familiar with nova is there's a different dimension here I'm missing | 20:02 |
sean-k-mooney | proably | 20:02 |
sean-k-mooney | i just need to review the options and then see where to go form there | 20:02 |
sean-k-mooney | i was lookign at how we optionally enabel osprofiler.sqlalchemy | 20:02 |
sean-k-mooney | as well | 20:02 |
sean-k-mooney | it has a handel_error function | 20:03 |
sean-k-mooney | https://opendev.org/openstack/osprofiler/src/branch/master/osprofiler/sqlalchemy.py#L99 | 20:03 |
sean-k-mooney | right now im just trying to see what the options are in this space | 20:04 |
sean-k-mooney | its uses https://github.com/openstack/oslo.db/blob/master/oslo_db/sqlalchemy/enginefacade.py#L792 append_on_engine_create | 20:05 |
JayF | oh, nice! I imagine we didn't use that because we only wanted to catch sqlite connections | 20:07 |
sean-k-mooney | https://docs.sqlalchemy.org/en/20/core/pooling.html#custom-legacy-pessimistic-ping | 20:07 |
sean-k-mooney | there are docs for how to lliste for the engine_connect event | 20:08 |
sean-k-mooney | so i was wondering if i could combine some for those approch to ensure that wehn a engin is create we greister a litener for and engine_connect | 20:09 |
sean-k-mooney | that just set a db_connection healtcheck to pass/fail | 20:09 |
sean-k-mooney | anyway ill let this simmer in the back of my mind and come back in a few days | 20:10 |
sean-k-mooney | thanks for the ironic link | 20:10 |
JayF | In Ironic, we don't have a handle on engine at all when we setup that event listener; so I'm a little confused as to if it really needs to be done per engine | 20:10 |
sean-k-mooney | ya im not sure either | 20:10 |
sean-k-mooney | i should proably just ask zeek about it | 20:11 |
JayF | My current belief is it is gloriously simple -- just make sure the listener is registered and you don't need to worry about doing something "per enginer". If you find that is not correct, please let me know if you remember :D | 20:13 |
sean-k-mooney | i mean i can give it a try and just see what happens | 20:19 |
sean-k-mooney | its either goign to work or not | 20:19 |
sean-k-mooney | JayF: oh i see whats happening in the example its calling listen for on an instance of a class | 20:27 |
sean-k-mooney | JayF: but ironic is calling it with a type in this case ConnectionEvents | 20:28 |
JayF | Yes, different types give you different args as well | 20:28 |
JayF | so a ConnectionEvent gives you a dbapi_connection object | 20:28 |
sean-k-mooney | yep ok this makes sense now | 20:28 |
sean-k-mooney | trying to regester an event function for the engine_connect event that just raises a RuntimeError | 20:49 |
sean-k-mooney | does nohthing so either there is no engince_connect event when the nova conductor starts | 20:50 |
sean-k-mooney | or https://github.com/openstack/ironic/blob/master/ironic/db/sqlalchemy/__init__.py#L28 didnt seam to work for me | 20:50 |
sean-k-mooney | i would guess we had not used the db connection if it was not fro | 20:51 |
sean-k-mooney | DEBUG dbcounter [-] [2076032] Writing DB stats nova_cell1:SELECT=4,nova_cell1:UPDATE=1 { | 20:51 |
tonyb | sean-k-mooney[m]: Thanks again for all your help. If you're interested here's somewhat of a summary | 22:55 |
*** tosky_ is now known as tosky | 23:14 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!