| *** ysandeep|out is now known as ysandeep|rover | 04:42 | |
| noonedeadpunk | mornings | 07:12 |
|---|---|---|
| jrosser | good morning | 07:38 |
| foutatoro | hello jrosser | 07:57 |
| jrosser | hello | 07:57 |
| damiandabrowski[m] | morning folks! | 07:58 |
| *** ysandeep|rover is now known as ysandeep|rover|lunch | 08:49 | |
| admin1 | morning | 09:20 |
| *** ysandeep|rover|lunch is now known as ysandeep|rover | 09:47 | |
| mgariepy | good morning everyone | 11:23 |
| *** dviroel|afk is now known as dviroel | 11:28 | |
| opendevreview | Marc GariƩpy proposed openstack/openstack-ansible-os_tempest master: [DNM] testing if all the tests are still passing. https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/841257 | 12:22 |
| mgariepy | noonedeadpunk, uca doesnt have tempest plugin package. jammy does have some but i guess it's only in universe and won't stay up to date anyway. | 12:23 |
| noonedeadpunk | we were using source install for ubuntu tempest regardless if it's source or distro install | 12:25 |
| noonedeadpunk | I thought we had something that prevented this from failures | 12:26 |
| noonedeadpunk | maybe we can disable building wheels on ubuntu, when it's distro install | 12:26 |
| mgariepy | i thing the hard-coded "source" install does fix it | 12:26 |
| mgariepy | we will see. | 12:27 |
| noonedeadpunk | but we don't have repo container when rest is distro install? do we? | 12:27 |
| mgariepy | indeed we do not. | 12:27 |
| mgariepy | let's see if it passes. if not i'll debug it. | 12:28 |
| noonedeadpunk | then it should fail on attempt to get constraints file from repo container | 12:29 |
| mgariepy | tempest was installing from source on distro install test a couple weeks ago. | 12:29 |
| noonedeadpunk | well. I dropped some things maybe :D | 12:30 |
| mgariepy | lol. maybe that's why i'm re-testing the role haha | 12:30 |
| noonedeadpunk | like with https://review.opendev.org/c/openstack/openstack-ansible/+/837845 | 12:31 |
| noonedeadpunk | But I don't see what would result in the issues... | 12:31 |
| noonedeadpunk | Maybe we also fixed not deploying repo container for distro installs when were merging gluster | 12:32 |
| mgariepy | well me neither. | 12:32 |
| mgariepy | let's wait for the test result. | 12:32 |
| noonedeadpunk | but eventually this fails only for tempest role | 12:32 |
| noonedeadpunk | which is really interesting | 12:32 |
| mgariepy | we do have another role that doesn't support distro install. | 12:32 |
| mgariepy | gnocchi. | 12:33 |
| noonedeadpunk | I guess we should jsut drop distro support there? | 12:36 |
| noonedeadpunk | or whole telemetry does support it? | 12:36 |
| mgariepy | i have no idea if there is gnocchi in uca or not. it's getting really hard i think to have all our roles patched at the same time | 12:37 |
| mgariepy | there are always one or 2 or 4 that are left behind. | 12:37 |
| lowercase | I'm finally getting to a place where I feel comfortable uploading my work with fluentd, openstack and loki into a public repo. What's the repo where this would all go. The one that had the elk configurations and such. | 12:45 |
| jrosser | lowercase: openstack-ansible-ops is the repo for this sort of thing | 12:47 |
| mgariepy | foutatoro, did you recovered your cluster ? | 12:48 |
| foutatoro | mgariepy, good morning. | 12:56 |
| foutatoro | mgariepy: not yet I have a really strange issue. previous VM disks seem to be in ceph vms pool but I can't list them not attached them to appropriate VM | 12:58 |
| foutatoro | https://paste.opendev.org/show/b1Mhw44QtAMH8qlQVTBL/ | 12:58 |
| lowercase | 4 in (since 6M) | 12:59 |
| lowercase | did you just recover an osd? | 12:59 |
| mgariepy | 6M i guess it month./ | 12:59 |
| mgariepy | it's 6 months** | 13:00 |
| lowercase | then why are the pgs degraded? | 13:00 |
| lowercase | if he didn't lose an osd | 13:00 |
| foutatoro | lowercase: I've 4 osd this is a pred-prod | 13:00 |
| mgariepy | was an osd out for a long time ? | 13:01 |
| foutatoro | mgariepy>: no | 13:01 |
| lowercase | what happened 14 hours ago | 13:01 |
| foutatoro | lowercase: due to a incident all infra hosts were restared | 13:02 |
| lowercase | do you infra hosts also host osds? | 13:02 |
| foutatoro | yes | 13:03 |
| lowercase | okay, so all osds were offline 14 hours ago | 13:03 |
| foutatoro | exact | 13:03 |
| lowercase | which makes a 6 month uptime for that osd impossible. So what happened 6 minutes ago? | 13:04 |
| foutatoro | since the restart the cinder-volune service state is down also | 13:04 |
| jrosser | the backend is down, not the service | 13:04 |
| foutatoro | nothing happens 6 minutes ago | 13:04 |
| lowercase | your ceph health status says otherwise | 13:04 |
| foutatoro | lowercase mgariepy: is there a way to download rdb objects as qcow2 ? | 13:08 |
| mgariepy | foutatoro, you can copy the image from ceph yes. | 13:09 |
| lowercase | okay, i was wrong. mgariepy was correct. 12 osds: 11 up (since 2m), 12 in (since 10w) | 13:10 |
| lowercase | i restarted an osd in my dev cluster just to confirm. | 13:10 |
| mgariepy | foutatoro, https://paste.openstack.org/show/bj72ZpWyVrbgBcSaswTU/ | 13:11 |
| mgariepy | simple command with a few args easy to remember by heart | 13:12 |
| mgariepy | but you really should try to see why cinder is not starting. | 13:13 |
| jrosser | cinder volume backends can be down because of rabbitmq trouble | 13:13 |
| mgariepy | foutatoro, ok first thing, can you lists projects and users (this will tell you if keystone works) | 13:14 |
| foutatoro | yes I can list projets, users, neutron networks, previous instances names ... | 13:14 |
| mgariepy | for rabbitmq, what does `rabbitmqctl cluster_status` tells you | 13:15 |
| foutatoro | `rabbitmqctl cluster_status`: https://paste.openstack.org/show/b3GkAwR515Mfs9unC40K/ | 13:17 |
| mgariepy | ok seems ok i guess. | 13:18 |
| mgariepy | now cinder did you restart it after you fixed the galera cluster ? | 13:18 |
| foutatoro | yes, I restart containers and all services with 'systemctl restart cinder*' | 13:19 |
| mgariepy | and the cinder api is online in your haproxy ? | 13:20 |
| mgariepy | hatop -s /var/run/haproxy.stat | 13:20 |
| foutatoro | https://paste.openstack.org/show/byhWiTt4sJ5FNdEegIry/ | 13:23 |
| foutatoro | cider-api is not marked as UP | 13:23 |
| mgariepy | cinder_api-back seems UP. | 13:24 |
| mgariepy | in the cinder container | 13:25 |
| mgariepy | what does cinder log looks like ? | 13:25 |
| mgariepy | `journalctl -u cinder.slice -f` | 13:25 |
| mgariepy | `systemctl status cinder.slice` | 13:27 |
| foutatoro | https://paste.openstack.org/show/bFSZYN9I2hL5DkDAJLfo/ | 13:27 |
| jrosser | `cinder service-list` | 13:29 |
| lowercase | This might help narrow down: `journalctl -u cinder-volume -p 3` or `journalctl -u cinder.slice -f -p 3`, -p 3 only shows logs that are marked as errors. | 13:31 |
| mgariepy | nice about -p3 | 13:32 |
| mgariepy | i usually to `-n 10000|grep something` :D lol | 13:32 |
| mgariepy | or --since with some quick google haha | 13:32 |
| lowercase | another favorite is --no-pager, which makes journactl not view in a bad ... well pager. | 13:34 |
| foutatoro | cinder service-list return a Bad Gateway it tries to join serve running on 8776 | 13:35 |
| mgariepy | `opesntack volume service list` | 13:35 |
| foutatoro | https://paste.openstack.org/show/bc2YxdSQWJpCnle4a1xw/ | 13:35 |
| foutatoro | https://paste.openstack.org/show/b6BpHCcSaf8FfG0Uf8Eh/ | 13:36 |
| foutatoro | `opesntack volume service list`: https://paste.openstack.org/show/b6BpHCcSaf8FfG0Uf8Eh/ | 13:36 |
| foutatoro | I'm restarting the scheduler | 13:37 |
| lowercase | cinder-api is prob offline | 13:37 |
| lowercase | you don't even have a cinder-api? | 13:37 |
| jrosser | i don't think it appears in that list anyway | 13:38 |
| mgariepy | indeed it doesnt | 13:39 |
| lowercase | it sure doesn't | 13:39 |
| lowercase | huh | 13:39 |
| mgariepy | systemctl restart cinder.slice | 13:39 |
| mgariepy | or the status before. | 13:40 |
| mgariepy | just to see. | 13:40 |
| jrosser | i keep saying that the up/down there isnt about the service running or not :) | 13:40 |
| jrosser | it's the backend | 13:40 |
| foutatoro | cinder.slice status: https://paste.openstack.org/show/bmHEid7kcMNSyWILsUU6/ | 13:41 |
| lowercase | `journalctl -n 100 -p 3 -u cinder.slice` command please | 13:43 |
| jrosser | i do not thing that it is correct to have both rbd:volumes@RBD and infra2-cinder-volumes-container-bdc12de4@RBD cinder-volume services both listed | 13:44 |
| jrosser | that is a sign that there is something wrong with the active/active parts of the config | 13:44 |
| mgariepy | or it wasn't cleaned up ? | 13:45 |
| jrosser | yes, or that | 13:45 |
| mgariepy | if it was installed in the good old days. and it was never cleaned up it can be there. | 13:46 |
| foutatoro | https://paste.openstack.org/show/b84ZTb019u7lU7cDudwI/ | 13:46 |
| jrosser | looks like at least rabbitmq trouble there | 13:49 |
| jrosser | you can use netstat or something to see if there are any actual connections working | 13:50 |
| lowercase | pymysql.err.OperationalError: (2013, 'Lost connection to M | 13:53 |
| lowercase | ySQL server during query') | 13:53 |
| lowercase | yeah, both rabbit and mysql issues. | 13:53 |
| lowercase | a whole log of mysql issues. | 13:54 |
| foutatoro | I see but those errors were at 2022-05-10 08:03:24.397 | 13:59 |
| foutatoro | and I restart services after | 13:59 |
| lowercase | Is the time on the server the same time as your timezone? | 14:00 |
| lowercase | cause mine are set to UTC and that screws me up all the time lol | 14:00 |
| mgariepy | i prefer to have utc everywhere .. here we have some day light saving ( +1 / -1 every 6 months) | 14:01 |
| mgariepy | noonedeadpunk, tempest still pass with the static install_method. | 14:09 |
| noonedeadpunk | oh, ok | 14:09 |
| mgariepy | good enough for me :D haha | 14:11 |
| mgariepy | foutatoro, https://paste.openstack.org/show/bmHEid7kcMNSyWILsUU6/ is seems to be missing the api service | 14:22 |
| mgariepy | unless i'm mistaken. | 14:22 |
| mgariepy | ha. it's not in the cinder slice :/ | 14:24 |
| mgariepy | it's in the uwsgi slice... | 14:25 |
| mgariepy | fun. | 14:25 |
| foutatoro | so I have to run 'openstack-ansible os-cinder-install.yml' ? | 14:26 |
| mgariepy | systemctl status cinder-api.service | 14:27 |
| mgariepy | `journalctl -u cinder-api.service -p 3 -n 100` | 14:29 |
| mgariepy | foutatoro, i don't think running playbook will help you there. | 14:33 |
| mgariepy | it's better to try to find the root cause | 14:33 |
| mgariepy | even more since this is a pre-prod system. you can take the time to debug it. it's not like when the produciton cluster have issues | 14:34 |
| foutatoro | cinder-api status: https://paste.openstack.org/show/bDXX5EswGiW5HqJ1Ietk/ | 14:39 |
| foutatoro | mgariepy: I don't know why this message "http://infra2-cinder-api-container-632dfbb4:8776/ returned with HTTP 300" but "wget http://infra2-cinder-api-container-632dfbb4:8776/" works fine on both containers | 14:43 |
| mgariepy | 300 is multiple choice from haproxy check i think. | 14:43 |
| lowercase | http 300 just means multiple choice, meaning that the url isn't a terminating url and there are multiple url paths that it can follow. i would try curl -L <url> and see if you get a 200 from that | 14:44 |
| lowercase | but can you perform the same command with -p 3 appended to it | 14:45 |
| lowercase | your api server is clearly running, processing requests. Howevor, since the issue is a rabbit or database issue, i want to see if the api service is complaining about either of those. | 14:46 |
| foutatoro | lowercase: right, curl -L works but adding -p 3 makes the request not terminate | 14:52 |
| lowercase | -p 3 to the journactl command lol | 14:52 |
| foutatoro | my bad | 14:52 |
| lowercase | `journalctl -u cinder-api.service -p 3 -n 100` | 14:52 |
| *** dviroel is now known as dviroel|lunch|afk | 14:53 | |
| foutatoro | lowercase: journal shows logs of yesterday | 14:55 |
| foutatoro | https://paste.openstack.org/show/bgQAuugtt5ERJVxobuIi/ | 14:55 |
| noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:00 |
| opendevmeet | Meeting started Tue May 10 15:00:29 2022 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
| opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
| opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:00 |
| noonedeadpunk | #topic rollcall | 15:00 |
| noonedeadpunk | o/ | 15:00 |
| noonedeadpunk | well, I'm actually semi-around | 15:01 |
| mgariepy | hey o/ | 15:01 |
| jrosser | hello o/ | 15:02 |
| noonedeadpunk | #topic office hours | 15:05 |
| ebbex | o/ | 15:05 |
| noonedeadpunk | I will be honest - I done nothing. I can't even recall what I was doing whole week... | 15:05 |
| noonedeadpunk | Likely side-effect after moving to the new place... | 15:06 |
| noonedeadpunk | jrosser: you had some issues with merging repo stuff - should we discuss it? | 15:06 |
| jrosser | oh yes, i left it alone for a few days | 15:06 |
| jrosser | but i think it's got all a bit circular | 15:07 |
| noonedeadpunk | We can always disable CI to land that... | 15:07 |
| jrosser | well maybe a couple of things to look at first | 15:07 |
| damiandabrowski[m] | hi! | 15:08 |
| jrosser | the glusterfs filesystem does not exist until we merge this https://review.opendev.org/c/openstack/openstack-ansible/+/837589/13/playbooks/repo-install.yml | 15:08 |
| jrosser | the repo_install playbook needs updating to create it, as the current use of serial: breaks the installation | 15:08 |
| jrosser | the tasks cannot be serial for those parts | 15:09 |
| jrosser | but then logically the next patch to merge (until I thought about it) was this https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/839411 | 15:09 |
| jrosser | but i don't think thats ever going to pass without the first one | 15:09 |
| jrosser | as the fs will not exist | 15:09 |
| jrosser | i think rather than circular patches, i mean its very hard to get everything to pass in CI without making it circular | 15:11 |
| noonedeadpunk | I also left comment for https://review.opendev.org/c/openstack/openstack-ansible/+/837589/13/ansible-collection-requirements.yml#40 just in case :) | 15:12 |
| jrosser | ah yes i saw that | 15:12 |
| jrosser | i got kind of diverted by playing with skyline | 15:13 |
| jrosser | but we should try to get this gluster stuff merged becasue it is a big change and needs some testing for real | 15:13 |
| noonedeadpunk | yeah | 15:13 |
| jrosser | deleting / re-creating repo server containers has some subletlies now, for example | 15:13 |
| noonedeadpunk | Btw regarding rbac topic - I guess there's no reall need to do changes this release since cinder/heat are still not ready | 15:14 |
| noonedeadpunk | But I'd rather introduced service role anyway, despite discussions about it are still ongoing | 15:14 |
| noonedeadpunk | we can suggest dropping all repo containers at once for example as well | 15:15 |
| noonedeadpunk | as that should be fine I guess? | 15:15 |
| jrosser | i mount /openstack/glusterfs in the current patches | 15:15 |
| noonedeadpunk | As there's nothing _really_ important anyway | 15:15 |
| jrosser | as there is UUID need to be preserved, else you can't re-create/join the cluster properly | 15:15 |
| jrosser | so sometimes you need to keep that, sometimes you need to delete it | 15:15 |
| jrosser | depends if you want to destroy the fs and start again, or to keep it | 15:16 |
| noonedeadpunk | I actually thought it will get removed with force_containers_data_destroy ? | 15:16 |
| jrosser | that does not seem to understand whatever bind mounts get made | 15:16 |
| noonedeadpunk | But not sure | 15:16 |
| jrosser | however in this case, i think that preserving it is the right thing to do for multinode | 15:17 |
| jrosser | there is also an impact on re-deploying an infra node | 15:17 |
| jrosser | having said all this - i really would like other eyes / opinions on it | 15:18 |
| noonedeadpunk | yep, fair | 15:20 |
| noonedeadpunk | another thing - do we want to have a presentation about project updates? | 15:20 |
| noonedeadpunk | THere's no dedicated event during summit for that, but still marketing has some plan how to promote these | 15:20 |
| noonedeadpunk | Basically they asked for a video 10mins tops to say about changes that were made lately | 15:21 |
| jrosser | i guess we would have to look back over the etherpads to see what we did / did not do | 15:22 |
| noonedeadpunk | yup, agree | 15:23 |
| noonedeadpunk | I will try to put smth into other etherpad so we could review topics next week | 15:23 |
| damiandabrowski[m] | okok, great | 15:24 |
| noonedeadpunk | ok, what else we have on plate? | 15:26 |
| noonedeadpunk | Except tons of stuff that needs to land? | 15:27 |
| jrosser | hmmm yes - reviews / merging of lots of things | 15:27 |
| jrosser | i should also say that i have done a proof-of-concept with the alternative dashboard, skyline | 15:28 |
| jrosser | and it's ummmm - interesting to deploy | 15:28 |
| noonedeadpunk | I can imagine, as it's nodejs iirc? | 15:29 |
| noonedeadpunk | at least frontend part of it | 15:29 |
| jrosser | there is a python part 'apiserver' then nodejs things for 'console' | 15:30 |
| damiandabrowski[m] | i will spend some time on reviews tomorrow | 15:30 |
| jrosser | and imho the code is very docker / kolla centric | 15:30 |
| jrosser | and also confuses the service code with deployment tooling, as theres a executable to generate the required nginx config /o\ | 15:30 |
| jrosser | but i think this is an opportunity to influence the skyline development to support wider tools and deployments | 15:32 |
| jrosser | debugging this is really on the edge of my understanding though, so if anyone is interested with web development skills then please help out :) | 15:33 |
| noonedeadpunk | tool to generate nginx config sounds as it sounds ofc... | 15:42 |
| noonedeadpunk | And also I saw there's no SSO support atm | 15:42 |
| noonedeadpunk | So I'd say they have plenty gaps as of todayy | 15:42 |
| noonedeadpunk | but you're right, we'd better chime-in earlier then later | 15:43 |
| jrosser | theres kind of two parts i think - ansible'ing up the deployment, pretty much whatever-it-takes to make it work | 15:44 |
| jrosser | then work on tidying that all up and making it all more OSA-like | 15:45 |
| noonedeadpunk | #endmeeting | 16:00 |
| opendevmeet | Meeting ended Tue May 10 16:00:34 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:00 |
| opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-05-10-15.00.html | 16:00 |
| opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-05-10-15.00.txt | 16:00 |
| opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-05-10-15.00.log.html | 16:00 |
| *** ysandeep|rover is now known as ysandeep|out | 16:23 | |
| opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Make octavia_provider_network better configurable https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/787336 | 16:46 |
| *** dviroel|lunch|afk is now known as dviroel\ | 18:53 | |
| *** dviroel\ is now known as dviroel | 18:53 | |
| *** dviroel is now known as dviroel|out | 21:22 | |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!