*** ysandeep|out is now known as ysandeep|rover | 04:42 | |
noonedeadpunk | mornings | 07:12 |
---|---|---|
jrosser | good morning | 07:38 |
foutatoro | hello jrosser | 07:57 |
jrosser | hello | 07:57 |
damiandabrowski[m] | morning folks! | 07:58 |
*** ysandeep|rover is now known as ysandeep|rover|lunch | 08:49 | |
admin1 | morning | 09:20 |
*** ysandeep|rover|lunch is now known as ysandeep|rover | 09:47 | |
mgariepy | good morning everyone | 11:23 |
*** dviroel|afk is now known as dviroel | 11:28 | |
opendevreview | Marc GariƩpy proposed openstack/openstack-ansible-os_tempest master: [DNM] testing if all the tests are still passing. https://review.opendev.org/c/openstack/openstack-ansible-os_tempest/+/841257 | 12:22 |
mgariepy | noonedeadpunk, uca doesnt have tempest plugin package. jammy does have some but i guess it's only in universe and won't stay up to date anyway. | 12:23 |
noonedeadpunk | we were using source install for ubuntu tempest regardless if it's source or distro install | 12:25 |
noonedeadpunk | I thought we had something that prevented this from failures | 12:26 |
noonedeadpunk | maybe we can disable building wheels on ubuntu, when it's distro install | 12:26 |
mgariepy | i thing the hard-coded "source" install does fix it | 12:26 |
mgariepy | we will see. | 12:27 |
noonedeadpunk | but we don't have repo container when rest is distro install? do we? | 12:27 |
mgariepy | indeed we do not. | 12:27 |
mgariepy | let's see if it passes. if not i'll debug it. | 12:28 |
noonedeadpunk | then it should fail on attempt to get constraints file from repo container | 12:29 |
mgariepy | tempest was installing from source on distro install test a couple weeks ago. | 12:29 |
noonedeadpunk | well. I dropped some things maybe :D | 12:30 |
mgariepy | lol. maybe that's why i'm re-testing the role haha | 12:30 |
noonedeadpunk | like with https://review.opendev.org/c/openstack/openstack-ansible/+/837845 | 12:31 |
noonedeadpunk | But I don't see what would result in the issues... | 12:31 |
noonedeadpunk | Maybe we also fixed not deploying repo container for distro installs when were merging gluster | 12:32 |
mgariepy | well me neither. | 12:32 |
mgariepy | let's wait for the test result. | 12:32 |
noonedeadpunk | but eventually this fails only for tempest role | 12:32 |
noonedeadpunk | which is really interesting | 12:32 |
mgariepy | we do have another role that doesn't support distro install. | 12:32 |
mgariepy | gnocchi. | 12:33 |
noonedeadpunk | I guess we should jsut drop distro support there? | 12:36 |
noonedeadpunk | or whole telemetry does support it? | 12:36 |
mgariepy | i have no idea if there is gnocchi in uca or not. it's getting really hard i think to have all our roles patched at the same time | 12:37 |
mgariepy | there are always one or 2 or 4 that are left behind. | 12:37 |
lowercase | I'm finally getting to a place where I feel comfortable uploading my work with fluentd, openstack and loki into a public repo. What's the repo where this would all go. The one that had the elk configurations and such. | 12:45 |
jrosser | lowercase: openstack-ansible-ops is the repo for this sort of thing | 12:47 |
mgariepy | foutatoro, did you recovered your cluster ? | 12:48 |
foutatoro | mgariepy, good morning. | 12:56 |
foutatoro | mgariepy: not yet I have a really strange issue. previous VM disks seem to be in ceph vms pool but I can't list them not attached them to appropriate VM | 12:58 |
foutatoro | https://paste.opendev.org/show/b1Mhw44QtAMH8qlQVTBL/ | 12:58 |
lowercase | 4 in (since 6M) | 12:59 |
lowercase | did you just recover an osd? | 12:59 |
mgariepy | 6M i guess it month./ | 12:59 |
mgariepy | it's 6 months** | 13:00 |
lowercase | then why are the pgs degraded? | 13:00 |
lowercase | if he didn't lose an osd | 13:00 |
foutatoro | lowercase: I've 4 osd this is a pred-prod | 13:00 |
mgariepy | was an osd out for a long time ? | 13:01 |
foutatoro | mgariepy>: no | 13:01 |
lowercase | what happened 14 hours ago | 13:01 |
foutatoro | lowercase: due to a incident all infra hosts were restared | 13:02 |
lowercase | do you infra hosts also host osds? | 13:02 |
foutatoro | yes | 13:03 |
lowercase | okay, so all osds were offline 14 hours ago | 13:03 |
foutatoro | exact | 13:03 |
lowercase | which makes a 6 month uptime for that osd impossible. So what happened 6 minutes ago? | 13:04 |
foutatoro | since the restart the cinder-volune service state is down also | 13:04 |
jrosser | the backend is down, not the service | 13:04 |
foutatoro | nothing happens 6 minutes ago | 13:04 |
lowercase | your ceph health status says otherwise | 13:04 |
foutatoro | lowercase mgariepy: is there a way to download rdb objects as qcow2 ? | 13:08 |
mgariepy | foutatoro, you can copy the image from ceph yes. | 13:09 |
lowercase | okay, i was wrong. mgariepy was correct. 12 osds: 11 up (since 2m), 12 in (since 10w) | 13:10 |
lowercase | i restarted an osd in my dev cluster just to confirm. | 13:10 |
mgariepy | foutatoro, https://paste.openstack.org/show/bj72ZpWyVrbgBcSaswTU/ | 13:11 |
mgariepy | simple command with a few args easy to remember by heart | 13:12 |
mgariepy | but you really should try to see why cinder is not starting. | 13:13 |
jrosser | cinder volume backends can be down because of rabbitmq trouble | 13:13 |
mgariepy | foutatoro, ok first thing, can you lists projects and users (this will tell you if keystone works) | 13:14 |
foutatoro | yes I can list projets, users, neutron networks, previous instances names ... | 13:14 |
mgariepy | for rabbitmq, what does `rabbitmqctl cluster_status` tells you | 13:15 |
foutatoro | `rabbitmqctl cluster_status`: https://paste.openstack.org/show/b3GkAwR515Mfs9unC40K/ | 13:17 |
mgariepy | ok seems ok i guess. | 13:18 |
mgariepy | now cinder did you restart it after you fixed the galera cluster ? | 13:18 |
foutatoro | yes, I restart containers and all services with 'systemctl restart cinder*' | 13:19 |
mgariepy | and the cinder api is online in your haproxy ? | 13:20 |
mgariepy | hatop -s /var/run/haproxy.stat | 13:20 |
foutatoro | https://paste.openstack.org/show/byhWiTt4sJ5FNdEegIry/ | 13:23 |
foutatoro | cider-api is not marked as UP | 13:23 |
mgariepy | cinder_api-back seems UP. | 13:24 |
mgariepy | in the cinder container | 13:25 |
mgariepy | what does cinder log looks like ? | 13:25 |
mgariepy | `journalctl -u cinder.slice -f` | 13:25 |
mgariepy | `systemctl status cinder.slice` | 13:27 |
foutatoro | https://paste.openstack.org/show/bFSZYN9I2hL5DkDAJLfo/ | 13:27 |
jrosser | `cinder service-list` | 13:29 |
lowercase | This might help narrow down: `journalctl -u cinder-volume -p 3` or `journalctl -u cinder.slice -f -p 3`, -p 3 only shows logs that are marked as errors. | 13:31 |
mgariepy | nice about -p3 | 13:32 |
mgariepy | i usually to `-n 10000|grep something` :D lol | 13:32 |
mgariepy | or --since with some quick google haha | 13:32 |
lowercase | another favorite is --no-pager, which makes journactl not view in a bad ... well pager. | 13:34 |
foutatoro | cinder service-list return a Bad Gateway it tries to join serve running on 8776 | 13:35 |
mgariepy | `opesntack volume service list` | 13:35 |
foutatoro | https://paste.openstack.org/show/bc2YxdSQWJpCnle4a1xw/ | 13:35 |
foutatoro | https://paste.openstack.org/show/b6BpHCcSaf8FfG0Uf8Eh/ | 13:36 |
foutatoro | `opesntack volume service list`: https://paste.openstack.org/show/b6BpHCcSaf8FfG0Uf8Eh/ | 13:36 |
foutatoro | I'm restarting the scheduler | 13:37 |
lowercase | cinder-api is prob offline | 13:37 |
lowercase | you don't even have a cinder-api? | 13:37 |
jrosser | i don't think it appears in that list anyway | 13:38 |
mgariepy | indeed it doesnt | 13:39 |
lowercase | it sure doesn't | 13:39 |
lowercase | huh | 13:39 |
mgariepy | systemctl restart cinder.slice | 13:39 |
mgariepy | or the status before. | 13:40 |
mgariepy | just to see. | 13:40 |
jrosser | i keep saying that the up/down there isnt about the service running or not :) | 13:40 |
jrosser | it's the backend | 13:40 |
foutatoro | cinder.slice status: https://paste.openstack.org/show/bmHEid7kcMNSyWILsUU6/ | 13:41 |
lowercase | `journalctl -n 100 -p 3 -u cinder.slice` command please | 13:43 |
jrosser | i do not thing that it is correct to have both rbd:volumes@RBD and infra2-cinder-volumes-container-bdc12de4@RBD cinder-volume services both listed | 13:44 |
jrosser | that is a sign that there is something wrong with the active/active parts of the config | 13:44 |
mgariepy | or it wasn't cleaned up ? | 13:45 |
jrosser | yes, or that | 13:45 |
mgariepy | if it was installed in the good old days. and it was never cleaned up it can be there. | 13:46 |
foutatoro | https://paste.openstack.org/show/b84ZTb019u7lU7cDudwI/ | 13:46 |
jrosser | looks like at least rabbitmq trouble there | 13:49 |
jrosser | you can use netstat or something to see if there are any actual connections working | 13:50 |
lowercase | pymysql.err.OperationalError: (2013, 'Lost connection to M | 13:53 |
lowercase | ySQL server during query') | 13:53 |
lowercase | yeah, both rabbit and mysql issues. | 13:53 |
lowercase | a whole log of mysql issues. | 13:54 |
foutatoro | I see but those errors were at 2022-05-10 08:03:24.397 | 13:59 |
foutatoro | and I restart services after | 13:59 |
lowercase | Is the time on the server the same time as your timezone? | 14:00 |
lowercase | cause mine are set to UTC and that screws me up all the time lol | 14:00 |
mgariepy | i prefer to have utc everywhere .. here we have some day light saving ( +1 / -1 every 6 months) | 14:01 |
mgariepy | noonedeadpunk, tempest still pass with the static install_method. | 14:09 |
noonedeadpunk | oh, ok | 14:09 |
mgariepy | good enough for me :D haha | 14:11 |
mgariepy | foutatoro, https://paste.openstack.org/show/bmHEid7kcMNSyWILsUU6/ is seems to be missing the api service | 14:22 |
mgariepy | unless i'm mistaken. | 14:22 |
mgariepy | ha. it's not in the cinder slice :/ | 14:24 |
mgariepy | it's in the uwsgi slice... | 14:25 |
mgariepy | fun. | 14:25 |
foutatoro | so I have to run 'openstack-ansible os-cinder-install.yml' ? | 14:26 |
mgariepy | systemctl status cinder-api.service | 14:27 |
mgariepy | `journalctl -u cinder-api.service -p 3 -n 100` | 14:29 |
mgariepy | foutatoro, i don't think running playbook will help you there. | 14:33 |
mgariepy | it's better to try to find the root cause | 14:33 |
mgariepy | even more since this is a pre-prod system. you can take the time to debug it. it's not like when the produciton cluster have issues | 14:34 |
foutatoro | cinder-api status: https://paste.openstack.org/show/bDXX5EswGiW5HqJ1Ietk/ | 14:39 |
foutatoro | mgariepy: I don't know why this message "http://infra2-cinder-api-container-632dfbb4:8776/ returned with HTTP 300" but "wget http://infra2-cinder-api-container-632dfbb4:8776/" works fine on both containers | 14:43 |
mgariepy | 300 is multiple choice from haproxy check i think. | 14:43 |
lowercase | http 300 just means multiple choice, meaning that the url isn't a terminating url and there are multiple url paths that it can follow. i would try curl -L <url> and see if you get a 200 from that | 14:44 |
lowercase | but can you perform the same command with -p 3 appended to it | 14:45 |
lowercase | your api server is clearly running, processing requests. Howevor, since the issue is a rabbit or database issue, i want to see if the api service is complaining about either of those. | 14:46 |
foutatoro | lowercase: right, curl -L works but adding -p 3 makes the request not terminate | 14:52 |
lowercase | -p 3 to the journactl command lol | 14:52 |
foutatoro | my bad | 14:52 |
lowercase | `journalctl -u cinder-api.service -p 3 -n 100` | 14:52 |
*** dviroel is now known as dviroel|lunch|afk | 14:53 | |
foutatoro | lowercase: journal shows logs of yesterday | 14:55 |
foutatoro | https://paste.openstack.org/show/bgQAuugtt5ERJVxobuIi/ | 14:55 |
noonedeadpunk | #startmeeting openstack_ansible_meeting | 15:00 |
opendevmeet | Meeting started Tue May 10 15:00:29 2022 UTC and is due to finish in 60 minutes. The chair is noonedeadpunk. Information about MeetBot at http://wiki.debian.org/MeetBot. | 15:00 |
opendevmeet | Useful Commands: #action #agreed #help #info #idea #link #topic #startvote. | 15:00 |
opendevmeet | The meeting name has been set to 'openstack_ansible_meeting' | 15:00 |
noonedeadpunk | #topic rollcall | 15:00 |
noonedeadpunk | o/ | 15:00 |
noonedeadpunk | well, I'm actually semi-around | 15:01 |
mgariepy | hey o/ | 15:01 |
jrosser | hello o/ | 15:02 |
noonedeadpunk | #topic office hours | 15:05 |
ebbex | o/ | 15:05 |
noonedeadpunk | I will be honest - I done nothing. I can't even recall what I was doing whole week... | 15:05 |
noonedeadpunk | Likely side-effect after moving to the new place... | 15:06 |
noonedeadpunk | jrosser: you had some issues with merging repo stuff - should we discuss it? | 15:06 |
jrosser | oh yes, i left it alone for a few days | 15:06 |
jrosser | but i think it's got all a bit circular | 15:07 |
noonedeadpunk | We can always disable CI to land that... | 15:07 |
jrosser | well maybe a couple of things to look at first | 15:07 |
damiandabrowski[m] | hi! | 15:08 |
jrosser | the glusterfs filesystem does not exist until we merge this https://review.opendev.org/c/openstack/openstack-ansible/+/837589/13/playbooks/repo-install.yml | 15:08 |
jrosser | the repo_install playbook needs updating to create it, as the current use of serial: breaks the installation | 15:08 |
jrosser | the tasks cannot be serial for those parts | 15:09 |
jrosser | but then logically the next patch to merge (until I thought about it) was this https://review.opendev.org/c/openstack/openstack-ansible-repo_server/+/839411 | 15:09 |
jrosser | but i don't think thats ever going to pass without the first one | 15:09 |
jrosser | as the fs will not exist | 15:09 |
jrosser | i think rather than circular patches, i mean its very hard to get everything to pass in CI without making it circular | 15:11 |
noonedeadpunk | I also left comment for https://review.opendev.org/c/openstack/openstack-ansible/+/837589/13/ansible-collection-requirements.yml#40 just in case :) | 15:12 |
jrosser | ah yes i saw that | 15:12 |
jrosser | i got kind of diverted by playing with skyline | 15:13 |
jrosser | but we should try to get this gluster stuff merged becasue it is a big change and needs some testing for real | 15:13 |
noonedeadpunk | yeah | 15:13 |
jrosser | deleting / re-creating repo server containers has some subletlies now, for example | 15:13 |
noonedeadpunk | Btw regarding rbac topic - I guess there's no reall need to do changes this release since cinder/heat are still not ready | 15:14 |
noonedeadpunk | But I'd rather introduced service role anyway, despite discussions about it are still ongoing | 15:14 |
noonedeadpunk | we can suggest dropping all repo containers at once for example as well | 15:15 |
noonedeadpunk | as that should be fine I guess? | 15:15 |
jrosser | i mount /openstack/glusterfs in the current patches | 15:15 |
noonedeadpunk | As there's nothing _really_ important anyway | 15:15 |
jrosser | as there is UUID need to be preserved, else you can't re-create/join the cluster properly | 15:15 |
jrosser | so sometimes you need to keep that, sometimes you need to delete it | 15:15 |
jrosser | depends if you want to destroy the fs and start again, or to keep it | 15:16 |
noonedeadpunk | I actually thought it will get removed with force_containers_data_destroy ? | 15:16 |
jrosser | that does not seem to understand whatever bind mounts get made | 15:16 |
noonedeadpunk | But not sure | 15:16 |
jrosser | however in this case, i think that preserving it is the right thing to do for multinode | 15:17 |
jrosser | there is also an impact on re-deploying an infra node | 15:17 |
jrosser | having said all this - i really would like other eyes / opinions on it | 15:18 |
noonedeadpunk | yep, fair | 15:20 |
noonedeadpunk | another thing - do we want to have a presentation about project updates? | 15:20 |
noonedeadpunk | THere's no dedicated event during summit for that, but still marketing has some plan how to promote these | 15:20 |
noonedeadpunk | Basically they asked for a video 10mins tops to say about changes that were made lately | 15:21 |
jrosser | i guess we would have to look back over the etherpads to see what we did / did not do | 15:22 |
noonedeadpunk | yup, agree | 15:23 |
noonedeadpunk | I will try to put smth into other etherpad so we could review topics next week | 15:23 |
damiandabrowski[m] | okok, great | 15:24 |
noonedeadpunk | ok, what else we have on plate? | 15:26 |
noonedeadpunk | Except tons of stuff that needs to land? | 15:27 |
jrosser | hmmm yes - reviews / merging of lots of things | 15:27 |
jrosser | i should also say that i have done a proof-of-concept with the alternative dashboard, skyline | 15:28 |
jrosser | and it's ummmm - interesting to deploy | 15:28 |
noonedeadpunk | I can imagine, as it's nodejs iirc? | 15:29 |
noonedeadpunk | at least frontend part of it | 15:29 |
jrosser | there is a python part 'apiserver' then nodejs things for 'console' | 15:30 |
damiandabrowski[m] | i will spend some time on reviews tomorrow | 15:30 |
jrosser | and imho the code is very docker / kolla centric | 15:30 |
jrosser | and also confuses the service code with deployment tooling, as theres a executable to generate the required nginx config /o\ | 15:30 |
jrosser | but i think this is an opportunity to influence the skyline development to support wider tools and deployments | 15:32 |
jrosser | debugging this is really on the edge of my understanding though, so if anyone is interested with web development skills then please help out :) | 15:33 |
noonedeadpunk | tool to generate nginx config sounds as it sounds ofc... | 15:42 |
noonedeadpunk | And also I saw there's no SSO support atm | 15:42 |
noonedeadpunk | So I'd say they have plenty gaps as of todayy | 15:42 |
noonedeadpunk | but you're right, we'd better chime-in earlier then later | 15:43 |
jrosser | theres kind of two parts i think - ansible'ing up the deployment, pretty much whatever-it-takes to make it work | 15:44 |
jrosser | then work on tidying that all up and making it all more OSA-like | 15:45 |
noonedeadpunk | #endmeeting | 16:00 |
opendevmeet | Meeting ended Tue May 10 16:00:34 2022 UTC. Information about MeetBot at http://wiki.debian.org/MeetBot . (v 0.1.4) | 16:00 |
opendevmeet | Minutes: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-05-10-15.00.html | 16:00 |
opendevmeet | Minutes (text): https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-05-10-15.00.txt | 16:00 |
opendevmeet | Log: https://meetings.opendev.org/meetings/openstack_ansible_meeting/2022/openstack_ansible_meeting.2022-05-10-15.00.log.html | 16:00 |
*** ysandeep|rover is now known as ysandeep|out | 16:23 | |
opendevreview | Dmitriy Rabotyagov proposed openstack/openstack-ansible-os_octavia master: Make octavia_provider_network better configurable https://review.opendev.org/c/openstack/openstack-ansible-os_octavia/+/787336 | 16:46 |
*** dviroel|lunch|afk is now known as dviroel\ | 18:53 | |
*** dviroel\ is now known as dviroel | 18:53 | |
*** dviroel is now known as dviroel|out | 21:22 |
Generated by irclog2html.py 2.17.3 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!