*** cheng1 has joined #airshipit | 01:07 | |
*** cheng1 has quit IRC | 03:21 | |
*** cheng1 has joined #airshipit | 03:58 | |
*** juhak has quit IRC | 04:51 | |
*** juhak has joined #airshipit | 04:52 | |
openstackgerrit | Tin Lam proposed openstack/airship-pegleg master: trivial: fix yapf/pep8 interaction failing on logical operator https://review.openstack.org/619936 | 05:07 |
---|---|---|
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging https://review.openstack.org/633873 | 06:10 |
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-drydock master: End user logging for audit traceabilty https://review.openstack.org/638115 | 06:11 |
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging https://review.openstack.org/633873 | 06:31 |
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-in-a-bottle master: Document End user optional header https://review.openstack.org/642999 | 06:33 |
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-in-a-bottle master: Document End user optional header https://review.openstack.org/642999 | 06:33 |
*** cheng1 has quit IRC | 06:38 | |
*** cheng1 has joined #airshipit | 06:38 | |
*** cheng1_ has joined #airshipit | 07:23 | |
*** cheng1 has quit IRC | 07:25 | |
*** licanwei has joined #airshipit | 08:23 | |
*** cheng1_ has quit IRC | 08:37 | |
*** roman_g has joined #airshipit | 08:42 | |
*** juhak has quit IRC | 08:54 | |
*** juhak has joined #airshipit | 08:54 | |
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging https://review.openstack.org/633873 | 09:44 |
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging https://review.openstack.org/633873 | 09:47 |
*** cheng1_ has joined #airshipit | 09:58 | |
*** cheng1_ has quit IRC | 12:06 | |
*** juhak has quit IRC | 12:54 | |
*** juhak has joined #airshipit | 12:54 | |
*** aaronsheffield has joined #airshipit | 12:57 | |
*** irclogbot_3 has joined #airshipit | 13:27 | |
*** altlogbot_0 has quit IRC | 13:31 | |
*** altlogbot_0 has joined #airshipit | 13:31 | |
*** irclogbot_3 has quit IRC | 13:38 | |
*** irclogbot_2 has joined #airshipit | 13:38 | |
*** kranthikirang has joined #airshipit | 13:41 | |
*** michael-beaver has joined #airshipit | 13:48 | |
evgenyl | kranthikirang: Thanks for submitting a patch! I've checked it, just a few small comments. | 14:01 |
evgenyl | kranthikirang: Yes, you will need to configure the vlans manually for genesis | 14:01 |
openstackgerrit | Dimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship https://review.openstack.org/642171 | 14:10 |
openstackgerrit | Aaron Sheffield proposed openstack/airship-deckhand master: Updating Docker Gate use of zuul.newrev https://review.openstack.org/645825 | 14:21 |
openstackgerrit | Merged openstack/airship-armada master: Support in Armada for locking Tiller https://review.openstack.org/632483 | 14:29 |
openstackgerrit | kranthi kiran guttikonda proposed openstack/airship-treasuremap master: Fix install 4.15.0-34-generic https://review.openstack.org/645951 | 14:42 |
kranthikirang | evgenyl: I am seeing an exception between Armada and tiller while running ./genesis.sh script; Script keep trying to deploy mariadb, rabbitmq and ingress and failing or being timed out and trying to same thing over and over | 14:48 |
kranthikirang | http://paste.openstack.org/show/748322/ | 14:48 |
openstackgerrit | Evgeniy L proposed openstack/airship-in-a-bottle master: Add a comment to clarify ingress requirements for MariaDB https://review.openstack.org/647496 | 14:49 |
openstackgerrit | Smruti Soumitra Khuntia proposed openstack/airship-shipyard master: User context tracing through logging https://review.openstack.org/633873 | 14:50 |
mattmceuen | kranthikirang: what do you see in the mariadb logs or pod descriptions? | 14:50 |
kranthikirang | mattmceuen: When I checked on Friday I see pods were running and but it took a while for them to come up | 14:50 |
kranthikirang | mattmceuen: I will try running genesis.sh again | 14:51 |
evgenyl | Hi everyone, please help with reviews/merges for AIAB gate fix https://review.openstack.org/#/c/644634/ | 14:52 |
mattmceuen | kranthikirang: sounds good. It'll either be a case of "mariadb is taking a long time because the environment is a bit slow", in which case maybe you should increase timeouts; or, it'll be "taking a long time because something is wrong" in which case the root cause will just need to be troubleshot | 14:53 |
kranthikirang | mattmceuen: How can I increse the timeouts? I am using HP gen9 v4 servers | 14:53 |
mattmceuen | All the timeout / waiting type stuff lives in the armada charts for the different deployed components. I think this is the one you'd need to tweak: https://github.com/openstack/airship-treasuremap/blob/master/global/software/charts/ucp/core/mariadb.yaml#L57 | 14:56 |
kranthikirang | mattmceuen: since I have already generated the collectd and bundlem I guess I have to modify airship-treasuremap.yaml | 14:59 |
mattmceuen | kranthikirang: Since Airship is a declarative platform, the best thing to do is to modify the original source documents (or override at the type or site level), and then re-collect them fresh, and generate a new genesis.sh. Modifying the collected documents should work fine, but then the deployed site won't match the declarative intent in your git repo | 15:03 |
kranthikirang | mattmceuen: totally agree; I see in genesis its creating /etc/genesis/armada folder where its keeping all the manifests as well; I guess updating collectd manifest alone will not resolve; | 15:05 |
kranthikirang | mattmceuen: WIll update the original site documents as well | 15:05 |
mattmceuen | Awesome - let me know how the timeout update goes | 15:06 |
openstackgerrit | Lev Morgan proposed openstack/airship-pegleg master: Fix multiple I/O issues in cert generation https://review.openstack.org/643678 | 15:09 |
openstackgerrit | Merged openstack/airship-in-a-bottle master: Fix AIAB gate Heat test & MariaDB failures https://review.openstack.org/644634 | 15:10 |
openstackgerrit | Alexander Hughes proposed openstack/airship-pegleg master: PKI Cert generation and check updates https://review.openstack.org/639414 | 15:14 |
kranthikirang | mattmceuen: re-running ./gesis.sh seems very costly; it restarts the docker thus restarting everything as container load; What is the follow you follow to update anything in middle? build collectd, bundle and re-run ./gesis.sh? | 15:22 |
openstackgerrit | Lev Morgan proposed openstack/airship-pegleg master: [DNM] Added cleartext option to passphrase generation https://review.openstack.org/645017 | 15:28 |
openstackgerrit | Rick Bartra proposed openstack/airship-divingbell master: Update documentation based on change to using unprivileged containers https://review.openstack.org/647510 | 15:33 |
openstackgerrit | Lev Morgan proposed openstack/airship-pegleg master: Added document wrapping command https://review.openstack.org/644637 | 15:42 |
openstackgerrit | Aaron Sheffield proposed openstack/airship-deckhand master: [WIP] Updating Docker Gate use of zuul.newrev https://review.openstack.org/645825 | 16:09 |
openstackgerrit | Merged openstack/airship-pegleg master: Set salt when generating genesis bundle https://review.openstack.org/642848 | 16:16 |
openstackgerrit | Merged openstack/airship-pegleg master: trivial: fix yapf/pep8 interaction failing on logical operator https://review.openstack.org/619936 | 16:21 |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada https://review.openstack.org/639207 | 16:22 |
mattmceuen | kranthikirang: yep, if genesis needs to be re-run with updated manifests, then we run those steps again (via automation). Once the genesis process has completed, though, you can generally push additional changes to the site via update_site or update_software APIs on Shipyard | 16:34 |
openstackgerrit | Merged openstack/airship-in-a-bottle master: Add a comment to clarify ingress requirements for MariaDB https://review.openstack.org/647496 | 16:44 |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell https://review.openstack.org/641706 | 17:16 |
openstackgerrit | Evgeniy L proposed openstack/airship-in-a-bottle master: [WIP][DNM] debug patch https://review.openstack.org/647567 | 18:06 |
openstackgerrit | Dimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship https://review.openstack.org/642171 | 18:07 |
*** ukk1985 has joined #airshipit | 18:11 | |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell https://review.openstack.org/641706 | 18:34 |
openstackgerrit | Dimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship https://review.openstack.org/642171 | 18:35 |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell https://review.openstack.org/641706 | 18:45 |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada https://review.openstack.org/639207 | 19:04 |
*** rihbb has joined #airshipit | 19:07 | |
rihbb | Hi, when i am using update_software shipyard script to make updates to my current site, the ceph-rgw chart seems to fail ("Failed to apply manifest: Exception deploying charts: ['tenant-ceph-rgw']"). Is there some ceph related clean up that one needs to do before running update_software?Thanks! | 19:10 |
evgenyl | rihbb: I don't think that there is anything specific that needs to be run, I did quite a few updates, and have not seen problems with tenant-ceph-rgw, check around this message in armada-api logs, you may be able to find more details, and also check rgw pods in tenant-ceph namespace, maybe you will be able to identify some of them being stuck in init/error states. | 19:14 |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell https://review.openstack.org/641706 | 19:14 |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada https://review.openstack.org/639207 | 19:17 |
*** sthussey has joined #airshipit | 19:18 | |
rihbb | evgenyl: Hi, The armada logs show timeout related errors - https://paste.ubuntu.com/p/XFjFBwYPKG/. There are no ceph-rgw pods in tenant-ceph namespace, however the ceph-rgw pods in openstack namespaces seem to be restarting with no error https://paste.ubuntu.com/p/MfVhTfhHXC/. All the other pods in tenant-ceph namespace seem to be running & deployed properly. Any idea of what could have gone wrong? | 19:18 |
rihbb | All that I changed before running update_software are the versions of openstack component images (I didnt touch any ceph related config). | 19:20 |
evgenyl | rihbb: Can you show the output of `helm history airship-tenant-ceph-rgw`? | 19:24 |
evgenyl | rihbb: I suspect this is just some timeout related error which just requires re-apply, but let's first check a few things. | 19:24 |
rihbb | REVISION UPDATED STATUS CHART DESCRIPTION | 19:25 |
rihbb | 1 Mon Mar 25 09:41:21 2019 FAILED ceph-rgw-0.1.0 Release "airship-tenant-ceph-rgw" failed: timed out waiti... | 19:25 |
openstackgerrit | Rahul Khiyani proposed openstack/airship-drydock master: Drydock: Add pod/container security context https://review.openstack.org/639197 | 19:26 |
evgenyl | rihbb: And now `helm history -o yaml airship-tenant-ceph-rgw` to see a complete description of the problem. | 19:27 |
rihbb | evgenyl: That also shows timeout condition: | 19:28 |
rihbb | description: 'Release "airship-tenant-ceph-rgw" failed: timed out waiting for the | 19:28 |
rihbb | condition' | 19:28 |
rihbb | revision: 1 | 19:28 |
rihbb | status: FAILED | 19:28 |
evgenyl | rihbb: `kubectl get pods -o wide --all-namespaces | grep rgw` | 19:30 |
rihbb | evgenyl: https://paste.ubuntu.com/p/m94RV47try/ | 19:34 |
rihbb | logs of the ceph-rgw pod in crashloopback state: https://paste.ubuntu.com/p/MfVhTfhHXC/ | 19:35 |
evgenyl | rihbb: Are these all the logs? Can you try to run `kubectl logs` with `-f` key and wait until the next round of crash happens? | 19:36 |
evgenyl | rihbb: Just to make sure that we catch an actual error that caused the crash. | 19:37 |
rihbb | evgenyl: kubectl logs command exits with the same log ^ when the new restart happens; but kubectl describe shows Warning Unhealthy 14m (x1872 over 9h) kubelet, node-4 Readiness probe failed: Get http://10.97.232.89:8088/: dial tcp 10.97.232.89:8088: getsockopt: connection refused | 19:41 |
openstackgerrit | Rahul Khiyani proposed openstack/airship-maas master: Maas: Add pod/container security context https://review.openstack.org/639200 | 19:51 |
evgenyl | rihbb: Hmm, so it fails on `ceph version 13.2.2` without any other messages? | 19:52 |
evgenyl | ^ I means on printing its version? | 19:53 |
rihbb | evgenyl: Yes (2019-03-25 19:49:51.716 7fa3387988c0 0 ceph version 13.2.2 (02899bfda814146b021136e9d8e80eba494e1126) mimic (stable), process radosgw, pid 17) | 19:54 |
evgenyl | rihbb: Can you try to force to recreate e.g. `ceph-rgw-679c47b9dd-klwm5` pod by deleting it using `kubectl delete -n openstack ceph-rgw-679c47b9dd-klwm5` and waiting until it gets recreated using a different name and following the logs when it is started. | 19:59 |
openstackgerrit | Rahul Khiyani proposed openstack/airship-drydock master: Drydock: Add pod/container security context https://review.openstack.org/639197 | 20:02 |
openstackgerrit | Rahul Khiyani proposed openstack/airship-maas master: Maas: Add pod/container security context https://review.openstack.org/639200 | 20:03 |
rihbb | evgenyl: This is how the logs look like once it gets created for the first time: https://paste.ubuntu.com/p/wpYY62QPMD/. | 20:04 |
evgenyl | rihbb: Has it crashed? | 20:05 |
rihbb | evgenyl: yes | 20:05 |
evgenyl | rihbb: Can you now show the output of `kubectl describe ...`? | 20:05 |
openstackgerrit | Alexander Hughes proposed openstack/airship-pegleg master: PKI Cert generation and check updates https://review.openstack.org/639414 | 20:07 |
rihbb | evgenyl: Sure, https://paste.ubuntu.com/p/H8bjtDbPrQ/ | 20:09 |
openstackgerrit | Dimitrios Markou proposed openstack/airship-in-a-bottle master: Add bgp peering in virtual airship https://review.openstack.org/642171 | 20:12 |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-armada master: Implement Security Context for Armada https://review.openstack.org/639207 | 20:12 |
openstackgerrit | Alexander Hughes proposed openstack/airship-pegleg master: PKI Cert generation and check updates https://review.openstack.org/639414 | 20:26 |
evgenyl | rihbb: It's hard to tell what is going on, I don't think that failed readiness probe causes crashes, I'm wondering if we can increase logging level for rgw, but I'm surprised that it fails silently with no messages. | 20:27 |
evgenyl | rihbb: Can you provide a bit more details on what changes have you applied to he manifests? | 20:27 |
openstackgerrit | Stas Egorov proposed openstack/airship-in-a-bottle master: Fixed sudo env vars for apt https://review.openstack.org/647598 | 20:34 |
openstackgerrit | Rahul Khiyani proposed openstack/airship-shipyard master: Shipyard and Airflow: Add pod/container security context https://review.openstack.org/639195 | 20:40 |
*** mfuller_ has quit IRC | 20:42 | |
*** mcfuller has joined #airshipit | 20:42 | |
openstackgerrit | Rahul Khiyani proposed openstack/airship-shipyard master: Shipyard and Airflow: Add pod/container security context https://review.openstack.org/639195 | 20:43 |
mcfuller | Hello, I was curious about the absence of a global tempest chart in treasuremap. Is the preferred method for tempest testing to set the run_tempest flag as a value override for individual helm / armada charts? | 20:44 |
rihbb | evgenyl: For this case, I had modified versions.yaml - https://paste.ubuntu.com/p/s3JTFBmwGB/ | 20:51 |
*** ukk1985 has quit IRC | 20:52 | |
evgenyl | rihbb: The only idea I have right now, is to try to increase logging level, you can do that using `kubectl edit configmap -n openstack ceph-rgw-etc` you will need to add `debug_rgw = 10/5` into globals section, see http://docs.ceph.com/docs/mimic/rados/troubleshooting/log-and-debug/ for details. | 20:56 |
rihbb | evgenyl: Thanks, will try that. | 21:00 |
rihbb | evgenyl: The logs after increasing logging level (& before entering restart mode) look like this - https://paste.ubuntu.com/p/QCYZFwyq8N/. | 21:15 |
rihbb | No error message as such in the entire log. | 21:16 |
openstackgerrit | Lev Morgan proposed openstack/airship-pegleg master: Added DeploymentData document generation https://review.openstack.org/647615 | 21:21 |
evgenyl | rihbb: This is very strange :) Can you try checking /var/log/syslog (on the node where rgw fails) and see if there is anything interesting related to rgw pod? | 21:23 |
openstackgerrit | Scott Hussey proposed openstack/airship-in-a-bottle master: (multinode) Make disk layout flexible https://review.openstack.org/638040 | 21:30 |
openstackgerrit | Scott Hussey proposed openstack/airship-in-a-bottle master: Network enhancements for gate-multinode https://review.openstack.org/634837 | 21:30 |
kranthikirang | mattmceuen: I have increased the timeouts to 600 for mariadb and rabbit and observed the same failure; After inspecting the logs I see two reasons for the failure; | 21:34 |
kranthikirang | mattmceuen: http://paste.openstack.org/show/748340/ - rabbitmq logs | 21:35 |
kranthikirang | mattmceuen: http://paste.openstack.org/show/748341/ - mariadb logs | 21:36 |
kranthikirang | mattmceuen: I also see readinessProbe failing for both the pods; With in 600 seconds these didn't become alive hence Armada giving failures; Can you help me to find the root cause for these two? Also on how to change ReadinessProbe value for a chaert? I have deployed rabbitmq directly using openstack-helm-infra charts but never encountered these failures | 21:37 |
kranthikirang | mattmceuen: probably my HP gen9 v4 isn't sufficient but that's weird since its a 56 CPU host and with 256GB memory | 21:39 |
openstackgerrit | Scott Hussey proposed openstack/airship-in-a-bottle master: [WIP] Network enhancements for gate-multinode https://review.openstack.org/634837 | 21:48 |
*** rihbb has left #airshipit | 21:49 | |
openstackgerrit | PRATEEK REDDY DODDA proposed openstack/airship-divingbell master: Implement Security Context for Divingbell https://review.openstack.org/641706 | 21:56 |
openstackgerrit | Georg Kunz proposed openstack/airship-in-a-bottle master: [WIP] Configuration for testing DPDK in multi-node AIAB https://review.openstack.org/634207 | 21:58 |
*** michaelbeaver has joined #airshipit | 22:00 | |
*** michael-beaver has quit IRC | 22:04 | |
openstackgerrit | Rahul Khiyani proposed openstack/airship-promenade master: Add pod/container security context https://review.openstack.org/639189 | 22:06 |
*** michaelbeaver has quit IRC | 22:06 | |
openstackgerrit | Rahul Khiyani proposed openstack/airship-shipyard master: Shipyard and Airflow: Add pod/container security context https://review.openstack.org/639195 | 22:18 |
openstackgerrit | Anthony Bellino proposed openstack/airship-divingbell master: [WIP] Initial Ansible Daemonset https://review.openstack.org/640539 | 22:38 |
*** kranthikirang has quit IRC | 22:39 | |
openstackgerrit | Rahul Khiyani proposed openstack/airship-maas master: Maas: Add pod/container security context https://review.openstack.org/639200 | 22:42 |
*** aaronsheffield has quit IRC | 22:56 | |
*** sthussey has quit IRC | 23:57 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!