openstackgerrit | Adrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle https://review.opendev.org/744592 | 00:06 |
---|---|---|
*** xiaolin has joined #opendev | 01:32 | |
*** stephenfin has quit IRC | 02:04 | |
*** stephenfin has joined #opendev | 02:08 | |
*** xiaolin has quit IRC | 05:27 | |
*** ysandeep is now known as ysandeep|afk | 05:52 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder master: Copy apt gpg keys directly into trusted.gpg.d https://review.opendev.org/747810 | 05:59 |
*** DSpider has joined #opendev | 07:21 | |
*** zbr has quit IRC | 07:23 | |
*** zbr has joined #opendev | 07:24 | |
*** tosky has joined #opendev | 07:35 | |
*** moppy has quit IRC | 08:01 | |
*** moppy has joined #opendev | 08:01 | |
*** hashar has joined #opendev | 08:03 | |
*** ysandeep|afk is now known as ysandeep | 08:15 | |
*** kevinz has joined #opendev | 08:32 | |
kevinz | ianw: Thanks for letting me know https://review.opendev.org/#/c/747063/ , I will take a look at that | 08:32 |
*** ysandeep is now known as ysandeep|lunch | 09:32 | |
*** stephenfin has quit IRC | 09:55 | |
*** stephenfin has joined #opendev | 09:56 | |
*** stephenfin has quit IRC | 10:40 | |
*** stephenfin has joined #opendev | 10:41 | |
*** stephenfin has quit IRC | 10:47 | |
*** xiaolin has joined #opendev | 10:47 | |
*** stephenfin has joined #opendev | 10:56 | |
*** ysandeep|lunch is now known as ysandeep | 11:01 | |
*** stephenfin has quit IRC | 11:02 | |
*** stephenfin has joined #opendev | 11:02 | |
*** stephenfin has quit IRC | 11:07 | |
*** stephenfin has joined #opendev | 11:09 | |
*** stephenfin has quit IRC | 11:16 | |
*** stephenfin has joined #opendev | 11:21 | |
*** stephenfin has quit IRC | 11:25 | |
*** stephenfin has joined #opendev | 11:38 | |
*** stephenfin has quit IRC | 11:43 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Partial address ansible-lint E208 https://review.opendev.org/748480 | 11:49 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: WIP: E208 work https://review.opendev.org/748606 | 11:50 |
*** xiaolin has quit IRC | 11:57 | |
*** stephenfin has joined #opendev | 11:59 | |
*** stephenfin has quit IRC | 12:04 | |
openstackgerrit | Adrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle https://review.opendev.org/744592 | 12:06 |
*** stephenfin has joined #opendev | 12:11 | |
*** stephenfin has quit IRC | 12:15 | |
*** stephenfin has joined #opendev | 12:23 | |
*** stephenfin has quit IRC | 12:28 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Partial address ansible-lint E208 https://review.opendev.org/748480 | 12:46 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Partial address ansible-lint E208 https://review.opendev.org/748480 | 13:12 |
openstackgerrit | Adrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle https://review.opendev.org/744592 | 13:22 |
*** ysandeep is now known as ysandeep|brb | 13:23 | |
*** ysandeep|brb is now known as ysandeep | 13:26 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: WIP: E208 work https://review.opendev.org/748606 | 13:26 |
openstackgerrit | Adrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle https://review.opendev.org/744592 | 14:02 |
*** stephenfin_ has joined #opendev | 14:14 | |
*** stephenfin_ is now known as stephenfin | 14:37 | |
*** ysandeep is now known as ysandeep|away | 14:38 | |
yoctozepto | morning | 14:41 |
yoctozepto | any idea why https://review.opendev.org/711601 says failed when all tasks are green? | 14:41 |
AJaeger | yoctozepto: I see " kolla-ansible-centos-source-c7-8-migration FAILURE in 2h 12m 59s" on 711601 | 14:43 |
AJaeger | yoctozepto: Oh, you mean tasks inside? | 14:44 |
AJaeger | yoctozepto: I see no "run" playbook logs, only post and pre | 14:44 |
openstackgerrit | Denis proposed zuul/zuul-jobs master: [terraform role] Add option to set the color arg on terraform commands https://review.opendev.org/748679 | 14:46 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Add managed jobs to periodic https://review.opendev.org/748682 | 14:47 |
AJaeger | yoctozepto: I'm not seeing actually the problem ;8 | 14:47 |
clarkb | usually that means the run playbook timed out | 14:49 |
clarkb | chwck the job-output.txt | 14:49 |
yoctozepto | clarkb: ah, thanks, though I expected TIMED_OUT ;d | 14:50 |
yoctozepto | AJaeger: clarkb seems to be right based on the total run time | 14:50 |
yoctozepto | but it is confusing | 14:50 |
AJaeger | indeed | 14:52 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add openstack-ansible/os_senlin role https://review.opendev.org/748683 | 14:53 |
yoctozepto | it still says | 14:53 |
yoctozepto | 2020-08-26 16:19:58.267731 | RUN END RESULT_NORMAL: [untrusted : opendev.org/openstack/kolla-ansible/tests/run.yml@stable/train] | 14:53 |
yoctozepto | in the log | 14:53 |
yoctozepto | but yeah, it's crippled: | 14:54 |
yoctozepto | 2020-08-26 16:19:58.141215 | TASK [Run check-logs.sh script executable=/bin/bash, chdir={{ kolla_ansible_src_dir }}, _raw_params=check-logs.sh] *** | 14:54 |
yoctozepto | 2020-08-26 16:19:58.267731 | RUN END RESULT_NORMAL: [untrusted : opendev.org/openstack/kolla-ansible/tests/run.yml@stable/train] | 14:54 |
yoctozepto | the timeout is set to 9000s so it has not been hit - I guess it's some opendev-wide mechanism to kill long jobs? clarkb | 14:57 |
*** qchris has quit IRC | 14:57 | |
clarkb | I forget what the ceiling is but therr is one | 14:58 |
clarkb | run end result normal doesnt look like a timeout though | 14:58 |
*** hashar has quit IRC | 15:01 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/system-config master: Add zuul-jobs-failures list https://review.opendev.org/748688 | 15:01 |
*** qchris has joined #opendev | 15:09 | |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add openstack-ansible/os_senlin role https://review.opendev.org/748683 | 15:10 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add os_senlin to zuul projects https://review.opendev.org/748693 | 15:15 |
openstackgerrit | Ken Giusti proposed openstack/project-config master: Retire the devstack-plugin-zmq project https://review.opendev.org/745344 | 15:16 |
openstackgerrit | Ken Giusti proposed openstack/project-config master: Retire devstack-plugin-pika project https://review.opendev.org/745342 | 15:17 |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/system-config master: Refreshed ansible-lint config https://review.opendev.org/748695 | 15:22 |
openstackgerrit | Sorin Sbarnea (zbr) proposed openstack/project-config master: Assure periodic-weekly emails to zuul-jobs-failures https://review.opendev.org/748699 | 15:34 |
openstackgerrit | Sorin Sbarnea (zbr) proposed openstack/project-config master: Assure periodic-weekly emails to zuul-jobs-failures https://review.opendev.org/748699 | 15:42 |
openstackgerrit | Ken Giusti proposed openstack/project-config master: Retire devstack-plugin-pika project https://review.opendev.org/748705 | 15:48 |
openstackgerrit | Dmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add openstack-ansible/os_senlin role https://review.opendev.org/748683 | 15:52 |
*** fressi has joined #opendev | 15:56 | |
*** fressi has left #opendev | 16:05 | |
openstackgerrit | Ken Giusti proposed openstack/project-config master: Retire devstack-plugin-pika project https://review.opendev.org/748712 | 16:07 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: More E208 mode fixes https://review.opendev.org/748498 | 16:12 |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/base-jobs master: Set file modes explicitly https://review.opendev.org/748478 | 16:56 |
*** SotK has quit IRC | 17:03 | |
*** tosky has quit IRC | 17:03 | |
*** SotK has joined #opendev | 17:03 | |
*** tosky has joined #opendev | 17:04 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/base-jobs master: Set file modes explicitly https://review.opendev.org/748478 | 17:12 |
openstackgerrit | Ken Giusti proposed openstack/project-config master: Retire the devstack-plugin-zmq project https://review.opendev.org/745344 | 17:31 |
openstackgerrit | Ken Giusti proposed openstack/project-config master: Retire the devstack-plugin-zmq project https://review.opendev.org/748724 | 17:31 |
*** tosky has quit IRC | 17:44 | |
openstackgerrit | Sorin Sbarnea (zbr) proposed opendev/base-jobs master: Set file modes explicitly https://review.opendev.org/748478 | 17:50 |
clarkb | #status log Restarted zuul-web and zuul-fingergw to pick up web updates, primarly the one that fixes keyboard scrolling. | 17:58 |
openstackstatus | clarkb: finished logging | 17:58 |
clarkb | popping out for a bike ridenow that it appears zuul-web is stable | 18:45 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Remove dependency on pkg_resources https://review.opendev.org/748737 | 18:45 |
openstackgerrit | Ghanshyam Mann proposed opendev/irc-meetings master: Update Technical Committee office hours new time https://review.opendev.org/748740 | 18:57 |
openstackgerrit | Sorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Remove dependency on pkg_resources https://review.opendev.org/748737 | 19:00 |
smcginnis | Hmm, latest upgrade of zuul may have some issues. Jobs in queue that have hit retry_limit redirect to the zuul tenants page. | 21:24 |
clarkb | smcginnis: do they work if yo usearch them through the builds page? | 21:25 |
clarkb | (wondering how wide spread that problem is) | 21:25 |
smcginnis | Doesn't even show up if I try to filter by it on the Builds tab. | 21:27 |
smcginnis | As an example, looking at tempest-integrated-storage for patch 740384 in the gate queue right now. | 21:28 |
clarkb | they won't be available in builds until they report | 21:28 |
clarkb | (thats normal) | 21:28 |
clarkb | so I think it may not be the zuul-web update | 21:30 |
clarkb | the retry limits are pretty widespread, I think what may be happening is something is failing early enough in the job run that it doesn't really have a uuid? | 21:30 |
clarkb | er not that it doesn't have a uuid but that the zuul_return that sets log url doesn't run yet | 21:31 |
clarkb | which means it falls back on a default which isn't vald and ends up at the tenant page | 21:31 |
smcginnis | Several jobs failing with retry_limit for all the jobs in gate right now, so somethings definitely up. But I guess we'll see once the finally report. | 21:31 |
*** tosky has joined #opendev | 21:33 | |
clarkb | I'm trying to pull up console logs and see if I can catch one | 21:34 |
corvus | i'm looking in logs | 21:34 |
corvus | the one i picked at random just failed with this task in a pre-playbook: TASK [Run bootstrap-aio script chdir=src/opendev.org/openstack/openstack-ansible, executable=/bin/bash, _raw_params=scripts/bootstrap-aio.sh] *** | 21:36 |
corvus | that was build eec9a18aa0e1452891eadd56e7dfba44 of openstack-ansible-deploy-aio_distro_metal-opensuse-15 | 21:37 |
clarkb | it seems to be affecting a large variety of jobs which would make a specific thing like that weird, unless its like network connectivity? | 21:37 |
clarkb | https://zuul.opendev.org/t/openstack/build/82fcb124e6c24e178f6ffd82b3ae5597/logs I'm going to try and find that one on an executor | 21:38 |
clarkb | "took 1 sec" | 21:39 |
corvus | ze08 has a read only fs | 21:39 |
corvus | i may have just had the bad luck to find a real failure on my first attempt; the second sent me to ze08 where i found the readonly fs | 21:39 |
clarkb | that would do it | 21:39 |
clarkb | corvus: I guess stop zuul-executor there then reboot? | 21:40 |
clarkb | let me see if there are any emails about that host | 21:40 |
corvus | yeah, i'll stop ze08 now, will hold for more investigation before rebooting | 21:40 |
smcginnis | Darn clouds. Just can't trust em. | 21:40 |
clarkb | ze04 has a message from a couple days ago, I don't see anything from rax about ze08 | 21:41 |
clarkb | smcginnis: thank you for reporting | 21:41 |
corvus | can't docker-compose down because of read-only fs | 21:42 |
clarkb | thats a fun behavior | 21:42 |
corvus | but the process is stopped | 21:42 |
smcginnis | I just have good timing I guess. | 21:42 |
corvus | i think it may have done the kill but not the cleanup | 21:42 |
clarkb | ah | 21:42 |
corvus | will be interesting to see what happens on boot | 21:42 |
clarkb | since we don't see anything from rax I would say rebooting is fine? its also easy enough to completely rebuild the host if it comes to htat | 21:42 |
corvus | [857571.914327] print_req_error: I/O error, dev xvde, sector 162962769 | 21:43 |
corvus | [857571.914441] Buffer I/O error on dev xvde2, logical block 18370346, lost sync page write | 21:44 |
clarkb | xvde is the ephemeral device (not a cinder volume) | 21:44 |
corvus | that's /var/lib/zuul | 21:44 |
corvus | both it and / are ro | 21:44 |
clarkb | I expect they are both on the hypervisor | 21:45 |
corvus | swap also affected (but it's also xvde) | 21:45 |
clarkb | corvus: did you down or stop the docker-compose containers ? I think stop will restart them on boot. We may want to use down if not yet done so that we can check the fs's before starting services? | 21:45 |
corvus | clarkb: i did down which failed | 21:45 |
corvus | so i'm real fuzzy on what will happen on boot :/ | 21:45 |
clarkb | I'm 99% sure down means the containers are removed entirely so there is nothing to start on boot | 21:46 |
clarkb | stop stop srunning the containers but they are still registered with docker so when docker starst they will start | 21:46 |
corvus | clarkb: right, but it failed | 21:46 |
clarkb | oh good point | 21:46 |
corvus | i have no other ideas of anything to do now before rebooting | 21:47 |
smcginnis | With ro, probably not much you can do at this point. | 21:47 |
clarkb | ya Ithink we should reboot and then check and see what it looks like when up again | 21:47 |
corvus | doing | 21:47 |
corvus | looks like it started the executor container/process | 21:50 |
clarkb | mount output looks good though | 21:50 |
corvus | it's cleaning up stale jobdirs | 21:50 |
corvus | starting | 21:55 |
clarkb | it seems happy? I hvaen't seen anythin gscroll by in the logs that jumps out yet at least | 21:57 |
corvus | ditto | 21:58 |
clarkb | #status log ze08 ended up with read only filesystems and has been rebooted. This resulted in many retry_limit errors. | 22:09 |
openstackstatus | clarkb: finished logging | 22:09 |
clarkb | corvus: do you think we should do a status notice that jobs can be rechecked? | 22:09 |
corvus | clarkb: maybe so, looks like there were a lot that hit the limit | 22:10 |
clarkb | what about #status notice A zuul server ended up with read only filesystems which caused many jobs to hit retry_limit. The server has been rebooted and appers happy. Jobs can be rechecked. | 22:11 |
corvus | clarkb: typo 'appers' but otherwise lgtm | 22:11 |
clarkb | #status notice A zuul server ended up with read only filesystems which caused many jobs to hit retry_limit. The server has been rebooted and appears happy. Jobs can be rechecked. | 22:12 |
openstackstatus | clarkb: sending notice | 22:12 |
clarkb | fixed the typi | 22:12 |
clarkb | and made a new one :) | 22:12 |
-openstackstatus- NOTICE: A zuul server ended up with read only filesystems which caused many jobs to hit retry_limit. The server has been rebooted and appears happy. Jobs can be rechecked. | 22:12 | |
clarkb | smcginnis: are there any openstack release related jobs we need to reenqueue into zuul? | 22:15 |
openstackstatus | clarkb: finished sending notice | 22:15 |
openstackgerrit | Merged opendev/irc-meetings master: Update Technical Committee office hours new time https://review.opendev.org/748740 | 22:20 |
clarkb | corvus: before you call it a week have a moment for https://review.opendev.org/#/c/748040/ I think that will get limestone working again for us | 22:50 |
corvus | done | 23:01 |
clarkb | tyty | 23:02 |
*** tosky has quit IRC | 23:13 | |
openstackgerrit | Merged opendev/system-config master: Update the limestone cert in our clouds.yaml https://review.opendev.org/748040 | 23:23 |
Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!