Friday, 2020-08-28

openstackgerritAdrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle  https://review.opendev.org/74459200:06
*** xiaolin has joined #opendev01:32
*** stephenfin has quit IRC02:04
*** stephenfin has joined #opendev02:08
*** xiaolin has quit IRC05:27
*** ysandeep is now known as ysandeep|afk05:52
openstackgerritIan Wienand proposed openstack/diskimage-builder master: Copy apt gpg keys directly into trusted.gpg.d  https://review.opendev.org/74781005:59
*** DSpider has joined #opendev07:21
*** zbr has quit IRC07:23
*** zbr has joined #opendev07:24
*** tosky has joined #opendev07:35
*** moppy has quit IRC08:01
*** moppy has joined #opendev08:01
*** hashar has joined #opendev08:03
*** ysandeep|afk is now known as ysandeep08:15
*** kevinz has joined #opendev08:32
kevinzianw: Thanks for letting me know  https://review.opendev.org/#/c/747063/ , I will take a look at that08:32
*** ysandeep is now known as ysandeep|lunch09:32
*** stephenfin has quit IRC09:55
*** stephenfin has joined #opendev09:56
*** stephenfin has quit IRC10:40
*** stephenfin has joined #opendev10:41
*** stephenfin has quit IRC10:47
*** xiaolin has joined #opendev10:47
*** stephenfin has joined #opendev10:56
*** ysandeep|lunch is now known as ysandeep11:01
*** stephenfin has quit IRC11:02
*** stephenfin has joined #opendev11:02
*** stephenfin has quit IRC11:07
*** stephenfin has joined #opendev11:09
*** stephenfin has quit IRC11:16
*** stephenfin has joined #opendev11:21
*** stephenfin has quit IRC11:25
*** stephenfin has joined #opendev11:38
*** stephenfin has quit IRC11:43
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Partial address ansible-lint E208  https://review.opendev.org/74848011:49
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: WIP: E208 work  https://review.opendev.org/74860611:50
*** xiaolin has quit IRC11:57
*** stephenfin has joined #opendev11:59
*** stephenfin has quit IRC12:04
openstackgerritAdrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle  https://review.opendev.org/74459212:06
*** stephenfin has joined #opendev12:11
*** stephenfin has quit IRC12:15
*** stephenfin has joined #opendev12:23
*** stephenfin has quit IRC12:28
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Partial address ansible-lint E208  https://review.opendev.org/74848012:46
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Partial address ansible-lint E208  https://review.opendev.org/74848013:12
openstackgerritAdrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle  https://review.opendev.org/74459213:22
*** ysandeep is now known as ysandeep|brb13:23
*** ysandeep|brb is now known as ysandeep13:26
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: WIP: E208 work  https://review.opendev.org/74860613:26
openstackgerritAdrian Turjak proposed openstack/project-config master: Fork Gnocchi back to openstack as Farfalle  https://review.opendev.org/74459214:02
*** stephenfin_ has joined #opendev14:14
*** stephenfin_ is now known as stephenfin14:37
*** ysandeep is now known as ysandeep|away14:38
yoctozeptomorning14:41
yoctozeptoany idea why https://review.opendev.org/711601 says failed when all tasks are green?14:41
AJaegeryoctozepto: I see "    kolla-ansible-centos-source-c7-8-migration FAILURE in 2h 12m 59s" on 71160114:43
AJaegeryoctozepto: Oh, you mean tasks inside?14:44
AJaegeryoctozepto: I see no "run" playbook logs, only post and pre14:44
openstackgerritDenis proposed zuul/zuul-jobs master: [terraform role] Add option to set the color arg on terraform commands  https://review.opendev.org/74867914:46
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Add managed jobs to periodic  https://review.opendev.org/74868214:47
AJaegeryoctozepto: I'm not seeing actually the problem ;814:47
clarkbusually that means the run playbook timed out14:49
clarkbchwck the job-output.txt14:49
yoctozeptoclarkb: ah, thanks, though I expected TIMED_OUT ;d14:50
yoctozeptoAJaeger: clarkb seems to be right based on the total run time14:50
yoctozeptobut it is confusing14:50
AJaegerindeed14:52
openstackgerritDmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add openstack-ansible/os_senlin role  https://review.opendev.org/74868314:53
yoctozeptoit still says14:53
yoctozepto2020-08-26 16:19:58.267731 | RUN END RESULT_NORMAL: [untrusted : opendev.org/openstack/kolla-ansible/tests/run.yml@stable/train]14:53
yoctozeptoin the log14:53
yoctozeptobut yeah, it's crippled:14:54
yoctozepto2020-08-26 16:19:58.141215 | TASK [Run check-logs.sh script executable=/bin/bash, chdir={{ kolla_ansible_src_dir }}, _raw_params=check-logs.sh] ***14:54
yoctozepto2020-08-26 16:19:58.267731 | RUN END RESULT_NORMAL: [untrusted : opendev.org/openstack/kolla-ansible/tests/run.yml@stable/train]14:54
yoctozeptothe timeout is set to 9000s so it has not been hit - I guess it's some opendev-wide mechanism to kill long jobs? clarkb14:57
*** qchris has quit IRC14:57
clarkbI forget what the ceiling is but therr is one14:58
clarkbrun end result normal doesnt look like a timeout though14:58
*** hashar has quit IRC15:01
openstackgerritSorin Sbarnea (zbr) proposed opendev/system-config master: Add zuul-jobs-failures list  https://review.opendev.org/74868815:01
*** qchris has joined #opendev15:09
openstackgerritDmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add openstack-ansible/os_senlin role  https://review.opendev.org/74868315:10
openstackgerritDmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add os_senlin to zuul projects  https://review.opendev.org/74869315:15
openstackgerritKen Giusti proposed openstack/project-config master: Retire the devstack-plugin-zmq project  https://review.opendev.org/74534415:16
openstackgerritKen Giusti proposed openstack/project-config master: Retire devstack-plugin-pika project  https://review.opendev.org/74534215:17
openstackgerritSorin Sbarnea (zbr) proposed opendev/system-config master: Refreshed ansible-lint config  https://review.opendev.org/74869515:22
openstackgerritSorin Sbarnea (zbr) proposed openstack/project-config master: Assure periodic-weekly emails to zuul-jobs-failures  https://review.opendev.org/74869915:34
openstackgerritSorin Sbarnea (zbr) proposed openstack/project-config master: Assure periodic-weekly emails to zuul-jobs-failures  https://review.opendev.org/74869915:42
openstackgerritKen Giusti proposed openstack/project-config master: Retire devstack-plugin-pika project  https://review.opendev.org/74870515:48
openstackgerritDmitriy Rabotyagov (noonedeadpunk) proposed openstack/project-config master: Add openstack-ansible/os_senlin role  https://review.opendev.org/74868315:52
*** fressi has joined #opendev15:56
*** fressi has left #opendev16:05
openstackgerritKen Giusti proposed openstack/project-config master: Retire devstack-plugin-pika project  https://review.opendev.org/74871216:07
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: More E208 mode fixes  https://review.opendev.org/74849816:12
openstackgerritSorin Sbarnea (zbr) proposed opendev/base-jobs master: Set file modes explicitly  https://review.opendev.org/74847816:56
*** SotK has quit IRC17:03
*** tosky has quit IRC17:03
*** SotK has joined #opendev17:03
*** tosky has joined #opendev17:04
openstackgerritSorin Sbarnea (zbr) proposed opendev/base-jobs master: Set file modes explicitly  https://review.opendev.org/74847817:12
openstackgerritKen Giusti proposed openstack/project-config master: Retire the devstack-plugin-zmq project  https://review.opendev.org/74534417:31
openstackgerritKen Giusti proposed openstack/project-config master: Retire the devstack-plugin-zmq project  https://review.opendev.org/74872417:31
*** tosky has quit IRC17:44
openstackgerritSorin Sbarnea (zbr) proposed opendev/base-jobs master: Set file modes explicitly  https://review.opendev.org/74847817:50
clarkb#status log Restarted zuul-web and zuul-fingergw to pick up web updates, primarly the one that fixes keyboard scrolling.17:58
openstackstatusclarkb: finished logging17:58
clarkbpopping out for a bike ridenow that it appears zuul-web is stable18:45
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Remove dependency on pkg_resources  https://review.opendev.org/74873718:45
openstackgerritGhanshyam Mann proposed opendev/irc-meetings master: Update Technical Committee office hours new time  https://review.opendev.org/74874018:57
openstackgerritSorin Sbarnea (zbr) proposed zuul/zuul-jobs master: Remove dependency on pkg_resources  https://review.opendev.org/74873719:00
smcginnisHmm, latest upgrade of zuul may have some issues. Jobs in queue that have hit retry_limit redirect to the zuul tenants page.21:24
clarkbsmcginnis: do they work if yo usearch them through the builds page?21:25
clarkb(wondering how wide spread that problem is)21:25
smcginnisDoesn't even show up if I try to filter by it on the Builds tab.21:27
smcginnisAs an example, looking at tempest-integrated-storage for patch 740384 in the gate queue right now.21:28
clarkbthey won't be available in builds until they report21:28
clarkb(thats normal)21:28
clarkbso I think it may not be the zuul-web update21:30
clarkbthe retry limits are pretty widespread, I think what may be happening is something is failing early enough in the job run that it doesn't really have a uuid?21:30
clarkber not that it doesn't have a uuid but that the zuul_return that sets log url doesn't run yet21:31
clarkbwhich means it falls back on a default which isn't vald and ends up at the tenant page21:31
smcginnisSeveral jobs failing with retry_limit for all the jobs in gate right now, so somethings definitely up. But I guess we'll see once the finally report.21:31
*** tosky has joined #opendev21:33
clarkbI'm trying to pull up console logs and see if I can catch one21:34
corvusi'm looking in logs21:34
corvusthe one i picked at random just failed with this task in a pre-playbook: TASK [Run bootstrap-aio script chdir=src/opendev.org/openstack/openstack-ansible, executable=/bin/bash, _raw_params=scripts/bootstrap-aio.sh] ***21:36
corvusthat was build eec9a18aa0e1452891eadd56e7dfba44 of openstack-ansible-deploy-aio_distro_metal-opensuse-1521:37
clarkbit seems to be affecting a large variety of jobs which would make a specific thing like that weird, unless its like network connectivity?21:37
clarkbhttps://zuul.opendev.org/t/openstack/build/82fcb124e6c24e178f6ffd82b3ae5597/logs I'm going to try and find that one on an executor21:38
clarkb"took 1 sec"21:39
corvusze08 has a read only fs21:39
corvusi may have just had the bad luck to find a real failure on my first attempt; the second sent me to ze08 where i found the readonly fs21:39
clarkbthat would do it21:39
clarkbcorvus: I guess stop zuul-executor there then reboot?21:40
clarkblet me see if there are any emails about that host21:40
corvusyeah, i'll stop ze08 now, will hold for more investigation before rebooting21:40
smcginnisDarn clouds. Just can't trust em.21:40
clarkbze04 has a message from a couple days ago, I don't see anything from rax about ze0821:41
clarkbsmcginnis: thank you for reporting21:41
corvuscan't docker-compose down because of read-only fs21:42
clarkbthats a fun behavior21:42
corvusbut the process is stopped21:42
smcginnisI just have good timing I guess.21:42
corvusi think it may have done the kill but not the cleanup21:42
clarkbah21:42
corvuswill be interesting to see what happens on boot21:42
clarkbsince we don't see anything from rax I would say rebooting is fine? its also easy enough to completely rebuild the host if it comes to htat21:42
corvus[857571.914327] print_req_error: I/O error, dev xvde, sector 16296276921:43
corvus[857571.914441] Buffer I/O error on dev xvde2, logical block 18370346, lost sync page write21:44
clarkbxvde is the ephemeral device (not a cinder volume)21:44
corvusthat's /var/lib/zuul21:44
corvusboth it and / are ro21:44
clarkbI expect they are both on the hypervisor21:45
corvusswap also affected (but it's also xvde)21:45
clarkbcorvus: did you down or stop the docker-compose containers ? I think stop will restart them on boot. We may want to use down if not yet done so that we can check the fs's before starting services?21:45
corvusclarkb: i did down which failed21:45
corvusso i'm real fuzzy on what will happen on boot :/21:45
clarkbI'm 99% sure down means the containers are removed entirely so there is nothing to start on boot21:46
clarkbstop stop srunning the containers but they are still registered with docker so when docker starst they will start21:46
corvusclarkb: right, but it failed21:46
clarkboh good point21:46
corvusi have no other ideas of anything to do now before rebooting21:47
smcginnisWith ro, probably not much you can do at this point.21:47
clarkbya  Ithink we should reboot and then check and see what it looks like when up again21:47
corvusdoing21:47
corvuslooks like it started the executor container/process21:50
clarkbmount output looks good though21:50
corvusit's cleaning up stale jobdirs21:50
corvusstarting21:55
clarkbit seems happy? I hvaen't seen anythin gscroll by in the logs that jumps out yet at least21:57
corvusditto21:58
clarkb#status log ze08 ended up with read only filesystems and has been rebooted. This resulted in many retry_limit errors.22:09
openstackstatusclarkb: finished logging22:09
clarkbcorvus: do you think we should do a status notice that jobs can be rechecked?22:09
corvusclarkb: maybe so, looks like there were a lot that hit the limit22:10
clarkbwhat about #status notice A zuul server ended up with read only filesystems which caused many jobs to hit retry_limit. The server has been rebooted and appers happy. Jobs can be rechecked.22:11
corvusclarkb: typo 'appers' but otherwise lgtm22:11
clarkb#status notice A zuul server ended up with read only filesystems which caused many jobs to hit retry_limit. The server has been rebooted and appears happy. Jobs can be rechecked.22:12
openstackstatusclarkb: sending notice22:12
clarkbfixed the typi22:12
clarkband made a new one :)22:12
-openstackstatus- NOTICE: A zuul server ended up with read only filesystems which caused many jobs to hit retry_limit. The server has been rebooted and appears happy. Jobs can be rechecked.22:12
clarkbsmcginnis: are there any openstack release related  jobs we need to reenqueue into zuul?22:15
openstackstatusclarkb: finished sending notice22:15
openstackgerritMerged opendev/irc-meetings master: Update Technical Committee office hours new time  https://review.opendev.org/74874022:20
clarkbcorvus: before you call it a week have a moment for https://review.opendev.org/#/c/748040/ I think that will get limestone working again for us22:50
corvusdone23:01
clarkbtyty23:02
*** tosky has quit IRC23:13
openstackgerritMerged opendev/system-config master: Update the limestone cert in our clouds.yaml  https://review.opendev.org/74804023:23

Generated by irclog2html.py 2.17.2 by Marius Gedminas - find it at https://mg.pov.lt/irclog2html/!