sean-k-mooney | what is the smalles job that actully aqquire a node. does the noop job do that? | 00:00 |
---|---|---|
pabelanger | noop jobs doesn't use any nodes, you likey want something tox related | 00:00 |
pabelanger | or create a hello world job | 00:01 |
sean-k-mooney | if not i might create a simple hello world job that just | 00:01 |
sean-k-mooney | ya that is want i was thinking | 00:01 |
clarkb | re ironic if they come asking for access to the node I held it is | 0011007873 | rax-ord | ubuntu-bionic | 0326030d-851e-41bd-86df-d5acb9191f7b | 23.253.173.73 | 2001:4801:7825:104:be76:4eff:fe10:4061 | hold | 00:05:29:02 | unlocked | | 00:01 |
sean-k-mooney | i can disable all jobs except the canary job and run check experimental on it a few times in a row | 00:01 |
fungi | clarkb: that could also warrant a brief e-mail to openstack-discuss | 00:01 |
sean-k-mooney | that will skip the queue more or less | 00:02 |
clarkb | fungi: good point (I did file a bug but email will hopefulyl get eyeballs on it) | 00:02 |
sean-k-mooney | although its 1 am so maybe tomorow | 00:02 |
clarkb | sean-k-mooney: you should sleep :) | 00:02 |
sean-k-mooney | yes that is what my brain is currnelty telling me | 00:03 |
sean-k-mooney | thanksf or your help o/ | 00:03 |
*** ociuhandu has joined #openstack-infra | 00:10 | |
*** factor has joined #openstack-infra | 00:12 | |
clarkb | johnsom: and I haven't forgotten about your dns problems. 680340 is close to being able to address the next debugging steps for that I think as will my cleanup-run playbook | 00:13 |
johnsom | clarkb Cool, thanks. | 00:13 |
*** aaronsheffield has quit IRC | 00:14 | |
*** ociuhandu has quit IRC | 00:15 | |
ianw | clarkb: i just had a query on executor jobs with the cleanup stuff | 00:17 |
*** goldyfruit_ has joined #openstack-infra | 00:20 | |
rm_work | hey, we just noticed hacking checks stopped working after the upgrade from `hacking!=0.13.0,<0.14,>=0.12.0` to 1.1.0 | 00:22 |
rm_work | anyone aware of any changes necessary to get them to run on the new version? did we miss a migration? | 00:22 |
*** mattw4 has quit IRC | 00:23 | |
rm_work | looks like between 1.0.0 and 1.1.0 | 00:28 |
donnyd | so i restarted the l3-agent. I am looking in the logs now to see if there is an issue | 00:29 |
donnyd | its wasn't happy when i brought it back online.. so I am thinking maybe a service didn't start correctly or something | 00:30 |
logan- | re-enabling limestone host aggregate, will monitor for any issues as we start accepting jobs again | 00:31 |
donnyd | sean-k-mooney: I use bgp for routing, but that subnet already has an entry. I route the whole /64 for that tenant at my edge... so it may as well be static | 00:33 |
donnyd | every now and again the l3-agent on my side sticks... I don't have a better way to explain it other than that.. | 00:35 |
donnyd | so maybe give it another go when you get back on tomorrow | 00:35 |
logan- | sean-k-mooney: catching up on backscroll.. i'd be happy to support numa labels on limestone. also happy to discuss our experience keeping nested virt operational. | 00:37 |
donnyd | Also I would just like to point out we didn't really seem to have this issue when we had a separate pool for the custom stuff. Maybe that is also worth a go | 00:39 |
* donnyd goes to get some food and then sleep. be back in the am | 00:39 | |
*** michael-beaver has quit IRC | 00:40 | |
*** prometheanfire has quit IRC | 00:42 | |
*** gyee has quit IRC | 00:42 | |
*** prometheanfire has joined #openstack-infra | 00:43 | |
*** markvoelker has joined #openstack-infra | 00:46 | |
*** markvoelker has quit IRC | 00:48 | |
*** markvoelker has joined #openstack-infra | 00:49 | |
rm_work | ah figured out the hacking-checks issue | 00:52 |
*** diablo_rojo has quit IRC | 00:57 | |
*** markvoelker has quit IRC | 00:59 | |
*** markvoelker has joined #openstack-infra | 00:59 | |
*** markvoelker has quit IRC | 01:04 | |
*** nicolasbock has quit IRC | 01:04 | |
*** nicolasbock has joined #openstack-infra | 01:04 | |
*** markvoelker has joined #openstack-infra | 01:07 | |
*** yamamoto has joined #openstack-infra | 01:15 | |
*** markvoelker has quit IRC | 01:17 | |
*** markvoelker has joined #openstack-infra | 01:18 | |
*** bobh has joined #openstack-infra | 01:19 | |
*** markvoelker has quit IRC | 01:23 | |
*** hongbin has joined #openstack-infra | 01:35 | |
*** rfolco has quit IRC | 01:41 | |
*** markvoelker has joined #openstack-infra | 01:48 | |
*** jamesmcarthur has joined #openstack-infra | 01:48 | |
*** nicolasbock has quit IRC | 02:01 | |
*** apetrich has quit IRC | 02:10 | |
*** spsurya has joined #openstack-infra | 02:18 | |
*** yamamoto has quit IRC | 02:19 | |
*** markvoelker has quit IRC | 02:20 | |
*** markvoelker has joined #openstack-infra | 02:22 | |
*** bobh has quit IRC | 02:23 | |
*** icarusfactor has joined #openstack-infra | 02:25 | |
*** ykarel|away has joined #openstack-infra | 02:25 | |
*** factor has quit IRC | 02:27 | |
*** jamesmcarthur has quit IRC | 02:30 | |
*** larainema has joined #openstack-infra | 02:33 | |
*** roman_g has quit IRC | 02:34 | |
*** yamamoto has joined #openstack-infra | 03:03 | |
*** hamzy_ has quit IRC | 03:06 | |
*** rlandy|bbl has quit IRC | 03:12 | |
*** igordc has quit IRC | 03:31 | |
*** rh-jelabarre has quit IRC | 03:34 | |
*** dave-mccowan has quit IRC | 03:35 | |
*** armax has quit IRC | 03:41 | |
*** exsdev0 has joined #openstack-infra | 03:43 | |
*** exsdev has quit IRC | 03:44 | |
*** exsdev0 is now known as exsdev | 03:44 | |
*** xarses has quit IRC | 03:44 | |
clarkb | fwiw no new fn network errors at launch for the last several hours | 03:45 |
clarkb | I've rechecked sean-k-mooney's change so hopefully there are results for sean-k-mooney when back at it | 03:45 |
*** xarses has joined #openstack-infra | 03:45 | |
*** hongbin has quit IRC | 03:46 | |
clarkb | hrm the first attempt at booting that numa label failed :/ maybe the quietness was a fluke | 03:49 |
*** ykarel|away has quit IRC | 03:53 | |
*** ramishra has joined #openstack-infra | 03:54 | |
clarkb | ianw: responded to your question. And now I call it a night | 03:56 |
AJaeger | config-core, https://review.opendev.org/678356 and https://review.opendev.org/678357 are ready for review, dependencies merged - those remove now unused publish jobs and update promote jobs. Please review | 04:06 |
*** udesale has joined #openstack-infra | 04:07 | |
*** ykarel|away has joined #openstack-infra | 04:08 | |
*** ykarel|away is now known as ykarel | 04:09 | |
openstackgerrit | Merged opendev/base-jobs master: Add cleanup phase to base(-test) https://review.opendev.org/681100 | 04:12 |
*** snapiri has joined #openstack-infra | 04:19 | |
*** ociuhandu has joined #openstack-infra | 04:30 | |
*** ociuhandu has quit IRC | 04:34 | |
*** jtomasek has joined #openstack-infra | 04:38 | |
*** igordc has joined #openstack-infra | 04:44 | |
*** markvoelker has quit IRC | 04:48 | |
*** ricolin has joined #openstack-infra | 05:00 | |
*** kjackal has joined #openstack-infra | 05:17 | |
*** markvoelker has joined #openstack-infra | 05:26 | |
openstackgerrit | Merged zuul/zuul master: Web: rely on new attributes when determining task failure https://review.opendev.org/680498 | 05:28 |
*** jtomasek has quit IRC | 05:29 | |
*** markvoelker has quit IRC | 05:30 | |
*** diga has joined #openstack-infra | 05:33 | |
*** raukadah is now known as chandankumar | 05:34 | |
*** ralonsoh has joined #openstack-infra | 05:38 | |
*** pots has quit IRC | 05:42 | |
*** soniya29 has joined #openstack-infra | 05:57 | |
AJaeger | ianw: what's your timeframe for https://etherpad.openstack.org/p/static-services ? I can write the job updates for publishing/promote (just added myself to etherpad for that) | 06:00 |
openstackgerrit | Ian Wienand proposed zuul/zuul master: [wip] Test and expand documentation for executor-only jobs https://review.opendev.org/679184 | 06:00 |
ianw | AJaeger: ummm ... when it gets done? :) the hard bits will moving the publishing to afs i guess? | 06:00 |
AJaeger | That should be easy, I can propose all the AFS publishing jobs... | 06:01 |
AJaeger | Just tell me when - and whether you want them as single changes or one large one... | 06:01 |
AJaeger | (easish for me - so, I volunteer) | 06:01 |
ianw | AJaeger: well i mean any time you like! the ideal would be to get to the point that static.o.o just isn't doing anything | 06:02 |
ianw | the redirects shouldn't be too hard. we didn't really come to a conclusion if we should spin up a new service or move them over to files.openstack.org's apache ... we can probably push for more of an answer in meeting if that's becoming the holdup | 06:03 |
AJaeger | let me work the next days on the jobs and then we can discuss how to split them up. | 06:03 |
AJaeger | I only volunteer for any publish/promote jobs - not the redirects ;) | 06:03 |
ianw | that's fine :) i mean i have changes out that handle all the redirects via haproxy, if we want to do that | 06:04 |
AJaeger | ok | 06:04 |
*** dtantsur|afk is now known as dtantsur | 06:15 | |
dtantsur | clarkb: I don't think we're creating anything on test nodes (only in VMs inside test nodes) | 06:16 |
*** dklyle has quit IRC | 06:20 | |
*** dklyle has joined #openstack-infra | 06:21 | |
ianw | mirror.opensuse | 06:24 |
ianw | Release failed: VOLSER: Problems encountered in doing the dump ! | 06:25 |
ianw | hrm ... | 06:25 |
*** kopecmartin|off is now known as kopecmartin | 06:26 | |
*** gfidente has joined #openstack-infra | 06:27 | |
ianw | Tue Sep 10 06:18:38 2019 1 Volser: ReadVnodes: IH_CREATE: File exists - restore aborted | 06:28 |
ianw | Tue Sep 10 06:18:38 2019 Scheduling salvage for volume 536871010 on part /vicepa over FSSYNC | 06:28 |
ianw | i think that must have been it | 06:28 |
ianw | Starting ForwardMulti from 536871010 to 536871010 on afs02.dfw.openstack.org (full release). | 06:28 |
ianw | it was | 06:28 |
frickler | infra-root: the kernel panics on bionic could be this one https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1842447 , hit us in several production nodes. kernel -62 should be fine | 06:31 |
openstack | Launchpad bug 1842447 in linux (Ubuntu) "Kernel Panic with linux-image-4.15.0-60-generic when specifying nameserver in docker-compose" [Undecided,Confirmed] | 06:31 |
AJaeger | ianw, could you review https://review.opendev.org/#/c/678356/ and https://review.opendev.org/#/c/678357/6, please? | 06:31 |
ianw | this looks like the same volume corruption we saw and i posted about @ https://lists.openafs.org/pipermail/openafs-devel/2018-May/020491.html | 06:36 |
ianw | we will have to try a similar salvage operation. i won't do anything until the current batch of "vos unlock / vos release" operations complete (running in root screen on afs01.dfw) | 06:37 |
ianw | so far mirror.opensuse is the only volume to have issues releasing, which is good at least | 06:37 |
ianw | it still has | 06:38 |
ianw | mirror.ubuntu-ports | 06:38 |
ianw | mirror.ubuntu | 06:38 |
ianw | mirror.yum-puppetlabs | 06:38 |
ianw | to go | 06:38 |
*** slaweq has joined #openstack-infra | 06:41 | |
dirk | roman_g: AJaeger: fungi: regarding the opensuse 15.0 mirror issues, I se it is current on files.openstack.org - so where is the issue? also, please switch your jobs away from opensuse-150 nodeset to opensuse-15 (which is 15.1 right now) to reduce maintenance burden going forward | 06:42 |
openstackgerrit | Merged openstack/project-config master: Remove now unused publish jobs https://review.opendev.org/678356 | 06:42 |
*** pgaxatte has joined #openstack-infra | 06:44 | |
*** igordc has quit IRC | 07:00 | |
*** tesseract has joined #openstack-infra | 07:05 | |
*** rcernin has quit IRC | 07:09 | |
*** ociuhandu has joined #openstack-infra | 07:14 | |
*** kjackal has quit IRC | 07:17 | |
icey | anybody else seeing issues with ipv6 + review.opendev.org? | 07:17 |
*** kjackal has joined #openstack-infra | 07:18 | |
*** threestrands has quit IRC | 07:20 | |
*** ociuhandu has quit IRC | 07:21 | |
*** yamamoto has quit IRC | 07:22 | |
*** apetrich has joined #openstack-infra | 07:24 | |
*** xenos76 has joined #openstack-infra | 07:26 | |
*** rpittau|afk is now known as rpittau | 07:28 | |
*** yamamoto has joined #openstack-infra | 07:30 | |
*** jpena|off is now known as jpena | 07:33 | |
*** yamamoto has quit IRC | 07:34 | |
*** apetrich has quit IRC | 07:39 | |
*** ykarel is now known as ykarel|lunch | 07:39 | |
*** apetrich has joined #openstack-infra | 07:40 | |
ianw | dirk: per some of my prior messages; all mirroring is currently paused while we try to recover all the volumes to full replication status, and it looks like the opensuse volume will need a little help to get recovered | 07:41 |
ianw | icey: i connect via ipv6 and not seeing any issues | 07:41 |
icey | ianw: weird, I can `ping6 google.com`, but not review.opendev.org | 07:42 |
ianw | 64 bytes from review01.openstack.org (2001:4800:7819:103:be76:4eff:fe04:9229): icmp_seq=1 ttl=48 time=280 ms | 07:42 |
*** xenos76 has quit IRC | 07:42 | |
icey | I suspect it's something wonky on my side, which is all I need to know (/me keeps digging!) | 07:43 |
*** pkopec has joined #openstack-infra | 07:43 | |
*** xenos76 has joined #openstack-infra | 07:44 | |
*** apetrich has quit IRC | 07:45 | |
*** happyhemant has joined #openstack-infra | 07:46 | |
*** trident has quit IRC | 07:50 | |
*** pgaxatte has quit IRC | 07:50 | |
*** apetrich has joined #openstack-infra | 07:52 | |
*** lpetrut has joined #openstack-infra | 07:55 | |
*** pgaxatte has joined #openstack-infra | 07:57 | |
*** trident has joined #openstack-infra | 08:01 | |
*** dchen has quit IRC | 08:04 | |
*** e0ne has joined #openstack-infra | 08:06 | |
*** jtomasek has joined #openstack-infra | 08:06 | |
*** priteau has joined #openstack-infra | 08:06 | |
* AJaeger will be offline for a bit... | 08:17 | |
*** AJaeger has left #openstack-infra | 08:17 | |
*** AJaeger has quit IRC | 08:17 | |
*** ociuhandu has joined #openstack-infra | 08:20 | |
*** tkajinam has quit IRC | 08:22 | |
*** panda|rover has quit IRC | 08:23 | |
*** panda has joined #openstack-infra | 08:24 | |
*** ociuhandu has quit IRC | 08:25 | |
*** ociuhandu has joined #openstack-infra | 08:26 | |
*** roman_g has joined #openstack-infra | 08:29 | |
*** ociuhandu has quit IRC | 08:30 | |
*** ociuhandu has joined #openstack-infra | 08:30 | |
openstackgerrit | Merged openstack/project-config master: Add allowed-projects to static publish jobs https://review.opendev.org/678357 | 08:32 |
*** derekh has joined #openstack-infra | 08:33 | |
*** soniya29 has quit IRC | 08:34 | |
*** sshnaidm|afk is now known as sshnaidm|ruck | 08:37 | |
*** ociuhandu has quit IRC | 08:40 | |
*** ociuhandu has joined #openstack-infra | 08:41 | |
*** ociuhandu has quit IRC | 08:44 | |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix delegating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 08:49 |
*** yamamoto has joined #openstack-infra | 08:56 | |
*** ykarel|lunch is now known as ykarel| | 09:01 | |
*** ykarel| is now known as ykarel | 09:01 | |
*** jaosorior has joined #openstack-infra | 09:11 | |
*** rfolco has joined #openstack-infra | 09:14 | |
*** yamamoto has quit IRC | 09:17 | |
*** soniya29 has joined #openstack-infra | 09:22 | |
*** kaiokmo has joined #openstack-infra | 09:24 | |
*** rcernin has joined #openstack-infra | 09:38 | |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix delegating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 09:50 |
*** happyhemant has quit IRC | 09:56 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handle Pull Request tags (labels) metadata https://review.opendev.org/681050 | 09:59 |
*** dklyle has quit IRC | 10:03 | |
*** dklyle has joined #openstack-infra | 10:04 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handle initial comment change event https://review.opendev.org/680310 | 10:06 |
*** udesale has quit IRC | 10:09 | |
*** apetrich has quit IRC | 10:10 | |
*** udesale has joined #openstack-infra | 10:10 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handle Pull Request tags (labels) metadata https://review.opendev.org/681050 | 10:12 |
*** markvoelker has joined #openstack-infra | 10:16 | |
*** xenos76 has quit IRC | 10:16 | |
*** markvoelker has quit IRC | 10:21 | |
*** xenos76 has joined #openstack-infra | 10:26 | |
*** AJaeger has joined #openstack-infra | 10:43 | |
jrosser | i think this is wrong using version_compare as a filter when it is a test https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/configure-mirrors/templates/etc/yum.repos.d/fedora-updates.repo.j2#L5 | 10:46 |
*** markvoelker has joined #openstack-infra | 10:55 | |
*** jpena is now known as jpena|lunch | 10:59 | |
*** ociuhandu has joined #openstack-infra | 11:00 | |
*** markvoelker has quit IRC | 11:00 | |
*** ociuhandu has quit IRC | 11:01 | |
*** nicolasbock has joined #openstack-infra | 11:06 | |
*** ociuhandu has joined #openstack-infra | 11:07 | |
*** rcosnita has joined #openstack-infra | 11:11 | |
*** rcosnita has quit IRC | 11:11 | |
*** udesale has quit IRC | 11:14 | |
*** larainema has quit IRC | 11:15 | |
*** pgaxatte has quit IRC | 11:19 | |
*** rh-jelabarre has joined #openstack-infra | 11:29 | |
*** rh-jelabarre has quit IRC | 11:30 | |
*** rh-jelabarre has joined #openstack-infra | 11:30 | |
*** priteau has quit IRC | 11:38 | |
*** prometheanfire has quit IRC | 11:41 | |
*** prometheanfire has joined #openstack-infra | 11:41 | |
zbr | ianw: can you help me with https://review.opendev.org/#/c/680962/ ? | 11:48 |
*** ykarel is now known as ykarel|afk | 11:50 | |
*** pgaxatte has joined #openstack-infra | 11:50 | |
*** jamesmcarthur has joined #openstack-infra | 11:54 | |
zbr | since we switched to the new dashboard view for logs the custom success/failure-urls are no longer working, is that a known bug? | 11:59 |
AJaeger | zbr: can you give an example, please? | 12:04 |
frickler | zbr: that is being called a feature | 12:04 |
*** markvoelker has joined #openstack-infra | 12:04 | |
frickler | infra-root: devstack fedora-"latest", i.e. F28, jobs are failing with this mirror issue, is there anything we can do? ianw was working on moving to F29 or F30, but that's not ready yet IIUC https://zuul.opendev.org/t/openstack/build/8dd1004470124f9ba95a677c7d95995e/log/job-output.txt#2321 | 12:05 |
*** jamesmcarthur has quit IRC | 12:08 | |
sean-k-mooney | logan-: thanks for the offer of trying to provide a multi numa flavor with nested vert via limestone. it would be very helpful if you could. in general haveing severl sources of the lable would likely help mitigate any network issue we hit or indiviugal error in a singel provider. | 12:09 |
sean-k-mooney | frickler: just so you are aware a future qemu is going to break oslo utils | 12:09 |
*** jpena|lunch is now known as jpena | 12:10 | |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 12:10 |
sean-k-mooney | i had a job that was trying to use the virt-preview repo with fedora 28 and i found that the output of qemu image info or something like that in the future version has chagned | 12:10 |
sean-k-mooney | so going to fedora 30 might hit the same issue | 12:10 |
sean-k-mooney | its a relitivly easy fix but i have not done it yet | 12:11 |
*** apetrich has joined #openstack-infra | 12:12 | |
*** njohnston has joined #openstack-infra | 12:13 | |
*** jamesmcarthur has joined #openstack-infra | 12:13 | |
AJaeger | frickler: FYI, devstack is the only remaining f28 user - everybody else got moved to F29 already | 12:16 |
*** goldyfruit_ has quit IRC | 12:18 | |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 12:19 |
*** rlandy has joined #openstack-infra | 12:23 | |
*** jamesmcarthur has quit IRC | 12:24 | |
*** gfidente has quit IRC | 12:24 | |
*** Lucas_Gray has joined #openstack-infra | 12:25 | |
*** jamesmcarthur has joined #openstack-infra | 12:26 | |
openstackgerrit | Andy Ladjadj proposed zuul/zuul master: Fix: prevent usage of hashi_vault https://review.opendev.org/681041 | 12:28 |
*** jamesmcarthur has quit IRC | 12:29 | |
*** Lucas_Gray has quit IRC | 12:32 | |
*** Wryhder has joined #openstack-infra | 12:32 | |
*** Wryhder is now known as Lucas_Gray | 12:33 | |
*** apetrich has quit IRC | 12:33 | |
zbr | AJaeger: frickler yes, here is the example: click the openstack-tox-molecule job on https://review.opendev.org/#/c/669223/6 | 12:34 |
*** soniya29 has quit IRC | 12:34 | |
zbr | this job was supposed to load the reports.html file from tox, based on https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/jobs.yaml#L212-L213 | 12:34 |
zbr | it was doing this, but not anymore | 12:35 |
zbr | i think it stopped doing this when we switched to way of browsing the logs | 12:35 |
*** pgaxatte has quit IRC | 12:35 | |
zbr | the report.html is still there, but user needs to dig to find it. | 12:36 |
AJaeger | zbr: yes, indeed. To make the reports easy available, you need to follow the changes we did for docs | 12:37 |
*** derekh has quit IRC | 12:43 | |
*** pgaxatte has joined #openstack-infra | 12:43 | |
*** derekh has joined #openstack-infra | 12:43 | |
zbr | AJaeger: I am not sure what you did for docs, tried to find but failed. Clearly the output file is collected, what is missing is the change URLs. | 12:47 |
zbr | if is no longer possible to change the success-url is ok as long I can make the repot visible on the log main page, maybe like the "console" tab? | 12:48 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 12:48 |
zbr | clearly that file is one of the most important outcome (artifact) of that build, like the html on docs too. and we need to make it easily accesible. | 12:49 |
AJaeger | zbr: https://opendev.org/zuul/zuul-jobs/src/branch/master/roles/fetch-sphinx-tarball/tasks/html.yaml#L31 shows what was done - zuul_return is the magic. | 12:52 |
openstackgerrit | Andy Ladjadj proposed zuul/zuul master: Fix: prevent usage of hashi_vault https://review.opendev.org/681041 | 12:54 |
zbr | AJaeger: thanks, this is clearly what i was looking for. One more bit: do I still need to get the file again here, even if I know that is collected correctly? | 12:55 |
*** mriedem has joined #openstack-infra | 12:55 | |
AJaeger | zbr: not sure, better ask corvus later | 12:55 |
*** AJaeger has quit IRC | 12:55 | |
zbr | AJaeger: i will try anyway, i will learn by doing it. thanks again, really helpful. | 12:55 |
*** aaronsheffield has joined #openstack-infra | 12:56 | |
*** Goneri has joined #openstack-infra | 12:57 | |
*** udesale has joined #openstack-infra | 12:57 | |
*** hamzy_ has joined #openstack-infra | 12:59 | |
*** kaiokmo has quit IRC | 13:02 | |
*** gfidente has joined #openstack-infra | 13:03 | |
*** kaiokmo has joined #openstack-infra | 13:05 | |
*** ykarel|afk is now known as ykarel | 13:10 | |
*** eharney has joined #openstack-infra | 13:12 | |
openstackgerrit | Sorin Sbarnea proposed openstack/openstack-zuul-jobs master: openstack-tox-molecule: replace success-url and failure-url https://review.opendev.org/681251 | 13:13 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 13:16 |
*** sthussey has joined #openstack-infra | 13:18 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - reference pipelines add open: True requirement https://review.opendev.org/681252 | 13:22 |
*** goldyfruit_ has joined #openstack-infra | 13:23 | |
openstackgerrit | Matt Riedemann proposed opendev/elastic-recheck master: Add query for nova functional test race bug 1843433 https://review.opendev.org/681256 | 13:28 |
openstack | bug 1843433 in OpenStack Compute (nova) "functional test test_migrate_server_with_qos_port fails intermittently due to race condition" [Medium,Confirmed] https://launchpad.net/bugs/1843433 - Assigned to Balazs Gibizer (balazs-gibizer) | 13:28 |
fungi | amorin: looks like our ovh account is still working since you fixed it on sunday... do you think we're safe to assume the bot isn't going to automatically turn that account off again at this point? | 13:37 |
amorin | fungi: yes | 13:38 |
amorin | we are safe now | 13:38 |
fungi | thanks! we've been waiting to turn swift log storage back on there since it's not as robust against such things as nodepool | 13:38 |
fungi | i'll approve change 680855 now in that case | 13:38 |
*** eharney has quit IRC | 13:42 | |
*** rcernin has quit IRC | 13:45 | |
*** AJaeger has joined #openstack-infra | 13:46 | |
openstackgerrit | Corey Bryant proposed openstack/project-config master: Retire charm-neutron-api-genericswitch https://review.opendev.org/681259 | 13:47 |
openstackgerrit | Merged opendev/elastic-recheck master: Add query for nova functional test race bug 1843433 https://review.opendev.org/681256 | 13:48 |
openstack | bug 1843433 in OpenStack Compute (nova) "functional test test_migrate_server_with_qos_port fails intermittently due to race condition" [Medium,In progress] https://launchpad.net/bugs/1843433 - Assigned to Balazs Gibizer (balazs-gibizer) | 13:48 |
openstackgerrit | Merged opendev/base-jobs master: Revert "Stop storing logs on OVH" https://review.opendev.org/680855 | 13:51 |
*** ramishra has quit IRC | 13:51 | |
*** ramishra has joined #openstack-infra | 13:51 | |
*** eharney has joined #openstack-infra | 13:55 | |
zbr | what are the supported metadata types on zuul artifacts? I tried to find some docs but apparently they are just a free form test field. | 13:56 |
*** ociuhandu has quit IRC | 13:57 | |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 13:57 |
*** ociuhandu has joined #openstack-infra | 13:58 | |
*** jtomasek has quit IRC | 14:00 | |
openstackgerrit | Sorin Sbarnea proposed openstack/openstack-zuul-jobs master: openstack-tox-molecule: replace success-url and failure-url https://review.opendev.org/681251 | 14:01 |
*** jtomasek has joined #openstack-infra | 14:01 | |
fungi | zbr: i don't know, but i wouldn't be surprised if that's left up to the service being used to store and serve artifacts, so in our case, swift | 14:02 |
*** tkajinam has joined #openstack-infra | 14:02 | |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 14:03 |
*** ociuhandu has quit IRC | 14:03 | |
corvus | zbr: yes, free form | 14:05 |
zbr | corvus: optional? | 14:05 |
corvus | zbr: yes; not currently used by zuul. actually the entire metadata dict is optional and free-form. | 14:08 |
zbr | corvus: thanks. | 14:08 |
corvus | zbr: the promote jobs use those fields to find the right kind of artifact to promote | 14:08 |
fungi | ahh, so we don't pass the mime type info along in the swift upload to instruct it what mime type to claim when serving the files? | 14:09 |
corvus | fungi: we do, but not manually, that's handled automatically by the uploads-logs-swift role | 14:09 |
fungi | oh, got it, so we use a magic number lib of some sort to infer mime types rather than whatever the job has claimed? | 14:10 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 14:10 |
corvus | fungi: yep. there isn't actually any direct link between an artifact and a logfile anyway; usually it points to the logserver, but you can (and we do) link to other urls such as zuul-preview urls | 14:11 |
fungi | that makes sense. thanks! | 14:12 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: Remove charm-neutron-api-genericswitch from infra https://review.opendev.org/681270 | 14:15 |
*** Lucas_Gray has quit IRC | 14:24 | |
*** Lucas_Gray has joined #openstack-infra | 14:25 | |
*** dtantsur is now known as dtantsur|afk | 14:25 | |
donnyd | clarkb: should we re-enable swift logs for FN... the propane guy didn't even show up yesterday | 14:28 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul master: Normalize/alias nodepool inventory IPs in executor https://review.opendev.org/681273 | 14:28 |
clarkb | donnyd: sure, I think we are going to reenable ovh too based on emails so probably want to stack on AJaeger's change for that if it hasnt merged yet | 14:29 |
donnyd | yea that makes sense to me | 14:29 |
*** Lucas_Gray has quit IRC | 14:30 | |
*** armax has joined #openstack-infra | 14:31 | |
*** ociuhandu has joined #openstack-infra | 14:32 | |
*** ociuhandu has quit IRC | 14:33 | |
fungi | clarkb: i replied to that e-mail and we also discussed it in here at 13:37z, so the change merged to reenable logs in ovh at 13:51z | 14:33 |
*** ociuhandu has joined #openstack-infra | 14:33 | |
clarkb | thanks! | 14:34 |
donnyd | clarkb: you have the review handy for that? | 14:34 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 14:34 |
clarkb | donnyd: soundslike it merged so just base your change on current master | 14:34 |
donnyd | ok | 14:34 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 14:35 |
*** ianychoi_ is now known as ianychoi | 14:37 | |
*** Lucas_Gray has joined #openstack-infra | 14:37 | |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 14:37 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul master: Normalize/alias nodepool inventory IPs in executor https://review.opendev.org/681273 | 14:37 |
*** ociuhandu has quit IRC | 14:37 | |
openstackgerrit | Donny Davis proposed opendev/base-jobs master: Re-enable FN swift logs - electrical work complete https://review.opendev.org/681275 | 14:38 |
openstackgerrit | Jeremy Stanley proposed openstack/project-config master: Report openstack/election repo changes in IRC https://review.opendev.org/681276 | 14:38 |
*** ociuhandu has joined #openstack-infra | 14:39 | |
openstackgerrit | Tristan Cacqueray proposed zuul/zuul master: spec: add a zuul-runner cli https://review.opendev.org/681277 | 14:40 |
*** ykarel is now known as ykarel|afk | 14:42 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 14:42 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add no-jobs reporter action https://review.opendev.org/681278 | 14:42 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 14:43 |
*** e0ne has quit IRC | 14:44 | |
*** e0ne has joined #openstack-infra | 14:44 | |
*** jamesmcarthur has joined #openstack-infra | 14:47 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Pagure - handles pull-request.closed event https://review.opendev.org/681279 | 14:49 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 14:50 |
*** Lucas_Gray has quit IRC | 14:52 | |
fungi | tristanC: this is not urgent, but i noticed that since you originally created the #openstack-election irc channel on freenode, you're the only one with administrative control over it. when you get time, can i trouble you to run this? `/msg chanserv access #openstack-election add openstackinfra +AFRefiorstv` | 14:56 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: Remove charm-neutron-api-genericswitch from infra https://review.opendev.org/681270 | 14:58 |
*** udesale has quit IRC | 15:00 | |
*** udesale has joined #openstack-infra | 15:00 | |
*** diablo_rojo has joined #openstack-infra | 15:01 | |
*** spsurya has quit IRC | 15:05 | |
openstackgerrit | Merged opendev/base-jobs master: Re-enable FN swift logs - electrical work complete https://review.opendev.org/681275 | 15:05 |
clarkb | donnyd: ^ fyi | 15:05 |
openstackgerrit | Merged zuul/zuul master: Overriding max. starting builds. https://review.opendev.org/670461 | 15:05 |
*** pgaxatte has quit IRC | 15:06 | |
*** tkajinam has quit IRC | 15:13 | |
donnyd | clarkb: i just tested swift to make sure its actually working. I don't think we had any issues with it before did we? | 15:13 |
clarkb | donnyd: I had not heard of any issues before | 15:14 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: End project gating for charm-neutron-api-genericswitch https://review.opendev.org/681259 | 15:14 |
openstackgerrit | Corey Bryant proposed openstack/project-config master: Remove charm-neutron-api-genericswitch from infra https://review.opendev.org/681270 | 15:14 |
clarkb | johnsom: fyi I think https://review.opendev.org/680340 is ready now to fix worlddump in devstack which will hopefully aid in debugging your dns problems | 15:15 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 15:16 |
donnyd | clarkb: I am also working on a faster router.. I think it is possible that FN is having an issue there and having something that can keep up would probably help greatly | 15:18 |
*** ociuhandu has quit IRC | 15:19 | |
donnyd | we keep seeing the occasional DNS can't be found or ssh inbound not working kinda things | 15:19 |
*** ociuhandu has joined #openstack-infra | 15:19 | |
donnyd | And currently that device *is* the bottleneck | 15:19 |
donnyd | so I am working on something that should be in orders of magnitude better | 15:20 |
*** ociuhandu has quit IRC | 15:22 | |
*** ociuhandu has joined #openstack-infra | 15:23 | |
openstackgerrit | Merged zuul/zuul master: Update heuristing of parallel starting builds. https://review.opendev.org/671702 | 15:23 |
*** ginopc has joined #openstack-infra | 15:24 | |
*** gfidente has quit IRC | 15:25 | |
*** ociuhandu has quit IRC | 15:28 | |
*** ociuhandu has joined #openstack-infra | 15:31 | |
*** gyee has joined #openstack-infra | 15:35 | |
*** ociuhandu has quit IRC | 15:38 | |
openstackgerrit | Jeremy Stanley proposed openstack/project-config master: Report openstack/election repo changes in IRC https://review.opendev.org/681276 | 15:38 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 15:38 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Remove explicit prints from cleanup playbook https://review.opendev.org/681291 | 15:38 |
clarkb | infra-root ^ testing shows those prints are not required. If we can get that in I'll rerun tests then push up a change to apply to base and base-minimal if it still looks good | 15:38 |
fungi | i fast-approved it | 15:41 |
*** apetrich has joined #openstack-infra | 15:41 | |
*** ykarel|afk is now known as ykarel|away | 15:42 | |
*** factor has joined #openstack-infra | 15:45 | |
*** icarusfactor has quit IRC | 15:45 | |
*** ykarel|away has quit IRC | 15:47 | |
clarkb | tyty | 15:47 |
*** ociuhandu has joined #openstack-infra | 15:47 | |
*** mattw4 has joined #openstack-infra | 15:48 | |
corvus | clarkb: do you even need the register? | 15:48 |
clarkb | corvus: oh no I don't | 15:48 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Remove explicit prints from cleanup playbook https://review.opendev.org/681291 | 15:49 |
fungi | unapproved in that case | 15:49 |
clarkb | lets go ahead and fix that now so that we have what we want for base and base-minimal when happy with the results | 15:49 |
corvus | clarkb: did you have a test change i can look at? | 15:49 |
clarkb | corvus: I do https://zuul.opendev.org/t/zuul/build/0ee6b77daafd40e3ac410e6f308801c8 that one | 15:49 |
corvus | (just curious to see what that looks like) | 15:49 |
clarkb | corvus: due to when cleanup runs we don't get that in published logs though. You have to grep that uuid on the executor (ze04) | 15:49 |
corvus | clarkb: ah right :) | 15:50 |
openstackgerrit | Bogdan Dobrelya (bogdando) proposed zuul/zuul-jobs master: Fix evaluating nodepool_ip and switch_ip facts https://review.opendev.org/681182 | 15:54 |
*** ginopc has quit IRC | 15:55 | |
*** ociuhandu has quit IRC | 15:58 | |
*** ociuhandu has joined #openstack-infra | 15:58 | |
*** ociuhandu has quit IRC | 16:02 | |
*** ociuhandu has joined #openstack-infra | 16:03 | |
openstackgerrit | Merged opendev/base-jobs master: Remove explicit prints from cleanup playbook https://review.opendev.org/681291 | 16:03 |
clarkb | johnsom: the mirror hasn't been updated yet. ianw was working on that overnight (relative to my location) | 16:04 |
clarkb | johnsom: I'll see if I can't figure out where that left off | 16:04 |
johnsom | Thank you! | 16:05 |
*** ykarel|away has joined #openstack-infra | 16:06 | |
*** tesseract has quit IRC | 16:06 | |
*** rpittau is now known as rpittau|afk | 16:06 | |
*** eernst has joined #openstack-infra | 16:08 | |
clarkb | I think ianw's script completed (I don't see it running anymore and the only locked volume insn't in his script's list) | 16:09 |
clarkb | corvus: fungi ^ did you want to double check that? but if you agree I think we can turn the mirror-update servers back on and have it start updating thigns again? | 16:09 |
clarkb | and that should get us an up to date mirror? | 16:09 |
clarkb | I guess we also need to know if ianw's script successfully vos released those volumes before we let the cron at it which may timeout? | 16:10 |
clarkb | /afs/.openstack.org/mirror/ubuntu/timestamp.txt and /afs/openstack.org/mirror/ubuntu/timestamp.txt differ. Maybe that means we need to do a vos release | 16:11 |
*** eernst has quit IRC | 16:11 | |
corvus | clarkb: yes, and the 'last update' timestamps on mirror.ubuntu and mirror.ubuntu.readonly differ | 16:12 |
*** lpetrut has quit IRC | 16:12 | |
corvus | clarkb: (under 'vos examine mirror.ubuntu' and 'vos examine mirror.ubuntu.readonly') | 16:13 |
clarkb | corvus: are we able to check if there is a vos release running for a volume without running ps everywhere? maybe examine would tell us that? | 16:13 |
clarkb | because if we know that vos release isn't running for ubuntu then maybe we just run it? | 16:13 |
*** mattw4 has quit IRC | 16:14 | |
corvus | clarkb: 'vos listvldb' says mirror.wheel.bionicx64 is the only release in progress | 16:14 |
corvus | clarkb: and yes, vos examine shows the same info | 16:14 |
*** gfidente has joined #openstack-infra | 16:15 | |
clarkb | here we go. root@afs01.dfw.o.o:/root/unlock.log and unlock.sh | 16:15 |
clarkb | the unlock.log files shows that ubuntu released successfully ~2.5 hours ago | 16:15 |
clarkb | except the RW and RO volumes differ in content | 16:16 |
*** kopecmartin is now known as kopecmartin|off | 16:16 | |
corvus | did a mirror update somehow run after the release started? | 16:17 |
clarkb | corvus: I don't think so unless the mirror-updat servers booted again | 16:18 |
*** e0ne has quit IRC | 16:18 | |
clarkb | ianw shutdown the update servers in order to run this script unchallenged | 16:18 |
*** mattw4 has joined #openstack-infra | 16:25 | |
fungi | they're not responding to icmp ping, at least | 16:25 |
mnaser | i have a question -- how do we feel about enabling CR for all of infra images and installing py3 in there by default? | 16:27 |
mnaser | making all of our images py3 native | 16:27 |
mnaser | s/indra/nodepool/ | 16:27 |
clarkb | CR? | 16:27 |
clarkb | fungi: corvus I need to find breakfast but can keep digging on the afs stuff after back in a bit | 16:27 |
clarkb | fungi: corvus: thinking maybe we do another vos release just to rule it out and then maybe we need to ivnestigate if the volume needs salvaging? | 16:28 |
tristanC | clarkb: I guess mnaser is refering to https://wiki.centos.org/AdditionalResources/Repositories/CR | 16:30 |
corvus | clarkb: another vos release sounds prudent. the timestamps only differ by like 1 minute which suggests to me some minor change | 16:30 |
mnaser | correct, using CR there | 16:31 |
clarkb | tristanC: mnaser: I think one of our goals with our testing is to test what people will find in the wild. If you install centos 7 you don't get the CR repos by default. If you want contents from that repo then your jobs can update the node as necessary to make that happen then it is also documented and automatable for anyone in the real world trying to do that same | 16:31 |
clarkb | I don't think we should update the base images we use to do that | 16:32 |
mnaser | clarkb: but we also develop what you dont see in the wild yet (for master) | 16:32 |
mnaser | and whatever is in CR will become centos 7.7 in the next few days | 16:32 |
tristanC | mnaser: we do enable CR in some of our jobs as a pre-tasks | 16:32 |
fungi | yeah, testing openstack on centos7 with cr suggests we expect users will actually deploy that combination | 16:32 |
*** mattw4 has quit IRC | 16:33 | |
pabelanger | python3 is in CR for centos7 now? | 16:34 |
mnaser | yep | 16:34 |
pabelanger | wow | 16:34 |
pabelanger | still holding out hope for centos 8 :p | 16:36 |
pabelanger | https://wiki.centos.org/About/Building_8 seems to imply things are pretty much done | 16:37 |
fungi | the easy 99% is done, the nearly impossible 1% is all that's left now? ;) | 16:38 |
mnaser | there's an intersting thread on centos-devel | 16:39 |
*** igordc has joined #openstack-infra | 16:39 | |
clarkb | mnaser: specifically we test our software's future. If we wantto have specific jobs that combine our future with another future we can do that. But I dont think we want that as default | 16:39 |
*** jpena is now known as jpena|off | 16:39 | |
clarkb | corvus: ok I'm running `vos release -v mirror.ubuntu -localauth` on afs01.dfw in root screen window 1 | 16:43 |
*** igordc has quit IRC | 16:44 | |
*** derekh has quit IRC | 16:44 | |
*** kjackal has quit IRC | 16:46 | |
* fungi disappears for a bit to try and reverse the boarding-up of windows | 16:46 | |
clarkb | vos release is done and the timestamps match now | 16:47 |
clarkb | checking if the kernel udpate is present | 16:48 |
clarkb | kernel update is not present | 16:48 |
clarkb | I think that means next step is turning the mirror update server back on and doign an ubuntu mirror update | 16:48 |
*** gfidente has quit IRC | 16:49 | |
corvus | ++ | 16:49 |
*** ykarel|away has quit IRC | 16:54 | |
clarkb | server is booted. the /var/run/reprepro dir does not exist. I seem to recall /var/run with systemd is an fs not backed on disk and we use puppet to create that dir? | 16:55 |
clarkb | I'll manually create it once I Figuer that out to speed things up | 16:55 |
clarkb | hrm no maybe that isn't how it is set up? where does /var/run/reprepro come from? | 16:57 |
clarkb | ah it is in puppet just a different manifest | 16:57 |
clarkb | ubuntu mirror update is now running in screen on mirror-update01.openstack.org | 16:59 |
*** ociuhandu_ has joined #openstack-infra | 17:00 | |
*** ociuhandu has quit IRC | 17:03 | |
*** ykarel|away has joined #openstack-infra | 17:03 | |
*** ociuhandu_ has quit IRC | 17:05 | |
*** udesale has quit IRC | 17:12 | |
clarkb | nb01 and nb02 are not running builders currently. Looks like they both OOMed on august 27 | 17:12 |
clarkb | I'm going to clean their disks then reboot them (may as well get updates since they aren't doing anything anyway) | 17:12 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 17:14 |
clarkb | ubuntu mirror is updated and released. Now just need to get the builders going | 17:16 |
*** jamesmcarthur has quit IRC | 17:18 | |
clarkb | nb01 has been asked to reboot but it has not come back yet :/ | 17:29 |
clarkb | I'll give it 5 more minuets then use the api to re reboot it | 17:30 |
*** njohnston is now known as njohnston|lunch | 17:30 | |
*** ralonsoh has quit IRC | 17:32 | |
*** nicolasbock has quit IRC | 17:35 | |
*** jtomasek has quit IRC | 17:37 | |
clarkb | hard reboot fixed it. Now cleaning up the disk and then will reenable nodepool-builder and reboot again (to ensure that reboots not after an OOM work) | 17:37 |
*** ralonsoh has joined #openstack-infra | 17:39 | |
*** nicolasbock has joined #openstack-infra | 17:40 | |
*** igordc has joined #openstack-infra | 17:44 | |
*** trident has quit IRC | 17:46 | |
*** e0ne has joined #openstack-infra | 17:46 | |
*** igordc has quit IRC | 17:50 | |
clarkb | nb01 is up and running and building images | 17:53 |
*** kjackal has joined #openstack-infra | 17:54 | |
*** diablo_rojo has quit IRC | 17:55 | |
*** igordc has joined #openstack-infra | 17:57 | |
*** trident has joined #openstack-infra | 17:59 | |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Add cleanup playbook to all base jobs https://review.opendev.org/681322 | 18:05 |
clarkb | infra-root ^ I think that is ready. I also pasted logs from ze10 for a base-test tested job that ran there | 18:06 |
*** igordc has quit IRC | 18:06 | |
*** igordc has joined #openstack-infra | 18:06 | |
AJaeger | clarkb: can we avoid the duplication? Why not use the same playbook everywhere? | 18:09 |
AJaeger | clarkb: you made a pasto, I commented | 18:10 |
clarkb | AJaeger: thats how those jobs are done today with the duplication | 18:10 |
clarkb | I belive so that they can be modified without unintended side effects | 18:10 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 18:12 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add report time to item model https://review.opendev.org/681323 | 18:12 |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add Item.formatStatusUrl https://review.opendev.org/681324 | 18:12 |
AJaeger | clarkb: the others mainly include roles, but yes, it's duplicated. | 18:13 |
corvus | clarkb, AJaeger: when we correct the typo, can we put the playbook in 'base/' rather than base-minimal? | 18:13 |
corvus | (so the base-minimal job will refer to the base/ playbook, not the other way around) | 18:14 |
corvus | oh wait i have misunderstood | 18:14 |
paladox | corvus you'll be happy with https://github.com/dburm/pg-test-result-plugin ! | 18:14 |
corvus | one sec | 18:14 |
paladox | live demo at https://gerrit.git.wmflabs.org/r/c/testing/test/+/3101 | 18:14 |
corvus | clarkb, AJaeger: forget i said anything :) | 18:15 |
*** igordc has quit IRC | 18:15 | |
corvus | paladox: cool, that looks like a polygerrit version of our hacky hideci script | 18:16 |
paladox | yeh | 18:16 |
corvus | paladox: so our upgrade path can be 2.13 ---> 2.16, with hideci on the gwt ui, and pg-test-results on the polygerrit ui, pause for a bit, then -> 3.0 with checks | 18:17 |
paladox | yup | 18:17 |
*** eharney has quit IRC | 18:17 | |
paladox | i plan on installing that for wikimedia | 18:18 |
corvus | paladox: nice, i think that'll work great, thanks! | 18:18 |
paladox | if there's no objection | 18:18 |
*** jamesmcarthur has joined #openstack-infra | 18:18 | |
corvus | noted in https://etherpad.openstack.org/p/gerrit-upgrade | 18:18 |
paladox | :) | 18:18 |
paladox | (just noting it works for only the latest patchset) | 18:19 |
paladox | it basically looks through messages to match a pattern. | 18:19 |
corvus | paladox: that's mostly what hideci does too (though it counts runs from previous patchsets to tell you how many there were; we can live without that i think) | 18:19 |
paladox | ah | 18:19 |
corvus | or, it's probably a trivial pr to the pg plugin to add that if it sounds useful | 18:20 |
paladox | yup | 18:20 |
paladox | corvus https://gerrit.wikimedia.org/r/plugins/gitiles/operations/software/gerrit/plugins/wikimedia/+/master/gr-wikimedia/gr-wikimedia-custom-buttons.html is what i used to add a "Recheck" button if you want to copy that! | 18:22 |
clarkb | corvus: AJaeger do you think maybe base-test should be a separate playbook since we tend to test that independently but then I could have base and base-minimal share a common playbook? | 18:23 |
clarkb | I do think having base-test be separate is likely a good thing to avoid test stuff leaking into production unexpectedly | 18:23 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Add cleanup playbook to all base jobs https://review.opendev.org/681322 | 18:24 |
clarkb | AJaeger: corvus ^ that fixes the typ | 18:24 |
clarkb | #status log Rebooted and cleaned up /opt/dib_tmp on nb01 and nb02 after their builders stopped running due to OOMs | 18:28 |
openstackstatus | clarkb: finished logging | 18:28 |
clarkb | johnsom: ^ I've got the entire image build toolchain running again. Now we just need ubuntu-bionic to haev its turn in the queue and get uploaded | 18:33 |
johnsom | Ok, let me know when I should fire up a test | 18:34 |
*** ociuhandu has joined #openstack-infra | 18:39 | |
*** ykarel|away has quit IRC | 18:39 | |
*** ociuhandu has quit IRC | 18:44 | |
AJaeger | clarkb: sharing base and base-minimal should be ok - and yes, having base-test separately helps. But we can keep that separate as well... | 18:44 |
*** ralonsoh has quit IRC | 18:46 | |
fungi | well, base-test is for manually testing the base job. base-minimal is for using an analog of the base job in tests of jobs, right? | 18:48 |
clarkb | fungi: ya, mostly base-minimal replaces pre-run playbook stuff though so that test jobs can test those playbooks directly | 18:49 |
*** goldyfruit___ has joined #openstack-infra | 18:49 | |
*** gyee has quit IRC | 18:49 | |
fungi | the main consumer, according to codesearch, is https://opendev.org/zuul/zuul-jobs/src/branch/master/zuul-tests.d/general-roles-jobs.yaml#L103-L106 | 18:50 |
*** gyee has joined #openstack-infra | 18:50 | |
fungi | so if the desire is to separate the base job from tests of the base job or its contents, then separating base, base-test and base-minimal is desirable | 18:50 |
*** goldyfruit_ has quit IRC | 18:52 | |
fungi | also jobs like https://opendev.org/openstack/openstack-zuul-jobs/src/branch/master/zuul.d/jobs.yaml#L6-L19 | 18:52 |
*** panda is now known as panda|rover|off | 18:54 | |
*** jamesmcarthur has quit IRC | 18:57 | |
*** ricolin has quit IRC | 18:58 | |
*** kjackal has quit IRC | 19:03 | |
*** kjackal has joined #openstack-infra | 19:08 | |
ianw | ok, mirroring update -- it looks like everything but mirror.opensuse released itself and is back in sync | 19:09 |
clarkb | ianw: ya I turned mirror-update.openstack.org back on | 19:09 |
ianw | i have run the salvage described in https://lists.openafs.org/pipermail/openafs-devel/2018-May/020493.html | 19:09 |
clarkb | ianw: in order to get ubuntu updates with kernel that fixes ovs panic | 19:09 |
clarkb | and that all seems happy | 19:09 |
ianw | i'm re-running the mirror.opensuse release again (in root screen on afs01) and we can see if that volume is fixed by that | 19:10 |
clarkb | ianw: sounds great | 19:10 |
*** eharney has joined #openstack-infra | 19:11 | |
Shrews | ianw: since you're around now, do you still need those two held nodes? couple of weeks old now | 19:17 |
*** pcaruana has quit IRC | 19:20 | |
ianw | Shrews: ... um which ones sorry? | 19:20 |
Shrews | ianw: i don't remember. i just cleaned several yesterday. yours were the only 2 left | 19:20 |
clarkb | (I've since added a couple debugging network issues in testing) | 19:20 |
Shrews | ianw: the comment is "ianw: redirs" if that helps | 19:22 |
clarkb | johnsom: bionic is building now so hopefully in about an hour. Will confirm with you before you should recheck though | 19:22 |
ianw | Shrews: oh right yeah, sorry just pulling up status page ... no that can be deleted if you have a console up, thanks, or i can do it | 19:22 |
Shrews | ianw: i'm there now, i'll delete them | 19:22 |
ianw | thanks | 19:22 |
Shrews | done. np | 19:23 |
*** pkopec has quit IRC | 19:28 | |
*** mattw4 has joined #openstack-infra | 19:29 | |
*** mattw4 has quit IRC | 19:33 | |
*** panda|rover|off has quit IRC | 19:37 | |
openstackgerrit | James E. Blair proposed opendev/system-config master: Add docs for recovering an OpenAFS fileserver https://review.opendev.org/681338 | 19:38 |
*** panda has joined #openstack-infra | 19:38 | |
johnsom | clarkb Ok, thanks | 19:42 |
clarkb | infra-root https://review.opendev.org/#/c/681322/ cleanup playbook in production is ready for review now I think | 20:01 |
clarkb | johnsom: bionic image just finished building and is being uploaded to clouds now | 20:02 |
ianw | infra-root: also if one other wants to check f30 node support with -> https://review.opendev.org/680919 ... i can monitor | 20:02 |
johnsom | Nice | 20:03 |
ianw | once this opensuse volume releases (assuming it does now after the salvage ...) i might take the chance to switch on auditing and try a fedora volume release before re-enabling | 20:05 |
*** e0ne has quit IRC | 20:19 | |
clarkb | mgagne: are you about? curious if you know what the status of infra using inap is. I think you mentioned someone else was working on it now? | 20:35 |
*** ociuhandu has joined #openstack-infra | 20:37 | |
openstackgerrit | Merged zuul/zuul master: Record handler tasks in json job output https://review.opendev.org/680726 | 20:38 |
clarkb | cmurphy: did you manage to get the timeout fixes in for keystone? I noticed that the lower constraints job hits that too (and I'm guessing upper constraints) if they hvaen't been updated yet | 20:48 |
cmurphy | clarkb: yeah we did but i missed that one https://review.opendev.org/681161 | 20:49 |
cmurphy | if you want to make the gate a little healthier you could consider promoting that one, it's been waiting for almost 6 hours | 20:50 |
clarkb | cool. Also I think openstack/requirements runs your unittests on dep updates | 20:50 |
cmurphy | oof :/ | 20:50 |
clarkb | should be able to make a similar update in requirements | 20:50 |
clarkb | as for promoting I'm not opposed though we've got cinder also flapping. It would probaly be helpful if someone could take a cross project view and start identifying these changes as priorities then we can promote them all | 20:51 |
*** kjackal has quit IRC | 20:52 | |
*** ociuhandu has quit IRC | 20:52 | |
*** ociuhandu has joined #openstack-infra | 20:53 | |
clarkb | cmurphy: I enqueued 681161 into the gate (but didn't promote it to head since it looks like things ahead of it might be stable ish for now?) | 20:53 |
clarkb | cinder in particular has been flapping on unittests and the cinder ahead passed those | 20:54 |
clarkb | if I catch it reseting again I'll promote | 20:54 |
cmurphy | thanks clarkb | 20:55 |
clarkb | smcginnis: diablo_rojo_phon ttx ^ thinking out loud here the release team might want to act like ATC for changes that fix the gate? | 20:55 |
clarkb | (qa seems a bit hands off these days, but maybe someone from qa is willing to help too?) | 20:56 |
fungi | i'm not sure what "act like atc" means | 20:56 |
clarkb | fungi: track changes that fix issues in our testing and ensure they get enqueued/promoted as appropriate when ready | 20:57 |
*** ociuhandu has quit IRC | 20:57 | |
clarkb | and to prioritize these efforts across projects | 20:57 |
clarkb | basically the situation we are in now is a large backlog because we are all fighting against each other with flaky testing | 20:57 |
clarkb | rather than focusing on ensuring that fixes get attention first | 20:57 |
smcginnis | clarkb: I can push the cinder fix through. It's one of those stupid false UT failures. | 20:58 |
fungi | ahh, okay, i mainly just didn't know what "atc" meant in that context | 20:58 |
clarkb | johnsom: bionic is updated on all clouds that run jobs right now | 20:58 |
clarkb | johnsom: I think you are clear to check if octavia jobs are happier now | 20:58 |
johnsom | clarkb Awesome. I will give it a go | 20:59 |
clarkb | fungi: for example that keystone fix hsa sat queued for ~5 hours in check because there are a ton of nova neutron cinder keystone etc changes all up too | 20:59 |
fungi | maybe it's just that i don't know what "atc" is an abbreviation of there. not "active technical contributor" but something else | 20:59 |
clarkb | fungi: air traffic control | 21:00 |
clarkb | sorry | 21:00 |
fungi | got it. thanks! | 21:00 |
*** Goneri has quit IRC | 21:00 | |
fungi | i thought you were suggesting that they should open voting for release management ptl to anyone who fixed gate bugs (which is also an interesting suggestion, but entirely different) | 21:00 |
clarkb | oh no mostly we need someone (some group) to basically do air traffic contorl for changes since things are really unhappy and we are theoretically in a stablization period and have RCs in ~2 weeks | 21:01 |
fungi | yes, having release and/or qa folks help track and coordinate major gate-obstructing fixes in openstack projects would be a huge help | 21:01 |
smcginnis | We may want to push through https://review.opendev.org/#/c/681318/ | 21:01 |
smcginnis | It's approved, but is going to take a long time yet to land. | 21:01 |
smcginnis | I think this was the main cinder issue causing resets. | 21:02 |
clarkb | smcginnis: I've enqueued it too | 21:03 |
smcginnis | Thanks! | 21:03 |
clarkb | the concern I have is that a long ish gate that resets often represents significant wasted effort | 21:04 |
smcginnis | Definitely. | 21:05 |
clarkb | and that then slows down check | 21:05 |
fungi | yes, and significant delays | 21:05 |
fungi | that | 21:05 |
fungi | slows down everything, not *just* check | 21:05 |
clarkb | ya | 21:05 |
clarkb | I've got the promote command typed up for those two and will run it if I catch a reset | 21:07 |
clarkb | smcginnis: note the top of gate is a cinder change that failed in unittests and legacy-grenade-dsvm-cinder-mn-sub-volbak | 21:12 |
clarkb | looks like an error with nova cells in that job | 21:14 |
clarkb | mriedem: ^ https://48dfe966566e2a08c129-e099c5c03695c7198c297e75ec3f8d05.ssl.cf1.rackcdn.com/680838/1/gate/legacy-grenade-dsvm-cinder-mn-sub-volbak/ab25959/logs/grenade.sh.txt.gz do you understand why that may happen? | 21:14 |
mriedem | looks like a subnode isn't getting discovered, could be a race in how the job is setup | 21:17 |
mriedem | i've never heard of that job before though - is it voting? | 21:17 |
clarkb | mriedem: yes it and a unittest failure are the cause of the latest gate reset | 21:18 |
mriedem | not really a cells issue, just a failure to map the host issue | 21:18 |
clarkb | we think we've got a change up to fix the unittest issue. Next up fixing that grenade job I guess | 21:18 |
clarkb | mriedem: thats the thing that d-g runs after all the stack.sh runs right? | 21:18 |
clarkb | its from d-g/tool/something.sh ? | 21:18 |
clarkb | (its a legacy job so should be using d-g) | 21:18 |
mriedem | yeah https://github.com/openstack/devstack-gate/blob/401e7535c6c29f9ba814058610ef6cefc952678b/devstack-vm-gate.sh#L251 | 21:19 |
mriedem | btw https://review.opendev.org/#/c/681238/ is a known nova functional test race bug fix as well | 21:19 |
clarkb | mriedem: that change is a bit more involved than the other two I just enqueued to the gate so don't trust myself to evaluate it. you however have approved it. Are we reasonably sure it will pass testing as is without having run check tests first? | 21:21 |
clarkb | (I don't want to enqueue something that will make the gate worse) | 21:21 |
mriedem | i was going to rebase it into the middle of another series that needs to use some code off it so it can sit out | 21:22 |
clarkb | k | 21:22 |
mriedem | will be doing so from the comfort of a dance class lobby while my kid is in class... | 21:22 |
mriedem | i see this Creating host mapping for compute host 'ubuntu-xenial-ovh-gra1-0011108653': e6914d05-6244-4b1f-86dc-b73a4cc5ce3c | 21:24 |
mriedem | in that one job | 21:24 |
clarkb | gate just reset so I'm promoting those two | 21:25 |
mriedem | i think that's the control node | 21:25 |
clarkb | https://d494348350733031166c-4e71828f84900af50a9a26357b84a827.ssl.cf1.rackcdn.com/676138/16/gate/openstack-tox-py27/9628b8b/job-output.txt caused this reset, nova db migration tests | 21:26 |
mriedem | i think that's an old one http://status.openstack.org/elastic-recheck/#1823251 | 21:26 |
mriedem | or http://status.openstack.org/elastic-recheck/#1793364 | 21:26 |
mriedem | in that legacy-grenade-dsvm-cinder-mn-sub-volbak failure i can't see where discover_hosts was run from d-g on the subnode | 21:27 |
mriedem | it's supposed to generate a log but i don't see that getting archived | 21:27 |
mriedem | we must not archive $WORKSPACE/logs/devstack-gate-discover-hosts.txt ? | 21:27 |
mriedem | oh nvm i see why, | 21:28 |
mriedem | for grenade we don't create that log..https://github.com/openstack/devstack-gate/blob/401e7535c6c29f9ba814058610ef6cefc952678b/devstack-vm-gate.sh#L714 | 21:29 |
clarkb | and we only run it if the branch in devstack has it for the old side | 21:29 |
mriedem | it's been around since ocata so it'll be there | 21:30 |
clarkb | ok the change is against queens https://review.opendev.org/#/c/680838/ so we would be looking at pike | 21:31 |
mriedem | this is what that log should look like https://776a947f0924bf63aed6-b3fb5ecd7a4ff2f6244fae5211e9579f.ssl.cf5.rackcdn.com/673990/18/check/nova-live-migration/4d7d637/logs/devstack-gate-discover-hosts.txt.gz | 21:31 |
clarkb | https://opendev.org/openstack/devstack/src/branch/stable/pike/tools/discover_hosts.sh and it does exist | 21:31 |
clarkb | is it possible that nova-manage does not exist so it noops? | 21:32 |
mriedem | nova-manage exists, | 21:32 |
mriedem | otherwise you can't migrate the db schema | 21:32 |
clarkb | is it possible it doesn't exist on the subnode? | 21:33 |
clarkb | oh we run that only on the contrller so ya should be fine | 21:33 |
mriedem | yeah, | 21:33 |
mriedem | so what i think is probably happening, | 21:33 |
*** sshnaidm|ruck is now known as sshnaidm|afk | 21:33 | |
mriedem | is in pike devstack might not have a patch to wait properly/longer for nova-compute to come up on the subnode before stacking is done, | 21:34 |
clarkb | mriedem: grep discover_hosts.sh in https://48dfe966566e2a08c129-e099c5c03695c7198c297e75ec3f8d05.ssl.cf1.rackcdn.com/680838/1/gate/legacy-grenade-dsvm-cinder-mn-sub-volbak/ab25959/logs/grenade.sh.txt.gz | 21:34 |
mriedem | i think that's something mnaser fixed in devstack but i'm not sure how far back that fix went | 21:34 |
mriedem | if discover_hosts runs before the subnode nova-compute service is created in the db it won't discover it | 21:34 |
clarkb | ah | 21:34 |
clarkb | 2019-09-10 20:22:55.309 | Found 1 unmapped computes in cell: 7ba51989-bbc4-42fe-93e2-7f52bf2b2828 is what it says | 21:34 |
clarkb | so I guess what this might boil down to is stop running multinode grenade on queens? | 21:35 |
*** jamesmcarthur has joined #openstack-infra | 21:35 | |
mriedem | or backport the devstack fixes - i remember this showed up after the meltdown/spectre patches started hitting nodepool providers and slowing things down enough that we'd see the race | 21:36 |
mriedem | i'm having trouble finding the patch though | 21:36 |
mriedem | https://review.opendev.org/#/q/I0cd7f193589a1a0776ae76dc30cecefe7ba9e5db | 21:37 |
mriedem | bingo | 21:37 |
mriedem | not in pike | 21:37 |
mriedem | brotha | 21:37 |
mriedem | oh shit it is | 21:37 |
mriedem | heh well i'm out of ideas o-) | 21:37 |
mriedem | i have to run to this class, will be online later | 21:37 |
*** rh-jelabarre has quit IRC | 21:37 | |
*** mriedem has quit IRC | 21:37 | |
*** exsdev0 has joined #openstack-infra | 21:42 | |
*** exsdev has quit IRC | 21:43 | |
*** exsdev0 is now known as exsdev | 21:43 | |
*** rh-jelabarre has joined #openstack-infra | 21:43 | |
clarkb | donnyd: thinking out loud here re your swift graphs. Might be good to show a rate of growth (even aggregate is fine) that way we can see if we are trending always upward or if we plateau or trend down etc | 21:47 |
clarkb | I'm working to finally getaround to filing that gitea bug and noticed https://github.com/go-gitea/gitea/issues/491 | 21:53 |
clarkb | tl;dr gitea upstream seems aware of their performance issues, there is someone maintaining a branch that addresses it but they haven't upstreamed it :/ | 21:53 |
fungi | that sounds familiar | 21:54 |
*** iokiwi has quit IRC | 21:59 | |
*** adriant has quit IRC | 21:59 | |
*** mriedem has joined #openstack-infra | 22:00 | |
mriedem | clarkb: so on that failed queens grenade job, | 22:05 |
mriedem | the last discover_hosts.sh runs at: | 22:05 |
mriedem | 2019-09-10 20:32:13.462 | + ./post-stack.sh:main:8 : /opt/stack/old/devstack/tools/discover_hosts.sh | 22:05 |
mriedem | the compute node record for the subnode that was not discovered is created after that: | 22:05 |
mriedem | Sep 10 20:32:24.538778 ubuntu-xenial-ovh-gra1-0011108654 nova-compute[26738]: INFO nova.compute.resource_tracker [None req-09805f53-608d-4e2a-bfeb-1c3222e9b451 None None] Compute node record created for ubuntu-xenial-ovh-gra1-0011108654:ubuntu-xenial-ovh-gra1-0011108654 with uuid: b5a32e0d-b32d-4516-a21b-cb5ac5952aba | 22:05 |
clarkb | https://github.com/go-gitea/gitea/issues/8146 has been filed | 22:06 |
clarkb | mriedem: k so that was what you thought was fixed and backported? | 22:06 |
ianw | ok, i guess the whole-server salvage worked and opensuse mirror r/o is synced now | 22:06 |
mriedem | well what's weird is that devstack log on the subnode finds the compute service in the api earlier: | 22:07 |
mriedem | 2019-09-10 20:32:07.888 | ++ :: : openstack --os-cloud devstack-admin --os-region RegionOne compute service list --host ubuntu-xenial-ovh-gra1-0011108654 --service nova-compute -c ID -f value | 22:07 |
ianw | i was hoping mirror.wheel.bionicx64 was just in progress, but it seems not. i'll recover that too | 22:08 |
mriedem | clarkb: so no that patch i linked earlier about the timeout isn't the same thing, | 22:11 |
mriedem | looks like the subnode devstack says the compute is ready b/c the service record is in the API, | 22:11 |
mriedem | before the compute node record is created which is what discover_hosts is looking for, | 22:11 |
mriedem | which doesn't really make sense to me, but if that node is pike...it's been a long time since i've looked at what might be buggy there still | 22:11 |
mriedem | http://logstash.openstack.org/#dashboard/file/logstash.json?query=message%3A%5C%22is%20not%20mapped%20to%20any%20cell%5C%22%20AND%20tags%3A%5C%22console%5C%22&from=7d | 22:13 |
*** jamesmcarthur has quit IRC | 22:14 | |
mriedem | predominantly on ovh-gra1 nodes, i wonder if there is some subtle timing issue on those nodes | 22:14 |
*** jamesmcarthur has joined #openstack-infra | 22:15 | |
*** aaronsheffield has quit IRC | 22:16 | |
*** jamesmcarthur has quit IRC | 22:17 | |
mriedem | looking at nova pike code, the service record that devstack is waiting on is created before the compute node record that discover_hosts is looking for, | 22:18 |
mriedem | so that's a race | 22:18 |
ianw | with no releases happening, i'm going to enable logging on the afs servers so we can hopefully get to the bottom of this slow release issue | 22:18 |
clarkb | ianw: k | 22:18 |
*** jamesmcarthur has joined #openstack-infra | 22:18 | |
clarkb | mriedem: I guess we have to decide how valuable it is to keep supporting pike then? | 22:18 |
clarkb | mriedem: I only called it out because it caused a gate reset but there are other ways of dealing with those | 22:19 |
mriedem | looking at logstash results this is pretty rare it looks like | 22:19 |
mriedem | i would make legacy-grenade-dsvm-cinder-mn-sub-volbak non-voting in stable/queens first, though it looks like that job hasn't made it's way to cinder yet so it's still branchless | 22:20 |
mriedem | although couldn't you make it non-voting in the cinder stable/pike .zuul.yaml? https://github.com/openstack/cinder/blob/stable/queens/.zuul.yaml#L57 | 22:21 |
*** slaweq has quit IRC | 22:21 | |
*** jamesmcarthur has quit IRC | 22:21 | |
clarkb | ya I think you can | 22:21 |
clarkb | basically list it there and set nonvoting to true | 22:21 |
clarkb | seemsl ike this was the same trick used to increase keystone's unittest timeouts | 22:22 |
openstackgerrit | Merged openstack/project-config master: Add Fedora 30 nodes https://review.opendev.org/680919 | 22:24 |
*** jamesmcarthur has joined #openstack-infra | 22:24 | |
ianw | ok, interesting; i've enabled auditing and am doing a "vos release mirror.fedora" now. since nothing has written to this since it was released (mirror-update.opendev.org is off ...) i'd expect this would be a zero-delta update | 22:25 |
ianw | ... it does not appear to think it is ... | 22:25 |
mriedem | i'm puzzled why this isn't an issue after pike nodes but nothing is coming to mind right now | 22:25 |
mriedem | maybe it is and we're just lucky | 22:25 |
clarkb | mnaser: re https://review.opendev.org/#/c/662300/ we are planning project renames on monday. Now would be a good time to udpate that if it is still something you want otherwise maybe abandon? | 22:26 |
*** panda has quit IRC | 22:26 | |
clarkb | mriedem: that is often the case particularly with timing issues | 22:26 |
mriedem | hmm, when did devstack change to systemd? | 22:26 |
clarkb | mriedem: swap out hardware and all of a sudden no longer lucky | 22:26 |
clarkb | mriedem: sdague did it so a while back | 22:26 |
clarkb | I think after the first PTG | 22:27 |
clarkb | whenever that was | 22:27 |
mriedem | the service record and compute node record should be created by the time the service is reported as running/active by systemctl | 22:27 |
mriedem | first ptg was pike | 22:27 |
mriedem | in hot lanta | 22:27 |
mriedem | so, this might be a case of pre-systemd devstack not waiting long enough for the service to be fully started and records created | 22:27 |
clarkb | in the legionnaires hotel | 22:27 |
*** panda has joined #openstack-infra | 22:28 | |
clarkb | fungi: is it valid for a project to be in two storyboard groups? https://review.opendev.org/#/c/669298/3/gerrit/projects.yaml | 22:30 |
*** ociuhandu has joined #openstack-infra | 22:30 | |
mriedem | fwiw systemd in devstack was made the default in queens https://review.opendev.org/#/c/499186/ | 22:31 |
fungi | clarkb: absolutely | 22:31 |
fungi | a project can be in as many groups as make sense | 22:32 |
*** prometheanfire has quit IRC | 22:33 | |
*** prometheanfire has joined #openstack-infra | 22:33 | |
clarkb | remote: https://review.opendev.org/681353 project rename records | 22:33 |
clarkb | I think https://etherpad.openstack.org/p/project-renames-2019-09-10 and its related changes are ready for review now | 22:34 |
clarkb | fungi: thanks for confirming | 22:34 |
*** ociuhandu has quit IRC | 22:35 | |
johnsom | clarkb So far looking good. Will know for sure in a few hours. | 22:35 |
*** jamesmcarthur has quit IRC | 22:37 | |
*** mriedem has quit IRC | 22:41 | |
clarkb | corvus: fungi ianw do we want to apply cleanups globally? https://review.opendev.org/#/c/681322/ | 22:43 |
*** jamesmcarthur has joined #openstack-infra | 22:44 | |
fungi | ianw and i both just +2'd it but would appreciate corvus's input so i did not approve | 22:47 |
clarkb | k | 22:47 |
fungi | since it's a change to the base job, i feel extra caution is warranted | 22:47 |
fungi | also i'm knocking off for the day... it's been a long one | 22:48 |
corvus | fungi, clarkb: +2 | 22:49 |
corvus | clarkb: interested in making that happen only on failure? | 22:49 |
corvus | i think it's possible | 22:49 |
clarkb | corvus: hrm can I check a zuul var for that? | 22:50 |
clarkb | zuul_success says grep | 22:50 |
corvus | yep | 22:50 |
clarkb | sure | 22:50 |
corvus | that's the one i was thinking of | 22:50 |
clarkb | do you think that should go through base-test too (probably not a bad idea) | 22:50 |
corvus | yes | 22:50 |
*** asettle has quit IRC | 22:51 | |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Add cleanup playbook to all base jobs https://review.opendev.org/681322 | 22:54 |
openstackgerrit | Clark Boylan proposed opendev/base-jobs master: Update cleanup tasks to only happen on failure https://review.opendev.org/681354 | 22:54 |
clarkb | 681354 is parent of 681322 and modifies base-test | 22:54 |
*** mattw4 has joined #openstack-infra | 22:55 | |
clarkb | I'll test that on both successful and failed job cases | 22:55 |
*** jamesmcarthur has quit IRC | 22:57 | |
clarkb | ianw: is it possible that something other than rsync is touching files (maybe some verification step though I didn't think we had those for the rsynced repos) | 22:59 |
*** rcernin has joined #openstack-infra | 22:59 | |
ianw | clarkb: so right now i'm just testing with rsync completely out of the loop. it just finished "vos release mirror.fedora" and now i am re-running it | 23:00 |
*** mattw4 has quit IRC | 23:00 | |
*** shachar has joined #openstack-infra | 23:01 | |
ianw | i'm watching the rxdebug stats on afs02, right now it seems that the thread handling this has recevied 3110647592 bytes | 23:02 |
*** tkajinam has joined #openstack-infra | 23:03 | |
ianw | we'll see where it ends up at ... but multiple gigabytes for a volume that has had no updates doesn't seem right | 23:03 |
*** snapiri has quit IRC | 23:03 | |
ianw | we have the fileaudit logs too, so can see nothing touched the underlying r/w volume | 23:03 |
*** exsdev has quit IRC | 23:10 | |
*** slaweq has joined #openstack-infra | 23:11 | |
*** rcosnita has joined #openstack-infra | 23:13 | |
*** exsdev has joined #openstack-infra | 23:13 | |
*** slaweq has quit IRC | 23:15 | |
*** goldyfruit___ has quit IRC | 23:17 | |
*** rcosnita has quit IRC | 23:18 | |
*** jamesmcarthur has joined #openstack-infra | 23:18 | |
*** jamesmcarthur has quit IRC | 23:21 | |
openstackgerrit | James E. Blair proposed zuul/zuul master: WIP: Add support for the Gerrit checks plugin https://review.opendev.org/680778 | 23:22 |
*** jamesmcarthur has joined #openstack-infra | 23:23 | |
johnsom | clarkb Yes, confirmed. No more retries. | 23:24 |
johnsom | clarkb ianw Thank you for all your help. | 23:25 |
*** xenos76 has quit IRC | 23:25 | |
*** jamesmcarthur has quit IRC | 23:26 | |
openstackgerrit | Merged zuul/nodepool master: Fix node failures when at volume quota https://review.opendev.org/671704 | 23:29 |
clarkb | johnsom: great one less issue to worry about I wonder if other jobs will be more stable too | 23:30 |
johnsom | Any job actually putting traffic through floating IPs should have been impacted. | 23:30 |
johnsom | Doesn't seem to take much traffic either | 23:31 |
*** dchen has joined #openstack-infra | 23:31 | |
*** jamesmcarthur has joined #openstack-infra | 23:31 | |
clarkb | the traffic would need to fragment too | 23:35 |
clarkb | which if I'm remembering correctly from way back when we didn't always have mtus set correctly doesn't always happen | 23:35 |
johnsom | I know we do for the amphora, but this was out at the neutron level, so... don't know | 23:35 |
*** iokiwi has joined #openstack-infra | 23:37 | |
*** adriant has joined #openstack-infra | 23:39 | |
*** jamesmcarthur has quit IRC | 23:41 | |
*** mtreinish has quit IRC | 23:43 | |
*** sthussey has quit IRC | 23:46 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!