*** mattw4 has quit IRC | 00:22 | |
*** Goneri has quit IRC | 00:31 | |
*** georgk has quit IRC | 00:31 | |
*** georgk has joined #openstack-infra | 00:32 | |
*** diablo_rojo has quit IRC | 00:35 | |
*** michael-beaver has quit IRC | 00:35 | |
*** markvoelker has joined #openstack-infra | 00:45 | |
*** yamamoto has joined #openstack-infra | 00:45 | |
*** jamesmcarthur has quit IRC | 00:46 | |
fungi | ianw: thanks for sending out tomorrow's meeting agenda! | 00:48 |
---|---|---|
*** markvoelker has quit IRC | 00:50 | |
*** gregoryo has joined #openstack-infra | 00:52 | |
*** auristor has quit IRC | 00:56 | |
*** jiaopengju has joined #openstack-infra | 01:04 | |
*** ricolin has joined #openstack-infra | 01:05 | |
*** auristor has joined #openstack-infra | 01:05 | |
*** hongbin has joined #openstack-infra | 01:32 | |
*** hwoarang has quit IRC | 01:36 | |
*** hwoarang has joined #openstack-infra | 01:38 | |
*** yamamoto has quit IRC | 01:42 | |
*** yamamoto has joined #openstack-infra | 01:42 | |
*** markvoelker has joined #openstack-infra | 01:46 | |
*** markvoelker has quit IRC | 01:50 | |
*** rajinir has quit IRC | 01:52 | |
*** goldyfruit has quit IRC | 01:57 | |
*** apetrich has quit IRC | 01:57 | |
*** dychen has joined #openstack-infra | 02:06 | |
*** dchen has quit IRC | 02:08 | |
*** dingyichen has joined #openstack-infra | 02:10 | |
*** dychen has quit IRC | 02:12 | |
*** lseki has quit IRC | 02:22 | |
*** markvoelker has joined #openstack-infra | 02:46 | |
*** bhavikdbavishi has joined #openstack-infra | 02:48 | |
*** mgoddard has quit IRC | 02:48 | |
*** markvoelker has quit IRC | 02:51 | |
*** bhavikdbavishi has quit IRC | 02:52 | |
*** whoami-rajat has joined #openstack-infra | 03:01 | |
*** mgoddard has joined #openstack-infra | 03:01 | |
*** tristanC has quit IRC | 03:01 | |
*** tristanC has joined #openstack-infra | 03:04 | |
*** xw19 has joined #openstack-infra | 03:05 | |
*** ramishra has joined #openstack-infra | 03:06 | |
*** mattw4 has joined #openstack-infra | 03:09 | |
*** bhavikdbavishi has joined #openstack-infra | 03:10 | |
*** markvoelker has joined #openstack-infra | 03:47 | |
*** hongbin has quit IRC | 03:47 | |
*** psachin has joined #openstack-infra | 03:50 | |
openstackgerrit | Merged opendev/system-config master: epel: mirror also aarch64 https://review.opendev.org/663973 | 03:51 |
*** markvoelker has quit IRC | 03:52 | |
*** udesale has joined #openstack-infra | 04:00 | |
*** weifan has joined #openstack-infra | 04:03 | |
*** threestrands has joined #openstack-infra | 04:05 | |
*** weifan has quit IRC | 04:06 | |
*** yboaron_ has joined #openstack-infra | 04:21 | |
*** yboaron_ has quit IRC | 04:25 | |
*** mattw4 has quit IRC | 04:29 | |
*** janki has joined #openstack-infra | 04:35 | |
*** xw19 has quit IRC | 04:41 | |
openstackgerrit | Merged openstack/project-config master: Add airship/election https://review.opendev.org/664641 | 04:44 |
*** markvoelker has joined #openstack-infra | 04:48 | |
*** markvoelker has quit IRC | 04:52 | |
*** ykarel|away has joined #openstack-infra | 04:53 | |
*** pcaruana has joined #openstack-infra | 04:56 | |
openstackgerrit | Joshua Hesketh proposed zuul/zuul master: Expose date time as facts https://review.opendev.org/664674 | 05:06 |
*** zzzeek has quit IRC | 05:07 | |
*** zzzeek has joined #openstack-infra | 05:08 | |
*** kjackal has quit IRC | 05:10 | |
*** raukadah is now known as chandankumar | 05:21 | |
*** rtjure has joined #openstack-infra | 05:24 | |
*** ykarel|away has quit IRC | 05:26 | |
*** MarkMaglana has quit IRC | 05:36 | |
*** masayukig has quit IRC | 05:36 | |
*** johnsom has quit IRC | 05:36 | |
*** csatari has quit IRC | 05:36 | |
*** MarkMaglana has joined #openstack-infra | 05:36 | |
*** rpioso has quit IRC | 05:36 | |
*** rpittau|afk has quit IRC | 05:36 | |
*** hogepodge has quit IRC | 05:36 | |
*** rpittau|afk has joined #openstack-infra | 05:37 | |
*** rpioso has joined #openstack-infra | 05:37 | |
*** masayukig has joined #openstack-infra | 05:37 | |
*** csatari has joined #openstack-infra | 05:38 | |
*** hogepodge has joined #openstack-infra | 05:38 | |
*** johnsom has joined #openstack-infra | 05:38 | |
*** ykarel|away has joined #openstack-infra | 05:39 | |
*** ykarel|away is now known as ykarel | 05:40 | |
*** dtantsur|afk is now known as dtantsur | 05:42 | |
*** weifan has joined #openstack-infra | 05:43 | |
*** weifan has quit IRC | 05:43 | |
*** e0ne has joined #openstack-infra | 05:44 | |
*** markvoelker has joined #openstack-infra | 05:49 | |
*** udesale has quit IRC | 05:51 | |
*** udesale has joined #openstack-infra | 05:52 | |
*** ykarel_ has joined #openstack-infra | 05:52 | |
*** markvoelker has quit IRC | 05:54 | |
*** ykarel has quit IRC | 05:55 | |
*** pcrews has quit IRC | 05:57 | |
yoctozepto | > <fungi> i don't think this is related to our mirrors | 06:00 |
ianw | i think we have an odd problem on nb01 at least, kswapd is using up a cpu and nodepool-builder seems very busy leaking memory ... i think it might be in a thread | 06:01 |
yoctozepto | > <fungi> none of our mirrors have bionic-updates/multiverse amd64 Packages, the passing job fetched it from archive.ubuntu.com instead | 06:02 |
yoctozepto | nope, the both use the mirrors | 06:02 |
yoctozepto | they* both | 06:02 |
yoctozepto | the first apt run uses the non-mirrored but this is irrelevant | 06:02 |
yoctozepto | hrw is doing right by disabling the multiverse and restricted | 06:03 |
yoctozepto | but the problem is that the two mirrors I posted act differently | 06:03 |
ianw | # pwd | 06:04 |
ianw | root@nb01:/proc/1103/task# ls -ltrah | wc -l | 06:04 |
ianw | 259 | 06:04 |
ianw | ... leaking threads | 06:04 |
yoctozepto | sorry for being too lazy in words :D | 06:04 |
ianw | yoctozepto: does it look the Packages files are different sizes? | 06:04 |
yoctozepto | ianw: did not check but | 06:06 |
yoctozepto | The repository 'http://mirror.iad.rax.opendev.org/ubuntu bionic-updates Release' does not have a Release file. | 06:06 |
yoctozepto | this line is the difference between two mirror | 06:06 |
yoctozepto | mirrors* | 06:06 |
yoctozepto | damn, I am eating letters today | 06:06 |
ianw | it does though right? http://mirror.iad.rax.opendev.org/ubuntu/dists/bionic-updates/Release | 06:07 |
yoctozepto | ianw: it does now, though it must have not had it then | 06:08 |
yoctozepto | it's not a last minute job :D | 06:08 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/project-config master: Normalize projects.yaml https://review.opendev.org/665869 | 06:09 |
yoctozepto | I was just letting you know that our failure was not due to multiverse/restricted as those were just warnings and we had them for some time since we switched back to using mirror | 06:09 |
yoctozepto | but now there were some random issues | 06:09 |
*** slaweq has joined #openstack-infra | 06:10 | |
ianw | yoctozepto: ok; well we currently have 3 things in flight WRT to mirrors -- mirror.iad.rax.opendev.org is running kafs, mirror.dfw.rax.opendev.org is running openafs 1.8.3 and mirror.*.rax.openSTACK.org is still running openafs 1.6 | 06:10 |
yoctozepto | :DDD | 06:10 |
ianw | if something is missing in openDEV.org, just switching to openSTACK.org and confirming the file *is* there would be very helpful | 06:10 |
ianw | that would mean that somehow, the new kafs or openafs 1.8 clients have become out of sync | 06:10 |
yoctozepto | ianw: ok, I will be looking at it this way | 06:11 |
ianw | thanks; we can investigate but having something clearly out of sync would make it much easier to iterate and debug on :) | 06:11 |
*** janki has quit IRC | 06:15 | |
yoctozepto | ianw: sure, I was late with the debug | 06:15 |
openstackgerrit | Merged openstack/diskimage-builder master: Update test coverage for openSUSE/-minimal to 15.1 https://review.opendev.org/660137 | 06:21 |
*** e0ne has quit IRC | 06:26 | |
*** rpittau|afk is now known as rpittau | 06:28 | |
*** ykarel__ has joined #openstack-infra | 06:32 | |
*** igordc has quit IRC | 06:32 | |
*** ykarel_ has quit IRC | 06:34 | |
*** e0ne has joined #openstack-infra | 06:36 | |
*** pgaxatte has joined #openstack-infra | 06:43 | |
openstackgerrit | Merged openstack/diskimage-builder master: Remove the rhel 8 check for xfs https://review.opendev.org/663998 | 06:49 |
openstackgerrit | Merged openstack/diskimage-builder master: Use architecture-specific grub2 RPMs on RHEL8 https://review.opendev.org/663693 | 06:49 |
openstackgerrit | Merged openstack/diskimage-builder master: Move pypi to dib-python https://review.opendev.org/664240 | 06:49 |
*** markvoelker has joined #openstack-infra | 06:50 | |
*** dpawlik has joined #openstack-infra | 06:51 | |
*** markvoelker has quit IRC | 06:54 | |
*** trident has quit IRC | 06:57 | |
*** trident has joined #openstack-infra | 06:59 | |
*** apetrich has joined #openstack-infra | 07:05 | |
*** ginopc has joined #openstack-infra | 07:08 | |
*** udesale has quit IRC | 07:11 | |
*** udesale has joined #openstack-infra | 07:11 | |
*** e0ne has quit IRC | 07:13 | |
*** tosky has joined #openstack-infra | 07:15 | |
*** jpich has joined #openstack-infra | 07:22 | |
*** pcrews has joined #openstack-infra | 07:23 | |
*** tesseract has joined #openstack-infra | 07:24 | |
*** owalsh has quit IRC | 07:35 | |
*** owalsh has joined #openstack-infra | 07:35 | |
*** threestrands has quit IRC | 07:36 | |
*** iurygregory has joined #openstack-infra | 07:36 | |
*** ykarel__ is now known as ykarel|lunch | 07:40 | |
dirk | does anyone know how diskimage-builder gets released? I think I need a release with https://review.opendev.org/660137 included soonish | 07:40 |
*** xek has joined #openstack-infra | 07:41 | |
evrardjp | dirk: doesn't it follow standard releases? | 07:42 |
evrardjp | as _independent I mean | 07:42 |
dirk | doesn't look like. | 07:42 |
dirk | there is historical data in there but it doesn't have commits for the recent releases | 07:42 |
evrardjp | yeah I see | 07:43 |
evrardjp | mmm | 07:43 |
*** jpena|off is now known as jpena | 07:43 | |
*** kjackal has joined #openstack-infra | 07:45 | |
openstackgerrit | Tobias Urdin proposed opendev/system-config master: Mirror Train packages from UCA https://review.opendev.org/665896 | 07:46 |
*** kopecmartin|off is now known as kopecmartin | 07:49 | |
*** ralonsoh has joined #openstack-infra | 07:50 | |
*** markvoelker has joined #openstack-infra | 07:50 | |
*** markvoelker has quit IRC | 07:55 | |
*** spsurya has joined #openstack-infra | 08:01 | |
*** gregoryo has quit IRC | 08:02 | |
*** savihou has joined #openstack-infra | 08:03 | |
*** dingyichen has quit IRC | 08:14 | |
*** jpena has quit IRC | 08:15 | |
*** pkopec has joined #openstack-infra | 08:23 | |
*** e0ne has joined #openstack-infra | 08:23 | |
*** psachin has quit IRC | 08:24 | |
*** ociuhandu has quit IRC | 08:25 | |
*** lucasagomes has joined #openstack-infra | 08:26 | |
*** lucasagomes has quit IRC | 08:27 | |
*** dklyle has quit IRC | 08:35 | |
*** david-lyle has joined #openstack-infra | 08:35 | |
*** ykarel|lunch is now known as ykarel| | 08:35 | |
*** ykarel| is now known as ykarel | 08:36 | |
openstackgerrit | Iury Gregory Melo Ferreira proposed openstack/project-config master: Adding CI for ironic-prometheus-exporter https://review.opendev.org/665910 | 08:37 |
*** priteau has joined #openstack-infra | 08:38 | |
*** imacdonn has quit IRC | 08:40 | |
*** imacdonn has joined #openstack-infra | 08:40 | |
*** ociuhandu has joined #openstack-infra | 08:41 | |
iurygregory | good morning | 08:41 |
iurygregory | fungi, you around? About project creation, I've pushed the separate change for CI job, should I used Depends-On based on the first change? | 08:43 |
*** jpena has joined #openstack-infra | 08:44 | |
*** gfidente has joined #openstack-infra | 08:45 | |
*** ociuhandu has quit IRC | 08:46 | |
*** derekh has joined #openstack-infra | 08:47 | |
*** tkajinam has quit IRC | 08:50 | |
*** markvoelker has joined #openstack-infra | 08:51 | |
*** lucasagomes has joined #openstack-infra | 08:55 | |
*** markvoelker has quit IRC | 08:56 | |
*** bhavikdbavishi has quit IRC | 09:01 | |
*** sshnaidm|afk is now known as sshnaidm | 09:04 | |
ianw | dirk: i can do a release now, was just waiting for those bits to merge | 09:04 |
ianw | hrw: also the epel release should be done with the new arch stuff | 09:05 |
*** ginopc has quit IRC | 09:06 | |
*** ginopc has joined #openstack-infra | 09:07 | |
*** ginopc has quit IRC | 09:12 | |
*** ginopc has joined #openstack-infra | 09:17 | |
ianw | mordred: see note on leaking threads in nodepool-builder -> https://review.opendev.org/#/c/665014/1 | 09:17 |
*** ociuhandu has joined #openstack-infra | 09:20 | |
ianw | dirk: 2.24.0 should be making it's way through | 09:23 |
*** yamamoto has quit IRC | 09:23 | |
*** yamamoto has joined #openstack-infra | 09:24 | |
*** Adri2000 has quit IRC | 09:26 | |
openstackgerrit | Merged openstack/project-config master: Stop building control plane images https://review.opendev.org/665014 | 09:28 |
*** yamamoto has quit IRC | 09:29 | |
*** panda|off is now known as panda | 09:30 | |
*** yboaron_ has joined #openstack-infra | 09:43 | |
*** savihou has quit IRC | 09:43 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Add missing doc for pipeline start-message https://review.opendev.org/665930 | 09:45 |
*** rcernin has quit IRC | 09:57 | |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Add missing start-message in pipeline config schema https://review.opendev.org/665936 | 10:12 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Add missing doc for pipeline start-message https://review.opendev.org/665930 | 10:13 |
*** ykarel is now known as ykarel|afk | 10:27 | |
*** pkopec has quit IRC | 10:38 | |
*** pkopec has joined #openstack-infra | 10:39 | |
*** yamamoto has joined #openstack-infra | 10:41 | |
*** derekh has quit IRC | 10:42 | |
*** derekh has joined #openstack-infra | 10:42 | |
*** psachin has joined #openstack-infra | 10:48 | |
*** jpich has quit IRC | 10:49 | |
*** yamamoto has quit IRC | 10:50 | |
*** jpich has joined #openstack-infra | 10:50 | |
*** udesale has quit IRC | 10:57 | |
*** yamamoto has joined #openstack-infra | 10:59 | |
*** Lucas_Gray has joined #openstack-infra | 11:01 | |
*** ykarel|afk is now known as ykarel | 11:19 | |
*** bhavikdbavishi has joined #openstack-infra | 11:28 | |
*** jpena is now known as jpena|lunch | 11:32 | |
*** bhavikdbavishi has quit IRC | 11:35 | |
*** bhavikdbavishi has joined #openstack-infra | 11:42 | |
*** Adri2000 has joined #openstack-infra | 11:46 | |
*** yamamoto has quit IRC | 11:48 | |
*** yamamoto has joined #openstack-infra | 11:49 | |
*** yamamoto has quit IRC | 11:53 | |
*** rfolco has joined #openstack-infra | 11:59 | |
*** Chosimba1 has joined #openstack-infra | 12:03 | |
*** Chosimba1 has quit IRC | 12:03 | |
*** ykarel is now known as ykarel|afk | 12:04 | |
*** lpetrut has joined #openstack-infra | 12:05 | |
*** tdasilva has joined #openstack-infra | 12:10 | |
fungi | yoctozepto: sorry, there was no way to refer to specific lines in that log you linked, but the end of the file claimed errors retrieving the packages file for multiverse. can you quote the line with the actual error on it? | 12:12 |
*** goldyfruit has joined #openstack-infra | 12:15 | |
fungi | dirk: ianw: looks like https://pypi.org/project/diskimage-builder/ has 2.24.0 now | 12:15 |
*** jcoufal has joined #openstack-infra | 12:17 | |
yoctozepto | fungi: the line is: The repository 'http://mirror.iad.rax.opendev.org/ubuntu bionic-updates Release' does not have a Release file. | 12:17 |
*** _erlon_ has joined #openstack-infra | 12:17 | |
yoctozepto | but it does now | 12:17 |
yoctozepto | so I'll let you know when it stops again :D | 12:18 |
fungi | ahh, okay | 12:19 |
openstackgerrit | Jean-Philippe Evrard proposed zuul/zuul master: Expose date time as facts https://review.opendev.org/664674 | 12:19 |
fungi | going back over the log you linked now to see if i would have spotted it under other circumstances (the lines about the multiverse packages file were "e" lines not "w" lines) | 12:20 |
fungi | yoctozepto: and the one you're saying is the problem is a "w" line, not an "e" line | 12:21 |
fungi | implying it's just a warning | 12:21 |
*** rh-jelabarre has joined #openstack-infra | 12:21 | |
yoctozepto | fungi: yup, but that E thing should be only a W if only it had a Release for that repo | 12:21 |
yoctozepto | but it did not | 12:21 |
yoctozepto | see the other warnings | 12:22 |
yoctozepto | and in the "working" case | 12:22 |
yoctozepto | they are warnings at most | 12:22 |
yoctozepto | because blah blah does not have this set of stuff | 12:22 |
yoctozepto | worse when it does not tell you what it has :-) | 12:22 |
fungi | yeah, but in the working case you linked it didn't retrieve those files from our mirror. it clearly says it was fetching those files from archive.ubuntu.com | 12:24 |
fungi | which is *not* our mirror | 12:24 |
*** sthussey has joined #openstack-infra | 12:25 | |
fungi | ahh, i do see elsewhere in that log where it also fetched package metadata from the mirror.dfw.rax | 12:26 |
*** rh-jelabarre has quit IRC | 12:27 | |
*** rh-jelabarre has joined #openstack-infra | 12:27 | |
fungi | and the failing one also gets package metadata from archive.ubuntu.com | 12:27 |
fungi | okay, now i see why i was confused | 12:27 |
fungi | so the job is using both our mirror and ubuntu's server's | 12:28 |
pgaxatte | m | 12:29 |
yoctozepto | fungi: yes, tried to told ya | 12:30 |
yoctozepto | tell* ya | 12:30 |
fungi | "both the passing and failing runs use a mix of archive.ubuntu.com and the opendev mirrors" is the bit i missed you saying (if you said it). thanks for bearing with me | 12:32 |
*** udesale has joined #openstack-infra | 12:32 | |
*** pkopec has quit IRC | 12:32 | |
fungi | also dirk had indicated earlier in the day that kolla's jobs were *broken* because of the missing restricted and multiverse on our mirrors, so i had thought they weren't going to pass until his change merged to remove those | 12:34 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Add support for item.change for pipeline start-message formater https://review.opendev.org/665968 | 12:35 |
fungi | but maybe they weren't actually broken, just logging stray warnings? | 12:35 |
*** yamamoto has joined #openstack-infra | 12:35 | |
dirk | fungi: ianw: great, thanks | 12:36 |
fungi | er, not dirk but hrw (sorry dirk!) | 12:36 |
dirk | fungi: thanks for clarifying (I was still digging through my fading memories on what you're referring to .. ;-) ) | 12:37 |
fungi | heh, nope, i'm just insufficiently caffeinated at this time of morning | 12:37 |
dirk | can we W+1 https://review.opendev.org/#/c/660131/ now? | 12:37 |
dirk | I'd like to see the 15.1 leap images being in a testable state soon | 12:38 |
*** goldyfruit has quit IRC | 12:38 | |
*** lpetrut has quit IRC | 12:39 | |
*** rlandy has joined #openstack-infra | 12:40 | |
*** jpena|lunch is now known as jpena | 12:40 | |
fungi | dirk: i'll check the builders now to see if the release has propagated to all of them | 12:42 |
fungi | yep, all three have diskimage-builder 2.24.0 now, so approving | 12:43 |
openstackgerrit | Fabien Boucher proposed zuul/zuul master: Add change replacement field in doc for start-message https://review.opendev.org/665974 | 12:47 |
*** priteau has quit IRC | 12:49 | |
*** markvoelker has joined #openstack-infra | 12:55 | |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 12:57 |
openstackgerrit | Merged openstack/project-config master: Add openSUSE 15.1 to the nodepool building as opensuse-15 https://review.opendev.org/660131 | 12:58 |
*** ykarel|afk is now known as ykarel | 12:59 | |
*** markvoelker has quit IRC | 13:00 | |
*** goldyfruit has joined #openstack-infra | 13:00 | |
*** pkopec has joined #openstack-infra | 13:02 | |
*** Goneri has joined #openstack-infra | 13:02 | |
*** mriedem has joined #openstack-infra | 13:06 | |
*** roman_g has quit IRC | 13:08 | |
*** aaronsheffield has joined #openstack-infra | 13:16 | |
*** priteau has joined #openstack-infra | 13:19 | |
*** roman_g has joined #openstack-infra | 13:20 | |
goldyfruit | fungi, they fixed it the way you mentioned yesterday: https://review.opendev.org/#/c/665939 :) | 13:29 |
*** kjackal has quit IRC | 13:35 | |
mnaser | hmm | 13:36 |
*** kjackal has joined #openstack-infra | 13:36 | |
mnaser | http://zuul.openstack.org/stream/b0f02e7d547b41078a8a4ede9f6dd94a?logfile=console.log | 13:36 |
mnaser | this is about to timeout | 13:36 |
mnaser | been stuck at fetch-zuul-cloner for a while | 13:36 |
*** pcaruana has quit IRC | 13:37 | |
*** bhavikdbavishi has quit IRC | 13:39 | |
fungi | goldyfruit: that's great, thanks for following up! | 13:39 |
goldyfruit | np :) | 13:40 |
fungi | mnaser: the good news is that monday we'll be removing fetch-zuul-cloner from our base job | 13:40 |
mnaser | fungi: heh, /me hopes thy have removed all references | 13:41 |
*** lseki has joined #openstack-infra | 13:41 | |
*** yamamoto has quit IRC | 13:41 | |
fungi | well, legacy-base and similar abstract jobs will still have it | 13:42 |
*** yamamoto has joined #openstack-infra | 13:42 | |
fungi | but this way non-legacy jobs don't need to incur the setup overhead | 13:42 |
*** yamamoto has quit IRC | 13:42 | |
mordred | morning fungi ! | 13:43 |
*** yamamoto has joined #openstack-infra | 13:43 | |
mordred | fungi: anything I should look at or start wrapping my head around? | 13:43 |
fungi | nothing's on fire | 13:43 |
fungi | hoping it stays that way | 13:44 |
openstackgerrit | Merged openstack/os-testr master: Deprecate ostestr command https://review.opendev.org/573636 | 13:44 |
mordred | fungi: yay! it's wonderful when things are not on fire. at least, unless the things in question are things you want to be on fire, like if you're trying to start a fire | 13:46 |
*** psachin has quit IRC | 13:46 | |
*** udesale has quit IRC | 13:48 | |
*** udesale has joined #openstack-infra | 13:49 | |
*** ykarel is now known as ykarel|away | 13:50 | |
*** rkukura has quit IRC | 13:53 | |
*** liuyulong has joined #openstack-infra | 13:53 | |
*** pcaruana has joined #openstack-infra | 13:54 | |
mnaser | fungi: fyi.. Connection to mirror.iad.rax.opendev.org timed out. (connect timeout=60.0) | 13:55 |
mnaser | that's how fetch-zuul-cloner failed | 13:55 |
mnaser | so not sure if that is the vm or the mirror | 13:56 |
*** markvoelker has joined #openstack-infra | 13:56 | |
*** michael-beaver has joined #openstack-infra | 13:56 | |
mnaser | http://zuul.openstack.org/stream/081d904f92b946399f69d44135d81bac?logfile=console.log | 13:56 |
mnaser | this is another rax iad that is about to fail the same way | 13:56 |
mnaser | maybe someone needs to check in on that mirror? | 13:56 |
*** ekultails has joined #openstack-infra | 13:58 | |
fungi | yeah, logging in | 14:05 |
*** ykarel|away has quit IRC | 14:06 | |
fungi | it's entirely unresponsive to ssh and even ping | 14:06 |
corvus | maybe it got sacked | 14:06 |
fungi | v6 or v4 | 14:07 |
*** bhavikdbavishi has joined #openstack-infra | 14:07 | |
*** liuyulong has quit IRC | 14:07 | |
fungi | openstack server show indicates "status SHUTOFF" | 14:08 |
fungi | ip addresses match so that's got to be the one | 14:09 |
corvus | cacti's last reading was at 09:55 | 14:09 |
fungi | rebooting it now | 14:09 |
mordred | well - with status SHUTOFF, it is appropriate for it to be non-functional | 14:09 |
*** chandankumar is now known as raukadah | 14:10 | |
fungi | did a `openstack server start ...` on it | 14:10 |
fungi | i'm getting ping responses again | 14:11 |
fungi | able to ssh in now. digging into its logs in a sec | 14:11 |
corvus | Jun 18 07:30:06 mirror01 kernel: [375159.154029] apache2: page allocation failure: order:0, mode:0x90c00(GFP_NOIO|__GFP_NORETRY|__GFP_NOMEMALLOC), nodemask= | 14:11 |
corvus | (null),cpuset=/,mems_allowed=0 | 14:11 |
corvus | last entry in syslog is: Jun 18 10:00:01 mirror01 CRON[29035]: (root) CMD (/usr/bin/flock -n /var/run/htcacheclean.lock /usr/bin/htcacheclean -n -p /var/cache/apache2/proxy -t -l 70200M > /dev/null) | 14:12 |
fungi | that's nifty | 14:12 |
fungi | this is our linux 5.2-rc4 with kafs, for those following along | 14:12 |
yoctozepto | > <fungi> but maybe they weren't actually broken, just logging stray warnings? | 14:13 |
*** mattmceuen has left #openstack-infra | 14:13 | |
yoctozepto | we don't use them so they should only produce warnings | 14:13 |
yoctozepto | we are removing them obviously so that they actually don't | 14:13 |
yoctozepto | though in this case it would just fail but with a different message | 14:14 |
fungi | yoctozepto: right, hrw had implied yesterday (or else i'd misunderstood him) that kolla's builds broke after switching to the mirrors because we're not mirroring restricted and multiverse and they were in the sources lists | 14:14 |
*** liuyulong has joined #openstack-infra | 14:15 | |
*** markvoelker has quit IRC | 14:15 | |
yoctozepto | fungi: yeah, well, we had the same change applied to stein recently so maybe he shifted the timeline, but for master they worked for some time | 14:16 |
yoctozepto | that's why I investigated further 8-) | 14:17 |
*** rajinir has joined #openstack-infra | 14:17 | |
*** jcoufal has quit IRC | 14:17 | |
yoctozepto | https://www.youtube.com/watch?v=mznsEcZlM2I | 14:17 |
fungi | Jun 18 09:15:46 mirror01 kernel: [381498.906721] [apache] vnode modified 3b6 on {20000038:3} [exp 3b5] FS.FetchStatus(vnode) | 14:19 |
fungi | wish i had the faintest idea what that meant... presumably kafs related? | 14:20 |
fungi | seems to happen with some regularity | 14:21 |
fungi | there were several page allocation failures for apache2 in yesterday's syslog | 14:23 |
*** icarusfactor has joined #openstack-infra | 14:23 | |
fungi | and a few in thursday's syslog | 14:23 |
fungi | no smoking gun here | 14:24 |
fungi | maybe the htcacheclean cron freaked it out somehow? no clue | 14:25 |
*** icarusfactor has quit IRC | 14:25 | |
*** factor has quit IRC | 14:25 | |
corvus | fungi: yeah, vnode is an afs term | 14:25 |
*** icarusfactor has joined #openstack-infra | 14:25 | |
zbr|ruck | fungi: do you remember why removal of fetch-zuul-cloner was delayed? https://review.opendev.org/#/c/663151/1 | 14:26 |
*** ianychoi_ is now known as ianychoi | 14:26 | |
*** lucasagomes has quit IRC | 14:26 | |
*** dpawlik has quit IRC | 14:27 | |
*** e0ne has quit IRC | 14:28 | |
*** priteau has quit IRC | 14:29 | |
*** bhavikdbavishi has quit IRC | 14:29 | |
*** lucasagomes has joined #openstack-infra | 14:30 | |
zbr|ruck | so iax mirror is messed... time to go for another coffee | 14:30 |
fungi | zbr|ruck: turns out it had been baked into numerous tools/tox_install.sh scripts (i'm unclear on why that meant we had to have it installed for ci jobs though), and AJaeger along with others worked to eradicate those wrapper scripts from projects | 14:31 |
fungi | zbr|ruck: we think the iad mirror is back up and working as of ~14:15z | 14:31 |
*** jamesmcarthur has joined #openstack-infra | 14:31 | |
fungi | something caused the instance for it to spontaneously go into shutoff state | 14:32 |
fungi | investigation is still underway | 14:32 |
fungi | #status log mirror.iad.rax.opendev.org started again at 14:10z after mysteriously entering shutoff state at 10:00z | 14:33 |
openstackstatus | fungi: finished logging | 14:33 |
fungi | we don't have any new tickets from the provider indicating a reason it might have been stopped externally | 14:39 |
*** pgaxatte has quit IRC | 14:40 | |
corvus | i guess we let it continue to run and see if the new kernel is more likely to crash like that, or if that was just a cosmic ray? | 14:41 |
corvus | fungi, mordred: is there console access to shut-off machines in rax? (ie, if this happens again, can we see the last thing that was printed to the console before shutoff?) | 14:42 |
mordred | corvus: I don't know | 14:43 |
mordred | I know console access there is weird in general | 14:43 |
zbr|ruck | fungi: at 14:26 it was till down based on http://logs.openstack.org/81/665981/1/check/openstack-tox-cover/5354bb1/job-output.txt.gz#_2019-06-18_14_26_33_694064 | 14:44 |
*** priteau has joined #openstack-infra | 14:44 | |
corvus | fungi: we may need to manually remount afs? | 14:44 |
fungi | hmm, maybe | 14:45 |
corvus | this is in history: mount -t afs "#openstack.org:root.afs." /afs | 14:46 |
fungi | in ianw's shell history? | 14:46 |
corvus | yes | 14:46 |
fungi | cool, i was digging for references to mount in there but hadn't found that one yet | 14:47 |
corvus | oh, there's no kafs mod loaded yet | 14:47 |
fungi | oh! | 14:47 |
corvus | modprobe kafs rootcell=openstack.org:104.130.136.20:23.253.200.228 | 14:47 |
corvus | that's also in history | 14:47 |
*** jcoufal has joined #openstack-infra | 14:47 | |
corvus | so how about i run those? | 14:48 |
fungi | we probably need something like that in a file under /etc/modules-load.d/ so it survives reboots | 14:48 |
fungi | yes, please | 14:48 |
corvus | http://mirror.iad.rax.opendev.org/ubuntu/dists/bionic/universe/binary-amd64/Packages works now | 14:48 |
fungi | the manual mount may or may not be necessary... it sounded like kafs might automount volumes but not sure if the root /afs mount has to be done explicitly first | 14:49 |
corvus | #status log ran "modprobe kafs rootcell=openstack.org:104.130.136.20:23.253.200.228" and "mount -t afs "#openstack.org:root.afs." /afs" on mirror01.iad.rax.opendev.org after reboot | 14:49 |
openstackstatus | corvus: finished logging | 14:49 |
corvus | fungi: i did an 'ls /afs' between the two and nada | 14:49 |
fungi | k | 14:50 |
fungi | sounds like it's needed after all | 14:50 |
corvus | fungi: yeah, i think even if we switch to the dynamic root option, we still need to mount /afs | 14:50 |
corvus | there was a command in the docs like "mount -t afs none /afs" | 14:50 |
openstackgerrit | Mark Meyer proposed zuul/zuul master: Extend event reporting https://review.opendev.org/662134 | 14:51 |
*** e0ne has joined #openstack-infra | 14:52 | |
fungi | i guess that's so the kernel knows where on your filesystem tree you want to graft the afs root | 14:53 |
*** mattw4 has joined #openstack-infra | 14:54 | |
*** mattw4 has quit IRC | 14:54 | |
*** mattw4 has joined #openstack-infra | 14:55 | |
*** priteau has quit IRC | 14:58 | |
*** jamesmcarthur_ has joined #openstack-infra | 14:59 | |
*** mattw4 has quit IRC | 15:01 | |
*** spsurya has quit IRC | 15:01 | |
*** diablo_rojo has joined #openstack-infra | 15:02 | |
*** jamesmcarthur has quit IRC | 15:02 | |
*** eharney has quit IRC | 15:06 | |
*** gyee has joined #openstack-infra | 15:07 | |
*** bhavikdbavishi has joined #openstack-infra | 15:09 | |
*** markvoelker has joined #openstack-infra | 15:12 | |
*** jiaopengju has quit IRC | 15:12 | |
*** tdasilva has quit IRC | 15:19 | |
*** tdasilva has joined #openstack-infra | 15:19 | |
*** whoami-rajat has quit IRC | 15:21 | |
openstackgerrit | jacky06 proposed openstack/diskimage-builder master: Sync Sphinx requirement https://review.opendev.org/666023 | 15:26 |
*** iurygregory has quit IRC | 15:26 | |
openstackgerrit | jacky06 proposed openstack/diskimage-builder master: Sync Sphinx requirement https://review.opendev.org/666023 | 15:27 |
*** e0ne has quit IRC | 15:28 | |
openstackgerrit | jacky06 proposed openstack/diskimage-builder master: Sync Sphinx requirement https://review.opendev.org/666023 | 15:30 |
*** markvoelker has quit IRC | 15:31 | |
*** markmcd has quit IRC | 15:31 | |
*** lucasagomes has quit IRC | 15:31 | |
*** zbr|ruck is now known as zbr|brb | 15:34 | |
*** hamzy_ has joined #openstack-infra | 15:38 | |
*** yboaron_ has quit IRC | 15:38 | |
*** michael-beaver has quit IRC | 15:39 | |
*** mgagne has quit IRC | 15:39 | |
*** d34dh0r53 has quit IRC | 15:39 | |
*** rajinir has quit IRC | 15:40 | |
*** masayukig has quit IRC | 15:40 | |
*** hamzy has quit IRC | 15:40 | |
*** d34dh0r53 has joined #openstack-infra | 15:40 | |
*** rajinir has joined #openstack-infra | 15:40 | |
*** michael-beaver has joined #openstack-infra | 15:40 | |
*** masayukig has joined #openstack-infra | 15:40 | |
*** mgagne has joined #openstack-infra | 15:42 | |
*** ekultails has quit IRC | 15:42 | |
*** goldyfruit has quit IRC | 15:43 | |
mwhahaha | how do we retire a project from storyboard? | 15:44 |
*** mattw4 has joined #openstack-infra | 15:44 | |
*** gfidente has quit IRC | 15:45 | |
*** mattw4 has quit IRC | 15:46 | |
fungi | mwhahaha: we have a toggle in the stories table for is_active which ought to do the trick. which project is it? | 15:47 |
mwhahaha | openstack/tripleo-ui | 15:47 |
fungi | i've set openstack/tripleo-ui to inactive... let's see if that has the effect we desire | 15:48 |
*** hamzy_ is now known as hamzy | 15:48 | |
fungi | still able to find the project in the jump-to search | 15:50 |
mwhahaha | yea it still shows up in the project group and has stories still | 15:51 |
fungi | still gives me the option to add a story for it too... i'll do a bit more research, though to remove it from the project group you just get rid of the groups entry in gerrit/projects.yaml | 15:51 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 15:52 |
mwhahaha | we have a patch up to do that i think as well | 15:52 |
mwhahaha | i was going to just go through all the stories and mark them as invalid but the client lacks functionality and trying to craft the api calls isn't high on my list | 15:52 |
fungi | mwhahaha: in launchpad, you can close down reporting of new bugs for a project (though you can't prevent bugtasks being added to other bugs as affecting that project, or people updating the existing bugs)... i'm assuming we would like something more comprehensive than that at least | 15:53 |
mwhahaha | right but i can bulk act on the bugs for a project via the lp client | 15:54 |
mwhahaha | doesn't seem like that's currently available from storyboardclient | 15:54 |
fungi | it's likely we've never discussed what behavior we expect for inactive projects in sb | 15:54 |
fungi | but this provides a good opportunity to do so | 15:54 |
fungi | mwhahaha: which lp client? launchpadlib? | 15:55 |
mwhahaha | yea | 15:55 |
*** goldyfruit has joined #openstack-infra | 15:55 | |
mwhahaha | cause i can get all the tasks for my project and set them all to wontfix or whatever | 15:55 |
mwhahaha | there's also not a wontfix state | 15:55 |
fungi | got it, i never knew you could tell it to take an action on multiple bugs matching a particular query | 15:55 |
mwhahaha | well i can query them all | 15:55 |
mwhahaha | there is no search implementation in the python client | 15:55 |
mwhahaha | nor does there seem to be a way to fetch them all. I can fetch *all* stories just not by project | 15:56 |
mwhahaha | which is less than ideal | 15:56 |
fungi | for won't fix, would invalid status suffice? | 15:56 |
mwhahaha | kinda of | 15:56 |
mwhahaha | but not really because they have two different meanings | 15:56 |
mwhahaha | i was going to set them all to invalid but only because there isn't a better option | 15:57 |
*** _erlon_ has quit IRC | 15:57 | |
*** mattw4 has joined #openstack-infra | 15:58 | |
*** pcaruana has quit IRC | 16:01 | |
*** igordc has joined #openstack-infra | 16:03 | |
*** liuyulong has quit IRC | 16:04 | |
*** tdasilva_ has joined #openstack-infra | 16:10 | |
*** tdasilva has quit IRC | 16:12 | |
*** rpittau is now known as rpittau|afk | 16:14 | |
*** tdasilva_ is now known as tdasilva | 16:17 | |
*** ekultails has joined #openstack-infra | 16:17 | |
yoctozepto | fungi: http://logs.openstack.org/17/664217/1/check/kolla-build-ubuntu-binary/f9b833e/ara-report/ | 16:19 |
yoctozepto | here is the failure with this mirror now | 16:19 |
yoctozepto | forbidden now | 16:19 |
fungi | yoctozepto: wow, that suggests apache felt it lacked read permission to that file i suspect | 16:20 |
fungi | -rw-r--r-- 1 10004 root 7481056 Apr 26 2018 /afs/openstack.org/mirror/ubuntu/dists/bionic/main/binary-amd64/Packages | 16:21 |
fungi | that's from the mirror server in question | 16:21 |
fungi | so seems world-readable | 16:21 |
corvus | was that during our downtime? | 16:22 |
fungi | and i can cat the file as an unprivileged user | 16:22 |
fungi | 2019-06-18 14:44:02.016899 | 16:22 |
fungi | i think that's when /afs wasn't mounted yet | 16:22 |
corvus | yes... roughly 14:30 - 14:45 was our downtime due to not having afs mounted | 16:22 |
corvus | and 403 forbidden was the error for that | 16:22 |
yoctozepto | k then | 16:23 |
yoctozepto | doing recheck | 16:23 |
corvus | yoctozepto: so that was a transient infra error, sorry. you can recheck | 16:23 |
fungi | yoctozepto: if you see gets which happened after 14:45z hitting something similar please do let us know | 16:23 |
yoctozepto | ok, staying vigilant 8-) | 16:23 |
fungi | appreciated! | 16:23 |
fungi | we want to be able to use newer ubuntu for our mirror servers, which means we need working afs on arm64 (for our arm clouds), which means we'd really like for kafs to work. so far no solid evidence of it being broken on linux 5.2, but we do have a couple of suspicious incidents and no real pattern yet | 16:25 |
yoctozepto | fungi: I see, thanks for explanation | 16:27 |
*** markvoelker has joined #openstack-infra | 16:28 | |
*** jpich has quit IRC | 16:32 | |
fungi | the missing http://mirror.iad.rax.opendev.org/ubuntu/dists/bionic-updates/Release for http://logs.openstack.org/21/665621/3/check/kolla-build-ubuntu-binary/a84b8fb/job-output.txt.gz#_2019-06-17_18_11_17_213998 was ~5.5 hours prior to the "Date: Mon, 17 Jun 2019 22:41:00 UTC" stamp in the current existing file | 16:33 |
fungi | i'm going to see if i can tell when reprepro updated that | 16:34 |
*** weifan has joined #openstack-infra | 16:34 | |
fungi | i think that was updated in the vos release which completed at 2019-06-17T22:55:59 | 16:37 |
*** weifan has quit IRC | 16:38 | |
*** zbr|brb is now known as zbr|ruck | 16:38 | |
fungi | the last vos release before that failure completed at 2019-06-17T16:39:07 | 16:43 |
fungi | a new reprepro run started at 2019-06-17T18:29:01 which is 17m44s after the failure, so unless the node's clock was waaay off that's not related | 16:44 |
*** pcaruana has joined #openstack-infra | 16:45 | |
fungi | time on the mirror-update server seems to be accurate at least | 16:45 |
*** ricolin has quit IRC | 16:46 | |
*** markvoelker has quit IRC | 16:46 | |
openstackgerrit | Merged zuul/zuul master: Change default job_dir location https://review.opendev.org/665186 | 16:47 |
openstackgerrit | James E. Blair proposed ttygroup/gertty master: Add prev/next patchset keys to diff view https://review.opendev.org/666049 | 16:49 |
*** igordc has quit IRC | 16:49 | |
openstackgerrit | Merged zuul/zuul master: Make git repo leak check advisory in TestExecutor https://review.opendev.org/665764 | 16:50 |
*** e0ne has joined #openstack-infra | 16:51 | |
aspiers | stephenfin: regarding https://review.opendev.org/#/c/660732/, can you suggest where I should get a sample lower-constraints.txt from? And how to prevent it rapidly bit-rotting? | 16:52 |
aspiers | stephenfin: feel free to respond to the same questions on the review itself if you prefer | 16:53 |
*** ramishra has quit IRC | 16:55 | |
*** panda has quit IRC | 16:56 | |
*** jamesmcarthur_ has quit IRC | 16:57 | |
*** weifan has joined #openstack-infra | 16:58 | |
*** jamesmcarthur has joined #openstack-infra | 16:59 | |
*** altlogbot_2 has quit IRC | 17:00 | |
*** panda has joined #openstack-infra | 17:01 | |
*** altlogbot_2 has joined #openstack-infra | 17:01 | |
*** irclogbot_3 has quit IRC | 17:01 | |
*** irclogbot_3 has joined #openstack-infra | 17:02 | |
*** jamesmcarthur has quit IRC | 17:04 | |
*** diablo_rojo has quit IRC | 17:04 | |
openstackgerrit | Merged zuul/zuul master: Pagure driver - https://pagure.io/pagure/ https://review.opendev.org/604404 | 17:05 |
openstackgerrit | Michael Johnson proposed openstack/diskimage-builder master: Add ubuntu-minimal-kvm element https://review.opendev.org/666063 | 17:06 |
*** corvus is now known as thecount | 17:06 | |
*** thecount is now known as corvus | 17:06 | |
*** ykarel|away has joined #openstack-infra | 17:10 | |
*** udesale has quit IRC | 17:11 | |
mnaser | 2019-06-18 14:39:41.814080 | ubuntu-bionic | E: Failed to fetch http://mirror.iad.rax.opendev.org/ubuntu/dists/bionic/main/binary-amd64/Packages 403 Forbidden [IP: 2001:4802:7802:104:be76:4eff:fe20:4b35 80] | 17:13 |
openstackgerrit | Dirk Mueller proposed openstack/openstack-zuul-jobs master: Add and switch to the newly created opensuse-15 nodeset https://review.opendev.org/666066 | 17:13 |
mnaser | ... can i recheck? | 17:13 |
fungi | mnaser: yep, that was between when we got the server back up and when we mounted /afs on it | 17:13 |
*** Lucas_Gray has quit IRC | 17:18 | |
mnaser | fungi: cool thanks | 17:18 |
*** bhavikdbavishi has quit IRC | 17:19 | |
fungi | just to confirm, i checked the syslog on the dfw mirror and apache's not hitting allocation failures there | 17:21 |
fungi | need to pop out and get groceries, but should return well before the infra meeting | 17:22 |
*** pkopec has quit IRC | 17:23 | |
*** pkopec has joined #openstack-infra | 17:23 | |
*** ociuhandu_ has joined #openstack-infra | 17:24 | |
*** dtantsur is now known as dtantsur|afk | 17:25 | |
*** ociuhandu has quit IRC | 17:26 | |
*** jamesmcarthur has joined #openstack-infra | 17:28 | |
*** ociuhandu_ has quit IRC | 17:28 | |
*** jamesmcarthur has quit IRC | 17:33 | |
*** slaweq has quit IRC | 17:34 | |
*** igordc has joined #openstack-infra | 17:35 | |
*** whoami-rajat has joined #openstack-infra | 17:37 | |
*** kopecmartin is now known as kopecmartin|off | 17:37 | |
*** jpena is now known as jpena|off | 17:43 | |
*** eharney has joined #openstack-infra | 17:44 | |
openstackgerrit | Merged zuul/zuul master: Remove unused user_id from github client https://review.opendev.org/665504 | 17:46 |
*** mattmceuen has joined #openstack-infra | 17:50 | |
mattmceuen | hello! would someone be able to add me to the new `airship-election-core` gerrit group? I can add the rest of the airship election officials after that | 17:51 |
*** jamesmcarthur has joined #openstack-infra | 17:52 | |
mordred | mattmceuen: done | 17:56 |
mattmceuen | thanks mordred! | 17:56 |
*** ralonsoh has quit IRC | 17:56 | |
*** pkopec has quit IRC | 17:58 | |
*** ykarel|away has quit IRC | 17:59 | |
*** markmcclain has quit IRC | 18:00 | |
*** weifan has quit IRC | 18:00 | |
*** weifan has joined #openstack-infra | 18:02 | |
*** markmcclain has joined #openstack-infra | 18:02 | |
*** dpawlik has joined #openstack-infra | 18:06 | |
*** rkukura has joined #openstack-infra | 18:08 | |
fungi | looks like the iad mirror stayed up while i was down at the shops | 18:12 |
fungi | no apache allocation failures logged yet since the reboot | 18:12 |
*** e0ne has quit IRC | 18:13 | |
openstackgerrit | Merged zuul/zuul master: Rework GitHub rate limit handling https://review.opendev.org/665505 | 18:16 |
openstackgerrit | Merged zuul/zuul master: Switch getPullBySha to using the search api https://review.opendev.org/665469 | 18:27 |
*** e0ne has joined #openstack-infra | 18:29 | |
*** weifan has quit IRC | 18:30 | |
*** ociuhandu has joined #openstack-infra | 18:30 | |
*** slaweq has joined #openstack-infra | 18:31 | |
*** guimaluf has joined #openstack-infra | 18:31 | |
fungi | are 665506,665507,665511 all we're waiting to merge before we restart opendev's zuul to burn things in? | 18:31 |
*** e0ne has quit IRC | 18:33 | |
*** pkopec has joined #openstack-infra | 18:34 | |
fungi | is 663015 also safe to approve right now? | 18:34 |
fungi | er, i meant to ask those questions in #zuul... *sigh* | 18:35 |
*** ociuhandu has quit IRC | 18:39 | |
*** derekh has quit IRC | 18:40 | |
*** rkukura has quit IRC | 18:43 | |
dmsimard | Seeing a lot of POST_FAILURES with no logs and only finger URLs. Watched a console and saw an error about the version of python: http://paste.openstack.org/raw/753154/ | 18:43 |
*** pcaruana has quit IRC | 18:45 | |
dmsimard | 10f3ab875c524a99b93d4191aff4eb4d on ze03 and da0a27a4f5e649e59bcdf9a25bb3ee0b on ze02 both failed with that same error | 18:45 |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 18:47 |
fungi | "Current version: 3.4.3" | 18:48 |
fungi | i think that's trusty? | 18:48 |
dmsimard | It looks like Ansible is complaining that the version of python on logs.o.o is insufficient. | 18:48 |
dmsimard | The "create log directories" task is delegated to logs.o.o | 18:49 |
corvus | neat... did we just upgrade ansible 2.7 on the executors? | 18:49 |
fungi | yeah, i guess this will be solved by the move to xenial for static.o.o | 18:49 |
fungi | -rwxr-xr-x 1 root root 5863 Mar 18 22:57 /usr/lib/zuul/ansible/2.7/bin/ansible | 18:50 |
fungi | that's on ze01 | 18:50 |
pabelanger | we should be able to force python2.7 on logs.o.o | 18:51 |
pabelanger | we likey need to update add_host command | 18:51 |
fungi | so ansible 2.7 hasn't been upgraded on ze01 at least since march | 18:51 |
pabelanger | and set ansible_python_interpreter | 18:51 |
pabelanger | but not sure why it would start complaining now | 18:51 |
dmsimard | pabelanger: yes, I'm trying to find out why all of a sudden | 18:52 |
openstackgerrit | Merged zuul/zuul master: Add central retry handling for github requests https://review.opendev.org/665506 | 18:52 |
fungi | just checked all 12 executors and the ansible 2.7 executable on all of them is dated march 18 | 18:53 |
pabelanger | is job maybe using ansible 2.8? | 18:53 |
fungi | so we haven't accidentally upgraded a subset either | 18:53 |
pabelanger | or other ansible_version | 18:53 |
dmsimard | pabelanger: I haven't declared an ansible_version on the failing jobs | 18:54 |
fungi | the executors have ansible 2.8 builds from may 16 | 18:54 |
*** slaweq has quit IRC | 18:54 | |
fungi | so it's not as if they were just added today | 18:54 |
fungi | been there over a month now | 18:55 |
corvus | well, even that is sorta weird, isn't it? 2.7.11 was released on may 23 | 18:55 |
dmsimard | Would it be possible that passing the ansible_python_interpreter var is somehow leaking into the job ? That's not new but I'm doing it since fedora stopped shipping /usr/bin/python and the py3 fix came with ansible 2.8 | 18:55 |
fungi | `/usr/lib/zuul/ansible/2.7/bin/pip list` says ansible 2.7.9 | 18:55 |
corvus | dmsimard: is this only affecting some jobs? | 18:56 |
dmsimard | Seeing failures across ubuntu and fedora, 665323 is still running on zuul.o.o | 18:56 |
corvus | dmsimard: is it only affecting one change? | 18:56 |
ianw | reminder we'll have the infra meeting in 5 minutes in #openstack-meeting | 18:56 |
fungi | the ansible 2.5, 2.6 and 2.7 installs on the executors are all from the day we added multi-ansible support. they've never upgraded spontaneously that i've noticed | 18:57 |
dmsimard | corvus: I've seen the post failures reproduced against this one change, yes | 18:57 |
dmsimard | on different executors | 18:57 |
corvus | okay, i vastly misunderstood the scope of this problem | 18:57 |
fungi | dmsimard: which change is it? | 18:58 |
*** weifan has joined #openstack-infra | 18:58 | |
corvus | if this is affecting one change, let's start by looking at that change rather than the entire system. | 18:58 |
* fungi puts away fire extinguisher | 18:58 | |
ianw | fungi: i had a chat with dhowells in #openafs about both those vnode messages and memory bt's ... i don't think that channel is logged. short story is we didn't find an exact smoking gun | 18:58 |
dmsimard | fungi: this one: https://review.opendev.org/#/c/665323/ | 18:58 |
fungi | ianw: also today at ~10:00z the server spontaneously entered shutoff state. no indication why | 18:59 |
fungi | ianw: we discovered once rebooting that it lacks instructions to modprobe kafs at boot, and to mount /afs | 18:59 |
fungi | but more concerning is why it just died | 18:59 |
ianw | fungi: yeah ... changes on that ready for review :) | 19:00 |
fungi | anyway, to the meeting channel! | 19:00 |
ianw | anyway, can discuss in meeting, i have a topic :) | 19:00 |
*** eharney has quit IRC | 19:01 | |
pabelanger | corvus: fungi: dmsimard: I suspect https://opendev.org/recordsansible/ara/src/branch/master/.zuul.d/zuul.yaml#L69 might have something to do with it, that will be passed as extra-var to ansible | 19:03 |
pabelanger | which, is going to override logs.o.o I believe | 19:03 |
dmsimard | pabelanger: yeah, that's why I mentioned it but it's been there for a long, long time | 19:03 |
pabelanger | however, it is odd we are just noticing it nowm if the case | 19:03 |
pabelanger | dmsimard: I guess we can find out the last time the job worked, and cross reference zuul releases. | 19:04 |
*** pcaruana has joined #openstack-infra | 19:05 | |
dmsimard | I'll figure out what I can and report back :D | 19:06 |
fungi | well, restarts anyway | 19:06 |
*** jamesmcarthur has quit IRC | 19:08 | |
*** diablo_rojo has joined #openstack-infra | 19:13 | |
openstackgerrit | Merged zuul/zuul master: Cleanup specialized retry in _process_event https://review.opendev.org/665507 | 19:16 |
*** kjackal has quit IRC | 19:19 | |
*** weifan has quit IRC | 19:23 | |
*** raissa has joined #openstack-infra | 19:25 | |
*** raissa has quit IRC | 19:27 | |
*** panda has quit IRC | 19:33 | |
*** raissa has joined #openstack-infra | 19:37 | |
*** _erlon_ has joined #openstack-infra | 19:38 | |
*** e0ne has joined #openstack-infra | 19:38 | |
openstackgerrit | Merged zuul/zuul master: Eliminate two github requests per _updateChange https://review.opendev.org/665511 | 19:38 |
*** kjackal has joined #openstack-infra | 19:40 | |
*** raissa has quit IRC | 19:42 | |
*** raissa has joined #openstack-infra | 19:43 | |
*** panda has joined #openstack-infra | 19:43 | |
*** raissa has quit IRC | 19:44 | |
*** raissa has joined #openstack-infra | 19:44 | |
*** raissa has quit IRC | 19:44 | |
*** slaweq has joined #openstack-infra | 19:45 | |
openstackgerrit | Merged zuul/zuul master: Expose date time as facts https://review.opendev.org/664674 | 19:52 |
*** tdasilva has quit IRC | 19:52 | |
*** gfhellma has joined #openstack-infra | 19:53 | |
*** weifan has joined #openstack-infra | 19:53 | |
*** raissa has joined #openstack-infra | 19:53 | |
*** gfhellma_ has joined #openstack-infra | 19:58 | |
*** raissa has quit IRC | 19:59 | |
*** jamesmcarthur has joined #openstack-infra | 20:01 | |
*** gfhellma has quit IRC | 20:02 | |
ianw | so last night i had a loop in the background reading the kafs stats every minute; last 5 minutes of life seem to be http://paste.openstack.org/show/753156/ | 20:03 |
*** weifan has quit IRC | 20:03 | |
fungi | ianw: is it possible what you saw with vcsrepo was failure in the face of attempts to clone through an http redirect? | 20:03 |
fungi | i'm starting to suspect that's the cause of what i'm seeing | 20:04 |
fungi | ianw: those kafs stats look pretty steady to me, though i'm not 100% sure what some of the abbreviations mean | 20:05 |
ianw | fungi: now i really think about it, i think i may have been conflating problems with network access and gpg key import in the rspec jobs that were triggering CI failures; this was in puppet-apt. it was sort of similarly not getting remote stuff correctly | 20:07 |
fungi | ahh | 20:07 |
fungi | well, anyway, i have a theory to work from at any rate | 20:08 |
ianw | on the stats, for the record : | 20:08 |
ianw | on the dir-mgmt line, inval= indicates the number of times a directory got invalidated | 20:08 |
ianw | I really should split that, but there are three causes: (1) a page being reclaimed from the pagecache, (2) a page missing in the pagecache and (3) a vnode being found to have a different DV on the server | 20:09 |
ianw | the first two are perfectly normal and come about due to memory pressure or the directory being extended on the server and requiring an extra page | 20:09 |
ianw | the third case is the interesting one and should be split from the others | 20:09 |
ianw | reval= is the number of times a cached directory entry gets out of date and needs rechecking | 20:09 |
ianw | that happens if the version number is bumped on the server | 20:09 |
ianw | ^ all via dhowells in openafs channel (not logged, afaik) | 20:10 |
fungi | thanks for rehashing | 20:10 |
fungi | so the inval was a steady 27123 the final 5 minutes before the server died | 20:10 |
ianw | but yeah, it doesn't seem like from that it went crazy | 20:10 |
fungi | and didn't increment at all | 20:10 |
fungi | same for reval holding at 242346 | 20:11 |
ianw | the file is /opt/kafs-stats/afs-stats.txt on iad | 20:11 |
*** raissa has joined #openstack-infra | 20:12 | |
openstackgerrit | Merged zuul/zuul master: Mount tmpfs on ansible tmp dir https://review.opendev.org/663015 | 20:12 |
openstackgerrit | Jeremy Stanley proposed opendev/puppet-mediawiki master: Canonicalize clone URLs https://review.opendev.org/666162 | 20:16 |
*** raissa has quit IRC | 20:18 | |
*** weifan has joined #openstack-infra | 20:19 | |
*** gfhellma has joined #openstack-infra | 20:20 | |
*** gfhellma_ has quit IRC | 20:23 | |
*** harlowja has joined #openstack-infra | 20:23 | |
*** weifan has quit IRC | 20:25 | |
*** kjackal has quit IRC | 20:27 | |
*** jesusaur has quit IRC | 20:30 | |
*** eharney has joined #openstack-infra | 20:35 | |
*** e0ne has quit IRC | 20:36 | |
ianw | i've started up the stats collection on iad.rax mirror again, this time with "free" included to see if memory pressure arises | 20:38 |
ianw | one thing i could do is configure netconsole to a remote host, which should give us any oops that might kill the machine | 20:39 |
*** jcoufal_ has joined #openstack-infra | 20:40 | |
ianw | might wait until we see an issue | 20:41 |
ianw | fyi in discussions about some custom kernel tracepoints to look into that vnode error we see too | 20:41 |
*** Goneri has quit IRC | 20:42 | |
*** weifan has joined #openstack-infra | 20:43 | |
*** jtomasek has quit IRC | 20:43 | |
*** jcoufal has quit IRC | 20:44 | |
*** weifan has quit IRC | 20:44 | |
*** weifan has joined #openstack-infra | 20:44 | |
*** dpawlik has quit IRC | 20:52 | |
*** whoami-rajat has quit IRC | 20:56 | |
fungi | well, whatever oopsed (if something did) the machine lacked enough sanity to record it to disk, so whether it might be able to emit anything over netconsole? anybody's guess | 20:58 |
fungi | if it retains functional networking longer than it retains functional block devices, then maybe | 20:59 |
*** rfolco has quit IRC | 21:01 | |
*** dtroyer has quit IRC | 21:09 | |
*** dtroyer has joined #openstack-infra | 21:10 | |
*** xek has quit IRC | 21:15 | |
*** jesusaur has joined #openstack-infra | 21:15 | |
*** rh-jelabarre has quit IRC | 21:16 | |
*** jamesmcarthur has quit IRC | 21:17 | |
*** ekultails has quit IRC | 21:25 | |
fungi | ianw: backup01.ord.rax.ci.openstack.org is still our primary (only?) bup destination for now, right? | 21:41 |
ianw | fungi: yep, as mentioned will bring up a new server in mtl | 21:46 |
*** mriedem is now known as mriedem_away | 21:47 | |
*** tesseract has quit IRC | 21:47 | |
fungi | cool, just making sure the docs are current before i go adding some backups | 21:48 |
*** markvoelker has joined #openstack-infra | 21:48 | |
fungi | ianw: have you added a backup recently following https://docs.openstack.org/infra/system-config/sysadmin.html#backups | 21:55 |
fungi | if so, i have some questions about the second quoteblock there | 21:55 |
*** markvoelker has quit IRC | 22:00 | |
*** mattw4 has quit IRC | 22:00 | |
*** gfhellma_ has joined #openstack-infra | 22:00 | |
*** mattw4 has joined #openstack-infra | 22:01 | |
*** gfhellma has quit IRC | 22:04 | |
ianw | fungi: i think i setup ask.o.o replacement recently ... | 22:06 |
*** calbers has quit IRC | 22:06 | |
ianw | the process is codified in the new change @ https://review.opendev.org/#/c/662657/25/playbooks/roles/backup-server/tasks/user.yaml too | 22:08 |
*** calbers has joined #openstack-infra | 22:08 | |
ianw | (bib ... school run) | 22:08 |
*** slaweq has quit IRC | 22:08 | |
*** slaweq has joined #openstack-infra | 22:11 | |
*** JpMaxMan has quit IRC | 22:12 | |
*** evgenyl has quit IRC | 22:12 | |
*** sparkycollier has quit IRC | 22:15 | |
*** rajinir has quit IRC | 22:15 | |
*** slaweq has quit IRC | 22:16 | |
*** sparkycollier has joined #openstack-infra | 22:18 | |
openstackgerrit | James E. Blair proposed zuul/nodepool master: Use the DIB installed in the virtualenv if running there https://review.opendev.org/666177 | 22:18 |
*** rajinir has joined #openstack-infra | 22:19 | |
*** evgenyl has joined #openstack-infra | 22:19 | |
openstackgerrit | James E. Blair proposed zuul/nodepool master: WIP: new devstack-based functional job https://review.opendev.org/665023 | 22:20 |
*** JpMaxMan has joined #openstack-infra | 22:20 | |
*** pkopec has quit IRC | 22:20 | |
*** gfhellma_ has quit IRC | 22:25 | |
*** gfhellma_ has joined #openstack-infra | 22:28 | |
*** Lucas_Gray has joined #openstack-infra | 22:29 | |
*** icarusfactor has quit IRC | 22:34 | |
*** icarusfactor has joined #openstack-infra | 22:34 | |
*** Goneri has joined #openstack-infra | 22:36 | |
*** Goneri has quit IRC | 22:43 | |
*** diablo_rojo has quit IRC | 22:44 | |
*** Goneri has joined #openstack-infra | 22:47 | |
*** tkajinam has joined #openstack-infra | 23:01 | |
*** jcoufal_ has quit IRC | 23:04 | |
openstackgerrit | Matt McEuen proposed openstack/project-config master: New project request: airship/docs https://review.opendev.org/666190 | 23:05 |
*** rtjure has quit IRC | 23:07 | |
*** sthussey has quit IRC | 23:15 | |
*** rcernin has joined #openstack-infra | 23:16 | |
*** lseki has quit IRC | 23:16 | |
*** rcernin has quit IRC | 23:17 | |
*** _erlon_ has quit IRC | 23:17 | |
*** rcernin has joined #openstack-infra | 23:18 | |
*** rlandy has quit IRC | 23:21 | |
*** shachar has quit IRC | 23:22 | |
*** shachar has joined #openstack-infra | 23:23 | |
*** calbers has quit IRC | 23:25 | |
*** jklare has quit IRC | 23:25 | |
*** calbers has joined #openstack-infra | 23:26 | |
*** jklare has joined #openstack-infra | 23:26 | |
*** Lucas_Gray has quit IRC | 23:28 | |
*** dchen has joined #openstack-infra | 23:31 | |
*** gfhellma_ has quit IRC | 23:31 | |
fungi | ianw: my main question, which also isn't clear to me from the playbook, is whether bup init really needs to be run on both ends or just one | 23:35 |
fungi | the docs suggest it does, but there's a flow control issue with the example for the backup server side which makes me wonder whether the bup init there is just a cut-and-paste error | 23:36 |
fungi | reading through bup's official documentation now to see if i can confirm one way or the other | 23:41 |
clarkb | its both | 23:42 |
*** gfhellma_ has joined #openstack-infra | 23:42 | |
clarkb | it was just the remote | 23:42 |
clarkb | but then an update changed that | 23:42 |
fungi | i see, their readme also confirms | 23:45 |
fungi | i'll try to correct our documented process in a couple places in that case | 23:45 |
*** gfhellma_ has quit IRC | 23:49 | |
*** gfhellma_ has joined #openstack-infra | 23:50 | |
ianw | huh, yeah and that's pointed out in the review, i'll get to that | 23:51 |
openstackgerrit | Jeremy Stanley proposed opendev/system-config master: Streamline documented bup setup process https://review.opendev.org/666194 | 23:57 |
*** gfhellma_ has quit IRC | 23:57 | |
*** mattw4 has quit IRC | 23:58 | |
*** michael-beaver has quit IRC | 23:59 |
Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!