Thursday, 2018-06-14

*** eernst has quit IRC00:00
*** felipemonteiro has joined #openstack-infra00:00
*** yamamoto has quit IRC00:00
*** eernst has joined #openstack-infra00:00
*** heyongli has quit IRC00:01
*** heyongli has joined #openstack-infra00:01
*** dingyichen has joined #openstack-infra00:03
*** SumitNaiksatam has quit IRC00:03
corvusthat job is now running00:06
*** dhill_ has quit IRC00:07
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add /{tenant}/job/{job_name} route  https://review.openstack.org/55097800:07
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add /{tenant}/projects and /{tenant}/project/{project} routes  https://review.openstack.org/55097900:08
*** r-daneel has quit IRC00:08
*** felipemonteiro has quit IRC00:09
clarkbcorvus: do you know how early the failures were happening?00:09
ianwfungi / clarkb: i've updated the project references in hiera, luckily fungi templated them out differently to start with.  let's see if that makes future puppet runs happier...00:09
corvusclarkb: i think we need a devstack-multinode job for that00:10
clarkbcorvus: oh00:10
corvus(ie, the new thing; i'm guessing this is the old one, otherwise everything would have been broken?)00:11
*** heyongli has quit IRC00:11
fungiianw: oh, no need for a template change then? excellent!00:11
clarkbhrm no I think we only run multinode on a small subset of stuff and its mostly non voting?00:11
*** heyongli has joined #openstack-infra00:11
clarkbbut I will get link to devstack-multinode too00:11
clarkbhttps://zuul.openstack.org/stream.html?uuid=959ee2a9b7f142298f39f0166a834b48&logfile=console.log devstack-multinode against same change as above00:12
*** rlandy is now known as rlandy|afk00:13
*** felipemonteiro has joined #openstack-infra00:14
*** felipemonteiro_ has joined #openstack-infra00:16
ianwdoes --os-project-name on the command line not override the values in clouds.yaml?00:18
clarkbianw: probably not if it is set in clouds.yaml explicitly00:19
clarkbI think those flags are for selecting the right cloud in clouds.yaml mostly00:19
*** felipemonteiro has quit IRC00:20
*** heyongli has quit IRC00:21
*** heyongli has joined #openstack-infra00:21
*** rossella_s has quit IRC00:21
ianwi'm deleting the servers out of the open-infra project (including the mirror)00:22
*** aeng has quit IRC00:24
*** Swami has quit IRC00:25
*** Swami_ has quit IRC00:25
*** r-daneel has joined #openstack-infra00:25
*** felipemonteiro_ has quit IRC00:25
*** rossella_s has joined #openstack-infra00:25
clarkbI'm going to start dinner prep ping me if zuul needs attention00:25
clarkbcorvus: fungi ianw ^00:26
*** aeng has joined #openstack-infra00:26
*** sthussey has quit IRC00:26
fungik00:27
*** r-daneel_ has joined #openstack-infra00:29
*** r-daneel has quit IRC00:29
*** r-daneel_ is now known as r-daneel00:29
*** heyongli has quit IRC00:31
*** heyongli has joined #openstack-infra00:32
*** aeng has quit IRC00:32
*** rossella_s has quit IRC00:33
*** rossella_s has joined #openstack-infra00:35
*** heyongli has quit IRC00:42
*** heyongli has joined #openstack-infra00:42
openstackgerritMerged openstack/diskimage-builder master: Add log directory option to functional tests  https://review.openstack.org/57009500:43
*** annp has joined #openstack-infra00:48
*** rossella_s has quit IRC00:50
*** r-daneel has quit IRC00:50
*** heyongli has quit IRC00:52
*** heyongli has joined #openstack-infra00:52
spsuryaclarkb:  hi...00:53
clarkbspsurya: hello00:53
*** rossella_s has joined #openstack-infra00:53
spsuryaclarkb: currently in CI/CD of Infra, infra deploy services into VMs not into containers.00:53
spsuryaright ?00:53
spsuryaif in VM, does any discussion going on in infra team to deploy and test with containerized services ?00:53
clarkbspsurya: there are probably two separate but possibly overlapping things here. The first is our control plane and the other is the test instances that run our tests00:54
clarkbspsurya: the control plane runs on VMs though there is early work to spec out running services for the control plane in containers00:55
clarkbspsurya: for the test instances we also run those in VMs but we give you root in those instances  and you can use them how you like including running containers00:55
clarkbmany people do do this00:55
*** yamamoto has joined #openstack-infra00:56
clarkbas for running tests directly in containers I think that may still be a way off as we add functionality to zuul for that. Also it complicates test isolation and security concerns00:56
clarkbspsurya: hopefully that helps00:57
clarkbspsurya: is there something specific you are looking to do?00:57
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add initial GraphQL controller  https://review.openstack.org/57462500:57
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: sql: use a declarative base model  https://review.openstack.org/57527500:57
*** eharney has quit IRC00:58
spsuryaclarkb: I am asking about gate jobs to run it faster00:58
spsuryathanks for detailed info00:59
fungii'm not sure how containers would suddenly make jobs run faster00:59
spsuryaclarkb: does zuul currently has CI/CD containerized ?01:01
*** aeng has joined #openstack-infra01:01
spsuryafungi: thanks for putting the doubt01:02
*** yamamoto has quit IRC01:02
clarkbspsurya: there is an crio driver iirc but not yet merged. Work is in progress to make sure we support containers that look like VMs and more application container like systeds like k8s01:02
*** heyongli has quit IRC01:02
clarkband making sure it all works together01:02
*** heyongli has joined #openstack-infra01:02
spsuryafungi: AFAIK we boot use infra resources by running VMs in some way01:03
spsuryaalso wanted to reduce the uses of resources01:04
spsuryaif we will do with containers01:04
fungiyes, the goal is to support using zuul/nodepool in environments which have container management systems rather than virtual machine management systems, though i'm not really sure the efficiency will be much different either way01:04
*** pahuang has quit IRC01:04
fungivirtual machines and containers have converged quite a lot on performance as containers have realized the need for better isolation and virtualization hypervisors have found improved efficiencies01:05
spsuryaclarkb: thank you very much for detailed info, may i get the WIP link whatever01:06
*** aeng has quit IRC01:07
*** rossella_s has quit IRC01:08
ianwfungi: ok, i'm close to getting stuck on what networks should be setup in these projects01:08
clarkbhttps://review.openstack.org/#/c/560136/01:09
mnasergerrit seems kinda slow01:09
clarkbhttps://review.openstack.org/#/c/565550/01:10
*** rossella_s has joined #openstack-infra01:10
clarkbspsurya: those two changes are the two specs in progress01:10
*** rlandy|afk is now known as rlandy01:10
fungiianw: i don't see where we were provided with any additional network detail. i assumed (no doubt in error) that shade would be able to figure it out01:11
spsuryafungi: thanks, i understand your point not making much difference in efficiency. But we can reduce the booting time and resource uses of infra, by starting container in place of VMs, please correct me if my understanding is wrong01:11
ianwfungi: no, the new projects don't have a network associated.  i'm trying to copy what open-infra has setup, see if it works ...01:11
spsuryaclarkb: thanks for providing the link01:12
*** heyongli has quit IRC01:12
*** heyongli has joined #openstack-infra01:13
fungispsurya: depends on the difference in your vm boot time and container start time, but also that ultimately just translates to some percentage overhead in your overall quota since nodepool already has provisions to keep nodes prepared in advance of assignment01:14
fungiso under ideal circumstances the vm or container is already up and prepared to assign to a build by the time it's requested01:15
*** pahuang has joined #openstack-infra01:16
*** heyongli has quit IRC01:23
*** heyongli has joined #openstack-infra01:23
ianwsometimes these tools feel like being about half a step away from raw sql queries01:25
fungiianw: its almost like you've seen the true face of openstack?01:26
*** rpioso is now known as rpioso|afk01:26
*** r-daneel has joined #openstack-infra01:32
*** heyongli has quit IRC01:33
*** heyongli has joined #openstack-infra01:33
spsuryafungi: understood, thanks for info, may be i need to go through current WIPs of control plan containerization, specification for using containers as build resources looks interesting  https://review.openstack.org/#/c/560136/01:33
*** rlandy has quit IRC01:35
*** r-daneel has quit IRC01:37
*** r-daneel has joined #openstack-infra01:40
*** rossella_s has quit IRC01:40
*** gyee has quit IRC01:41
*** rossella_s has joined #openstack-infra01:42
*** heyongli has quit IRC01:43
*** heyongli has joined #openstack-infra01:43
*** rossella_s has quit IRC01:47
*** rossella_s has joined #openstack-infra01:49
*** boris_42_ has quit IRC01:50
*** bobh has joined #openstack-infra01:51
*** heyongli has quit IRC01:53
*** heyongli has joined #openstack-infra01:54
*** s-shiono has joined #openstack-infra01:58
*** yamamoto has joined #openstack-infra01:58
*** hongbin has joined #openstack-infra02:01
mnaseris it possible to use `project-template` but in combination of manually defined jobs02:02
mnaserex: a project template that defines a set of generic jobs, but inside a project, overriding a job to make it non voting02:03
*** yamahata has quit IRC02:03
*** heyongli has quit IRC02:04
*** heyongli has joined #openstack-infra02:04
*** yamamoto has quit IRC02:04
*** iyamahat_ has quit IRC02:10
*** felipemo_ has joined #openstack-infra02:12
*** heyongli has quit IRC02:14
*** heyongli has joined #openstack-infra02:14
*** hemna_ has quit IRC02:15
*** ramishra has joined #openstack-infra02:22
*** neiloy has joined #openstack-infra02:23
openstackgerritMerged openstack/diskimage-builder master: Rename output log files  https://review.openstack.org/57009602:23
openstackgerritMerged openstack/diskimage-builder master: Don't install zypper on bionic  https://review.openstack.org/57050002:23
*** rossella_s has quit IRC02:24
*** pahuang has quit IRC02:24
*** heyongli has quit IRC02:24
*** heyongli has joined #openstack-infra02:24
*** kjackal has joined #openstack-infra02:25
*** heyongli has quit IRC02:34
*** heyongli has joined #openstack-infra02:35
*** pahuang has joined #openstack-infra02:36
openstackgerritNguyen Van Trung proposed openstack/diskimage-builder master: Fix 'Operation not supported' issue for setfiles  https://review.openstack.org/57531502:43
*** heyongli has quit IRC02:45
*** heyongli has joined #openstack-infra02:45
*** pbourke has quit IRC02:47
*** pbourke has joined #openstack-infra02:48
*** bobh has quit IRC02:50
corvusmnaser: yes -- any jobs added directly to a project pipeline will be merged with the same jobs added via the template.  so the voting attribute would override what's in the template02:53
mnasercorvus: wonderful. Thank you.02:54
*** heyongli has quit IRC02:55
*** heyongli has joined #openstack-infra02:55
openstackgerritNguyen Van Trung proposed openstack/diskimage-builder master: Add iscsi-boot element  https://review.openstack.org/51149402:57
openstackgerritNguyen Van Trung proposed openstack/diskimage-builder master: Add iscsi-boot element for CentOS images  https://review.openstack.org/54270802:58
gmanncorvus: gate job against lib repo patches is always fetch that lib from src + that patch change right? not from pypi. means we do not need to define current repo(where that job is running not where job is defined) as required_projects list ?02:59
*** yamamoto has joined #openstack-infra03:00
*** heyongli has quit IRC03:05
*** heyongli has joined #openstack-infra03:05
*** yamamoto has quit IRC03:05
*** SumitNaiksatam has joined #openstack-infra03:13
*** heyongli has quit IRC03:15
*** heyongli has joined #openstack-infra03:16
*** heyongli has quit IRC03:26
*** heyongli has joined #openstack-infra03:26
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: sql: use a declarative base model  https://review.openstack.org/57527503:26
openstackgerritTristan Cacqueray proposed openstack-infra/zuul master: web: add initial GraphQL controller  https://review.openstack.org/57462503:26
*** yamahata has joined #openstack-infra03:35
*** heyongli has quit IRC03:36
*** heyongli has joined #openstack-infra03:36
*** yamamoto has joined #openstack-infra03:38
*** yamahata has quit IRC03:43
*** dave-mccowan has quit IRC03:45
*** heyongli has quit IRC03:46
*** heyongli has joined #openstack-infra03:46
*** sree has joined #openstack-infra03:48
*** yamahata has joined #openstack-infra03:51
*** udesale has joined #openstack-infra03:56
*** heyongli has quit IRC03:56
*** heyongli has joined #openstack-infra03:57
*** andreww has quit IRC03:57
*** xarses has joined #openstack-infra03:58
*** annp has quit IRC04:00
*** annp has joined #openstack-infra04:00
*** heyongli has quit IRC04:07
*** heyongli has joined #openstack-infra04:07
*** dhajare-brb has joined #openstack-infra04:10
*** germs has quit IRC04:12
*** ykarel|away has joined #openstack-infra04:16
*** ykarel|away is now known as ykarel04:16
*** heyongli has quit IRC04:17
*** heyongli has joined #openstack-infra04:17
*** hongbin has quit IRC04:21
*** eernst has quit IRC04:21
*** lifeless_ has quit IRC04:22
*** stakeda has joined #openstack-infra04:23
*** heyongli has quit IRC04:27
*** heyongli has joined #openstack-infra04:27
*** agopi has quit IRC04:31
*** heyongli has quit IRC04:37
*** heyongli has joined #openstack-infra04:38
*** threestrands has quit IRC04:47
*** heyongli has quit IRC04:48
*** heyongli has joined #openstack-infra04:48
*** e0ne has joined #openstack-infra04:53
*** janki has joined #openstack-infra04:57
*** heyongli has quit IRC04:58
*** heyongli has joined #openstack-infra04:58
*** e0ne has quit IRC05:00
*** links has joined #openstack-infra05:00
*** felipemo_ has quit IRC05:07
*** heyongli has quit IRC05:08
*** heyongli has joined #openstack-infra05:08
*** pcaruana has quit IRC05:09
*** dhajare-brb has quit IRC05:12
*** lifeless has joined #openstack-infra05:17
*** heyongli has quit IRC05:18
*** heyongli has joined #openstack-infra05:19
*** dhajare-brb has joined #openstack-infra05:29
*** heyongli has quit IRC05:29
*** heyongli has joined #openstack-infra05:29
*** kzaitsev_pi has quit IRC05:33
*** heyongli has quit IRC05:39
*** heyongli has joined #openstack-infra05:39
*** pcichy has quit IRC05:40
*** jaosorior has quit IRC05:41
*** heyongli has quit IRC05:49
*** heyongli has joined #openstack-infra05:49
*** slaweq has quit IRC05:53
*** cshastri has joined #openstack-infra05:55
*** pcichy has joined #openstack-infra05:55
*** heyongli has quit IRC05:59
*** heyongli has joined #openstack-infra06:00
*** dhajare-brb has quit IRC06:02
*** hjensas has quit IRC06:05
AJaegermnaser: keep in mind that we have voting jobs in both check and gate - and non-voting only in check queue. So, while you can override a voting job with non-voting this way, it'S not nice06:07
*** heyongli has quit IRC06:10
*** heyongli has joined #openstack-infra06:10
*** jesslampe has joined #openstack-infra06:12
*** pcaruana has joined #openstack-infra06:17
*** mtreinish has quit IRC06:19
*** heyongli has quit IRC06:20
*** heyongli has joined #openstack-infra06:20
*** rajinir has quit IRC06:22
*** AJaeger has quit IRC06:24
*** threestrands has joined #openstack-infra06:27
*** dhajare has joined #openstack-infra06:28
*** anteaya has quit IRC06:29
*** heyongli has quit IRC06:30
*** heyongli has joined #openstack-infra06:30
*** jesslampe has quit IRC06:31
*** jesslampe has joined #openstack-infra06:31
*** aojea has joined #openstack-infra06:33
*** hashar has joined #openstack-infra06:33
*** jesslampe has quit IRC06:36
*** shardy has joined #openstack-infra06:38
*** dingyichen has quit IRC06:39
*** heyongli has quit IRC06:40
*** heyongli has joined #openstack-infra06:41
*** iyamahat has joined #openstack-infra06:48
*** heyongli has quit IRC06:51
*** heyongli has joined #openstack-infra06:51
*** ccamacho has joined #openstack-infra06:54
*** slaweq has joined #openstack-infra06:56
*** AJaeger has joined #openstack-infra06:57
*** mtreinish has joined #openstack-infra06:58
*** heyongli has quit IRC07:01
*** heyongli has joined #openstack-infra07:01
*** jcoufal has joined #openstack-infra07:07
*** heyongli has quit IRC07:11
*** heyongli has joined #openstack-infra07:11
*** evrardjp_ is now known as evrardjp07:12
*** amoralej|off is now known as amoralej07:12
*** pblaho has quit IRC07:13
*** tesseract has joined #openstack-infra07:16
*** heyongli has quit IRC07:21
*** heyongli has joined #openstack-infra07:22
*** pblaho has joined #openstack-infra07:23
*** efried has quit IRC07:26
*** efried has joined #openstack-infra07:27
*** rcernin has quit IRC07:27
*** lifeless has quit IRC07:28
*** gfidente has joined #openstack-infra07:29
*** gfidente has joined #openstack-infra07:29
*** jpena|off is now known as jpena07:30
*** heyongli has quit IRC07:32
*** heyongli has joined #openstack-infra07:32
*** lifeless has joined #openstack-infra07:35
*** tosky has joined #openstack-infra07:37
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Move zuul_log_id injection to command action plugin  https://review.openstack.org/57535107:38
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix log streaming for delegated hosts  https://review.openstack.org/57535207:38
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Revert "Temporarily override Ansible linear strategy"  https://review.openstack.org/57535307:38
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Remove extra argument when logging logger timeout  https://review.openstack.org/57535407:38
*** heyongli has quit IRC07:42
*** heyongli has joined #openstack-infra07:42
*** zoli is now known as zoli|wfh07:43
*** zoli|wfh is now known as zoli07:43
*** flaper87 has quit IRC07:49
*** armaan has joined #openstack-infra07:49
*** heyongli has quit IRC07:52
*** janki has quit IRC07:52
*** heyongli has joined #openstack-infra07:52
*** jpich has joined #openstack-infra07:57
*** flaper87 has joined #openstack-infra07:58
*** kzaitsev_pi has joined #openstack-infra07:58
*** florianf has joined #openstack-infra08:00
*** heyongli has quit IRC08:02
*** heyongli has joined #openstack-infra08:03
*** kamren has quit IRC08:03
*** janki has joined #openstack-infra08:09
*** heyongli has quit IRC08:13
*** heyongli has joined #openstack-infra08:13
*** hamzy_ has joined #openstack-infra08:18
*** ykarel is now known as ykarel|lunch08:19
*** hamzy has quit IRC08:20
*** owalsh has joined #openstack-infra08:21
*** heyongli has quit IRC08:23
*** heyongli has joined #openstack-infra08:23
*** armaan has quit IRC08:24
*** armaan has joined #openstack-infra08:26
*** bdodd_ has joined #openstack-infra08:28
*** owalsh has quit IRC08:28
*** alexchadin has joined #openstack-infra08:29
*** bdodd has quit IRC08:30
*** armaan has quit IRC08:30
*** sree has quit IRC08:32
*** electrofelix has joined #openstack-infra08:33
*** heyongli has quit IRC08:33
*** heyongli has joined #openstack-infra08:33
*** armaan has joined #openstack-infra08:34
*** ianychoi has quit IRC08:39
*** owalsh has joined #openstack-infra08:40
*** armaan has quit IRC08:41
*** heyongli has quit IRC08:43
*** heyongli has joined #openstack-infra08:44
*** derekh has joined #openstack-infra08:45
*** iyamahat has quit IRC08:51
*** heyongli has quit IRC08:54
*** heyongli has joined #openstack-infra08:54
*** yamahata has quit IRC08:54
*** armaan has joined #openstack-infra08:57
*** s-shiono has quit IRC09:02
*** heyongli has quit IRC09:04
*** heyongli has joined #openstack-infra09:04
*** ykarel|lunch is now known as ykarel09:09
*** armaan has quit IRC09:11
*** armaan has joined #openstack-infra09:13
*** heyongli has quit IRC09:14
*** heyongli has joined #openstack-infra09:14
*** jaosorior has joined #openstack-infra09:15
*** dougsz has joined #openstack-infra09:22
*** zhangfei has joined #openstack-infra09:23
dougszAny ideas why there is no tarball for http://tarballs.openstack.org/monasca-thresh/ ?09:23
dougsz(the main thing that differentiates it is that it's a Java project (!))09:24
*** heyongli has quit IRC09:24
*** heyongli has joined #openstack-infra09:25
*** kamren has joined #openstack-infra09:27
*** e0ne has joined #openstack-infra09:28
*** udesale_ has joined #openstack-infra09:30
*** chkumar246 has joined #openstack-infra09:30
*** dhajare_ has joined #openstack-infra09:30
*** dtantsur|afk is now known as dtantsur09:31
*** cshastri_ has joined #openstack-infra09:32
*** links has quit IRC09:33
*** links has joined #openstack-infra09:33
*** chandankumar has quit IRC09:33
*** udesale__ has joined #openstack-infra09:33
*** udesale has quit IRC09:34
*** cshastri has quit IRC09:34
*** dhajare has quit IRC09:34
fricklerdougsz: might well be that the release jobs are broken for it, but since the last release seems to have been 8 weeks ago, the logs have expired, so it is difficult to check this09:34
*** heyongli has quit IRC09:35
*** heyongli has joined #openstack-infra09:35
fricklerdougsz: the existence of this directory makes it very likely to me that something is broken with the release jobs: http://tarballs.openstack.org/monasca-thresh/$ZUUL_SHORT_PROJECT_NAME/09:35
*** chkumar246 has quit IRC09:35
*** dhajare_ has quit IRC09:36
*** udesale_ has quit IRC09:36
*** chandankumar has joined #openstack-infra09:39
AJaegerdougsz, frickler, looking at the job configuration, I see as post job " legacy-monasca-thresh-localrepo-upload"09:39
AJaegerwe had many broken legacy upload jobs, I will just assume this never worked and needs porting to the new way of uploading...09:40
fricklerAJaeger: yes, according to zuul it ran with status success on  2018-04-10T15:39:43 which seems to be when the last release was tagged, but one would need the logs in order to dig deeper I think.09:41
fricklerbut I'm not sure how to really debug this other than tagging some new release and possibly repeating that until one can fix it09:42
AJaegerfrickler: post job - runs after each merge09:42
AJaegerfrickler: so, just merge something ;)09:42
AJaegerand last merge was 10th of April...09:42
AJaegerfrickler: I suggest to just rewrite that job from scratch - but cannot help. Perhaps mordred can?09:43
*** kamren has quit IRC09:44
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Move zuul_log_id injection to command action plugin  https://review.openstack.org/57535109:44
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix log streaming for delegated hosts  https://review.openstack.org/57535209:44
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Revert "Temporarily override Ansible linear strategy"  https://review.openstack.org/57535309:44
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Remove extra argument when logging logger timeout  https://review.openstack.org/57535409:44
fricklerAJaeger: oh, indeed, I just saw the last tag was created 8 weeks ago and assumed that that was the event09:44
*** heyongli has quit IRC09:45
*** heyongli has joined #openstack-infra09:45
*** dhajare_ has joined #openstack-infra09:49
*** threestrands has quit IRC09:54
*** heyongli has quit IRC09:55
*** heyongli has joined #openstack-infra09:56
*** heyongli has quit IRC10:05
*** heyongli has joined #openstack-infra10:06
e0nehi. could anybody please help we with configuring a new grenade job?10:08
e0neit fails, there is almost no logs :(10:08
e0nee.g.: http://logs.openstack.org/15/575115/6/check/grenade-vitrage/25b6457/10:08
dougszfrickler, AJaeger, thanks for the insight. It's good to at least know there should be a tarball10:15
*** heyongli has quit IRC10:16
*** heyongli has joined #openstack-infra10:16
*** kjackal has quit IRC10:22
dougszperhaps I can use the jar from the Maven build for now, just need to find where it's published. (I'm adding support for deploying monasca-thresh to kolla).10:22
*** hjensas has joined #openstack-infra10:24
jamespagemorning - is there a way to see why https://review.openstack.org/#/c/573217/ did not post-commit publish to https://docs.openstack.org/project-deploy-guide/charm-deployment-guide/latest/ ?10:25
*** heyongli has quit IRC10:26
*** gnuoy has joined #openstack-infra10:26
*** heyongli has joined #openstack-infra10:26
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Move zuul_log_id injection to command action plugin  https://review.openstack.org/57535110:28
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Fix log streaming for delegated hosts  https://review.openstack.org/57535210:28
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Revert "Temporarily override Ansible linear strategy"  https://review.openstack.org/57535310:28
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Remove extra argument when logging logger timeout  https://review.openstack.org/57535410:28
AJaegere0ne: better ask on #openstack-qa10:35
e0neAJaeger: got it, thanks!10:35
*** heyongli has quit IRC10:36
*** heyongli has joined #openstack-infra10:36
AJaegerjamespage: https://wiki.openstack.org/wiki/Infrastructure_Status shows that a cuople of hours afterwards we restarted zuul, it might be that the post job was lost.10:37
*** nicolasbock has joined #openstack-infra10:37
AJaegerbut let's check...10:37
jamespageta10:37
AJaegerjamespage: http://zuul.openstack.org/builds.html?job_name=publish-deploy-guide&project=openstack%2Fcharm-deployment-guide10:39
AJaegerso, no runs in June - probably the restart10:39
jamespageAJaeger: OK - best to shove through another commit?10:40
AJaegerjamespage: yeah, if possible. An infra-root can also manually trigger it if needed10:40
AJaegerjamespage: remove the index from the main page - http://logs.openstack.org/17/573217/3/check/build-openstack-deploy-guide/a3e26ef/html/genindex.html gives 404 ;)10:41
jamespageAJaeger: ack will do10:43
stephenfinmtreinish: How difficult would it be for stestr to accept a pytest-style path to a function instead of a Python module path? e.g. nova/tests/unit/virt/test_hardware.py::VirtNUMATopologyCellUsageTestCase10:44
stephenfin(asking here as I don't know where to ask those questions)10:44
*** kjackal has joined #openstack-infra10:46
*** heyongli has quit IRC10:46
*** heyongli has joined #openstack-infra10:47
*** dtantsur is now known as dtantsur|brb10:50
openstackgerritboden proposed openstack-infra/project-config master: add lower constraints to vmware-nsx gate pipeline  https://review.openstack.org/57539910:53
*** heyongli has quit IRC10:57
*** jpena is now known as jpena|lunch10:57
*** heyongli has joined #openstack-infra10:57
*** annp has quit IRC11:02
*** yamamoto has quit IRC11:02
*** heyongli has quit IRC11:07
*** heyongli has joined #openstack-infra11:07
*** zoli is now known as zoli|lunch11:09
*** alexchadin has quit IRC11:11
*** armaan has quit IRC11:15
*** heyongli has quit IRC11:17
*** heyongli has joined #openstack-infra11:18
*** heyongli has quit IRC11:27
*** heyongli has joined #openstack-infra11:28
*** ykarel_ has joined #openstack-infra11:31
*** ykarel has quit IRC11:34
*** ykarel_ is now known as ykarel11:34
*** ldnunes has joined #openstack-infra11:35
*** heyongli has quit IRC11:38
*** heyongli has joined #openstack-infra11:38
mnaserAJaeger: ok interesting but I was thinking it can be a quick temporary way but it’s all one big long term thing of improving OSA Ci11:39
*** nicolasbock has quit IRC11:42
*** udesale_ has joined #openstack-infra11:42
AJaegermnaser: often those temporary hacks stay for ever ;/ - but yes, it could be used as a temporary band-aid ;)11:44
*** yamamoto has joined #openstack-infra11:44
mnaserAJaeger: my goal is for OSA to use project-template only so that we disable in one repo only and it will disable everywhere so we will always notice that it must be fixed11:45
mnaserAnd when fixing it will be one place only rather than be scattered and inconsistent11:45
*** udesale__ has quit IRC11:45
AJaegermnaser: good plan11:46
* AJaeger loves templates ;)11:46
*** dtantsur|brb is now known as dtantsur11:46
mnaseryou and I both :)11:46
*** heyongli has quit IRC11:48
*** heyongli has joined #openstack-infra11:48
*** yamamoto has quit IRC11:48
*** udesale_ has quit IRC11:48
*** sthussey has joined #openstack-infra11:50
*** gfidente has quit IRC11:53
*** roman_g has joined #openstack-infra11:54
*** gfidente has joined #openstack-infra11:56
*** heyongli has quit IRC11:58
*** heyongli has joined #openstack-infra11:58
*** rlandy has joined #openstack-infra12:01
*** boden has joined #openstack-infra12:01
*** jpena|lunch is now known as jpena12:03
bodenAJaeger hi.. thx for review on https://review.openstack.org/#/c/575399/  but your comments don’t match the discussion I had the other day with other infra folks: http://eavesdrop.openstack.org/irclogs/%23openstack-infra/%23openstack-infra.2018-06-07.log.html#t2018-06-07T17:35:4112:03
bodenso admittedly I’m confused12:03
*** amoralej is now known as amoralej|lunch12:03
AJaegerboden: I'M talking about the vmware job only - it should really have been done in your tree12:03
bodenAJaeger I understand, I’m just telling you I was told to put them projects.yaml12:04
AJaegerboden: I expect that clarkb was not aware that the job is in your tree12:04
*** yamamoto has joined #openstack-infra12:06
fricklerAJaeger: boden: oh, then this was done wrong already in https://review.openstack.org/#/c/573386/1/zuul.d/projects.yaml12:08
bodenfrickler, yeah I’ll submit a patch to remove that as per AJaeger comments12:08
*** heyongli has quit IRC12:08
AJaegerfrickler: yes, agreed12:09
*** heyongli has joined #openstack-infra12:09
*** yamamoto has quit IRC12:10
*** nicolasbock has joined #openstack-infra12:11
openstackgerritboden proposed openstack-infra/project-config master: remove lower constraints to vmware-nsx gate pipeline  https://review.openstack.org/57539912:12
bodenfrickler ^12:12
AJaegerboden: thanks12:12
bodenAJaeger thanks.. sorry for confusion, I should’ve seen the local defs for those jobs12:13
AJaegerboden: happy to review your change for vmware-nsx - feel free to CC me on it.12:14
AJaegerboden: foudn it;)12:15
*** heyongli has quit IRC12:19
*** heyongli has joined #openstack-infra12:19
*** tpsilva has joined #openstack-infra12:21
*** kgiusti has joined #openstack-infra12:21
*** dhajare_ has quit IRC12:22
toskyfrickler: regarding that issue with orchestrate-devstack, I confirm that it's solved now12:23
*** felipemonteiro has joined #openstack-infra12:26
*** trown|outtypewww is now known as trown12:27
*** heyongli has quit IRC12:29
*** heyongli has joined #openstack-infra12:29
pabelangerheads up, 2 weeks until SSL certs expire: Jun 30 23:59:59 2018 GMT according to inbox12:33
*** zoli|lunch is now known as zoli|wfh12:35
*** zoli|wfh is now known as zoli12:35
*** myoung|off is now known as myoung12:36
fungiyup12:38
fungii believe clarkb is preparing to replace them in the next week-ish12:38
*** heyongli has quit IRC12:39
*** heyongli has joined #openstack-infra12:39
*** yamamoto has joined #openstack-infra12:40
*** dhajare has joined #openstack-infra12:42
*** yamamoto has quit IRC12:45
*** ianychoi has joined #openstack-infra12:46
*** felipemonteiro has quit IRC12:47
*** lifeless has quit IRC12:48
*** lifeless has joined #openstack-infra12:49
*** heyongli has quit IRC12:49
*** gfidente has quit IRC12:49
*** heyongli has joined #openstack-infra12:50
openstackgerritMerged openstack-infra/zuul master: Allow zuul_return in untrusted jobs  https://review.openstack.org/57517312:51
*** gfidente has joined #openstack-infra12:51
*** gfidente has joined #openstack-infra12:51
*** cshastri_ has quit IRC12:53
*** edmondsw has joined #openstack-infra12:54
*** florianf has quit IRC12:55
*** esarault has joined #openstack-infra12:56
*** florianf has joined #openstack-infra12:58
*** heyongli has quit IRC13:00
*** heyongli has joined #openstack-infra13:00
*** camunoz has joined #openstack-infra13:00
*** mriedem has joined #openstack-infra13:03
*** iyamahat has joined #openstack-infra13:04
*** lihi has quit IRC13:06
*** lihi has joined #openstack-infra13:06
openstackgerritTobias Henkel proposed openstack-infra/zuul master: Remove extra argument when logging logger timeout  https://review.openstack.org/57535413:07
flaper87fungi: I re ran the tripleo-ci-centos-7-scenario009-multinode-oooq job for 574233 this morning to test another PS. Is the autohold still on? Or does it have to be enabled on demand?13:08
flaper87you can release the 2 servers you held for me yday13:09
mnaserclarkb: is there a way to force a recheck on the kata repos, i'd like to iterate on your work13:09
*** dave-mccowan has joined #openstack-infra13:09
*** felipemonteiro has joined #openstack-infra13:09
*** eernst has joined #openstack-infra13:09
*** heyongli has quit IRC13:10
*** heyongli has joined #openstack-infra13:10
*** armaan has joined #openstack-infra13:15
mnaseris there a way for us to get zuul to hold a vm the next time a job fails?13:15
mnaserthe `openstack-ansible-deploy-ceph-ubuntu-xenial` non-deterministically fails and we've gathered a lot of data but it doesn't seem to be still be enough :(13:16
*** felipemonteiro has quit IRC13:16
mnaserending up with "IOError: [Errno 28] No space left on device" even though there is plenty of space left on the device according to df13:16
fungiflaper87: i set it for a single count. looking to see if it held any more but i wouldn't expect it to until i delete the old ones13:17
flaper87fungi: understood. Let me know, worst case, I'll recheck it13:17
fungiflaper87: okay, old held nodes have been deleted and a fresh autohold has been set. rechect as needed13:19
flaper87fungi: done, thanks!13:20
*** heyongli has quit IRC13:20
*** heyongli has joined #openstack-infra13:20
*** Goneri has joined #openstack-infra13:21
*** hemna_ has joined #openstack-infra13:21
*** dhill_ has joined #openstack-infra13:22
*** yamamoto has joined #openstack-infra13:22
*** jamesdenton has joined #openstack-infra13:23
*** r-daneel has quit IRC13:26
*** agopi has joined #openstack-infra13:27
*** yamamoto has quit IRC13:27
*** armaan has quit IRC13:29
*** armaan has joined #openstack-infra13:29
fricklermnaser: that's for project "openstack-ansible" correct?13:29
*** amoralej|lunch is now known as amoralej13:29
*** eernst has quit IRC13:29
*** eernst has joined #openstack-infra13:30
*** heyongli has quit IRC13:30
*** heyongli has joined #openstack-infra13:31
*** armaan has quit IRC13:32
*** armaan has joined #openstack-infra13:32
*** eernst has quit IRC13:34
*** armaan has quit IRC13:37
*** yamamoto has joined #openstack-infra13:38
*** florianf has quit IRC13:40
*** heyongli has quit IRC13:41
*** heyongli has joined #openstack-infra13:41
fricklermnaser: I did set an autohold and it catched a node almost immediately. so with https://github.com/mnaser.keys you should be able to access root@104.130.163.27 now for further debugging13:41
*** armaan has joined #openstack-infra13:42
*** florianf has joined #openstack-infra13:42
*** yamamoto has quit IRC13:42
*** iyamahat has quit IRC13:43
fricklermnaser: which is a bit strange, because I cannot see results for this job yet ... hmm13:44
*** armaan has quit IRC13:46
*** shaner has quit IRC13:46
*** shaner has joined #openstack-infra13:46
*** eharney has joined #openstack-infra13:49
fungifrickler: was the job perhaps already running for another change? did you limit the autohold by change as well as project and job?13:49
*** heyongli has quit IRC13:51
*** heyongli has joined #openstack-infra13:51
*** ccamacho has quit IRC13:51
fricklerfungi: mnaser didn't mention a change, so I was assuming that the failure was happening for any job. but I was still thinking that I should find the job for the held node at http://zuul.openstack.org/builds.html?job_name=openstack-ansible-deploy-ceph-ubuntu-xenial13:52
fungiahh13:52
*** dave-mccowan has quit IRC13:52
*** Tahvok has quit IRC13:54
fungi`grep 0000123384 /var/log/zuul/debug.log` on zuul0113:55
*** jesslampe has joined #openstack-infra13:55
fungilooks like it was running for 559452,113:56
*** r-daneel has joined #openstack-infra13:56
EmilienMdid anything outstanding changed in the doc jobs lately? no job has been running on tripleo-docs repo for the last 3 days13:57
EmilienMI've checked project-config, nothing much on that regard lately13:57
*** linkmark has quit IRC13:57
fungiEmilienM: could it be due to the files vs irrelevant-files behavior change in zuul?13:58
EmilienMI guess it could13:58
EmilienMweshay|ruck, mwhahaha ^ fyi13:58
*** r-daneel_ has joined #openstack-infra13:59
fungithat went into effect late mondau13:59
fungimonday13:59
EmilienMI guess it's related14:00
*** r-daneel has quit IRC14:00
*** r-daneel_ is now known as r-daneel14:00
EmilienMbut we don't have zuul layout in the repo14:00
fungirm_work: weshay|ruck: mwhahaha: http://lists.openstack.org/pipermail/openstack-dev/2018-June/131304.html14:00
fungifor a refresher14:00
*** shardy has quit IRC14:01
*** heyongli has quit IRC14:01
*** rajinir has joined #openstack-infra14:01
*** heyongli has joined #openstack-infra14:01
fricklerfungi: oh, so the builds list only gets updated when all jobs for a change have finished it seems14:02
*** Tahvok has joined #openstack-infra14:02
fungifrickler: i guess so, as the db inserts are via a reporter14:03
fricklermnaser: sadly this one seems to haved failed early with a different error: http://logs.openstack.org/52/559452/1/check/openstack-ansible-deploy-ceph-ubuntu-xenial/42af3d3/job-output.txt.gz#_2018-06-14_13_30_04_62543514:03
fungie-mail reporter sends a message when all builds complete, gerrit/github reporters leave a comment when all builds complete, so i suppose the mysql reporter performs db inserts once all builds complete14:03
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Don't follow symlinks when setting log permissions  https://review.openstack.org/57543914:04
*** jcoufal has quit IRC14:04
fricklerfungi: makes sense, yes, just was a bit unexpected for me14:04
*** ykarel is now known as ykarel|away14:04
fungime too!14:04
mordredfungi, frickler: ^^ found that in tracking down why https://review.openstack.org/#/c/551989/ keeps failing in post14:04
fungii hadn't considered it14:04
mordredoh - my commit message is wrong14:05
fungimordred: random behavior change in a minor ansible release? did you find it in the release notes, or is it more likely a regression?14:05
mordredfungi: it's a change - I was looking at the 2.4 docs on the default by mistake14:06
mordredhttps://docs.ansible.com/ansible/latest/modules/file_module.html14:06
*** r-daneel_ has joined #openstack-infra14:06
mordredshows that the default cahnged in 2.514:06
openstackgerritMonty Taylor proposed openstack-infra/project-config master: Don't follow symlinks when setting log permissions  https://review.openstack.org/57543914:07
mordredfungi: fixed commit message to be accurate14:07
fungithis seems likely to catch a lot of jobs off guard14:07
*** r-daneel has quit IRC14:07
*** r-daneel_ is now known as r-daneel14:07
fungii'm surprised we haven't seen more reports of it before now14:07
mordredpossibly - although you kind of have to work at it to make symlinks in the logs that do sensible things14:08
*** yamamoto has joined #openstack-infra14:08
* mordred has some in the new zuul multi-tenant web dashboard job to simulate what apache rewrite rules would be doing :)14:08
mordredthat said - yes - I'm also surprised14:08
fungimordred: did you see ianw's e-mail about shade behavior with network guessing in our packethost environment?14:09
fungicurious whether you have any ideas there14:10
mordredfungi: https://docs.ansible.com/ansible/latest/porting_guides/porting_guide_2.5.html#noteworthy-module-changes does list the change, so there is that14:10
mordredfungi: I did not - which list?14:10
mordredoh - just direct14:10
mordredone sec14:10
fungiyeah. was just a private e-mail to me, clarkb and studarus, cc'd to you14:11
funginot sure why he was concerned about putting that on the infra ml14:11
*** heyongli has quit IRC14:11
*** heyongli has joined #openstack-infra14:12
mordredWELL14:12
*** yamamoto has quit IRC14:12
mordredI have never seen the values "Internal" and "External" for router:external before - those are usually true or false14:12
mordredslaweq: ^^ is this a neutron change?14:12
mordredslaweq: (seeing a cloud that returns Internal or External for router:external)14:13
*** links has quit IRC14:15
*** jesslampe has quit IRC14:15
*** jesslampe has joined #openstack-infra14:16
fricklermordred: that sounds broken to me, do you really see that in the api response?14:18
*** yamamoto has joined #openstack-infra14:20
*** yamamoto has quit IRC14:20
fricklermordred: OSC does seem to translate the bool, though14:21
mordredfrickler: yes. ianw got it from packethost14:21
mordredfrickler: really?14:21
mordredfrickler: *headdesk*14:21
*** jesslampe has quit IRC14:21
mordredthat's actually extremely unhelpfulk14:21
funginutty14:21
frickleropenstackclient/network/v2/network.py:    return 'External' if item else 'Internal'14:22
mordredfungi: in any case, I think we should just do with packethost the same thing we do for internap - mark the networks internal and external14:22
*** heyongli has quit IRC14:22
mordredfrickler: that is upsetting to me14:22
*** heyongli has joined #openstack-infra14:22
mordredfrickler: I guess it's been there since 2015 though - so such is life14:23
AJaegermordred: could you review https://review.openstack.org/#/c/570260/ , please? It looks sane to me but I know nothing about npm...14:23
mordredthe thing is - it's not even an accurate translation - since the router:external attribute does not actually mean internal/external14:23
mordredfrickler: so that translation actually increases the confusion about what that means14:23
openstackgerritThierry Carrez proposed openstack-infra/irc-meetings master: Add anchors to link to specific parts of index  https://review.openstack.org/57544514:24
fungimordred: so was this a misconfiguration in the deployment itself? or a bug in neutron? (or the latter making the former possible?)14:25
mordredfungi: I now no-longer know why shade can't figure it out14:25
*** dave-mccowan has joined #openstack-infra14:25
mordredand will have to debug further14:25
EmilienMinfra-core: mwhahaha and myself are looking at our gate issues and would like to know if you have hits on docker mirror provided on 8081 port. e.g. mirror.mtl01.inap.openstack.org:8081 - our goal is to make sure we use the infra mirror and not docker.io14:26
mordredbut we can still put in entries to clouds.yaml similar to internap to unblock us14:26
*** hongbin has joined #openstack-infra14:26
fricklermordred: I agree, if you compare it to the description in the api-ref, "External" is ok-ish, but to name the complement "Internal" is pretty misleading14:26
frickler"Indicates whether the network has an external routing facility that’s not managed by the networking service."14:26
EmilienMe.g. #2: http://mirror.mtl01.inap.openstack.org:8081/registry-1.docker/14:26
mordredAJaeger: lgtm14:27
mordredfrickler: yah14:27
mordredfrickler: "router:external" really means "can have a neutron router attached to it, oh, and also btw implies shared=True"14:27
mordredwhat it decidedly does NOT mean is "this network is external"14:28
*** owalsh has quit IRC14:29
mordredso router:external = True can be used to determine that a network is probably to be used for routing externally, but router:external = False actually cannot be counted on to communicate the same thing, as the network in question could be, for instance, a provider network to be used for external traffic14:30
AJaegerthanks, mordred14:31
fungiEmilienM: 198.72.124.176 - - [14/Jun/2018:14:26:36 +0000] "GET /registry-1.docker/v2/tripleomaster/centos-binary-nova-compute-ironic/blobs/sha256:d82d6152d6a63608951fcc3c36dd01a66d12bc2ba41e8d4ab034ba4cc3d05806 HTTP/1.1" 307 736 "-" "docker/1.13.1 go/go1.9.4 kernel/3.10.0-862.3.2.el7.x86_64 os/linux arch/amd64 UpstreamClient(Docker-Client/1.13.1 \\(linux\\))"14:31
fungimwhahaha: ^14:31
*** heyongli has quit IRC14:32
*** heyongli has joined #openstack-infra14:32
*** zhangfei has quit IRC14:35
mnaserfrickler: thanks for the node hold, it was a different failure but one that seems to repeat pretty often14:35
mnaseris there problems with our xenial mirrors14:36
mnaserhttp://paste.openstack.org/show/723477/14:36
mtreinishstephenfin: you can use --no-discover/-n on stestr run to specify running a specific file by path: https://stestr.readthedocs.io/en/latest/MANUAL.html#running-tests14:36
* mordred hugs mtreinish14:37
mtreinishit doesn't take a class arg iirc because the test runner just loads the module14:37
stephenfinmtreinish: That's almost exactly what I was looking for. I assume we can't specify a test class that way14:37
*** ramishra has quit IRC14:37
stephenfinBeat me to it14:37
mwhahahafungi: so 307 is a cache hit? is  there anyway to see like the number of requests/cache hit for docker fetches?14:38
mtreinishwe probably could look at adding that, but it might mean we need to patch the underlying subunit.run command14:38
stephenfinmtreinish: Eh, this gets me 80% of where I need to go14:38
ttxHi! Did y'all have any plans to normalize IRC configuration ? It's a bit spread out onto lots of files today. I'm interested as I'm looking into publishing a reference list of  current IRC channels on eavesdrop.o.o14:38
mordredttx: only in as much as there is a plan to consolidate from multiple bots to a single bot14:39
mordredttx: but there is no current configuration normalization plan that I am aware of14:39
ttxmordred: any decision as to which bot that would be ?14:40
mordredttx: yes - there is a spec even ... one sec14:40
* ttx likes gerritbot config file better than that hiera file statusbot uses for channel list14:40
mordredttx: http://specs.openstack.org/openstack-infra/infra-specs/specs/irc.html14:41
fricklermnaser: that kind of error tends to happen if the mirrors are not up to date and your image has newer packages pre-installed than what the mirror has.14:42
fricklermnaser: maybe some other infra-root can continue here, /me needs to leave soon14:42
mnaserfrickler: darn, okay, also i don't need `104.130.163.27` anymore14:42
ttxmordred: thanks!14:42
mnaserbut if we can keep an autohold for the next failure in line :(14:42
*** heyongli has quit IRC14:42
*** heyongli has joined #openstack-infra14:42
fricklermnaser: o.k., deleted the old node and did set an autohold for another three, just in case14:43
mnaserfrickler: thank you very much14:43
fungimwhahaha: on a provider-by-provider basis we could probably run some numbers by manually analyzing apache logs. what specifically are you looking for?14:43
*** hamzy_ is now known as hamzy14:44
fungimwhahaha: also, i don't know that tripleo jobs are the only ones using the dockerhub proxy14:44
mwhahahafungi: we were seeing increased time loading containers over the last week and want to make sure the caching is  still working and if we had added additional strain by switching to an containerized undercloud14:44
*** ociuhandu_ has joined #openstack-infra14:45
*** iyamahat has joined #openstack-infra14:45
mwhahahafungi: we've confirmed that we're still using the proxy, so we're working our way back to the source to try and determine what's going on14:45
fungimwhahaha: it may be useful for me to see if i can check the churn rate on the cache. we only set aside ~50gb of space for apache to cache things on our mirrors, if memory serves, so if the variety of what's being cached is too great we might be overrunning that and making the cache basically useless14:46
mtreinishstephenfin: fwiw, you could use the python path (so '.' separated) and specify the class that way (not sure if it works down to a single method though), but for the file path interpolation it only works to the module level I think14:46
mwhahahayea that's what i'm afraid of14:46
mtreinishstephenfin: and please feel free to open issues for quirks or improvment suggestions on this. Anything suggestions on making it easier to use would be appreciated14:47
stephenfinmtreinish: Yeah, that's what I have been doing (tox -e py27 nova.tests.unit.network.neutronv2.SomeClass.test_something) but it's tedious converting paths to python paths14:47
fungiand unfortunately, increasing that cache space is counter-productive even if we do have space to do it, because we reach the point where apache can't remove less-frequently-accessed content as fast as new content is being requested (it does this in an asynchronous fashion) so we risk filling up the filesystem14:47
stephenfinmtreinish: will do (y)14:47
mwhahahafungi: ideally having a local container registry that we could push to in each cloud would reduce this requirement, but i'm not aware if this is on anyone's radar14:48
*** iyamahat has quit IRC14:48
*** r-daneel_ has joined #openstack-infra14:48
mtreinishstephenfin: ah, that's slightly different pattern. With '-n' you're giving it a python object and it's run without doing a discovery. Without any args it does a regex match on the string after doing discovery (which imports all the modules to build a list)14:49
*** iyamahat has joined #openstack-infra14:49
*** r-daneel has quit IRC14:49
*** r-daneel_ is now known as r-daneel14:49
mtreinishstephenfin: if you don't mind taking the discovery hit (which can take a few secs depending on your system's io) you can give it a smaller string that uniquely identifies the test you care about14:49
fungimwhahaha: even at that, our test environment isn't really tuned to support nodes requesting many gigabytes of data over the network, whether it's hosted nearby or not14:49
fungiwe discussed in the past (now long past) the possibility of preparing, snapshotting, cloning and attaching block devices to test nodes to reduce the network load for such things, but in some of our providers available block storage bandwidth may not be any better than local network bandwidth (and in fact they're often one in the same anyway because of relying on iscsi)14:52
*** heyongli has quit IRC14:52
mtreinishmordred: on the topic of irc bots, want to take a look at: https://review.openstack.org/#/q/status:open+branch:master+topic:even-more-firehose :)14:53
*** heyongli has joined #openstack-infra14:53
mwhahahaso i think that's a different solution than being proposed, a docker registry would be no different than the existing distro mirrors today14:53
mwhahahabecause it's not a single file, it's a bunch of layers14:54
*** rpioso|afk is now known as rpioso14:54
fungisure, but also constrained by many of the same challenges we have with our existing mirroring solutions14:54
openstackgerritMerged openstack-infra/zuul-jobs master: Collect the coverage report for npm test jobs  https://review.openstack.org/57026014:54
*** Swami has joined #openstack-infra14:54
mwhahahathe existing distro mirrors seem to be fine, so i'm unsure why you think local in dc repository wouldn't be an improvement14:55
fungiwe can do synchronous mirror updates via afs, but it's not really fit for replicating many gigabytes of high-churn data halfway around the world14:55
mordredwe looked at local in-dc docker repo before doing the apache route14:55
* mwhahaha shrugs14:56
mordredit is, unfortunately, more complicated to get it going and working properly than was feasile to deal with at that point in time14:56
mordredthere are some other changes and things coming down the pipeline that should improve that and make it reasonable to re-assess that14:56
mordredbut nothing that would make much improvement this week14:56
fungimwhahaha: any idea what the turnover rate is on those docker images? and how much space we'd need to store them? just trying to get a feel for the scale of the problem14:56
mwhahahafungi: at most once every 8 hours i think14:57
mwhahahafungi: but we haven't had any new ones in some time14:57
fungiif it's 3 or 4 gigabytes changing every week or two that seems doable14:57
mwhahahathe probablem is likely that it's more than just one branch because it's master/queens/pike14:57
mwhahahai need to get some actual number on the total size of a container set for a release14:58
fungiif its in the order of tens of gigabytes and changing daily (or faster) then that would need some significant engineering to solve14:58
mwhahahai don't think so because we don't necessarily need infra to solve the replication14:58
mwhahahawe could do the pushes and coordination of tagging ourselves14:58
mwhahahaso i think it can be solved simply but we don't have a location to push to14:58
fungipushes to where? directly into the mirrors? and what's the plan to deal with them getting out of sync?14:59
mwhahahaso we just need a stable base to start from, as long as it's in sync of a region there's not a problem14:59
mwhahahafrom our stand point we query the registry for the latest15:00
mwhahahaso if we're pushing and don't update the tag until we're done,  the jobs would continue to function15:00
mordredfungi: so that if there was a docker registry in each cloud-region that had per-project space in it, they could have a publication job that pushed content into all of the mirrors15:00
mwhahahawe're handling patching the containers in the jobs which includes updating them15:00
fungimore concerned about the troubleshooting required if uploads to one or more regions get stuck/fail and you end up with jobs running against different versions of images for a significant period of time depending on where they're running15:01
mwhahahabut that's on the project15:01
mwhahahawe already have processes for uploading containers15:01
mordredthe challenge there is having a per-region registry that supports enough namespacing that a given project could have space to push things without affecting other projects15:01
mwhahahaso for us it's just additional locations and folks solve the failures as they happen15:02
fungiyeah, i really don't want to engineer "the tripleo docker registry network"15:02
mwhahahafor that it would be quota setting and i'm not sure what's available from the various registry solutions15:02
*** heyongli has quit IRC15:03
*** e0ne has quit IRC15:03
mwhahahabut it's also the kolla registry network15:03
fungiif we build something, it would need to be generalized for any project that wants to put things in it15:03
*** heyongli has joined #openstack-infra15:03
mordredright. one of the challenges so far is that all of the more advanced registry solutions all assume one is going to run your container registry system in containers itself15:03
mordredso far we do not use containers to run any services - although I am working on a spec about opening that up15:03
mordredthis is what I was getting at before - when we looked at this before the rabbit hole got pretty deep ... but we're getting close to the point where some of the previous blockers may have solutions15:04
mwhahahai know dmsimard has had decent luck with the atomic registry15:04
mordredyes. that is one of the ones I believe we'd entertain15:05
mwhahahaso maybe he has some input on this and the fesability to offer something15:05
mordredbut atomic registry assumes you are running it in containers15:05
dmsimardatomic registry is unfortunately not really a thing anymore15:05
mordredwhich we do not currently support15:05
clarkbdo we have a list of issues that the cyrrebt caching proxies dony solve that running a new service everywhere would?15:05
dmsimardat least last I know15:05
clarkbalso tumbleweed deleted xmonad so I'm like a fisb out of water right now15:05
dmsimardmordred: we are running the RDO openshift (standalone registry implementation) on a single virtual machine15:05
mordreddmsimard: awesome15:05
dmsimardmordred: it's not containerized15:05
fungiclarkb: sounds like it's probably mostly related to wanting faster and more reliable access to very large files which change very frequently15:05
mwhahahadmsimard: oh sorry i thought it was teh atomic registry, it's the openshift one?15:06
clarkbfungi: right I dont expect anew service would address that15:06
openstackgerritMerged openstack-infra/project-config master: Don't follow symlinks when setting log permissions  https://review.openstack.org/57543915:06
mwhahahathey aren't very large files15:06
mwhahahacontainers aren't a single file15:06
fungilarge sets of data then15:06
clarkbnetwork bw is the issue15:06
dmsimardmwhahaha: atomic registry is an implementation of openshift standalone registry but it's deprecated afaik15:07
clarkbabd running a new service wont fix that15:07
mwhahahathe issue is network bw out of a DC15:07
mordredclarkb: I think the thing a registry in each region would solve that passthrough caching doesn't is the ability choose which things are put into the registries (the things people care about - or maybe also built artifacts)15:07
mwhahahaand a new service in dc would fix that15:07
mwhahahaor at least improve it15:07
clarkbmwhahaha: you still have to copy it to all DCs15:07
fricklermordred: ianw: fungi: with this patch to the clouds yaml I was able to successfully start an instance running /home/ianw/start.py http://paste.openstack.org/show/723478/15:07
mordredpassthrough caching can get blown out by contention and random queries for less important things15:07
mwhahahahaving to go to extenral origin because caching rate is not sufficient seems to be the issue15:07
clarkbthats wan and effectively the same issue15:08
mwhahahawe're trying to be smart about the loading of teh data because we know what we need15:08
fricklerbbl15:08
dmsimardmordred: fwiw openshift has pull-through caching support https://docs.openshift.com/container-platform/3.9/install_config/registry/extended_registry_configuration.html#middleware-repository-pullthrough15:08
mwhahahathe caching leaves that up to disruption15:08
clarkbmordred: that implies you'll find terabytes if local storage for the registry in each cloud? otherwise you still have that problem15:08
mwhahahawe don't need terabytes15:08
mwhahahait's a rotating set of probably 10-20g15:09
clarkbmwhahaha: you have 100gb today iirc15:09
dmsimardmordred: so for example, you can automatically mirror the "centos" image in the openshift registry and then pull that15:09
mordredclarkb: I don't think the request is for terrabytes - I think the request if for a specific set of base images to always be in the cache and never get expired15:09
mwhahahaclarkb: 100gbs where?15:09
clarkbmwhahaha: of dockerhub cache in each region15:09
mwhahahaif you're refering to the apache stuff, that's shared between containers, rpms, etc15:09
mordredbecuase if they get expired, then the pullthrough cache needs to refresh from the open internet15:09
*** iyamahat has quit IRC15:09
mwhahahaclarkb: so whats the cache hit on that15:10
clarkbmordred: right so thats sort of shat I was tryibg to get at if the problem is how we cache we can address tht pretty easily15:10
clarkbif the problem is network we cant15:10
clarkbmwhahaha: I havent looked in a long time so not sure15:10
*** dizquierdo has joined #openstack-infra15:11
mordredclarkb: yes - the problem is how we cache15:11
mordredthe problem is not network15:11
*** dhajare has quit IRC15:11
mordredor - the hypothesis is that the problem is how we cache15:11
clarkbmordred: we can increas the length of time we keep objects (I think it is 24 hours today)15:11
mwhahahaso there are two issues15:11
mwhahaha(at least)15:12
mwhahahathe caching of stuff for 24 hours may not be correct15:12
mwhahahabecause the rotation of the images may occur more frequently15:12
mordredclarkb: in a perfect world, projects would be able to say "we never want this set of objects to be expired from cache"15:12
mwhahahawhich is hwy i asked about being able to push to a registry because we would handle the next cache loading in the container build process15:12
mordredclarkb: but the logistics of that become obviously complicated15:13
*** heyongli has quit IRC15:13
mwhahahawith caching, we're at the mercy of other things as well because the stale content may continue to live long after it's no longer valid reducing the viability of the cache15:13
*** heyongli has joined #openstack-infra15:13
fungialso if the files all update at once, then you get a thundering herd sort of problem15:13
mwhahahaalso misses are more painful because you end up with multipl origin hits15:13
clarkbmwhahaha: stale data in this cobtex shouldnt be a big problem since it is all sha256 addressed15:14
clarkband it acts as a fifo15:14
openstackgerritPaul Belanger proposed openstack-infra/zuul-jobs master: Add buildset-artifacts-location  https://review.openstack.org/53067915:14
mwhahahaok it would be nice to have some visibility on the health of these caches to understand increased requests/misses15:14
fungiwould exposing apache mod_status help, i wonder?15:18
mwhahahapossibly to start15:20
clarkbI was trying ti fibd if something like that would report ache stats15:20
*** yamamoto has joined #openstack-infra15:20
clarkbI dont think status does15:21
Diabelkohello again, I've stumbled upon "parent-change-enqueued" event and seems like a great idea to solve my problem (B depends on A, both got -1 because of failure in A, A is fixed and B is not getting a re-run), but I don't see it configured anywhere in your check or gate pipelines in openstack-infra/project-config15:22
Diabelkois there a tricky part there somewhere?15:22
Diabelkosome unintended behavior?15:22
*** heyongli has quit IRC15:23
*** lpetrut has joined #openstack-infra15:23
*** heyongli has joined #openstack-infra15:23
*** pcaruana has quit IRC15:24
*** yamamoto has quit IRC15:25
*** krtaylor has quit IRC15:25
clarkblooks like you can add cache status logging to the log format15:25
fungiDiabelko: we've taken the stance in the past that actions on a change should be necessary to trigger new jobs in independent pipelines like our check or experimental pipelines (new patchset, change restored, explicit recheck comment). also parent-change-enqueued only gets you that behavior for explicit change series but not zuul's cross-repository or cross-connection dependencies15:29
fungiDiabelko:15:30
fungier, sorry, stray carriage return15:31
*** ykarel_ has joined #openstack-infra15:31
fungioh, you're talking about a zuul internal event, i was completely misunderstanding and confusing that with one of the gerrit event stream events15:32
fungiso i think we do rely on that to enqueue in dependent pipelines15:32
fungiand it does get you cross-repo/conn dependencies15:33
*** heyongli has quit IRC15:33
*** ykarel|away has quit IRC15:34
*** heyongli has joined #openstack-infra15:34
*** eernst has joined #openstack-infra15:34
fungioh, actually no that behavior is simply implicit in dependent pipelines, as zuul evaluates the entire set of dependent changes to see which are ready to enqueue when it gets an enqueuing event for any one of them15:35
*** anteaya has joined #openstack-infra15:35
fungilooks like we actually added that event type in https://review.openstack.org/112411 nearly 4 years ago15:38
*** lpetrut has quit IRC15:38
*** krtaylor has joined #openstack-infra15:40
*** hashar is now known as hasharAway15:40
*** e0ne has joined #openstack-infra15:40
*** krtaylor has quit IRC15:42
fungiDiabelko: so anyway, yes matching on parent-change-enqueued to enqueue changes for retesting may make sense in some environments. i think in ours i wouldn't want to see that because we have rather a lot of churn and often very long dependent series where forcing retesting could lead to a lot of additional utilization, but it's worth entertaining15:43
clarkbya the internet seems to think that %{Age} is the way to track this via the log15:43
clarkbthen we can produce a report like we do for docs 404s likely15:43
*** heyongli has quit IRC15:44
*** krtaylor has joined #openstack-infra15:44
*** krtaylor has quit IRC15:44
*** heyongli has joined #openstack-infra15:44
clarkbor cache_status https://httpd.apache.org/docs/2.4/mod/mod_cache.html#status15:45
*** krtaylor has joined #openstack-infra15:45
*** krtaylor has quit IRC15:49
*** lihi has quit IRC15:50
*** lpetrut has joined #openstack-infra15:50
*** dtantsur is now known as dtantsur|afk15:51
*** kgiusti has left #openstack-infra15:52
*** germs has joined #openstack-infra15:52
*** germs has quit IRC15:52
*** germs has joined #openstack-infra15:52
*** lihi has joined #openstack-infra15:52
fungithat seems like it would be useful15:53
*** linkmark has joined #openstack-infra15:53
openstackgerritEd Leafe proposed openstack-infra/project-config master: Rename the API-WG to API-SIG  https://review.openstack.org/57547815:54
*** heyongli has quit IRC15:54
*** heyongli has joined #openstack-infra15:54
openstackgerritClark Boylan proposed openstack-infra/system-config master: Log cache misses on caching mirror proxies  https://review.openstack.org/57547915:54
clarkbmwhahaha: fungi ^ start with something like that maybe15:55
*** sshnaidm has quit IRC15:55
*** dizquierdo has quit IRC15:56
pabelangerI'm just looking into nodepool errors in nl03, and seeing: shade.exc.OpenStackCloudTimeout: Timeout waiting for the server to come up for limestone wonder if we need to bump boot-timeout a little here15:58
pabelangerwas originally looking to see they OVH-gra1 failures15:59
Diabelkofungi: oh, interesting. Our dependencies usually don't go with more than 5 reviews, but then again I think only situation it can bite you is that if change B (C, D..) requires modifications as well16:01
logan-pabelanger: timestamps line up with the spike here? http://grafana.openstack.org/dashboard/db/nodepool-limestone?panelId=13&fullscreen16:01
DiabelkoI'm interested that mostly because of the users though16:01
Diabelkogot asked multiple times why CI didn't run on a change X16:01
pabelangerlogan-: 2018-06-14 11:37:20,798 ERROR nodepool.NodeLauncher-0000121748: Launch attempt 1/3 failed for node 0000121748:16:02
pabelangerlogan-: seems too16:02
pabelangerlogan-: did we upload new images?16:03
pabelangerand hitting force raw convert?16:03
*** krtaylor has joined #openstack-infra16:03
*** myoung is now known as myoung|lunch16:04
*** heyongli has quit IRC16:04
*** heyongli has joined #openstack-infra16:05
*** krtaylor has quit IRC16:06
*** panda is now known as panda|off16:06
*** lifeless has quit IRC16:07
*** jpich has quit IRC16:08
*** lifeless has joined #openstack-infra16:08
logan-pabelanger: yeah either that or the daily osa deploy run is what i was thinking16:09
pabelangerk16:10
logan-thanks was just curious. osa deploy was 8:15 - 9:50 so it must have been new images16:12
*** jesslampe has joined #openstack-infra16:13
*** heyongli has quit IRC16:14
pabelangerlogan-: can you confirm if nova is converting them to raw?16:14
*** heyongli has joined #openstack-infra16:15
logan-ya the images in /var/lib/nova/instances/_base appear to be raws based on qemu-img info output16:16
logan-http://paste.openstack.org/raw/723482/16:17
pabelangerk, that might explain it16:17
pabelangerlogan-: do they need to be raw?16:18
pabelangerif so, we could upload raw directly16:18
pabelangerotherwise, we should force qcow2 in nova.conf16:18
*** gyee has joined #openstack-infra16:19
*** udesale has joined #openstack-infra16:20
*** jesslampe has quit IRC16:22
*** yamamoto has joined #openstack-infra16:22
logan-they don't need to be raw, but iirc the defaults are the way they are because theres a performance hit to the instance disk i/o when the base is not raw? uploading raw would probably not be preferable because you would take a hit on boot times downloading images from glance then.16:22
logan-im thinking youre right to increase the ready timeout16:23
*** krtaylor has joined #openstack-infra16:23
pabelangerlogan-: which would take longer, downloading raw from glance, or all compute nodes converting to raw?16:24
*** krtaylor has quit IRC16:24
*** jesslampe has joined #openstack-infra16:24
*** heyongli has quit IRC16:25
pabelangerlogan-: 14GB for .raw / 8.5GB for .qcow216:25
*** heyongli has joined #openstack-infra16:25
logan-depends how many nodes we have :) i have no idea w/ the current setup but as you add more nodes its not like the glance bw throughput will scale along with the hv count. if you want to upload raws and find out, im not opposed to that16:27
*** yamamoto has quit IRC16:27
*** pbourke has quit IRC16:28
pabelangerclarkb: corvus: Shrews: we seem to have a fair bit (10) ready locked nodes in nodepool right now, all seem to be above 15 hours. Only mention since node-requests are climbing today and could use all the nodes when possible.16:28
pabelanger1 seems to be 2 days16:28
*** udesale has quit IRC16:28
*** jesslampe has quit IRC16:29
fungialso raw means much longer upload times to glance, and greater opportunity they're disrupted and have to be retried16:31
fungiand greater bandwidth utilization16:32
corvuspabelanger, clarkb, Shrews: it would be useful if someone has the time to track down what holds the locks and why.  unfortunately, i don't at the moment.16:32
*** germs has quit IRC16:32
ShrewsI'm afk for lunch16:33
corvusi plan on performing a full zuul restart today (a fuul restart?), so we'll probably lose that data soon.  on the plus side, we'll probably release the locks.16:34
pabelangerI might be able to dig more in later this evening, but not right now. I just wanted to point it out as I was looking into nodepool failures16:34
*** Swami has quit IRC16:35
openstackgerritEd Leafe proposed openstack-infra/project-config master: Migrate the API-SIG to StoryBoard  https://review.openstack.org/57512016:35
*** heyongli has quit IRC16:35
*** heyongli has joined #openstack-infra16:35
corvusthough, iirc, we may still have an open bug (from even before the 3.0 release) about builds/nodes not being cleaned up correctly if an executor dies; there could still be an edge case in there somewhere.16:35
corvuswe've had a lot of executor restarts lately16:35
*** SumitNaiksatam has quit IRC16:36
*** iyamahat has joined #openstack-infra16:36
pabelangerYah, there is also a bug if zuul does a reload, and removes a job, we leak the node and don't properly clean up.16:37
*** zzzeek has quit IRC16:38
*** dhajare has joined #openstack-infra16:39
*** lyarwood has quit IRC16:39
*** lyarwood has joined #openstack-infra16:39
*** sshnaidm has joined #openstack-infra16:40
*** germs has joined #openstack-infra16:41
*** germs has quit IRC16:41
*** germs has joined #openstack-infra16:41
*** krtaylor has joined #openstack-infra16:42
*** zzzeek has joined #openstack-infra16:44
*** heyongli has quit IRC16:45
*** heyongli has joined #openstack-infra16:45
*** krtaylor has quit IRC16:46
*** germs has quit IRC16:47
*** e0ne has quit IRC16:48
clarkbwe should be able to manually delete the nodes though right?16:53
clarkbespecially if they are that old16:53
*** dougsz has quit IRC16:55
mnaserhave we ever mirrored git repos locally?16:55
pabelangerclarkb: no, locked held but zuul, can't do it via CLI. Maybe we should add a --force flag16:55
mnaserspice-html5 decided "fu github" and is now running on an awful gitlab instance that is working half the time16:55
*** heyongli has quit IRC16:55
*** heyongli has joined #openstack-infra16:56
clarkbmnaser: no, one idea around that was to add the repos to zuul and it would push things into the test nodes16:56
clarkb(and maintain the cache on the zuul nodes)16:56
*** krtaylor has joined #openstack-infra16:58
*** zoli is now known as zli|gone16:59
*** zli|gone is now known as zoli|gone16:59
*** zoli|gone is now known as zoli16:59
*** e0ne has joined #openstack-infra17:00
*** derekh has quit IRC17:00
*** trown is now known as trown|lunch17:01
mnaserhttps://gitlab.freedesktop.org/spice/spice-html5 like this was 503ing a whole bunch of times17:02
*** SumitNaiksatam has joined #openstack-infra17:02
pabelangerisn't it published to npm?17:02
mnaserooo17:03
mnaserthat's a good idea17:03
*** krtaylor has quit IRC17:03
pabelangeryah, you'll get the regional reverse proxy cache that way17:03
mnasera whole bunch of commits missing from master tho17:03
mnasergr17:03
mnasermaybe they publish snapshots to npm hm17:04
clarkbmnaser: thanks for the comments about docker install yesterday, looking at the log now and http://logs.openstack.org/74/74/c356c2eeb28c1dcd4deb2b00fcd896c57d66284c/third-party-check/kata-runsh/bae0973/job-output.txt.gz#_2018-06-13_22_39_08_695464 looks suspicioutly unhappy17:04
clarkbcorvus: looking at that log path I think the 74/74/sha1/ maybe in error? I think we want 74/c3/sha1 ?17:04
mnaserclarkb: yeah, docker-ce would make a difference, can i iterate on the job you were building out?17:05
mnaserthe prep script is the same they use across all vms17:05
clarkbmnaser: definitely, but also docker-ce doesn't have bionic package :/17:05
mnaseryou have to add a repo17:05
clarkbmnaser: ya setup.sh is running but there are only older ubuntu packages17:05
clarkbthis is why I manually install docker as part of the job from the system17:05
*** heyongli has quit IRC17:06
clarkbmnaser: they have a bionic repo but no packages in it17:06
mnaserright but you have to add the docker.io repos17:06
mnaserOH17:06
mnaserOH17:06
clarkbyes I know, those are empty :)17:06
mnaseri'm sorry17:06
mnaserokay, now i get it17:06
mnasersorry potato brain17:06
*** heyongli has joined #openstack-infra17:06
clarkbya I don't know if that is just lag or if they don't publish packages while the distro has up to date packages itself or what17:06
pabelangerhttps://github.com/docker/for-linux/issues/29017:06
mnaseri think there in another path17:06
clarkbwe could try the xenial packages17:06
pabelangeror artful?17:07
mnaseror maybe we could run their ci on xenial vms?17:08
clarkbmnaser: ya I think the other path works as far as "oh docker is already installed" but doesn't work for the configure docker stuff17:08
clarkbmnaser: ya we could do that, though I figured starting on newer distro would be nice for other reasons (but docker-ce sort of gets in the way I guess)17:08
mnaserinvolves changing up the nodesets but it is a lot less 'breaking' their ci17:08
mnaserthey do testing on 16.04, 17.10 (dont ask me), centos 7 and fedora 2717:09
clarkbwas thinking newer kernel and everything else may be beneficial to them but maybe worry about that later17:10
*** e0ne has quit IRC17:10
mnaser++17:10
clarkblet me put up some patches to xenial17:11
*** myoung|lunch is now known as myoung17:12
*** e0ne has joined #openstack-infra17:13
*** jpena is now known as jpena|off17:14
openstackgerritClark Boylan proposed openstack-infra/project-config master: Add xenial node for vexxhost kata testing  https://review.openstack.org/57550217:15
clarkbmnaser: ^ that is first step17:15
openstackgerritMerged openstack-infra/project-config master: Remove glance legacy job  https://review.openstack.org/55101617:15
mnaserlgtm17:16
*** heyongli has quit IRC17:16
*** heyongli has joined #openstack-infra17:16
pabelanger+317:16
*** amoralej is now known as amoralej|off17:16
dhellmanncould I get 1 more reviewer to take a look at https://review.openstack.org/574842 please? I would like to be able to test out that new check job but I can't do it speculatively because the change is in project-config17:17
openstackgerritClark Boylan proposed openstack-infra/openstack-zuul-jobs master: Improve kata-runsh job  https://review.openstack.org/57374817:18
clarkbpabelanger: mnaser ^ that is the consumption of it, I have just been testing that in a self testing manner though17:18
clarkbso doesn't need review yet17:18
*** krtaylor has joined #openstack-infra17:19
corvusdhellmann, AJaeger: why is that in project-config instead of ozj?17:19
dhellmannbecause it uses part of the tarball playbook and the real publish job needs to be in project-config because it uses secrets17:19
*** e0ne has quit IRC17:19
corvusthx17:19
mnaserclarkb: this is the ready script right now in jenkins that is working - http://paste.openstack.org/show/723489/17:20
dhellmannkeeping the 2 things together felt like the less confusing way to do it, rather than duplicating the playbook and role17:20
dhellmann*2 jobs together17:20
mnaseryou can skip the java part obviously because that's for the jenkins slave stuff17:20
mnaseralso the whole unattended stuff is probably useless in our case17:20
*** krtaylor has quit IRC17:21
clarkbmnaser: ya also setup.sh does the docker install if no docker installed so I think it has that covered. Also we don't need java for jenkins17:21
mnaseroh the setup.sh does docker install? ok interesting didnt know that17:21
mnaseri copypasta'd old config17:21
clarkbmnaser: ya, it was one of the first things that failed for me because of the lack of bionic packages17:21
mnaseraaaah17:22
clarkbgot past that then ran into the nested virt problem and it wouldn't run past that17:22
*** krtaylor has joined #openstack-infra17:22
*** krtaylor has quit IRC17:22
mnasergotcha17:23
clarkbbut what is old is new again now that nested virt is addressed for their use case17:23
*** yamamoto has joined #openstack-infra17:23
*** e0ne has joined #openstack-infra17:24
clarkbmnaser: https://github.com/kata-containers/proxy/pull/74 is the pull request I have been pushing to to test changes to https://review.openstack.org/573748 if you want to also try that feel free. Though we have to wait for the xenial stuff to get to nodepool first17:24
mnaserooh that's how you've been doing it17:25
mnaseri see17:25
clarkbmnaser: right now only that one project in github tlaks to zuul17:25
mnaserare you manually re-enquing or 'recheck'17:25
*** felipemonteiro has joined #openstack-infra17:25
clarkbmnaser: I'm pushing new commits to the PR because I don't think recheck will work there until they accept the updated perms requirements17:25
clarkbmnaser: usually I just edit the commit message then push :)17:26
*** heyongli has quit IRC17:26
mnaserok cool17:26
*** heyongli has joined #openstack-infra17:26
mnaseronce we get the base going i think it might be relatively easy to move things across17:27
clarkbya should be too bad. The biggest thing will be addign a tenant for them in zuul if they want to use zuul17:27
*** e0ne has quit IRC17:27
*** yamamoto has quit IRC17:29
*** tesseract has quit IRC17:30
*** dave-mccowan has quit IRC17:30
clarkbmwhahaha: so that I understand properly, the symptom of failure you are observing is that multiple docker image pulls take long enough to cause jobs to timeout? Also, you are pulling the same images in all cases so they should be cached?17:30
clarkbor rather pulling the same images within a single job so subsequent pulls should be quicker17:31
mwhahahayea17:31
corvusdhellmann: i feel like there's probably a way to either use inheritance or ansible roles so that the bulk of the job is in ozj, and then there's a smaller thing in project-config which adds the secret into the mix for the "real" job.  but probably the best thing to do is to +3 that change and then think about refactoring later (since it shouldn't be hard) rather than blocking on perfect design.  :)17:31
*** janki has quit IRC17:31
clarkbmwhahaha: cool thanks for confirming. In that case I agree checking cache usage is going to be helpful. I think causes of that could be not using the cache at all (we have checked that we are using the proxies right?), the proxies not caching or not using cached data for some reason, or we are caching but network bandwidh (possibly disk io?) are the bottleneck17:32
clarkbmy change to the apache mirror config should help address the second item there17:32
mwhahahaclarkb: we checked the logs and it's referencing the mirrors so it should be going through the proxies17:33
mwhahahaso i'm just trying to work through the flow to understand where there might be issues. i know that the transit to the origin would be a problem. It's not consistent enough to look like we're hitting something there but may be related to efficiency of the caches17:34
clarkbmwhahaha: is any one region worse than the others (that could point to network/disk bottlenecks)17:35
mwhahahanot really17:35
mwhahahaweshay|ruck had more data around that17:35
mwhahahawe were seeing issues in rax and in limestone17:35
mwhahahaand maybe vexxhost17:35
mwhahahaat least thats what he mentioned to me yesterday but it wasn't consistent17:36
weshay|ruckwhat's up17:36
mwhahahai noticed there seemed to be a slight pattern in timeouts but it wasn't exactly every N hours17:36
*** heyongli has quit IRC17:36
mwhahahaweshay|ruck: looking into container fetching and caching effeciencies17:36
weshay|ruckk17:37
*** heyongli has joined #openstack-infra17:37
corvusi'm going to directly re-enqueue some zuul gate changes (starting at 575351) so that we can restart it with them today17:38
*** akhilaki has joined #openstack-infra17:38
clarkbok17:38
openstackgerritsebastian marcet proposed openstack-infra/openstackid-resources master: Added endpoint to get current selection plan by status  https://review.openstack.org/57550917:38
openstackgerritMerged openstack-infra/openstackid-resources master: Added endpoint to get current selection plan by status  https://review.openstack.org/57550917:39
clarkbI'm getting htcacheclean to print everything we have in the cache on limestone17:42
*** florianf has quit IRC17:43
*** pcaruana has joined #openstack-infra17:43
clarkblimestone has 6718 docker images cached17:44
clarkbor at least that many /cloudfront/registry-v2/docker/registry/v2/blobs/$sha256 objects17:45
*** e0ne has joined #openstack-infra17:45
clarkbthat implies we are at least caching them17:45
*** heyongli has quit IRC17:47
*** heyongli has joined #openstack-infra17:47
corvusclarkb: do we log cache status?  https://httpd.apache.org/docs/2.4/mod/mod_cache.html#status17:47
clarkbcorvus: not yet https://review.openstack.org/575479 is up to start doing that17:49
clarkbthe biggest object in the cache is 429MB and is an opendaylight tarball. The next biggest objects are in the 330MB range and are all container images17:50
*** diablo_rojo has joined #openstack-infra17:51
openstackgerritMerged openstack-infra/project-config master: add a job to check the metadata for python packages  https://review.openstack.org/57484217:52
openstackgerritMerged openstack-infra/project-config master: remove publish-openstack-python-tarball  https://review.openstack.org/57485917:52
openstackgerritJames E. Blair proposed openstack-infra/system-config master: Add cache status into mirror access log  https://review.openstack.org/57551417:54
corvusclarkb: ^ i was thinking of an alternate approach, what do you think?17:54
clarkb43.9GB of 52GB of cache is just docker body content17:54
clarkbcorvus: looking17:55
clarkbcorvus: ya that will get us more info. I was wanting to avoid needing to sift too much data but that can all be done after the fact +217:56
EmilienMis gerrit only slow for me?17:56
clarkbEmilienM: it wasn't slow for me reviewing that change just now. Is it the web ui that is slow or pushing code? maybe both?17:56
EmilienMboth17:56
EmilienMit's probably canadian mega firewalls17:56
openstackgerritDavid Shrewsbury proposed openstack-infra/nodepool master: Fix paused handler exception handling  https://review.openstack.org/57551517:57
*** heyongli has quit IRC17:57
clarkbwe aren't garbage collecting according to melody17:57
*** heyongli has joined #openstack-infra17:57
*** ykarel_ has quit IRC17:58
clarkbmwhahaha: corvus I was also wrong about having 100GB available for the cache. We do have at least that but then htcacheclean is set to trim at 50GB because it lags behind apache17:58
clarkbwe may be able to increase that number, cache cleaning performance appears much better after we changed the number of levels in the cache17:58
clarkb80GB maybe17:58
clarkbmwhahaha: weshay|ruck if you can find the sha256sum of one of your images that should not have changed recently I can double check if it is in the cache too18:00
dhellmanncorvus : I stepped away from lunch, so I'm just coming back to your comment. I'll be happy to work with you on a refactoring once I have the job working.18:01
*** pcaruana has quit IRC18:01
clarkbdockerhub search doesn't work on sha256s apaprently18:03
*** krtaylor has joined #openstack-infra18:04
*** jesslampe has joined #openstack-infra18:04
*** krtaylor has quit IRC18:06
openstackgerritMerged openstack-infra/zuul-jobs master: Add buildset-artifacts-location  https://review.openstack.org/53067918:06
clarkbthey don't print the sha256sum either18:06
*** pcaruana has joined #openstack-infra18:06
*** krtaylor has joined #openstack-infra18:07
*** heyongli has quit IRC18:07
clarkbaha as of docker1.10 the layers are sha256 addressable18:07
*** heyongli has joined #openstack-infra18:08
clarkband no longer 1:1 mapped wtih images18:08
clarkbstill would be nice to be able to look things up this way18:08
clarkb(but undersatnd why it is more difficult if everyone shares a layer)18:08
openstackgerritEd Leafe proposed openstack-infra/project-config master: Rename the API-WG to API-SIG  https://review.openstack.org/57547818:08
clarkbmwhahaha: weshay|ruck but ya if you can docker inspect one of your images that hasn't updated recently then I can check for whether or not the layers are cached18:09
*** krtaylor has quit IRC18:11
*** owalsh has joined #openstack-infra18:12
*** aojea has quit IRC18:14
*** lpetrut has quit IRC18:14
*** eharney has quit IRC18:15
*** heyongli has quit IRC18:17
*** heyongli has joined #openstack-infra18:18
*** electrofelix has quit IRC18:19
*** felipemonteiro has quit IRC18:19
*** trown|lunch is now known as trown18:19
*** eharney has joined #openstack-infra18:20
openstackgerritBrianna Poulos proposed openstack-infra/openstack-zuul-jobs master: Remove glance legacy job  https://review.openstack.org/55101918:21
openstackgerritClark Boylan proposed openstack-infra/system-config master: Increase apache mirror cache to 70GB  https://review.openstack.org/57552018:23
clarkbcorvus: mwhahaha ^ something like that may also help18:23
*** sshnaidm is now known as sshnaidm|off18:26
*** e0ne has quit IRC18:27
*** heyongli has quit IRC18:28
*** r-daneel has quit IRC18:28
*** heyongli has joined #openstack-infra18:28
*** r-daneel has joined #openstack-infra18:28
*** yamamoto has joined #openstack-infra18:34
openstackgerritMerged openstack-infra/zuul master: Move zuul_log_id injection to command action plugin  https://review.openstack.org/57535118:36
openstackgerritMerged openstack-infra/zuul master: Fix log streaming for delegated hosts  https://review.openstack.org/57535218:36
openstackgerritMerged openstack-infra/zuul master: Revert "Temporarily override Ansible linear strategy"  https://review.openstack.org/57535318:36
openstackgerritMerged openstack-infra/zuul master: Remove extra argument when logging logger timeout  https://review.openstack.org/57535418:36
*** heyongli has quit IRC18:38
*** heyongli has joined #openstack-infra18:38
*** yamamoto has quit IRC18:39
*** germs has joined #openstack-infra18:43
*** germs has quit IRC18:43
*** germs has joined #openstack-infra18:43
*** germs has quit IRC18:47
*** heyongli has quit IRC18:48
*** heyongli has joined #openstack-infra18:48
*** yamamoto has joined #openstack-infra18:49
*** camunoz has quit IRC18:50
*** Goneri has quit IRC18:52
*** dave-mccowan has joined #openstack-infra18:52
*** yamamoto has quit IRC18:53
clarkbmwhahaha: corvus my reading of the cache timestamps is that we have plenty of older cached data in teh cache. So increasing the size may not actually help18:54
*** camunoz has joined #openstack-infra18:55
clarkb(that implies we are keeping old data then checking if it has been modified since and not deleting it due to disk pressure)18:57
clarkbthe logging change from corvus should tell us more though18:57
*** yamamoto has joined #openstack-infra18:58
*** heyongli has quit IRC18:58
*** heyongli has joined #openstack-infra18:59
*** sthussey has quit IRC18:59
*** gfidente is now known as gfidente|afk19:00
dhellmanncorvus : I expected this patch to run the new test release job and I don't know why it didn't. Is there some way to ask zuul? https://review.openstack.org/#/c/574916/19:02
dhellmannoh, nevermind, I see why19:03
dhellmannmissing a depends-on19:03
*** yamamoto has quit IRC19:07
hogepodgeSorry for the silly question, but how do I switch a gate job from optional to required?19:08
*** heyongli has quit IRC19:09
clarkbhogepodge: depends on what you mean by optional. Is it currently non voting? or does it only run when certain files are modified?19:09
fungihogepodge: not silly, but i'm having trouble parsing it. is the job in question being run now but reported as "non-voting" with the result?19:09
*** heyongli has joined #openstack-infra19:09
*** eernst has quit IRC19:09
hogepodgeclarkb: fungi: I want to change the job loci-requirements from non-voting to voting https://review.openstack.org/#/c/575174/19:10
hogepodgeThe rest can remain non-voting19:10
*** eernst has joined #openstack-infra19:11
clarkbhogepodge: https://git.openstack.org/cgit/openstack/loci/tree/.zuul.d/base.yaml#n6 sets that voting value to false globally in the loci jobs. You can override that at https://git.openstack.org/cgit/openstack/loci/tree/.zuul.d/requirements.yaml#n12 to force that one job to be voting19:12
clarkbjust add a voting: True19:14
*** e0ne has joined #openstack-infra19:15
*** felipemonteiro has joined #openstack-infra19:15
hogepodgeThanks clarkb.19:15
*** yamamoto has joined #openstack-infra19:16
*** yamamoto has quit IRC19:16
*** eernst has quit IRC19:16
*** eernst has joined #openstack-infra19:16
*** heyongli has quit IRC19:19
*** heyongli has joined #openstack-infra19:19
*** sthussey has joined #openstack-infra19:23
clarkbany other infra-root willing to review https://review.openstack.org/#/c/575514/1 ?19:25
clarkbadds cache status logging to our caching mirror proxies19:25
*** bobh has joined #openstack-infra19:27
*** heyongli has quit IRC19:29
*** heyongli has joined #openstack-infra19:29
prometheanfiredhellmann: might as well ask here19:30
prometheanfirehow long is the hound re-index expected to take?19:31
clarkbprometheanfire: I want to say just a few minutes, like 5-1019:32
fungi2018/06/14 14:57:51 Rebuilding airship-drydock for b4a31a79de5096ffb39c21c52544da515a7e06be19:36
fungi2018/06/14 14:57:51 open /tmp/csearch078190157: too many open files19:36
fungifrom the end of /var/log/hound.log19:36
prometheanfireah19:36
fungilooks like it may have been stuck partway through a reindex for the past ~5.5 hours19:37
*** lifeless has quit IRC19:37
*** eharney has quit IRC19:37
*** heyongli has quit IRC19:39
*** salv-orlando has joined #openstack-infra19:40
*** heyongli has joined #openstack-infra19:40
dhellmannheh, I assumed it was just taking a long time because we have a lot of data19:41
fungilooking for how to kick it manually, though i want to say we've hit "too many open files" with hound in the past just don't remember if we made any adjustments which we've now started to outgrow19:43
mtreinishfungi: sounds like ulimit19:44
clarkbmtreinish: agreed19:44
*** salv-orlando has quit IRC19:44
*** eernst has quit IRC19:45
fungirestarting the hound service seems to be how we update it19:46
fungitailing the log now19:46
fungiand yeah, we likely made some ulimit changes in the initscript i'm just checking to see what19:47
fungiulimit -n 204819:47
fungiin the start function19:47
fungiso i guess we've outgrown it19:47
fungii'll get a patch together after my next conference call19:48
*** heyongli has quit IRC19:50
fungi2018/06/14 19:50:08 All indexes built!19:50
*** heyongli has joined #openstack-infra19:50
*** kjackal has quit IRC19:53
*** lifeless has joined #openstack-infra19:53
fungiquite a number of "too many open files" errors in hound.log going back at least a week (the extent of our log retention on it)19:53
*** jesslampe has quit IRC19:54
*** eernst has joined #openstack-infra19:54
*** eernst has quit IRC19:55
*** eernst has joined #openstack-infra19:55
clarkbAJaeger: dirk I managed to rescue my xmonad install on tumbleweed by using https://build.opensuse.org/project/show/devel:languages:haskell:lts:11 the joys of a rolling release I guess. I'm pinging you because I can't find why the xmonad packages were removed from the distro proper, any idea how I figure that out? I'd be willing to maintain packages if that is what is needed19:55
mtreinishclarkb: heh, its just not hipster enough for xmonad :)19:58
*** heyongli has quit IRC20:00
*** heyongli has joined #openstack-infra20:00
*** dpawlik has quit IRC20:03
*** agopi has quit IRC20:09
*** agopi has joined #openstack-infra20:09
*** heyongli has quit IRC20:10
*** heyongli has joined #openstack-infra20:10
*** iyamahat has quit IRC20:14
clarkbmtreinish: at this point it feels less hipster and silly old functional programming people related20:16
*** felipemonteiro has quit IRC20:16
clarkbwhich is fine by me20:16
*** yamamoto has joined #openstack-infra20:16
*** heyongli has quit IRC20:20
*** heyongli has joined #openstack-infra20:21
*** yamamoto has quit IRC20:22
*** e0ne_ has joined #openstack-infra20:26
*** e0ne has quit IRC20:29
*** iyamahat has joined #openstack-infra20:29
*** heyongli has quit IRC20:31
*** heyongli has joined #openstack-infra20:31
openstackgerritMerged openstack-infra/project-config master: Add xenial node for vexxhost kata testing  https://review.openstack.org/57550220:33
*** esarault has quit IRC20:34
mtreinishclarkb: heh, ok20:38
*** salv-orlando has joined #openstack-infra20:40
*** heyongli has quit IRC20:41
*** heyongli has joined #openstack-infra20:41
clarkbmtreinish: are you in the i3 camp? I seem to recall you had something going on your laptop too20:44
*** salv-orlando has quit IRC20:45
*** eernst has quit IRC20:45
mtreinishI run openbox on my laptop and desktop20:47
mtreinishI could get into the full tiling wm thing20:47
*** eernst has joined #openstack-infra20:47
*** eernst has joined #openstack-infra20:47
* fungi still uses raptioson on all his x11 sessions20:47
mtreinishI did want to put i3 on my gemini, but the linux side of that is still a mess so I couldnt get it working20:48
fungier, ratpoison20:48
fungino idea how my fingers turned that into raptioson20:48
zigoHi there !20:48
zigoAny idea why https://review.openstack.org/#/c/575168/ isn't launching its tests?20:48
zigoDid I do something wrong?20:48
fungizigo: http://zuul.openstack.org/ shows it enqueued in the check pipeline for right at 2 hours now20:50
zigofungi: So, it's just busy infra?20:50
fungiwe're just busy at the moment, yeah20:50
zigoOk.20:51
*** heyongli has quit IRC20:51
zigoI just found a wrong test in tempest...20:51
zigohttp://logs.openstack.org/62/575262/2/check/puppet-openstack-integration-4-scenario003-tempest-debian-stable/fb52e26/job-output.txt.gz#_2018-06-14_14_47_34_90877120:51
zigoLooks like " instead of ' ... :P20:51
fungilooks like there are changes just a few minutes ahead of it getting node assignments in check now, so it'll probably start running jobs momentarily20:51
*** heyongli has joined #openstack-infra20:51
zigoIf I fix that one, then I get a 2nd puppet-openstack scenario passing ! :)20:52
fungiexcellent20:52
zigofungi: https://review.openstack.org/#/c/575262/ <--- The first scenario (the one that everyone cares about) is now green there...20:52
*** dhajare has quit IRC20:53
zigoI'm not sure what's going on with fwaas though, but it's definitively the cause of the last issue.20:53
fungiand looks like 575168,5 has node assignments rolling in now20:57
*** caphrim007 has joined #openstack-infra20:58
openstackgerritMerged openstack-infra/grafyaml master: fix tox python3 overrides  https://review.openstack.org/57433321:00
*** eernst has quit IRC21:00
*** eernst has joined #openstack-infra21:00
ianwmordred: are you thinking the clouds.yaml would look like -> http://paste.openstack.org/show/723511/ ?21:00
ianwcause i still can't get it to automatically choose :/21:01
clarkbmnaser: I just pushed to my kata proxy PR so we should queue up a job on xenial21:01
ianwi feel slightly better that it was not at least trivially obvious21:01
*** heyongli has quit IRC21:01
*** heyongli has joined #openstack-infra21:02
mnaserclarkb: oh cool i'll keep an eye out21:02
*** eernst has quit IRC21:02
*** eernst has joined #openstack-infra21:06
*** gfidente|afk has quit IRC21:07
ianwok, i think it's default we want21:08
ianwwhich leads to "The cloud returned multiple addresses, and none of them seem to work. That might be what you wanted, but we have no clue what's going on, so we just picked one at random"21:09
ianwwhich sounds like a mordred error message to me :)  but it comes up21:09
ianwfungi / clarkb : want to checkout 147.75.38.146 for sanity as a testing node; if it's good we're closer to getting packethost up21:10
*** bobh has quit IRC21:10
*** eernst has quit IRC21:10
fungii can certainly reach it21:10
fungi75gb rootfs with 63gb available might be tight21:11
clarkbfungi: I think that is what we get in other clouds21:11
fungii thought it was 80, but maybe close enough21:12
*** SumitNaiksatam has quit IRC21:12
fungi4 vcpus21:12
*** heyongli has quit IRC21:12
*** heyongli has joined #openstack-infra21:12
ianwoohh, i might have selected m1.large actually21:12
ianwi think there's a zuul flavor21:12
clarkbfungi: I think once you account for Gb vs GB and fs overhead we end up with ~75GB usable then our git caches fill up a ton of spcae alongside the distro21:12
fungisure21:13
clarkbbut ya 4vcpus looks potentially short what we want, but if we aren't oversubscribed or those cpus are fast it could be ok21:13
ianw548f2da8-edb4-440f-8f64-b661223f572c | zuul-flavor |  8192 |   80 |         0 |     8 | True21:13
clarkbosic was 4vcpu for a while21:13
ianwclarkb: ^ yep, the zuul flavor has 8, but is otherwise the same21:13
clarkbianw: that looks closer to what we want21:13
clarkbianw: maybe run tests on both flavors21:13
fungiand yeah, no other block devices besides the rootfs as far as i can see21:13
clarkbif 4vcpu is enough we use that otherwise use 821:13
ianwi can bring up a zuul flavor ... hang on21:13
*** iyamahat has quit IRC21:13
fungi/dev/vda1  *     2048 167772126 167770079  80G 83 Linux21:14
fungiyou're right clarkb21:14
fungiDisk /dev/vda: 80 GiB, 85899345920 bytes, 167772160 sectors21:15
*** ldnunes has quit IRC21:15
ianw147.75.38.147 is a zuul-flavor21:15
fungiyeah, looks basically the same except for double the vcpu count21:16
clarkbya I would say run some representative test on it (tempest has been good in the past) and we can compare them and go from there21:17
fungii suppose we can try m1.large and if jobs run too slowly we switch to the zuul flavor?21:17
clarkbfungi: in the past I've run tempest on flavors until we find one that works21:17
clarkbits a bit more difficult now that we don't have a reproduce.sh that works out of the box21:17
clarkbbut if you modify the zuul ref stuff you can get it ti still function I think21:17
clarkb(you just point it at master)21:18
ianwwell we have enough quota to run 100 of the zuul flavor, so i'd say go with that21:18
clarkbianw: that works too21:18
ianwotherwise we run out of ram before cpu21:18
clarkbgood point21:18
clarkbin that case we can just test zuul-flavor21:18
clarkbmake sure it gets jobs done reasonably quickly and then add it ot the pool21:18
openstackgerritMerged openstack-infra/system-config master: Add cache status into mirror access log  https://review.openstack.org/57551421:18
clarkbianw: I want to say if you take a current devstack-gate reproduce.sh and hack it to have a zuul cloner and update the zuul refs it will still work21:19
*** yamamoto has joined #openstack-infra21:19
*** roman_g has quit IRC21:19
zigofungi: The day we get our public cloud deployed, I'll make sure we give a few VMs to the infra. What's the requirement? Is there a minimum? I guess it's a nice stress test to run infra jobs, no? :)21:19
*** trown is now known as trown|outtypewww21:20
fungizigo: it's a very effective stress-test, we've been told21:20
zigo:)21:20
zigoAnd then we get free monitoring by real humans ... :P21:21
fungia minimum of 25 nodes worth of quota i think we've said in the past, but preferably around 100 nodes of quota or more21:21
fungi25 is where it starts becoming more work for us to track than we benefit from21:21
openstackgerritIan Wienand proposed openstack-infra/system-config master: Add default network to packethost  https://review.openstack.org/57554721:21
zigoOk.21:22
*** heyongli has quit IRC21:22
clarkbianw: also ansible-clouds.yaml or whatever that file is called21:22
zigofungi: And for us, it's just giving out a tenant, right?21:22
*** heyongli has joined #openstack-infra21:22
clarkbit is in the same dir as all-clouds.yaml21:22
fungizigo: basically, yes. details are at https://docs.openstack.org/infra/system-config/contribute-cloud.html21:23
*** eernst has joined #openstack-infra21:23
fungizigo: and if we can get 100 nodes of quota then we list the provider at https://www.openstack.org/foundation/companies/#infra-donors as well as on posters and slides in common areas and between talks at some of our conferences/events21:24
*** yamamoto has quit IRC21:24
openstackgerritIan Wienand proposed openstack-infra/system-config master: Add default network to packethost  https://review.openstack.org/57554721:25
ianwclarkb: ^ now with more overrides :)21:25
fungizigo: and we track and graph our interactions with per-provider dashboards like http://grafana.openstack.org/dashboard/db/nodepool-rackspace too so you can see what we're seeing21:25
clarkbianw: +2 thanks21:26
zigok21:26
*** eernst has quit IRC21:27
fungithe "time to ready" graph for example shows how long it's taking from when we put in a nova boot call until the node is available and reachable21:27
fungiand the various api operations graphs show response times for the api to get back to us with responses for each of those kinds of calls21:28
zigoWe're kind on planning on super fast infra...21:29
zigo4 x 10 Gbits bgp to the host...21:29
*** slaweq has quit IRC21:29
zigoThe only thing is that I haven't found so many docs on bgp-to-the-host setups.21:30
zigo:/21:30
*** slaweq has joined #openstack-infra21:30
*** eernst has joined #openstack-infra21:31
*** aeng has joined #openstack-infra21:32
*** heyongli has quit IRC21:32
*** heyongli has joined #openstack-infra21:32
fungithat would be pretty awesome. ibgp i'm assuming, not ebgp21:33
fungithough hopefully bgp621:33
fungii never really dealt with ibgp back in my providers days since we were pretty firmly entrenched in ospf (and some inherited eigrp) for our igp, though did a lot of ebgp at least21:35
*** caphrim007 has quit IRC21:35
*** eernst has quit IRC21:35
*** eernst has joined #openstack-infra21:37
*** hasharAway has quit IRC21:37
*** camunoz has quit IRC21:38
fungii can imagine some very interesting global redundancy services you could sell to customers by advertising their same ip addresses from different servers in different facilities21:39
*** salv-orlando has joined #openstack-infra21:41
ianwfungi / clarkb : i'm feeling like if we merge the default network thing, we just hand-hack in a max servers of 5 or something to get a few jobs and see?21:41
clarkbianw: probably not the worst thing21:42
*** eernst has quit IRC21:42
*** heyongli has quit IRC21:42
*** heyongli has joined #openstack-infra21:43
*** eernst has joined #openstack-infra21:43
*** prometheanfire has quit IRC21:43
*** dave-mccowan has quit IRC21:45
*** diablo_rojo has quit IRC21:45
*** myoung is now known as myoung|off21:45
*** salv-orlando has quit IRC21:46
*** slaweq has quit IRC21:46
fungiwfm21:48
*** eernst has quit IRC21:48
openstackgerritEd Leafe proposed openstack-infra/project-config master: Migrate the API-SIG to StoryBoard  https://review.openstack.org/57512021:49
openstackgerritEd Leafe proposed openstack-infra/project-config master: Rename the API-WG to API-SIG  https://review.openstack.org/57547821:49
*** eernst has joined #openstack-infra21:49
*** lifeless has quit IRC21:49
*** sthussey has quit IRC21:49
*** lifeless has joined #openstack-infra21:50
*** eernst has quit IRC21:51
*** heyongli has quit IRC21:53
*** eernst has joined #openstack-infra21:53
*** heyongli has joined #openstack-infra21:53
*** edmondsw has quit IRC21:54
*** eernst has quit IRC21:55
*** eernst has joined #openstack-infra21:55
fungiwe're finally back down under 100 node requests21:59
fungier, 100021:59
*** diablo_rojo has joined #openstack-infra22:01
fungilooks like we're spending good chunks of time with no executors accepting, though we're managing to saturate our quotas so i don't guess we need more executors yet22:02
*** heyongli has quit IRC22:03
*** heyongli has joined #openstack-infra22:03
*** jbadiapa_ has joined #openstack-infra22:03
*** jesslampe has joined #openstack-infra22:04
*** jbadiapa has quit IRC22:06
openstackgerritColleen Murphy proposed openstack-infra/puppet-openstack_infra_spec_helper master: Use system-config script to install puppet  https://review.openstack.org/48194322:06
*** akhilaki has quit IRC22:11
*** heyongli has quit IRC22:13
*** heyongli has joined #openstack-infra22:14
ianwyeah a new cloud would be nice about now :)22:16
fungiheh22:16
fungishould we bypass check for 575547?22:16
johnsomI have a question about a gate run. We got ERROR Unable to find playbook , though it's there in the master branch but this patch was not rebased onto that yet.  Is it the case that the playbooks must be in patch history? It seems odd that it would find the job but not the playbook.22:18
johnsomThis is the patch: https://review.openstack.org/#/c/558962/22:18
*** iyamahat has joined #openstack-infra22:19
*** rcernin has joined #openstack-infra22:20
clarkbjohnsom: where is the change that added the job and the one that adds the playbook22:20
*** yamamoto has joined #openstack-infra22:20
johnsomclarkb https://review.openstack.org/54965422:20
ianwfungi: will the check actually check anything? i'm not sure we even run a syntax check over those files22:21
clarkbjohnsom: its possible the branches specifier may be confusing things since this is a branched repo22:22
corvusinfra-root: any objection to me restarting zuul now?  (full restart)22:23
fungiianw: other than maybe a yaml syntax check, doubtful. regardless we're just going to run all the same jobs again in the gate so skipping check doesn't lose us anything22:23
fungicorvus: no objection22:23
*** heyongli has quit IRC22:23
johnsomclarkb Did we do this wrong? branches: ^(?!stable/(ocata|queens)).*$  Both patches were on the master branch22:23
clarkbjohnsom: looking at time stamps it may also be a race in config generation possibly22:24
*** heyongli has joined #openstack-infra22:24
clarkbjohnsom: in general you don't need branches: specifiers on repos with branches. Instead you just have per branch configs on each branch22:24
*** e0ne_ has quit IRC22:24
clarkbjohnsom: this allows your config to evolve on master but be stable on stable branches just by virtue of git branch branching22:24
clarkbcorvus: not from me22:24
*** yamamoto has quit IRC22:25
clarkbjohnsom: the behavior of branches: can get really confusing as you branch overtime and try to update things22:25
johnsomclarkb I wondered about that. So really we don't need those "branches" config lines?  That makes more sense to me really22:25
clarkbzuul applies it deterministically but it does so in a way that people don't expect unless they consider all branches together22:25
clarkbjohnsom: ya, instead you just configure the jobs you want in each branch's config22:26
johnsomclarkb Golden, thanks! I will clean those up at some point.22:26
*** boden has quit IRC22:27
*** hongbin has quit IRC22:27
clarkbjohnsom: as for the error you saw, I think it is possible that there was a race between merging trees for the executorsand merging trees for the scheduler. The scheduler merged stuff and added the new job. executor merged stuff and didn't add new job and new playbook22:27
corvusclarkb: the executor should merge exactly what the scheduler did22:28
clarkbcorvus: https://review.openstack.org/#/c/558962/ last comment from zuul there is what we are discussing fwiw22:29
clarkbcorvus: there definitely appears to be a mismatch22:29
clarkbhttps://review.openstack.org/#/c/549654/37 is the change that dded the job and playbook22:29
clarkbnot see anything otherthan the brances: specifiers that look out of place. THe paths seem to match up22:31
corvusclarkb: oh i understand what you're saying.  yes, there was likely a mismatch between the scheduler's config and the change itself.  if the repo state for the change was frozen by the scheduler, then the change that added the job landed, then we might expect to see this error.22:32
*** ociuhandu has joined #openstack-infra22:32
*** ociuhandu_ has quit IRC22:33
corvusclarkb: (the repo states are still consistent between what the scheduler originally merged for the change and what it executed.  the delta is between both of those states and the running config.  interestingly, a change which touches .zuul.yaml is probably immune to this fault)22:33
*** heyongli has quit IRC22:34
*** heyongli has joined #openstack-infra22:34
clarkbin that case I think the remedy for now is to recheck22:34
corvusyep22:34
clarkbjohnsom: ^22:34
*** nicolasbock has quit IRC22:34
johnsomOk22:35
corvusif we wanted to treat this as a bug, we might be able to fix it by resetting the buildset when jobs are added to a buildset, and those jobs are defined in a repo in the current dependency chain.22:36
*** iyamahat has quit IRC22:42
*** salv-orlando has joined #openstack-infra22:42
*** heyongli has quit IRC22:44
*** heyongli has joined #openstack-infra22:44
corvusokay, so i think what i'm going to do this time is shut the executors and mergers down, wait until they're stopped, then save queues and restart the scheduler (and friends), then start mergers and executors22:44
corvusi think that will minimize perceived downtime while avoiding the issue of the scheduler timing out during startup22:45
fungiwhat were the circumstances of the scheduler timeout last time?22:45
clarkbok, and full restart is for pre release burn in?22:45
corvusfungi: if it takes more than 5m for a merger to pick up an initial "cat" job, the scheduler will stop.22:46
*** salv-orlando has quit IRC22:46
*** jbadiapa_ has quit IRC22:46
fungiaha, right, and the mergers/executors were still all stopping when the scheduler got restarted?22:47
corvus(i could stop everything, then start the mergers along with the scheduler but not the executor.  i think that would work just as well)22:47
clarkbI think that is how I've done it before22:47
corvusyeah, i think fundamentally the mistake was not starting the mergers with the scheduler.  that's minimally required.  executors are optional.22:48
fungianyway, sounds safe enough. thanks for the reminder!22:48
*** boris_42_ has joined #openstack-infra22:48
corvusoh, actually, let me tweak this a bit22:49
fungiseparately, i wonder if it would make sense for the scheduler to bring a merger to the party the way each executor does22:49
corvushttps://etherpad.openstack.org/p/aIJSh0cGks22:49
*** tosky has quit IRC22:49
clarkbthat lgtm22:50
corvusokay, two procedures there.  i think they both would work (i tweaked my original suggestion slightly)22:52
corvuswith the first procedure, we still want to keep the mergers online as long as possible so that zuul can continue to process events and add new items/jobs to the queues22:53
clarkbprocedure 2 is the one I've used in the past and seemed to work well enough22:54
*** heyongli has quit IRC22:54
fungithe main difference with #2 is that the executors are continuing to stop while you're bringing everything else back up i guess?22:54
*** heyongli has joined #openstack-infra22:54
fungithat means less time wasted waiting since you can wait for them to stop while you wait for the config to get loaded by the scheduler?22:55
corvusyes, and the scheduler will come online slower with #2, which means more new events will end up first in the queue ahead of the things you saved22:55
fungigood point22:55
fungithose people just win the lottery, that's all ;)22:55
corvuswell, i think either way, the total time taken is going to be driven by the executor restart cycle22:55
*** threestrands has joined #openstack-infra22:56
*** threestrands has quit IRC22:56
*** threestrands has joined #openstack-infra22:56
*** dklyle has quit IRC22:56
corvusi think the event ordering is the only real substantial difference (even with only the mergers, the scheduler will come online before the executors are stopped)22:56
corvusi'm going to try #1 just to try it out22:56
fungiwfm22:57
*** dklyle has joined #openstack-infra22:57
clarkbsounds good22:57
*** threestrands has quit IRC22:57
corvusi gave a heads up to -release22:57
*** threestrands has joined #openstack-infra22:57
corvusstopping executors22:58
corvuswhile i'm looking at the post pipeline... clarkb you may be interested in reviewing https://review.openstack.org/57193222:59
fungioh! i didn't see that one come in23:00
clarkboh indeed.23:01
corvuswe've gone 6 years with only two pipeline managers.  it will be exciting to have a third :)23:01
*** jbadiapa_ has joined #openstack-infra23:02
*** heyongli has quit IRC23:04
*** heyongli has joined #openstack-infra23:05
pabelangerOooh, looking forward to ^23:11
clarkbcorvus: when not restarting zuul I left some comments, one of which I think may need attention (at the very least to get the logging right)23:12
corvuslooks like all executors have stopped, so i'm proceeding now23:12
*** heyongli has quit IRC23:15
*** heyongli has joined #openstack-infra23:15
corvusokay, a downside of process #1 is extra nodepool thrashing23:15
corvusafter stopping the executors, the scheduler recycles all the nodes, but then once the scheduler restarts, they all have to be recycled again23:17
corvus(sorry cloud providers)23:17
corvusthat might be a reason to favor #223:17
fungiyeah23:17
*** yamamoto has joined #openstack-infra23:21
EmilienMis the queue going to come back?23:23
corvusfungi, clarkb: responded to comments; can you check me on that?23:23
fungiEmilienM: yes23:24
EmilienMok23:24
corvusEmilienM: yep, just started re-enqueuing (was waiting for zuul to come online)23:24
*** heyongli has quit IRC23:25
*** heyongli has joined #openstack-infra23:25
fungicorvus: thanks for the detailed response. makes sense and i hadn't considered how the queue ordering actually worked23:26
*** yamamoto has quit IRC23:26
corvusfungi: me neither -- at least not with respect to starvation23:26
fungiit's certainly an interesting dynamic23:27
fungion the other hand, at least in our model, the items should only enqueue as fast as new merges can trigger them23:27
corvusi think maybe a way things could be made more fair (for supercedent and depedent) would be to have the queue processor order the queues based on the enqueue time of the head of each queue.23:28
*** prometheanfire has joined #openstack-infra23:28
fungiyeah, i'm trying to think back to fair queuing algorithms i've had familiarity with in the past and whether any are relevant to this case23:29
fungithe packet forwarding field is full of fair queuing designs23:29
clarkbcorvus: aha, I think that may be worth a comment as its non obvious if just looking at it like queue operations. Of course that could be the allergy headache talking23:30
clarkbcorvus: basically 0th and last are special23:30
fungiclarkb: my left ear has been stopped up since vancouver. i feel for you :/23:30
*** jesslampe has quit IRC23:30
clarkbfungi: the last few days have been really bad, yesterday was ok though. We had a reasonably dry may, then rain and now sun again and I think thattriiggered whatever my body hates to add pollen to the air23:31
clarkbthe rain then sun combo is so bad23:31
fungiunless you're grass23:31
clarkbI have a hunch its the what I think is lilac in the backyard23:32
openstackgerritJames E. Blair proposed openstack-infra/zuul master: Add supercedent pipeline manager  https://review.openstack.org/57193223:32
clarkbbut I don't want to take the tree out until I am sure because its a nice tree23:32
*** heyongli has quit IRC23:35
*** heyongli has joined #openstack-infra23:35
*** r-daneel has quit IRC23:37
*** jesslampe has joined #openstack-infra23:39
*** lifeless_ has joined #openstack-infra23:42
*** salv-orlando has joined #openstack-infra23:43
*** lifeless has quit IRC23:43
*** heyongli has quit IRC23:45
*** heyongli has joined #openstack-infra23:46
*** markvoelker has quit IRC23:46
*** salv-orlando has quit IRC23:47
*** bobh has joined #openstack-infra23:52
*** caphrim007 has joined #openstack-infra23:52
*** rpioso is now known as rpioso|afk23:54
*** heyongli has quit IRC23:56
*** heyongli has joined #openstack-infra23:56
*** caphrim007 has quit IRC23:57
*** caphrim007 has joined #openstack-infra23:58

Generated by irclog2html.py 2.15.3 by Marius Gedminas - find it at mg.pov.lt!