fungi | on the other hand, there are also lots more of them, and it's a stateful service not a command-line utility | 00:00 |
---|---|---|
notmorgan | fungi: that metric makes me sad. "faster tha nova's unit tests" is not the metric i'd measure things by | 00:00 |
notmorgan | :P | 00:00 |
*** dizquierdo has joined #openstack-infra | 00:01 | |
*** jamielennox|away is now known as jamielennox | 00:02 | |
*** rbrndt has quit IRC | 00:02 | |
*** camunoz has quit IRC | 00:02 | |
Sam-I-Am | hmm... is something wrong with r.o.o? | 00:03 |
Sam-I-Am | seems to be hanging, eventually returning service unavailable | 00:03 |
Sam-I-Am | oo, now a 500 error | 00:03 |
fungi | notmorgan: it was a... tongue-in-cheek metric | 00:03 |
fungi | Sam-I-Am: database backups kick off at midnight utc | 00:04 |
*** pahuang_ has quit IRC | 00:05 | |
notmorgan | fungi: >.> | 00:05 |
Sam-I-Am | fungi: ah ha | 00:06 |
*** krotscheck1 has quit IRC | 00:08 | |
*** Sukhdev has quit IRC | 00:09 | |
*** rere has joined #openstack-infra | 00:09 | |
*** rere has left #openstack-infra | 00:09 | |
openstackgerrit | Merged openstack-infra/nodepool: Add launch-timeout validator into config-validation. https://review.openstack.org/232974 | 00:10 |
*** Sukhdev has joined #openstack-infra | 00:10 | |
*** zz_dimtruck is now known as dimtruck | 00:11 | |
*** Daisy_ has joined #openstack-infra | 00:13 | |
*** Guest80711 is now known as med_ | 00:15 | |
*** med_ has quit IRC | 00:15 | |
*** med_ has joined #openstack-infra | 00:15 | |
*** dizquierdo has quit IRC | 00:15 | |
*** Daisy_ has quit IRC | 00:16 | |
openstackgerrit | Ian Wienand proposed openstack-infra/nodepool: Add ipv6-preferred into config-validation https://review.openstack.org/232981 | 00:17 |
*** camunoz has joined #openstack-infra | 00:19 | |
prometheanfire | neat, gate is stuck | 00:21 |
jeblair | prometheanfire: can you elaborate? | 00:21 |
*** pahuang_ has joined #openstack-infra | 00:22 | |
prometheanfire | nah, just noticed the wait time is almost 12hr now | 00:22 |
jeblair | prometheanfire: okay. the words you use have a specific meaning to some people that apparently is not the meaning you intended. :) | 00:22 |
prometheanfire | ah | 00:22 |
fungi | the gate is "busy" | 00:23 |
*** yamamoto_ has joined #openstack-infra | 00:24 | |
ianw | greghaynes: maybe it's obvious to everyone else looking at https://review.openstack.org/#/c/273703 , but i wouldn't mind some comments to convince me how you came to the conclusion this is the right thing to do... | 00:24 |
jeblair | words like 'stuck', 'wedged', 'completely broken' tend to make us drop what we're doing and fix things. words like 'slow because tests are flakey' not so much. | 00:24 |
*** thorst has joined #openstack-infra | 00:24 | |
*** abitha has joined #openstack-infra | 00:25 | |
*** diana_clarke has joined #openstack-infra | 00:25 | |
jeblair | fungi, whereiskrotscheckanyway: http://mirror.iad.rax.openstack.org/wheel/ exists | 00:25 |
fungi | i agree, and even more importantly, apache did not fall over! | 00:25 |
fungi | so we're safe to proceed | 00:25 |
nibalizer | jeblair: did you see my timing comparisons for infracloud this morning? | 00:26 |
jeblair | i'm not going to claim the lock on building servers right now if someone else wants to; but will do so later (possibly tomorrow) if no one else has | 00:26 |
jeblair | nibalizer: no | 00:26 |
nibalizer | 1883.0000 - 1144 = 739 | 00:27 |
jeblair | oh yeah i did see that | 00:27 |
jeblair | i recognize those numbers :) | 00:27 |
nibalizer | http://paste.openstack.org/show/485697/ | 00:27 |
*** sdake has joined #openstack-infra | 00:27 | |
nibalizer | 739 seconds slower, so like 12 minutes | 00:27 |
jeblair | nibalizer: what's 1144? | 00:29 |
jeblair | nibalizer: and is this ~= gate-tempest-dsvm-full ? | 00:29 |
nibalizer | you linked a console-log | 00:29 |
jeblair | nibalizer: yeah, but it was a random one just to show you where the number was | 00:29 |
nibalizer | from a random test and that timing was 1144 | 00:29 |
jeblair | i don't even know what job or branch it was :) | 00:30 |
*** yamamoto_ has quit IRC | 00:30 | |
openstackgerrit | greghaynes proposed openstack-infra/nodepool: Handle dib image deletion during periodic cleanup https://review.openstack.org/273703 | 00:30 |
*** yamamoto_ has joined #openstack-infra | 00:30 | |
greghaynes | ianw: hah, I was currently in the process of writing a new commit msg for that :) | 00:31 |
*** annegentle has joined #openstack-infra | 00:31 | |
jeblair | nibalizer: i'm looking at some recent nova runs now, and i think that may be well within our current variation | 00:31 |
nibalizer | http://logs.openstack.org/82/259082/5/check/gate-tempest-dsvm-full/3cbbf9b/console.html looks like the right job | 00:31 |
nibalizer | right job right branch | 00:31 |
*** thorst has quit IRC | 00:32 | |
nibalizer | jeblair: thats good news | 00:32 |
*** ybathia has joined #openstack-infra | 00:32 | |
jeblair | nibalizer: yeah, so we may be in the ovh/internap range but not as fast as rax | 00:32 |
nibalizer | can we take the exploratory step of setting up nodepool creds for omfra then? give it a node? start up a mirror? | 00:32 |
jeblair | nibalizer: yeah, i think so. | 00:33 |
*** jed56 has quit IRC | 00:33 | |
jeblair | nibalizer: here's a much slower internap node: http://logs.openstack.org/55/275155/1/gate/gate-tempest-dsvm-full/466a8de// | 00:33 |
nibalizer | :D | 00:33 |
crinkle | yay | 00:33 |
fungi | another infra-cloud | 00:33 |
fungi | success | 00:34 |
jeblair | nibalizer: and ovh looks closer to omfra at 1900 http://logs.openstack.org/41/269841/2/gate/gate-tempest-dsvm-full/fe70eca/console.html | 00:34 |
openstackgerrit | Joshua Harlow proposed openstack/requirements: Bump up eventlet to at least 0.18.1 https://review.openstack.org/275452 | 00:34 |
jeblair | i think 'close to ovh performance' is good :) | 00:34 |
*** yamamot__ has joined #openstack-infra | 00:34 | |
*** yamamoto_ has quit IRC | 00:34 | |
*** annegentle has quit IRC | 00:36 | |
* nibalizer will make with the patches | 00:37 | |
nibalizer | crinkle: any objection to me putting 15.184.52.4 controller01.hpuswest.ic.openstack.org into dns? | 00:38 |
fungi | i thought i already had? | 00:38 |
*** jamielennox is now known as jamielennox|away | 00:38 | |
crinkle | fungi: controller00 is in dns | 00:38 |
crinkle | nibalizer: there isn't anything on controller01 | 00:38 |
fungi | ah! right | 00:38 |
crinkle | afaik | 00:38 |
nibalizer | huh | 00:38 |
fungi | finally someone who, like me, enjoys starting host numbers at 0 and not 1 ;) | 00:39 |
clarkb | I do think we should try to get https://review.openstack.org/#/c/274821/1 as that seems to get stuff working on bluebox | 00:39 |
*** jamielennox|away is now known as jamielennox | 00:39 | |
clarkb | it is the switch to vxlan from gre | 00:39 |
crinkle | fungi: i think that started with SpamapS :P | 00:39 |
* fungi is not surprised in the least | 00:39 | |
nibalizer | ah yes I am using controller00 in all my openstack config files | 00:40 |
clarkb | bluebox floating ip listings look good still | 00:40 |
crinkle | these machines are almost definitely going to be moved and probably renumbered so i wouldn't spend a lot of effort dnsing them yet | 00:40 |
fungi | clarkb: seems a bit unfortunate to have to do that, but i guess it's an okay workaround while we try to get someone to figure out if bbox is just missing a conntrack plugin or something | 00:40 |
clarkb | I do not think it was a persistent issue whatever was causing hte leak | 00:40 |
*** thorst has joined #openstack-infra | 00:40 | |
SpamapS | crinkle: aye, 'twas the only way I knew. :) | 00:41 |
clarkb | fungi: ya I don't want to have to wait on that if we can make it work with an acceptable alternative | 00:41 |
fungi | SpamapS: numbers start at 0 | 00:41 |
fungi | of course | 00:41 |
clarkb | fungi: I just keep looking at the health dashboard and bleubox fail rate is really high compared to everything else | 00:42 |
*** thorst has quit IRC | 00:42 | |
clarkb | a large chunk of that appears to be multinode | 00:42 |
openstackgerrit | Joshua Harlow proposed openstack/requirements: Bump up eventlet to at least 0.18.1 https://review.openstack.org/275452 | 00:42 |
*** lucasagomes has quit IRC | 00:42 | |
SpamapS | do the multiple nodes communicate via the floating ips? | 00:43 |
fungi | clarkb: it seems like gate-tempest-dsvm-multinode-full consistently failed that change. what's up with that, any idea? | 00:43 |
clarkb | SpamapS: yes, tempest sshes to the floating IPs and may have to do so to a VM on the other hypervisor | 00:43 |
clarkb | fungi: I think we are masking other issues with the consistent fail to ssh | 00:43 |
clarkb | fungi: it does pass on other regions | 00:43 |
SpamapS | That shouldn't be super terrible, but it is definitely susceptible to slowdowns on the l3 agent. | 00:43 |
clarkb | fungi: https://jenkins03.openstack.org/job/gate-tempest-dsvm-multinode-full/2689/console is running on ovh now and I expect that to pass | 00:44 |
clarkb | SpamapS: it all works great when the floating IP is local :) | 00:44 |
SpamapS | meaning a not-floating IP? | 00:44 |
clarkb | no a floating IP NAT'd on the local host | 00:45 |
clarkb | because then you just hit networking in the kernel and never jump machines and it just works (tm) | 00:45 |
*** alivigni has quit IRC | 00:45 | |
clarkb | but if that floating IP is NAT'd on the other hypervisor (as with nova net multihost or neutron DVR) you need a working overlay network | 00:45 |
SpamapS | oh, so some rule that DNAT's to 127.0.0.1? | 00:45 |
*** dchen has joined #openstack-infra | 00:46 | |
clarkb | SpamapS: floating IPs NAT 172.24.5.0/24 (floating IPs) to the libvirt 10.whatever addrsses | 00:46 |
clarkb | if iptables is doing that on the same machine as tempest it works | 00:46 |
*** yamamot__ has quit IRC | 00:46 | |
SpamapS | Oh I see, it's the floating ip.. of the VM | 00:46 |
SpamapS | of the vm inside the vm | 00:46 |
clarkb | yes | 00:46 |
clarkb | ya | 00:46 |
SpamapS | https://www.youtube.com/watch?v=Wfg1c8dyZYM | 00:47 |
SpamapS | ^^ exactly how I feel righ tnow | 00:47 |
*** lucasagomes has joined #openstack-infra | 00:49 | |
*** bpokorny_ has quit IRC | 00:50 | |
clarkb | there are many turtles involved | 00:51 |
clarkb | but tl;dr is outside of neutron we set up an overlay to route the floating IPs independent of the external world's networking | 00:51 |
*** bpokorny has joined #openstack-infra | 00:51 | |
clarkb | we have been using GRE but bleubox seems to filter those packets | 00:51 |
clarkb | vxlan on the other hand works except that tests still fail but they don't fail due to ssh | 00:51 |
SpamapS | filter, or MTU fail? | 00:51 |
clarkb | pretty sure it isn't an mtu fail because they are tiny pings | 00:52 |
*** ashleighfarnham has quit IRC | 00:53 | |
clarkb | and we have set mtus to 1450 | 00:53 |
clarkb | which should be plenty small for both gre and vxlan | 00:53 |
*** jpr has quit IRC | 00:54 | |
*** bhunter71 has quit IRC | 00:55 | |
SpamapS | weird | 00:55 |
greghaynes | Sounds like they have an explicit list of l4 protos allowed | 00:56 |
*** pvaneck has quit IRC | 00:56 | |
SpamapS | greghaynes the question is.. why? | 00:57 |
greghaynes | yep | 00:57 |
clarkb | I also confirmed that it wasn't just arp that was broken by manually populating arp tables | 00:57 |
*** mriedem_machell is now known as mriedem_afk | 00:58 | |
*** eil397 has quit IRC | 00:58 | |
*** ybathia has quit IRC | 01:01 | |
*** SumitNaiksatam has quit IRC | 01:02 | |
fungi | greghaynes: mgagne mentioned that if you're neutron-based then iptables explicitly wants to state-track all communication through the interfaces, but if you don't explicitly load the conntrack plugin for gre then it can't set up state and doesn't allow the responses back in | 01:02 |
openstackgerrit | Tony Breeds proposed openstack-infra/yaml2ical: Add functionality to batch the meetings list https://review.openstack.org/275459 | 01:02 |
greghaynes | fungi: hrm, we should see the first outbound packet get there then | 01:03 |
clarkb | greghaynes: not on our vm though | 01:04 |
clarkb | and I can't tcpdump the hypervisors | 01:04 |
greghaynes | oh | 01:04 |
greghaynes | turtles | 01:04 |
nibalizer | infra-root, so it looks like we have 2 tenants per cloud? one that has 'jenkins' in the name and is the nodepool, and one that does not and we use to hold mirrors etc? | 01:04 |
fungi | well, we might since there's no session establishment for gre? | 01:04 |
fungi | or... who knows | 01:04 |
fungi | nibalizer: correcto | 01:04 |
clarkb | nibalizer: correct | 01:04 |
nibalizer | woot | 01:04 |
*** rajinir has quit IRC | 01:05 | |
fungi | nibalizer: that way we don't have to entrust nodepoold with the keys to the tenant which houses non-nodepool-managed hosts | 01:05 |
jeblair | nibalizer: given a choice, we try to call them openstackjenkins and openstackci | 01:05 |
*** kzaitsev_mb has quit IRC | 01:05 | |
jeblair | and yeah, that's probably going to live long after we stop using jenkins :) | 01:05 |
nibalizer | heh | 01:05 |
fungi | at least it's not openstackhudson? | 01:06 |
jeblair | whew | 01:06 |
openstackgerrit | Tony Breeds proposed openstack-infra/irc-meetings: WIP: Try a boostrap grid to list all the meetings https://review.openstack.org/241522 | 01:06 |
openstackgerrit | Tony Breeds proposed openstack-infra/irc-meetings: Exploit the new batch_meetings function https://review.openstack.org/275461 | 01:06 |
openstackgerrit | Yushiro FURUKAWA proposed openstack-infra/project-config: Add python3-job and check-requirements for networking-fujitsu https://review.openstack.org/275462 | 01:06 |
jeblair | it barely missed being called that :) | 01:06 |
clarkb | fungi: we had one fail with volume backed live migration | 01:06 |
clarkb | doesn't appear to be network related? | 01:06 |
clarkb | I want to say that this is something that nova semi expects | 01:07 |
jeblair | fungi: did we decide on a name for the wheel mirror slave? | 01:07 |
jeblair | maybe we should call it wheel-mirror-trustyx64.slave.o.o to use the same shortened form as the afs volume | 01:08 |
fungi | jeblair: i thought krotscheck had theoretically come up with a scheme... he was asking earlier today about underscores (and i referred him to the appropriate rfcs on allowed characters in hostnames as a result) | 01:11 |
fungi | but i don't know what he arrived at, if anything | 01:12 |
*** Swami_ has quit IRC | 01:12 | |
fungi | jeblair: in the interest of maintaining my personal no-bikeshed zone, i'm inclined to just agree to your first suggestion so we can move along | 01:15 |
jeblair | ok. i'm still not starting the launch (and am in fact going afk), so it's still up for grabs :) | 01:17 |
jeblair | probably looking at tomorrow if it's me | 01:17 |
fungi | noted. i have beer here so likely not going to get to it until tomorrow either | 01:18 |
*** sputnik13 has quit IRC | 01:19 | |
* nibalizer prepping omfracloud patches | 01:21 | |
*** gongysh has joined #openstack-infra | 01:22 | |
*** pradk has quit IRC | 01:22 | |
*** sridhar_ram has quit IRC | 01:26 | |
*** abitha has quit IRC | 01:30 | |
*** jamielennox is now known as jamielennox|away | 01:30 | |
*** shashank_hegde has quit IRC | 01:32 | |
*** achanda has joined #openstack-infra | 01:32 | |
greghaynes | fungi: cheers | 01:35 |
*** thorst has joined #openstack-infra | 01:36 | |
*** Daisy has joined #openstack-infra | 01:38 | |
*** Sukhdev has quit IRC | 01:39 | |
*** flwang has quit IRC | 01:39 | |
*** jsavak has joined #openstack-infra | 01:40 | |
*** weshay has quit IRC | 01:40 | |
*** dimtruck is now known as zz_dimtruck | 01:40 | |
*** baoli has joined #openstack-infra | 01:42 | |
*** baoli has quit IRC | 01:43 | |
*** SumitNaiksatam has joined #openstack-infra | 01:44 | |
*** thorst has quit IRC | 01:44 | |
*** thorst has joined #openstack-infra | 01:45 | |
*** mtanino has quit IRC | 01:45 | |
*** jsavak has quit IRC | 01:45 | |
*** jsavak has joined #openstack-infra | 01:45 | |
*** zz_dimtruck is now known as dimtruck | 01:47 | |
*** krotscheck1 has joined #openstack-infra | 01:50 | |
*** jpr has joined #openstack-infra | 01:52 | |
*** flwang has joined #openstack-infra | 01:53 | |
*** thorst has quit IRC | 01:53 | |
openstackgerrit | Merged openstack-infra/nodepool: Add ipv6-preferred into config-validation https://review.openstack.org/232981 | 01:57 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config: Always clone openstack/windmill for jobs https://review.openstack.org/275471 | 01:58 |
*** ok_delta has joined #openstack-infra | 01:59 | |
*** ok___delta has joined #openstack-infra | 01:59 | |
*** ybathia has joined #openstack-infra | 02:00 | |
*** ybathia has quit IRC | 02:01 | |
*** rossella_s has quit IRC | 02:02 | |
*** rossella_s has joined #openstack-infra | 02:03 | |
*** sdake has quit IRC | 02:05 | |
*** ok___delta has quit IRC | 02:07 | |
*** ok_delta has quit IRC | 02:07 | |
*** camunoz has quit IRC | 02:08 | |
*** salv-orl_ has joined #openstack-infra | 02:09 | |
*** pahuang_ has quit IRC | 02:10 | |
*** bpokorny has quit IRC | 02:11 | |
*** jpr has quit IRC | 02:11 | |
openstackgerrit | Merged openstack-infra/irc-meetings: Remove glance-drivers meeting https://review.openstack.org/275209 | 02:11 |
*** salv-orlando has quit IRC | 02:12 | |
* nibalizer adds omfracloud credentials to hiera | 02:14 | |
*** EricGonczer_ has joined #openstack-infra | 02:14 | |
nibalizer | (in other words, I am taking the hiera lock) | 02:14 |
openstackgerrit | Merged openstack-infra/irc-meetings: Move Glance artifacts meeting https://review.openstack.org/274806 | 02:15 |
nibalizer | and done | 02:17 |
openstackgerrit | Tony Breeds proposed openstack-infra/irc-meetings: Change chair for Glance's meeting https://review.openstack.org/275210 | 02:18 |
*** tiswanso has joined #openstack-infra | 02:19 | |
*** tiswanso has quit IRC | 02:19 | |
*** tiswanso has joined #openstack-infra | 02:20 | |
*** camunoz has joined #openstack-infra | 02:20 | |
*** zhurong has joined #openstack-infra | 02:21 | |
*** Jeffrey4l has joined #openstack-infra | 02:22 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config: Always clone openstack/windmill for jobs https://review.openstack.org/275471 | 02:23 |
*** pahuang_ has joined #openstack-infra | 02:23 | |
openstackgerrit | Merged openstack-infra/irc-meetings: Change chair for Glance's meeting https://review.openstack.org/275210 | 02:24 |
*** jsavak has quit IRC | 02:24 | |
*** amotoki has joined #openstack-infra | 02:24 | |
*** jsavak has joined #openstack-infra | 02:24 | |
angdraug | AJaeger: the promised plugins post: http://lists.openstack.org/pipermail/openstack-dev/2016-February/085636.html | 02:29 |
openstackgerrit | Spencer Krum proposed openstack-infra/system-config: Add OmfraCloud to puppetmaster_clouds https://review.openstack.org/275477 | 02:30 |
*** jamielennox|away is now known as jamielennox | 02:31 | |
fungi | clarkb: oh hey! see that update on the osic environment being ready by friday? | 02:31 |
clarkb | fungi I hadnt bjt will check email | 02:32 |
*** kzaitsev_mb has joined #openstack-infra | 02:32 | |
fungi | not urgent. nothing new needed from us | 02:32 |
*** angdraug has quit IRC | 02:32 | |
*** woodster_ has joined #openstack-infra | 02:37 | |
*** tphummel has quit IRC | 02:37 | |
*** Somay has joined #openstack-infra | 02:37 | |
*** EricGonczer_ has quit IRC | 02:37 | |
*** EricGonczer_ has joined #openstack-infra | 02:37 | |
*** jsavak has quit IRC | 02:38 | |
*** Thelo has quit IRC | 02:39 | |
*** dims_ has quit IRC | 02:40 | |
*** Apoorva has quit IRC | 02:41 | |
*** dimtruck is now known as zz_dimtruck | 02:41 | |
*** Jeffrey4l has quit IRC | 02:42 | |
*** zz_dimtruck is now known as dimtruck | 02:42 | |
*** FallenPegasus has joined #openstack-infra | 02:43 | |
*** Somay has quit IRC | 02:43 | |
*** EricGonczer_ has quit IRC | 02:44 | |
*** yamamoto has joined #openstack-infra | 02:46 | |
*** yamamoto has quit IRC | 02:49 | |
*** thorst has joined #openstack-infra | 02:52 | |
*** bhunter71 has joined #openstack-infra | 02:54 | |
*** sam_wan has joined #openstack-infra | 02:54 | |
*** thorst has quit IRC | 02:58 | |
sam_wan | hello, someone please tell me the email alias of openstack-infra team | 02:58 |
sam_wan | I need to enable my CI account | 02:59 |
sam_wan | thanks | 02:59 |
*** flwang1 has left #openstack-infra | 03:01 | |
*** sdake has joined #openstack-infra | 03:06 | |
*** links has joined #openstack-infra | 03:06 | |
nibalizer | sam_wan: what ci system do you run | 03:08 |
sam_wan | EMC isilon | 03:09 |
tonyb | Did jenkins get confused earlier today? All of the reviews against nova kilo went into merge conflict at the same time. | 03:10 |
tonyb | It's not impossible but it is improbable, and the 2 I've checked manually just fast-forward | 03:11 |
*** kzaitsev_mb has quit IRC | 03:11 | |
*** Daisy has quit IRC | 03:11 | |
*** aeng has joined #openstack-infra | 03:12 | |
sam_wan | nibalizer: do you know who i should contact? | 03:12 |
nibalizer | sam_wan: I can do it hold on | 03:14 |
tonyb | sam_wan: email openstack-infra@lists.openstack.org with all the details | 03:14 |
nibalizer | why was your system shutdown? | 03:14 |
sam_wan | nibalizer: thanks | 03:14 |
nibalizer | sam_wan: i'm not finding the email we sent out when disabling your account | 03:15 |
nibalizer | do you hvae it handy? | 03:15 |
sam_wan | it's disabled by Jim because we checked devstack-gate project | 03:15 |
sam_wan | I've altered my zuul configuration so we don't check devstack-gate again | 03:16 |
nibalizer | tis review | 03:16 |
nibalizer | https://review.openstack.org/#/c/188436/ | 03:16 |
nibalizer | ? | 03:16 |
sam_wan | I don't mean a specific patch, i enabled check for devstack-gate project before | 03:18 |
sam_wan | because I once uploaded a patch for devstack-gate and I thought maybe I need one of our CI's to check it | 03:19 |
*** yuanying_ has quit IRC | 03:20 | |
sam_wan | I've got an email from 'James E. Blair <corvus@inaugust.com>' on 1/29 | 03:20 |
openstackgerrit | Joshua Hesketh proposed openstack-infra/zuul: Fix memory leak reloading triggers https://review.openstack.org/275483 | 03:22 |
jhesketh | jeblair, fungi: ^ that should fix our memory leaks | 03:22 |
nibalizer | sam_wan: ok | 03:22 |
sam_wan | thanks | 03:22 |
nibalizer | I will re-enable your account | 03:22 |
jhesketh | sorry it took so long to track down! | 03:22 |
nibalizer | what is the gerrit name? | 03:23 |
nibalizer | hi jhesketh | 03:23 |
sam_wan | emc.isilon.ci@emc.com | 03:23 |
jhesketh | Hey nibalizer | 03:23 |
sam_wan | thanks nibalizer | 03:23 |
nibalizer | how are you? are you at lca? | 03:24 |
jhesketh | nibalizer: I'm good thanks. Indeed I am, currently sitting in a talk no less ;-) | 03:24 |
bkero | lca++ | 03:24 |
bkero | nibalizer: we should go next year. It's in Tazmania! | 03:24 |
*** yuanying has joined #openstack-infra | 03:24 | |
mtreinish | jhesketh: learning about power? | 03:25 |
jhesketh | mtreinish: trying ;-) | 03:25 |
jhesketh | bkero: that's my home town :-) | 03:25 |
bkero | You mean Hobart? | 03:25 |
openstackgerrit | Merged openstack/requirements: Exclude xvfbwrapper 0.2.8 from global requirements https://review.openstack.org/275155 | 03:26 |
fungi | jhesketh: convenient ;) | 03:26 |
jhesketh | bkero: correct | 03:26 |
nibalizer | taking the hiera lock again | 03:26 |
fungi | also, nice find on tracking down the leak! | 03:26 |
jhesketh | thanks | 03:26 |
jhesketh | it's a trivial fix, but was less trivial to find | 03:26 |
fungi | those sorts of bugs often are | 03:27 |
jhesketh | although we should probalby see how it goes before saying it's fixed | 03:27 |
bkero | jhesketh: Nice. I've been to a few LCAs, but skipped this year and hoped to double down for LCA Hobart next year. :) | 03:27 |
nibalizer | lock released | 03:27 |
openstackgerrit | Spencer Krum proposed openstack-infra/system-config: Add omfracloud to nodepool https://review.openstack.org/275485 | 03:28 |
jhesketh | bkero: cool, hope to see you there then! | 03:28 |
*** aeng has quit IRC | 03:28 | |
*** gyee has quit IRC | 03:28 | |
*** camunoz has quit IRC | 03:29 | |
bkero | jhesketh: Did you take the ferry over from Devonport? | 03:30 |
jhesketh | bkero: nah I flew over.. it's much quicker and in some ways cheaper | 03:31 |
jhesketh | particularly since I don't need a car here | 03:31 |
bkero | Yeah, I guess the only reason you'd do it is car. | 03:31 |
bkero | But having a car in Geelong is probably a good thing. I remember I was thankful to have a car in Ballarat. | 03:31 |
jhesketh | bkero: the campus in the city so it's not quite the same as ballarat | 03:32 |
jhesketh | the student accom is a while away and a car there would probably be handy, but there are buses | 03:32 |
jhesketh | and I'm staying in the city anyway | 03:32 |
bkero | jhesketh: My partner is there, and I hear the accom is like a 2 hour walk away. | 03:32 |
bkero | I told them to stay in the student housing, and the first two nights they didn't have hot water. I'm kind in hot water over that. -_- | 03:33 |
bkero | I thought Geelong was supposed to be hosted at some kind of Casino, not a Uni like usua | 03:33 |
jhesketh | yep, it's not perfect but there are buses on loops | 03:33 |
*** pahuang_ has quit IRC | 03:33 | |
jhesketh | oh I didn't hear about no hot water | 03:33 |
jhesketh | that sucks :-( | 03:33 |
bkero | Super grumpy, especially since one has to be up long beforehand to take the bus to the venue | 03:34 |
jhesketh | bkero: Hobart will be at a convention centre at a hotel which happens to have a casino | 03:34 |
jhesketh | ie next year | 03:34 |
bkero | Oooh, that's the casino thing. Thanks for setting me straight. | 03:34 |
*** jamespd_ is now known as jamespd | 03:34 | |
*** doug-fish has quit IRC | 03:34 | |
*** yamamoto has joined #openstack-infra | 03:34 | |
*** doug-fish has joined #openstack-infra | 03:35 | |
jlvillal | tonyb, I changed title on: https://review.openstack.org/#/c/241522 So it no longer says WIP :) | 03:35 |
*** sam_wan has quit IRC | 03:35 | |
*** doug-fish has quit IRC | 03:36 | |
tonyb | jlvillal: okay. | 03:36 |
tonyb | jlvillal: I was hoping that things would "just happen" but .... | 03:37 |
tonyb | jlvillal: anyway things are somewhat moving now | 03:37 |
*** sam_wan has joined #openstack-infra | 03:37 | |
jlvillal | tonyb, I think it looks better and other people seem to agree. | 03:37 |
jlvillal | Maybe someone can make it look even better after this :) | 03:37 |
tonyb | jlvillal: Maybe, TBH I find the whole page too busy so rather than look for what I need I just do in page search | 03:38 |
tonyb | jlvillal: did you see my alternate sort version? | 03:38 |
jlvillal | tonyb, I should probably do that | 03:38 |
*** gyee has joined #openstack-infra | 03:38 | |
jlvillal | I have not | 03:38 |
*** pavel_bondar has quit IRC | 03:39 | |
tonyb | jlvillal: http://bakeyournoodle.com/~tony/OpenStack_Meetings-index.html | 03:39 |
nibalizer | sam_wan: i think i have reactivated your account | 03:39 |
tonyb | jlvillal: it's basically your work I just altered the way the lists are generated | 03:39 |
sam_wan | hi nibalizer, yes, thanks a lot | 03:39 |
nibalizer | cool | 03:40 |
jlvillal | tonyb, Looks good to me. Oh great. I like going down the columns! | 03:40 |
jlvillal | tonyb, I didn't know how to do that. And didn't spend time trying to figure it out. | 03:40 |
jlvillal | I like that much better. Maybe only downside is maybe on phones. But not sure. | 03:41 |
tonyb | jlvillal: https://review.openstack.org/#/c/275459/1/yaml2ical/index.py that's how you do it :) | 03:41 |
jlvillal | :) | 03:41 |
tonyb | jlvillal: If you're using that page on a phone you've already lost ;P | 03:41 |
tonyb | jlvillal: I *wanted* to do it all in jinja2 but that'll do | 03:42 |
*** shashank_hegde has joined #openstack-infra | 03:42 | |
jlvillal | tonyb, I like it. Thanks | 03:42 |
tonyb | jlvillal: at least it isn't tightly coupled | 03:42 |
*** camunoz has joined #openstack-infra | 03:42 | |
tonyb | jlvillal: we'll see if ttx agrees. | 03:42 |
jlvillal | :D | 03:42 |
tonyb | jlvillal: If he does there will probably be some rebasing etc but I'll do that for you. | 03:43 |
jlvillal | Thank you. | 03:43 |
*** sam_wan has quit IRC | 03:43 | |
*** maishsk has quit IRC | 03:43 | |
*** sam_wan has joined #openstack-infra | 03:44 | |
*** unicell1 has quit IRC | 03:46 | |
*** sam_wan has quit IRC | 03:46 | |
openstackgerrit | Spencer Krum proposed openstack-infra/project-config: Use omfracloud in nodepool https://review.openstack.org/275491 | 03:46 |
*** esp_ has joined #openstack-infra | 03:47 | |
*** deva_ has joined #openstack-infra | 03:47 | |
*** hichihara has joined #openstack-infra | 03:47 | |
*** tphummel has joined #openstack-infra | 03:48 | |
*** NobodyCa1 has joined #openstack-infra | 03:48 | |
*** pahuang_ has joined #openstack-infra | 03:49 | |
openstackgerrit | Spencer Krum proposed openstack-infra/system-config: Add omfracloud to nodepool https://review.openstack.org/275485 | 03:50 |
*** esp_ has quit IRC | 03:51 | |
*** pavel_bondar has joined #openstack-infra | 03:51 | |
*** deva_ has quit IRC | 03:53 | |
*** hdd has quit IRC | 03:53 | |
*** NobodyCa1 has quit IRC | 03:53 | |
*** mrmartin has joined #openstack-infra | 03:53 | |
*** thorst has joined #openstack-infra | 03:56 | |
openstackgerrit | Jamie Lennox proposed openstack-infra/project-config: Run the identity v3 only jobs as part of integrated gate https://review.openstack.org/271128 | 03:59 |
*** gongysh has quit IRC | 04:00 | |
*** FallenPegasus has quit IRC | 04:00 | |
*** thorst has quit IRC | 04:03 | |
*** yuanying has quit IRC | 04:06 | |
*** gyee has quit IRC | 04:07 | |
*** yuanying has joined #openstack-infra | 04:07 | |
*** kzaitsev_mb has joined #openstack-infra | 04:08 | |
*** jamielennox is now known as jamielennox|away | 04:09 | |
*** sdake has quit IRC | 04:10 | |
*** deepakcs has joined #openstack-infra | 04:21 | |
*** mriedem_afk has quit IRC | 04:22 | |
deepakcs | sdague: Hi, reg: https://review.openstack.org/273326 | 04:23 |
deepakcs | sdague: is it possible that the reason for the higher fail rate is bcos the job runs against the plugin project as well, and might fail for patches against them as well | 04:23 |
cody-somerville | tonyb: You could use some CSS so that the long strings are truncated (nicely) and don't cause that ugly irregular whitespace. | 04:24 |
cody-somerville | also, alternating row colouring along with a little padding might also make it easier to read, allowing for one to easily keep track as they eye vertically along the dataset. | 04:26 |
*** achanda has quit IRC | 04:28 | |
*** dimtruck is now known as zz_dimtruck | 04:29 | |
*** Daisy has joined #openstack-infra | 04:38 | |
*** FallenPegasus has joined #openstack-infra | 04:41 | |
*** achanda has joined #openstack-infra | 04:43 | |
*** Daisy has quit IRC | 04:43 | |
*** amotoki has quit IRC | 04:57 | |
*** kzaitsev_mb has quit IRC | 04:58 | |
*** maishsk has joined #openstack-infra | 04:59 | |
*** tphummel has quit IRC | 05:00 | |
*** camunoz has quit IRC | 05:01 | |
*** thorst has joined #openstack-infra | 05:01 | |
*** dchen has quit IRC | 05:01 | |
*** esker has joined #openstack-infra | 05:06 | |
*** thorst has quit IRC | 05:08 | |
*** esker has quit IRC | 05:11 | |
*** camunoz has joined #openstack-infra | 05:13 | |
*** dchen has joined #openstack-infra | 05:15 | |
*** abitha has joined #openstack-infra | 05:16 | |
*** amotoki has joined #openstack-infra | 05:18 | |
*** FallenPegasus has quit IRC | 05:18 | |
*** Sukhdev has joined #openstack-infra | 05:23 | |
*** dchen has quit IRC | 05:24 | |
*** rguillebert has quit IRC | 05:27 | |
*** amotoki_ has joined #openstack-infra | 05:30 | |
*** amotoki has quit IRC | 05:33 | |
*** vgridnev has joined #openstack-infra | 05:44 | |
*** maishsk has quit IRC | 05:46 | |
*** abitha has quit IRC | 05:47 | |
openstackgerrit | Merged openstack/requirements: bump python-designateclient to 2.0.0 https://review.openstack.org/273681 | 05:47 |
*** jtomasek__ has joined #openstack-infra | 05:51 | |
*** amotoki_ has quit IRC | 05:52 | |
*** kzaitsev_mb has joined #openstack-infra | 05:55 | |
*** SumitNaiksatam has quit IRC | 05:57 | |
*** SumitNaiksatam has joined #openstack-infra | 05:57 | |
*** Jeffrey4l has joined #openstack-infra | 06:02 | |
*** rossella_s has quit IRC | 06:02 | |
*** rossella_s has joined #openstack-infra | 06:03 | |
*** amotoki has joined #openstack-infra | 06:05 | |
*** thorst has joined #openstack-infra | 06:07 | |
*** mrmartin has quit IRC | 06:09 | |
*** mrmartin has joined #openstack-infra | 06:09 | |
*** armax_ has joined #openstack-infra | 06:11 | |
*** unicell has joined #openstack-infra | 06:11 | |
*** armax has quit IRC | 06:12 | |
*** armax_ is now known as armax | 06:12 | |
*** jtomasek__ has quit IRC | 06:13 | |
*** thorst has quit IRC | 06:13 | |
*** mrmartin has quit IRC | 06:15 | |
*** mrmartin has joined #openstack-infra | 06:16 | |
*** abregman has joined #openstack-infra | 06:16 | |
*** woodster_ has quit IRC | 06:16 | |
*** mrmartin has quit IRC | 06:16 | |
*** FallenPegasus has joined #openstack-infra | 06:17 | |
*** amotoki has quit IRC | 06:18 | |
*** FallenPegasus has quit IRC | 06:22 | |
*** HeOS has joined #openstack-infra | 06:24 | |
*** SumitNaiksatam has quit IRC | 06:24 | |
*** abregman has quit IRC | 06:25 | |
*** abregman has joined #openstack-infra | 06:25 | |
*** amotoki has joined #openstack-infra | 06:26 | |
*** jaypipes has quit IRC | 06:26 | |
*** SumitNaiksatam has joined #openstack-infra | 06:28 | |
openstackgerrit | Masahito Muroi proposed openstack-infra/project-config: Adds a devstack test job using Congress new architecture https://review.openstack.org/275514 | 06:28 |
openstackgerrit | Masahito Muroi proposed openstack-infra/project-config: Adds a tox test job for Congress new architecture https://review.openstack.org/275515 | 06:28 |
*** eliqiao has quit IRC | 06:31 | |
*** eliqiao has joined #openstack-infra | 06:32 | |
*** zul has joined #openstack-infra | 06:33 | |
*** sflanigan has joined #openstack-infra | 06:34 | |
*** sflanigan has joined #openstack-infra | 06:34 | |
*** dkehn has quit IRC | 06:43 | |
*** amotoki has quit IRC | 06:46 | |
*** oomichi has joined #openstack-infra | 06:47 | |
*** dkehn has joined #openstack-infra | 06:49 | |
*** zul has quit IRC | 06:51 | |
*** armax has quit IRC | 06:52 | |
openstackgerrit | OpenStack Proposal Bot proposed openstack-infra/project-config: Normalize projects.yaml https://review.openstack.org/275527 | 06:54 |
openstackgerrit | OpenStack Proposal Bot proposed openstack/requirements: Updated from generate-constraints https://review.openstack.org/273907 | 06:58 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Cleanup translation scripts https://review.openstack.org/275225 | 07:06 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Re-enable repos for translation https://review.openstack.org/275140 | 07:06 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Re-Enable django_openstack_auth/designate-dashboard translations https://review.openstack.org/274832 | 07:06 |
*** YorikSar_ is now known as YorikSar | 07:06 | |
*** amotoki has joined #openstack-infra | 07:06 | |
*** gildub has quit IRC | 07:08 | |
*** kzaitsev_mb has quit IRC | 07:08 | |
openstackgerrit | Mark McLoughlin proposed openstack-infra/zuul: Fix tiny typo in scheduler comment https://review.openstack.org/275531 | 07:09 |
*** esker has joined #openstack-infra | 07:11 | |
*** thorst has joined #openstack-infra | 07:13 | |
AJaeger | amotoki: could you look at https://jenkins.openstack.org/job/glance-propose-translation-update/135/console , please? | 07:14 |
*** infra-red has joined #openstack-infra | 07:14 | |
openstackgerrit | Ian Wienand proposed openstack-infra/puppet-openstackci: nodepool : add flag to install diskimage-builder from git https://review.openstack.org/275535 | 07:15 |
*** esker has quit IRC | 07:16 | |
*** lbragstad has quit IRC | 07:16 | |
AJaeger | jeblair: we're soon publishing *all* documents everytime now - the building of only a few documents was an optimization we did for DocBook building (and glossary is last XML in openstack-doc, we can build it every time if that helps). RST building is faster, so we build all... So, option 1 is the one to go... | 07:16 |
openstackgerrit | venkatamahesh proposed openstack-infra/project-config: Fix some word spellings https://review.openstack.org/275536 | 07:17 |
*** fungi has quit IRC | 07:17 | |
*** thorst has quit IRC | 07:18 | |
AJaeger | amotoki: same problem: https://jenkins.openstack.org/job/glance-upstream-translation-update/460/console - is setup.cfg broken? | 07:19 |
*** yamahata has quit IRC | 07:19 | |
*** diana_clarke has quit IRC | 07:19 | |
amotoki | AJaeger: looking | 07:20 |
AJaeger | thanks, amotoki | 07:20 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config: Use diskimage-builder from git for nodepool https://review.openstack.org/275006 | 07:21 |
*** lbragstad has joined #openstack-infra | 07:21 | |
ianw | pabelanger: ^ yeah, prior change was wrong ... i think i untangled it! | 07:22 |
AJaeger | ianw, yolanda: Could you put the stack starting at https://review.openstack.org/274832 on your review list for today/tomorrow, please? | 07:22 |
*** pahuang_ has quit IRC | 07:23 | |
AJaeger | ianw: that's a simple but important change - and more comments than code ;) | 07:23 |
*** kushal has joined #openstack-infra | 07:24 | |
amotoki | AJaeger: glance setup.cfg does not contain [files] packages entry | 07:24 |
AJaeger | amotoki: so, make the script more robust *and* fix glance? | 07:25 |
yolanda | hi Jaeger will take a look today | 07:25 |
AJaeger | thanks, yolanda | 07:25 |
amotoki | AJaeger: I think so too. | 07:25 |
amotoki | AJaeger: i will send infra patch soon. | 07:25 |
AJaeger | same problem for glance_store I guess. Then let me tackle those two setup.cfgs... | 07:26 |
AJaeger | amotoki: ironic-ui is treated as python project: https://jenkins.openstack.org/job/ironic-ui-propose-translation-update/7/console - could you check that one as well, please? | 07:26 |
AJaeger | Thanks! | 07:26 |
*** _nadya_ has joined #openstack-infra | 07:27 | |
AJaeger | amotoki: ah, only glance... https://review.openstack.org/275539 | 07:29 |
AJaeger | flaper87: could you put 2755339 on a speed path, please? ^ | 07:29 |
amotoki | AJaeger: thanks. will double check | 07:29 |
*** ianw has quit IRC | 07:29 | |
*** ianw has joined #openstack-infra | 07:30 | |
AJaeger | amotoki: quite a few repos with broken setup.cfg - but most are not translated... | 07:31 |
*** infra-red has quit IRC | 07:32 | |
*** fungi has joined #openstack-infra | 07:32 | |
*** infra-red has joined #openstack-infra | 07:32 | |
*** esker has joined #openstack-infra | 07:32 | |
AJaeger | amotoki: ignore ironic-ui, that's an old log | 07:33 |
AJaeger | amotoki: so, only glance was broken | 07:34 |
amotoki | AJaeger: in my local test, ironic-ui works well. | 07:34 |
AJaeger | amotoki: log file was from 18th January ;( My error | 07:34 |
*** pahuang_ has joined #openstack-infra | 07:35 | |
*** rlandy has quit IRC | 07:35 | |
*** esker has quit IRC | 07:37 | |
amotoki | AJaeger: no problem :) | 07:37 |
amotoki | AJaeger: are you checking failure jobs in jenkins.o.o? | 07:38 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/infra-manual: WIP: Add information about enabling translations https://review.openstack.org/273759 | 07:39 |
AJaeger | amotoki: yes, I do - https://jenkins.openstack.org/ - and then check the first list for failures | 07:39 |
AJaeger | and also for success whether those really worked as they should... | 07:39 |
AJaeger | amotoki: I updated the infra manual to mention the packages entry in 273759 | 07:40 |
amotoki | AJaeger: nice! | 07:40 |
*** coreycb` is now known as coreycb | 07:40 | |
*** rkukura_ has joined #openstack-infra | 07:41 | |
*** rkukura has quit IRC | 07:41 | |
*** rkukura_ is now known as rkukura | 07:41 | |
tonyb | cody-somerville: Once I get the base infrastructure in place I'll poke you about doign that. | 07:42 |
*** abregman has quit IRC | 07:47 | |
openstackgerrit | Akihiro Motoki proposed openstack-infra/project-config: get-modulename: Handle no packages entry in setup.cfg https://review.openstack.org/275546 | 07:49 |
amotoki | AJaeger: ^^ | 07:49 |
*** esker has joined #openstack-infra | 07:53 | |
openstackgerrit | ice4o@hotmail.com proposed openstack-infra/jenkins-job-builder: Add HockeyApp Plugin support. https://review.openstack.org/275304 | 07:56 |
*** Sukhdev has quit IRC | 07:57 | |
openstackgerrit | Merged openstack-infra/project-config: Normalize projects.yaml https://review.openstack.org/275527 | 07:58 |
*** esker has quit IRC | 07:58 | |
*** SumitNaiksatam has quit IRC | 08:00 | |
openstackgerrit | ice4o@hotmail.com proposed openstack-infra/jenkins-job-builder: Add HockeyApp Plugin support. https://review.openstack.org/275304 | 08:03 |
AJaeger | amotoki: thanks. Looks fine and handles glance. I'm just wondering whether we need a warning message or not. | 08:04 |
*** kzaitsev_mb has joined #openstack-infra | 08:05 | |
yolanda | AJaeger, about https://review.openstack.org/#/c/274832/, you say "re-enable django translations.." but all i see in that patch is removal of liberty and translation functions. Can you clarify that to me? | 08:05 |
*** jcoufal has joined #openstack-infra | 08:06 | |
AJaeger | yolando, https://review.openstack.org/#/c/274832/4/jenkins/scripts/propose_translation_update.sh remove thes | 08:07 |
AJaeger | django_openstack_auth|designate-dashboard | 08:07 |
amotoki | AJaeger: good idea even though it is not easy to find it :-( | 08:07 |
AJaeger | and the upstream_translation file has the same | 08:07 |
ttx | tonyb: looks good. | 08:07 |
AJaeger | amotoki: yeah, I know. | 08:07 |
yolanda | ah i see, it removes the project disabled case | 08:08 |
ttx | tonyb: we should probably refactor the yaml2ical-core list to include you, but only infra-core can | 08:08 |
amotoki | AJaeger: i noticed the stdout is used in the translation shell script. stderr needs to be used. is it okay? | 08:08 |
AJaeger | amotoki: yes, seems the only option.. | 08:09 |
AJaeger | yolanda: thanks for review | 08:09 |
*** salv-orlando has joined #openstack-infra | 08:10 | |
ttx | tonyb: because currently we are a bit stuck on second approval in yaml2ical | 08:10 |
openstackgerrit | Evgeny Antyshev proposed openstack-infra/ciwatch: Ability to select specific CI results https://review.openstack.org/274412 | 08:10 |
*** salv-orl_ has quit IRC | 08:12 | |
*** ifarkas has joined #openstack-infra | 08:13 | |
*** jistr has joined #openstack-infra | 08:14 | |
openstackgerrit | Akihiro Motoki proposed openstack-infra/project-config: get-modulename: Handle no packages entry in setup.cfg https://review.openstack.org/275546 | 08:14 |
*** salv-orlando has quit IRC | 08:14 | |
*** esker has joined #openstack-infra | 08:14 | |
*** salv-orlando has joined #openstack-infra | 08:14 | |
*** ihrachys has joined #openstack-infra | 08:15 | |
*** thorst has joined #openstack-infra | 08:16 | |
AJaeger | amotoki: thanks. yolanda, could you review the change above as well, please? ^ That one is a bug fix for glance | 08:16 |
*** jcoufal_ has joined #openstack-infra | 08:16 | |
*** rubasov has quit IRC | 08:16 | |
*** esikachev has joined #openstack-infra | 08:16 | |
*** boris-42 has joined #openstack-infra | 08:18 | |
*** jcoufal has quit IRC | 08:19 | |
*** esker has quit IRC | 08:19 | |
*** amaretskiy has joined #openstack-infra | 08:20 | |
*** markus_z has joined #openstack-infra | 08:21 | |
*** thorst has quit IRC | 08:22 | |
*** jistr is now known as jistr|mtg | 08:22 | |
*** Hal has joined #openstack-infra | 08:23 | |
*** Hal is now known as Guest89472 | 08:23 | |
*** mikelk has joined #openstack-infra | 08:25 | |
*** shashank_hegde has quit IRC | 08:25 | |
*** amotoki has quit IRC | 08:25 | |
*** nikhil has joined #openstack-infra | 08:25 | |
yolanda | done | 08:26 |
AJaeger | thanks a lot! | 08:27 |
*** nmagnezi has joined #openstack-infra | 08:28 | |
*** nikhil_k has quit IRC | 08:28 | |
*** mrmartin has joined #openstack-infra | 08:29 | |
*** chaitu has quit IRC | 08:29 | |
*** ihrachys has quit IRC | 08:30 | |
*** rockyg has joined #openstack-infra | 08:30 | |
yolanda | no problem | 08:31 |
*** vgridnev has quit IRC | 08:32 | |
*** rubasov has joined #openstack-infra | 08:32 | |
*** Jeffrey4l has quit IRC | 08:33 | |
nibalizer | hiya yolanda AJaeger | 08:33 |
nibalizer | i put up some patches on the infra-cloud topic to start using it in nodepool | 08:33 |
nibalizer | id appreciate reviews | 08:33 |
yolanda | ah cool | 08:33 |
yolanda | is that fix for ansible-puppet on place? | 08:33 |
yolanda | nibalizer, also infra-cloud is still being refactored, for example in terms of dhcp, glean, the bridge... how are we going to live with that? | 08:35 |
esikachev | ttx: flaper87 hi! this is patch ready for merge https://review.openstack.org/#/c/259392/? | 08:35 |
*** esker has joined #openstack-infra | 08:35 | |
*** jtomasek__ has joined #openstack-infra | 08:35 | |
*** fhubik has joined #openstack-infra | 08:36 | |
*** ihrachys has joined #openstack-infra | 08:36 | |
*** hichihara has quit IRC | 08:37 | |
nibalizer | yolanda: the infea cloud region i have access to works great | 08:37 |
nibalizer | has for a long time | 08:37 |
nibalizer | abd with hpcloud gone we are seeing long queues | 08:37 |
yolanda | nibalizer yes, but for example the dhcp needs to be refactored, we are using elements there, and not glean to configure the network for example | 08:37 |
yolanda | we can test that changes in east anyway | 08:38 |
*** ihrachys has quit IRC | 08:38 | |
yolanda | also as i saw from Colleen yesterday, we are going to change rabbitmq to work with https | 08:38 |
*** oomichi is now known as oomichi_away | 08:38 | |
AJaeger | nibalizer: any special topic to review? | 08:39 |
yolanda | so there could be some changes, i guess some downtimes won't hurt us so much... | 08:39 |
*** bnemec has joined #openstack-infra | 08:39 | |
openstackgerrit | Merged openstack-infra/project-config: get-modulename: Handle no packages entry in setup.cfg https://review.openstack.org/275546 | 08:39 |
nibalizer | AJaeger: infra-cloud is the topic i think | 08:39 |
*** jaosorior has joined #openstack-infra | 08:39 | |
nibalizer | yolanda: yes if we have to take a downtime to change it we can | 08:39 |
*** yaume has joined #openstack-infra | 08:40 | |
*** zul has joined #openstack-infra | 08:40 | |
nibalizer | but thats no reason not to add it to puppetmaster and nodepool | 08:40 |
AJaeger | esikachev: AFAIK new repos for governance need PTL+1, so ask the sahara PTL to +1 it - but your change depends-on another change that is not merged, so your change cannot go in. | 08:40 |
*** esker has quit IRC | 08:41 | |
nibalizer | set up a mirror, etc | 08:41 |
*** abregman has joined #openstack-infra | 08:41 | |
AJaeger | nibalizer: what is with 268363 and 268366? Both merge conflict and same subject? | 08:41 |
yolanda | nibalizer ok. Main blocker i had on the automation is the missing environments, that was fixed on your ansible-puppet patch. Apart from that , all patches worked for me in east as well | 08:41 |
*** dingyichen has quit IRC | 08:42 | |
*** chmouel has left #openstack-infra | 08:42 | |
nibalizer | AJaeger: oops. ill fix that once i get a real computer | 08:42 |
nibalizer | on phone irc right bow | 08:42 |
AJaeger | ;) | 08:43 |
*** matrohon has joined #openstack-infra | 08:43 | |
AJaeger | yolanda: there's also https://review.openstack.org/246739 that has a merge conflict and you as co-author. Do you want to rebase? | 08:43 |
yolanda | sure | 08:43 |
*** jlanoux has joined #openstack-infra | 08:43 | |
*** ihrachys has joined #openstack-infra | 08:44 | |
yolanda | nibalizer, also you need to consider that we are in process of moving these blades | 08:45 |
yolanda | they are scheduling the move right now | 08:45 |
yolanda | so there will be a downtime again for it, also network settings will change | 08:45 |
AJaeger | yolanda: compare 246739 with https://review.openstack.org/208751 - why are those two so different? | 08:45 |
ttx | esikachev: yes I will process it this morning. Wanted to give a last chance to TC members to chime in after the last mod | 08:45 |
*** ihrachys has quit IRC | 08:46 | |
yolanda | AJaeger, hpuswest has been managed mostly by crinkle, greghaynes... they started working on it, and at that time, glean and bifrost were failing, so the setup had to be done different. When we joined to work on east, we did some fixes on glean and bifrost, so the config will be different now | 08:46 |
yolanda | crinkle is working as well on some refactor of the west zone | 08:46 |
openstackgerrit | Masahito Muroi proposed openstack-infra/project-config: Adds a devstack test job using Congress new architecture https://review.openstack.org/275514 | 08:47 |
openstackgerrit | Masahito Muroi proposed openstack-infra/project-config: Adds a tox test job for Congress new architecture https://review.openstack.org/275515 | 08:47 |
AJaeger | yolanda: ah, ok... | 08:47 |
yolanda | east still not ready, i need to work on some patch to create bridges | 08:48 |
*** ihrachys has joined #openstack-infra | 08:49 | |
openstackgerrit | yolanda.robla proposed openstack-infra/system-config: Add baremetal node definition for hpuseast https://review.openstack.org/246739 | 08:52 |
*** salv-orlando has quit IRC | 08:52 | |
openstackgerrit | yolanda.robla proposed openstack-infra/system-config: Add baremetal node definition for hpuseast https://review.openstack.org/246739 | 08:54 |
*** esikachev has quit IRC | 08:54 | |
*** esikachev has joined #openstack-infra | 08:55 | |
*** zeih has joined #openstack-infra | 08:55 | |
*** ihrachys has quit IRC | 08:56 | |
*** zeih has quit IRC | 08:56 | |
*** zeih has joined #openstack-infra | 08:56 | |
*** esker has joined #openstack-infra | 08:56 | |
*** esikachev has quit IRC | 08:59 | |
*** esker has quit IRC | 09:01 | |
*** zhurong has quit IRC | 09:01 | |
*** henrynash has joined #openstack-infra | 09:09 | |
*** ihrachys has joined #openstack-infra | 09:10 | |
*** ihrachys has quit IRC | 09:10 | |
*** MCoLo has joined #openstack-infra | 09:11 | |
*** sshnaidm has quit IRC | 09:12 | |
*** achanda has quit IRC | 09:14 | |
*** sdake has joined #openstack-infra | 09:15 | |
openstackgerrit | yolanda.robla proposed openstack-infra/system-config: Add baremetal node definition for hpuseast https://review.openstack.org/246739 | 09:15 |
*** esker has joined #openstack-infra | 09:18 | |
*** achanda has joined #openstack-infra | 09:18 | |
*** chlong has quit IRC | 09:20 | |
*** thorst has joined #openstack-infra | 09:21 | |
*** markus_z has quit IRC | 09:22 | |
*** esker has quit IRC | 09:23 | |
*** jordanP has joined #openstack-infra | 09:24 | |
*** achanda has quit IRC | 09:24 | |
*** dguitarbite has quit IRC | 09:25 | |
*** dizquierdo has joined #openstack-infra | 09:25 | |
*** dguitarbite has joined #openstack-infra | 09:27 | |
*** thorst has quit IRC | 09:28 | |
*** keedya has quit IRC | 09:29 | |
openstackgerrit | yolanda.robla proposed openstack-infra/system-config: Add baremetal node definition for hpuseast https://review.openstack.org/246739 | 09:31 |
*** esp_ has joined #openstack-infra | 09:32 | |
*** tcammann_ has joined #openstack-infra | 09:33 | |
*** NobodyCa1 has joined #openstack-infra | 09:34 | |
*** dtantsur|afk is now known as dtantsur | 09:37 | |
*** dguitarbite has quit IRC | 09:37 | |
*** tcammann_ has quit IRC | 09:38 | |
*** esp_ has quit IRC | 09:39 | |
*** NobodyCa1 has quit IRC | 09:40 | |
openstackgerrit | ice4o@hotmail.com proposed openstack-infra/jenkins-job-builder: Add HockeyApp Plugin support. https://review.openstack.org/275304 | 09:44 |
*** sshnaidm has joined #openstack-infra | 09:44 | |
*** e0ne has joined #openstack-infra | 09:45 | |
*** tzn has joined #openstack-infra | 09:46 | |
*** dguitarbite has joined #openstack-infra | 09:47 | |
yolanda | nibalizer, i reviewed your patches | 09:48 |
*** kzaitsev_mb has quit IRC | 09:49 | |
*** AJaeger has quit IRC | 09:49 | |
yolanda | one thing, it will be nice if we can use the same cacert for east and west ... but i need the private key to sign the certs for east using tha CA. Can you or crinkle send me? | 09:49 |
*** vgridnev has joined #openstack-infra | 09:51 | |
*** zeih has quit IRC | 09:53 | |
*** jaosorior has quit IRC | 09:53 | |
*** jaosorior has joined #openstack-infra | 09:54 | |
*** dizquierdo has quit IRC | 09:57 | |
*** gildub has joined #openstack-infra | 09:58 | |
*** esker has joined #openstack-infra | 09:59 | |
*** vivekd has joined #openstack-infra | 10:01 | |
*** jordanP has quit IRC | 10:01 | |
*** jordanP has joined #openstack-infra | 10:01 | |
*** rossella_s has quit IRC | 10:02 | |
*** rossella_s has joined #openstack-infra | 10:03 | |
*** zul has quit IRC | 10:03 | |
*** esker has quit IRC | 10:04 | |
*** nijaba has quit IRC | 10:05 | |
*** e0ne has quit IRC | 10:05 | |
*** zeih has joined #openstack-infra | 10:06 | |
*** zul has joined #openstack-infra | 10:06 | |
openstackgerrit | Merged openstack-infra/project-config: Fix some word spellings https://review.openstack.org/275536 | 10:08 |
*** jaosorior has quit IRC | 10:09 | |
*** zeih_ has joined #openstack-infra | 10:10 | |
*** rguillebert has joined #openstack-infra | 10:10 | |
*** AJaeger has joined #openstack-infra | 10:11 | |
*** zeih has quit IRC | 10:11 | |
*** nijaba has joined #openstack-infra | 10:12 | |
*** vgridnev_ has joined #openstack-infra | 10:14 | |
*** vgridnev has quit IRC | 10:14 | |
*** rhallisey has joined #openstack-infra | 10:14 | |
*** jaosorior has joined #openstack-infra | 10:16 | |
*** salv-orlando has joined #openstack-infra | 10:16 | |
openstackgerrit | Merged openstack-infra/project-config: Adding gerritbot and accessbot support for openstack-product https://review.openstack.org/275365 | 10:17 |
openstackgerrit | Merged openstack-infra/project-config: Switch ansible-role-diskimage-builder jobs voting https://review.openstack.org/275373 | 10:18 |
*** ociuhandu has joined #openstack-infra | 10:18 | |
nibalizer | yolanda: you are a rooter and can log into controller00.hpuswest.ic.openstack.org :) | 10:19 |
yolanda | nibalizer, i can get the data from hiera, but i need to create my own certs for hpuseast nodes, using that cacert | 10:20 |
yolanda | and i guess it's protected with key to sign? or not? | 10:20 |
yolanda | the password i mean | 10:20 |
*** salv-orlando has quit IRC | 10:21 | |
nibalizer | oh i don't know | 10:24 |
nibalizer | crinkle: is probably the only one who knows | 10:24 |
yolanda | ok i'll ping her, i'd prefer if we use same cacert in both regions for simplicity | 10:25 |
yolanda | also, in terms of dns, how are you going to manage? add the 100 entries to our openstack dns ? in east we have 100 nodes | 10:25 |
*** thorst has joined #openstack-infra | 10:26 | |
nibalizer | :shrug: | 10:26 |
nibalizer | i think we added the controller to main dns and the controller runs a dnsmasq that knows the other 100 ips | 10:27 |
nibalizer | that maybe isn't ideal but its a thing | 10:27 |
*** ildikov has quit IRC | 10:27 | |
*** ldnunes has joined #openstack-infra | 10:27 | |
*** yamamoto has quit IRC | 10:27 | |
*** hashar has joined #openstack-infra | 10:28 | |
yolanda | nibalizer, that's confusing me. Because nameservers in controller or compute are not pointing to controller dnsmasq, but to google | 10:28 |
*** fhubik is now known as fhubik_brb | 10:29 | |
*** fhubik_brb is now known as fhubik | 10:29 | |
*** thorst has quit IRC | 10:33 | |
*** mrmartin has quit IRC | 10:33 | |
yolanda | nibalizer also... we need to run puppet apply on the compute nodes so puppetmaster needs to know how to resolve these compute servers addresses right? | 10:37 |
*** khappone has quit IRC | 10:38 | |
*** khappone has joined #openstack-infra | 10:38 | |
*** rfolco has joined #openstack-infra | 10:40 | |
*** zeih_ has quit IRC | 10:40 | |
*** aysyd has joined #openstack-infra | 10:40 | |
*** esker has joined #openstack-infra | 10:41 | |
nibalizer | ya | 10:42 |
*** Daisy has joined #openstack-infra | 10:42 | |
nibalizer | so it will probably all end up in global dns | 10:42 |
*** d0ugal has quit IRC | 10:42 | |
*** d0ugal has joined #openstack-infra | 10:42 | |
openstackgerrit | Giulio Fidente proposed openstack-infra/tripleo-ci: Add support for --overcloud-update and use it for HA job https://review.openstack.org/260466 | 10:43 |
*** esker has quit IRC | 10:45 | |
*** fhubik is now known as fhubik_brb | 10:46 | |
*** fhubik_brb is now known as fhubik | 10:47 | |
*** fhubik is now known as fhubik_brb | 10:47 | |
*** aysyd has quit IRC | 10:47 | |
*** aysyd has joined #openstack-infra | 10:48 | |
*** abregman is now known as abregman|lunch | 10:48 | |
*** zeih has joined #openstack-infra | 10:49 | |
*** fhubik has joined #openstack-infra | 10:50 | |
yolanda | nibalizer so i think it will be better if we complete that requiremetns in term of dns, controlling everything with puppet, etc... then enable the cloud on nodepool | 10:50 |
*** fhubik_brb has quit IRC | 10:50 | |
*** FallenPegasus has joined #openstack-infra | 10:51 | |
*** e0ne has joined #openstack-infra | 10:51 | |
*** vgridnev_ has quit IRC | 10:52 | |
*** vgridnev_ has joined #openstack-infra | 10:54 | |
*** jaosorior has quit IRC | 10:58 | |
*** electrofelix has joined #openstack-infra | 10:58 | |
AJaeger | we seem to have some problems with an apt-mirror, see http://logs.openstack.org/85/273785/1/gate/gate-tempest-dsvm-neutron-full/e9b21e1/logs/devstacklog.txt.gz#_2016-02-03_10_44_41_836 | 10:59 |
*** jaosorior has joined #openstack-infra | 10:59 | |
*** deepakcs has quit IRC | 10:59 | |
*** zul has quit IRC | 11:00 | |
AJaeger | This seems to happen with many jobs - didn't have time to look further into it. | 11:00 |
AJaeger | Fails on rax-iad, works fine on ovh-bhs1 | 11:00 |
*** zul has joined #openstack-infra | 11:01 | |
AJaeger | works on internap | 11:01 |
*** tiswanso has quit IRC | 11:01 | |
*** esker has joined #openstack-infra | 11:02 | |
AJaeger | but that's a small sample only. | 11:02 |
*** zeih has quit IRC | 11:03 | |
yolanda | errors are not showing the real failure of apt_update, looks as a timeout | 11:03 |
sdague | hmm, so all mirrors are broken now? | 11:03 |
AJaeger | sdague: rax-iad only | 11:04 |
AJaeger | But we need more data ;) | 11:04 |
AJaeger | Just looked at 5 more fails - all rax-iad | 11:04 |
AJaeger | Sorry, can't look into this further now | 11:05 |
*** esker has quit IRC | 11:07 | |
*** salv-orlando has joined #openstack-infra | 11:07 | |
*** annegentle has joined #openstack-infra | 11:08 | |
*** infra-re_ has joined #openstack-infra | 11:09 | |
AJaeger | sdague: could you review https://review.openstack.org/#/c/272411/ , please? | 11:10 |
*** dizquierdo has joined #openstack-infra | 11:10 | |
sdague | sure | 11:11 |
yolanda | seems we are having network problems | 11:11 |
yolanda | 2016-02-03 10:32:18.496 | Resolving apt.puppetlabs.com (apt.puppetlabs.com)... 2600:3c03::f03c:91ff:fedb:6b1d, 192.155.89.90 | 11:11 |
yolanda | 2016-02-03 10:34:25.759 | Connecting to apt.puppetlabs.com (apt.puppetlabs.com)|2600:3c03::f03c:91ff:fedb:6b1d|:80... failed: Connection timed out. | 11:11 |
yolanda | that also on rax-iad | 11:12 |
*** annegentle has quit IRC | 11:12 | |
sdague | ah, their ipv6 path is bonkers? | 11:13 |
sdague | is there a rax status page up? | 11:13 |
*** infra-red has quit IRC | 11:13 | |
*** deepakcs has joined #openstack-infra | 11:13 | |
yolanda | i'm not aware of problems but i'm not getting any emails from providers, is something that i'll need to work on to get subscribed to | 11:13 |
sdague | johnthetubaguy: any idea if rax-iad is basically toast ^^^ ? | 11:13 |
*** sfinucan has joined #openstack-infra | 11:13 | |
johnthetubaguy | sdague: taking a peak now | 11:14 |
*** arxcruz has joined #openstack-infra | 11:14 | |
openstackgerrit | Merged openstack-infra/project-config: Remove redundant tempest job for heatclient https://review.openstack.org/272411 | 11:19 |
*** gnuoy_ has joined #openstack-infra | 11:20 | |
johnthetubaguy | sdague: so I am just seeing roses for the infra tenant right now, but there could be a little time delay | 11:21 |
*** jaosorior has quit IRC | 11:21 | |
johnthetubaguy | sdague: oh, so you are seeing networking related issues? | 11:21 |
*** infra-red has joined #openstack-infra | 11:22 | |
*** esker has joined #openstack-infra | 11:23 | |
sdague | yep | 11:23 |
sdague | we can't hit any of the apt mirrors | 11:23 |
*** infra-re_ has quit IRC | 11:24 | |
openstackgerrit | Igor Belikov proposed openstack-infra/puppet-jenkins: Use 'ruby' instead of 'ruby1.9.1' for Debian https://review.openstack.org/275329 | 11:25 |
*** jistr|mtg has quit IRC | 11:27 | |
*** esker has quit IRC | 11:28 | |
*** gnuoy_ has quit IRC | 11:29 | |
*** infra-re_ has joined #openstack-infra | 11:30 | |
johnthetubaguy | sdague: I guess its just a subset of the hosts in iad? I am just trying to find a uuid or something in the logs | 11:30 |
*** abregman|lunch is now known as abregman | 11:30 | |
sdague | yolanda: can you get johnthetubaguy more info? | 11:30 |
openstackgerrit | Dmitry Tantsur proposed openstack-infra/project-config: Add a non-voting inspector job for ironic-python-agent https://review.openstack.org/275637 | 11:31 |
yolanda | let me check logs | 11:31 |
johnthetubaguy | there was talk of one of the iad cells having networking issues, but just hoping more info will help them pin things down | 11:31 |
*** thorst has joined #openstack-infra | 11:32 | |
*** yamamoto has joined #openstack-infra | 11:32 | |
yolanda | i have a pair of errors with timeouts but cannot get the server ids from there, these are temporary nodepool servers | 11:33 |
*** infra-red has quit IRC | 11:33 | |
*** yamamoto has quit IRC | 11:33 | |
sdague | yolanda: is node uuid from nodepool or openstack? | 11:35 |
sdague | Node UUID: 368d44c3-3dc4-4801-8c80-33f0beb28ef1 | 11:35 |
sdague | in the console logs | 11:35 |
sdague | https://jenkins06.openstack.org/job/gate-tempest-dsvm-full/27546/console | 11:35 |
yolanda | ah that's better | 11:36 |
yolanda | johnthetubaguy, that helps more? | 11:37 |
johnthetubaguy | ah, console logs, perfect, let me look that one up | 11:38 |
*** thorst has quit IRC | 11:38 | |
sdague | well, jenkins console log | 11:38 |
openstackgerrit | Igor Belikov proposed openstack-infra/system-config: Get debian kernel headers based on architecture https://review.openstack.org/275323 | 11:39 |
*** deepakcs has quit IRC | 11:39 | |
openstackgerrit | Igor Belikov proposed openstack-infra/puppet-jenkins: Use 'ruby' instead of 'ruby1.9.1' for Debian https://review.openstack.org/275329 | 11:40 |
AJaeger | johnthetubaguy: do you need more console logs? Thanks for looking into this! | 11:42 |
openstackgerrit | Igor Belikov proposed openstack-infra/project-config: Add debian-jessie image to nodepool https://review.openstack.org/264726 | 11:43 |
*** boris-42 has quit IRC | 11:43 | |
*** vivekd_ has joined #openstack-infra | 11:44 | |
openstackgerrit | Dmitry Tantsur proposed openstack-infra/project-config: Add a non-voting inspector job for ironic-python-agent https://review.openstack.org/275637 | 11:44 |
*** mrmartin has joined #openstack-infra | 11:45 | |
*** vivekd has quit IRC | 11:45 | |
*** vivekd_ is now known as vivekd | 11:45 | |
johnthetubaguy | AJaeger: yolanda: sdague: unsure whats needed, asking the support folks, its not in the cell they were wondering about, so I think they are digging | 11:46 |
*** kzaitsev_mb has joined #openstack-infra | 11:46 | |
*** _amrith_ is now known as amrith | 11:53 | |
johnthetubaguy | yolanda: so I am out of date here, do we reconfigure the VM not to use the rackspace mirror, so I guess its hitting the default ubuntu address? | 11:54 |
yolanda | johnthetubaguy i need to double check but yes, i think it uses default ubuntu address | 11:54 |
*** zeih has joined #openstack-infra | 11:54 | |
yolanda | let me hold a node to confirm | 11:54 |
*** deepakcs has joined #openstack-infra | 11:55 | |
*** deepakcs has quit IRC | 11:55 | |
*** pkoniszewski has quit IRC | 11:56 | |
*** pkoniszewski has joined #openstack-infra | 11:56 | |
yolanda | yep, i see us.archive.ubuntu.com there, on apt sources | 11:56 |
AJaeger | I think we do - see nodepool/scripts/configure_mirror.sh in project-config | 11:57 |
yolanda | yes, it connects to us.archive, and actually my holded node is timing out | 11:57 |
yolanda | johnthetubaguy, i have a holded vm with that problem, so if you need to debug something it can be useful | 11:58 |
*** jcoufal_ has quit IRC | 11:58 | |
johnthetubaguy | yolanda: that could be useful, but I just updated the sources.list and I think I see errors talking to us.archive.ubuntu.com now | 11:59 |
sdague | ipv6 only? | 11:59 |
yolanda | i can ping succsesfully with ipv4, but not with ipv6 | 12:00 |
yolanda | ping us.archive.ubuntu.com | 12:00 |
yolanda | PING us.archive.ubuntu.com (91.189.91.14) 56(84) bytes of data. | 12:00 |
yolanda | 64 bytes from orobas.canonical.com (91.189.91.14): icmp_seq=1 ttl=51 time=14.3 ms | 12:00 |
*** vivekd_ has joined #openstack-infra | 12:00 | |
yolanda | ping6 us.archive.ubuntu.com | 12:00 |
yolanda | PING us.archive.ubuntu.com(orobas.canonical.com) 56 data bytes | 12:00 |
sdague | right, so it's an ipv6 fail | 12:00 |
yolanda | looks like | 12:00 |
sdague | I think that's the important bit | 12:00 |
sdague | to help rax narrow it | 12:00 |
yolanda | apt-get -o Acquire::ForceIPv4=true update works | 12:02 |
yolanda | so yes, definitely ipv6 problems | 12:02 |
AJaeger | Is there an easy way for us to stop using rax-iad or should we wait? | 12:03 |
*** julim_ has quit IRC | 12:03 | |
yolanda | i can set the max-servers to zero | 12:03 |
yolanda | so we stop using that | 12:03 |
sdague | yolanda: yeh, it's probably worth doing that | 12:03 |
yolanda | ok going to send a change | 12:03 |
sdague | because we're just auto failing all changes at this point | 12:03 |
AJaeger | yes, better than all the fails and we can easily revert... | 12:04 |
*** vivekd has quit IRC | 12:04 | |
*** vivekd_ is now known as vivekd | 12:04 | |
*** esker has joined #openstack-infra | 12:04 | |
openstackgerrit | yolanda.robla proposed openstack-infra/project-config: Set max-servers to 0 to rax-iad https://review.openstack.org/275655 | 12:05 |
yolanda | last time i looked, puppet on nodepool was disabled, going to double check if that's still the case | 12:05 |
openstackgerrit | ice4o@hotmail.com proposed openstack-infra/jenkins-job-builder: Add HockeyApp Plugin support. https://review.openstack.org/275304 | 12:06 |
AJaeger | yolanda: thanks. Once that's in we can prepare a revert and mark it WIP... | 12:06 |
*** ildikov has joined #openstack-infra | 12:06 | |
sdague | though, that does drop our works by 1/3 which sucks | 12:07 |
sdague | but sucks less than everything dying | 12:07 |
*** zeih_ has joined #openstack-infra | 12:07 | |
AJaeger | sdague: exactly | 12:07 |
yolanda | change has lots of chances of not passing, so going to apply on nodepool directly | 12:08 |
johnthetubaguy | yeah, failed to get a decent answer, so I have raised a ticket | 12:09 |
*** jcoufal has joined #openstack-infra | 12:09 | |
*** esker has quit IRC | 12:09 | |
sdague | we could also do this to all our nodes - http://forevercached.syphzero.net/2012/09/preferring-ipv4-over-ipv6.html to prefer v4 over v6 | 12:10 |
johnthetubaguy | I basically confirmed only in IAD, you can't do the IPv6 stuff to access http://us.archive.ubuntu.com doing apt-get update, so its nice a reproducible, it works just fine in ORD | 12:10 |
AJaeger | sdague, yolanda : On the other hand I see failures only for devstack-trusty | 12:10 |
*** ihrachys has joined #openstack-infra | 12:10 | |
AJaeger | We could also keep bare-trusty on iad and only disable devstack-trusty. | 12:10 |
sdague | AJaeger: we could, I don't know all the details in what that fallout does | 12:10 |
yolanda | AJaeger, i bet i saw one failing on bare | 12:10 |
yolanda | let me double check | 12:11 |
*** zeih has quit IRC | 12:11 | |
sdague | AJaeger: do you have a logstash signature for this? | 12:11 |
AJaeger | sdague: nope ;( | 12:11 |
*** julim has joined #openstack-infra | 12:11 | |
AJaeger | sdague: do you want to create one from http://logs.openstack.org/85/273785/1/gate/gate-tempest-dsvm-neutron-full/e9b21e1/logs/devstacklog.txt.gz#_2016-02-03_10_44_41_836 | 12:11 |
*** links has quit IRC | 12:11 | |
yolanda | mm, it was devstack as well | 12:11 |
johnthetubaguy | so leaving the rackspace mirror does seem to work, but that opens up a whole other can of worms | 12:12 |
sdague | message:"Failed to update apt repos, we're dead now" | 12:13 |
yolanda | updating the mirrors to use rax is not a simple change for us | 12:13 |
*** ihrachys has quit IRC | 12:13 | |
yolanda | as we'd need to do depending on providers | 12:13 |
yolanda | it may be easier to send a change to enforce ipv4 in apt | 12:14 |
sdague | yolanda: if ipv6 is down apt is just one place it will fail | 12:15 |
yolanda | yes, we may see another issues | 12:15 |
*** zz_dimtruck has quit IRC | 12:15 | |
johnthetubaguy | so I am seeing wget fail, and curl work | 12:16 |
johnthetubaguy | I guess that could be the ip v4 vs v6 thing? | 12:16 |
*** fhubik is now known as fhubik_brb | 12:17 | |
yolanda | yes, curl -6 fails as well | 12:17 |
AJaeger | sdague: so, only devstack-trusty - but there might be different errors for other types | 12:17 |
yolanda | failures shoud be happening independently of the label | 12:17 |
yolanda | AJaeger, it may be due to the jobs executed per label | 12:18 |
johnthetubaguy | so contacting google via ipv6 works just fine | 12:18 |
yolanda | do you see apt updates on some of the bare ones? | 12:18 |
sdague | the signature on bare nodes will be different, this is a devstack function that builds a good summary | 12:18 |
AJaeger | sdague: It's just that my jobs that failed only failed in the dsvm tests but not in others. But that's a small sample... | 12:19 |
openstackgerrit | Sean Dague proposed openstack-infra/elastic-recheck: failure for apt mirror https://review.openstack.org/275662 | 12:19 |
yolanda | i set max-servers to 0 manually at the meantime | 12:20 |
AJaeger | thanks, yolanda | 12:20 |
sdague | right, that will at least keep a few things running | 12:20 |
*** weshay_xchat has joined #openstack-infra | 12:21 | |
yolanda | real change passed check now | 12:21 |
*** zul has quit IRC | 12:24 | |
openstackgerrit | Merged openstack-infra/project-config: Set max-servers to 0 to rax-iad https://review.openstack.org/275655 | 12:24 |
yolanda | nice | 12:24 |
*** zeih has joined #openstack-infra | 12:25 | |
*** gildub has quit IRC | 12:25 | |
AJaeger | let me do a revert as reminder... | 12:25 |
*** lucasagomes is now known as lucas-hungry | 12:25 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Revert "Set max-servers to 0 to rax-iad" https://review.openstack.org/275666 | 12:26 |
yolanda | going to reenable puppet on nodepool as the real change landed | 12:27 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Source undercloud environment variable from a file https://review.openstack.org/275667 | 12:27 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Split the deploy script into its own file https://review.openstack.org/275668 | 12:27 |
yolanda | johnthetubaguy, i also can ping with v6 to different hosts, but not to canonical one | 12:27 |
*** zeih_ has quit IRC | 12:28 | |
johnthetubaguy | yolanda: yeah, seeing the same thing | 12:28 |
johnthetubaguy | I added notes in the support ticket, I hope they don't ignore mine because its an internal account | 12:28 |
* johnthetubaguy runs away for some lunch | 12:29 | |
*** trown|outttypeww is now known as trown | 12:29 | |
sdague | yolanda: how long until that kicks in? | 12:30 |
*** ociuhandu_ has joined #openstack-infra | 12:30 | |
AJaeger | sdague: it's active already, see our graphfs | 12:30 |
yolanda | yes, that happened | 12:30 |
AJaeger | http://grafana.openstack.org/dashboard/db/zuul-status | 12:30 |
sdague | ok | 12:31 |
*** rhallisey has quit IRC | 12:31 | |
*** dims has joined #openstack-infra | 12:32 | |
AJaeger | dtantsur: is https://review.openstack.org/#/c/268176/ still relevant? In that case let's ask sdague or yolando for a +2... | 12:33 |
*** mrmartin has quit IRC | 12:34 | |
*** ociuhandu has quit IRC | 12:34 | |
*** ociuhandu_ is now known as ociuhandu | 12:34 | |
dtantsur | AJaeger, yep, still relevant, I've forgot about it. thanks for reminder | 12:34 |
yolanda | will approve | 12:34 |
*** jpr has joined #openstack-infra | 12:34 | |
AJaeger | dtantsur: could you review https://review.openstack.org/272481 for me, please? It's ironic related change | 12:35 |
*** rhallisey has joined #openstack-infra | 12:35 | |
yolanda | AJaeger, sdague, are you normally sending status alerts, when it's not a failure, but something affecting to capacity? such as "jobs will run slower" | 12:35 |
*** rhallisey has quit IRC | 12:35 | |
*** zeih has quit IRC | 12:36 | |
*** rhallisey has joined #openstack-infra | 12:36 | |
*** zeih has joined #openstack-infra | 12:36 | |
* AJaeger does not know | 12:36 | |
* dtantsur looks | 12:37 | |
AJaeger | yolanda: you have the power now, make use of it to double check it. | 12:37 |
*** amrith is now known as _amrith_ | 12:37 | |
AJaeger | and a #status notice would be fine in this situation... | 12:37 |
*** jpr has quit IRC | 12:38 | |
yolanda | #status notice Infra running with lower capacity now, due to a temporary problem affecting one of our nodepool providers. Please expect some delays in your jobs. Apologies for any inconvenience caused. | 12:39 |
openstackstatus | yolanda: sending notice | 12:39 |
*** mase_x200 has joined #openstack-infra | 12:39 | |
openstackgerrit | Merged openstack-infra/elastic-recheck: failure for apt mirror https://review.openstack.org/275662 | 12:40 |
-openstackstatus- NOTICE: Infra running with lower capacity now, due to a temporary problem affecting one of our nodepool providers. Please expect some delays in your jobs. Apologies for any inconvenience caused. | 12:40 | |
yolanda | AJaeger, sdague, i'm going to have lunch now, but will be back in a while | 12:40 |
*** baoli has joined #openstack-infra | 12:40 | |
*** zeih_ has joined #openstack-infra | 12:40 | |
AJaeger | yolanda: get yourself something yummy ;) Thanks! | 12:40 |
openstackstatus | yolanda: finished sending notice | 12:41 |
openstackgerrit | Merged openstack-infra/project-config: Switch ironic-lib and python-ironicclient gates to IPA https://review.openstack.org/268176 | 12:42 |
*** zeih has quit IRC | 12:42 | |
*** baoli_ has joined #openstack-infra | 12:43 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Reduce timeout of ironic jobs https://review.openstack.org/272481 | 12:43 |
AJaeger | dtantsur: thanks for review, updated | 12:44 |
*** baoli has quit IRC | 12:45 | |
*** esker has joined #openstack-infra | 12:46 | |
*** thorst_ has joined #openstack-infra | 12:46 | |
*** henrynash has quit IRC | 12:50 | |
*** esker has quit IRC | 12:51 | |
*** zeih_ has quit IRC | 12:52 | |
*** daemontool has joined #openstack-infra | 12:52 | |
openstackgerrit | Giulio Fidente proposed openstack-infra/tripleo-ci: Add support for --overcloud-update and use it for HA job https://review.openstack.org/260466 | 12:52 |
*** zeih has joined #openstack-infra | 12:53 | |
*** julim has quit IRC | 12:54 | |
*** Jeffrey4l has joined #openstack-infra | 12:55 | |
*** zul has joined #openstack-infra | 12:55 | |
*** julim has joined #openstack-infra | 12:57 | |
*** jaosorior has joined #openstack-infra | 12:57 | |
*** dguitarbite has quit IRC | 12:57 | |
*** jpr has joined #openstack-infra | 12:58 | |
*** rlandy has joined #openstack-infra | 13:00 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack-infra/project-config: Add new experimental job for manila-image-elements project https://review.openstack.org/275682 | 13:01 |
*** openstackgerrit has quit IRC | 13:02 | |
*** annegentle has joined #openstack-infra | 13:02 | |
*** openstackgerrit has joined #openstack-infra | 13:03 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack-infra/project-config: Add new experimental job for manila-image-elements project https://review.openstack.org/275682 | 13:03 |
pabelanger | morning | 13:03 |
*** fhubik_brb is now known as fhubik | 13:05 | |
*** gildub has joined #openstack-infra | 13:06 | |
*** annegentle has quit IRC | 13:07 | |
*** ihrachys has joined #openstack-infra | 13:08 | |
openstackgerrit | Igor Belikov proposed openstack-infra/puppet-jenkins: Use 'ruby' instead of 'ruby1.9.1' for Debian https://review.openstack.org/275329 | 13:09 |
nibalizer | so our rax-iad lost ipv6? | 13:09 |
smcginnis | FYI, the nice warning that things are lower capacity didn't get sent the cinder channel. Not sure why, but just thought I should point that out. | 13:11 |
nibalizer | smcginnis: thanks | 13:12 |
nibalizer | thats #openstack-cinder ya? | 13:12 |
smcginnis | nibalizer: Yep | 13:12 |
smcginnis | Scanning through the channels I'm in, some got it and some didn't. | 13:12 |
*** jistr|mtg has joined #openstack-infra | 13:12 | |
nibalizer | yolanda: i have ssh'd into your help rax-iad node fyi | 13:13 |
*** claudiub has joined #openstack-infra | 13:13 | |
*** moravec has quit IRC | 13:14 | |
nibalizer | I can't see the ipv6 fails? | 13:15 |
*** moravec has joined #openstack-infra | 13:16 | |
AJaeger | nibalizer: ping6 us.archive.ubuntu.com is the fail, not a general fail | 13:20 |
AJaeger | smcginnis: let me check something for you... | 13:20 |
krotscheck1 | AJaeger: Thanks for all those +2's yesterday :) | 13:20 |
pabelanger | AJaeger: left a comment on the review, let me know if it is easier to chat here. | 13:21 |
*** tiswanso has joined #openstack-infra | 13:21 | |
*** samuelBartel has joined #openstack-infra | 13:21 | |
AJaeger | smcginnis: #openstack-cinder looks setup correctly, hope somebody else can debug the issue why statusbot does not show up at your channel | 13:22 |
AJaeger | krotscheck1: you're welcome | 13:22 |
AJaeger | pabelanger: let me check your latest comment | 13:22 |
*** xyang1 has joined #openstack-infra | 13:23 | |
AJaeger | pabelanger: yes, this should work. Gave a +2. Thanks for explaining | 13:23 |
AJaeger | smcginnis: check http://eavesdrop.openstack.org/irclogs/%23openstack-cinder/latest.log.html#t2016-02-03T12:39:36 - there'S teh status message | 13:23 |
pabelanger | AJaeger: great! Thanks for the follow up | 13:24 |
nibalizer | AJaeger: nibz@bare-trusty-rax-iad-7675327:~$ ping6 us.archive.ubuntu.com | 13:25 |
nibalizer | PING us.archive.ubuntu.com(economy.canonical.com) 56 data bytes | 13:25 |
nibalizer | seems working? | 13:26 |
nibalizer | 64 bytes from economy.canonical.com: icmp_seq=1 ttl=50 time=13.3 ms | 13:26 |
nibalizer | 64 bytes from economy.canonical.com: icmp_seq=2 ttl=50 time=13.2 ms | 13:26 |
sdague | nibalizer: maybe it got resolved? | 13:26 |
AJaeger | Great! | 13:26 |
sdague | it was definitely failing for folks earlier | 13:26 |
nibalizer | https://status.rackspace.com/ doesn't show any isuse | 13:26 |
*** jsavak has joined #openstack-infra | 13:26 | |
AJaeger | let's wait for yolanda to return from lunch and test - and then enable iad again... | 13:27 |
nibalizer | ok | 13:27 |
nibalizer | we could also bring it up to 10 nodes | 13:27 |
*** zeih has quit IRC | 13:27 | |
nibalizer | or something like that | 13:27 |
sdague | nibalizer: we know, see the conversation with johnthetubaguy earlier | 13:27 |
johnthetubaguy | sdague: it does seem to be fixed now | 13:27 |
sdague | johnthetubaguy: ok | 13:28 |
pabelanger | AJaeger: odyssey4me: So, easier to ask here. Is tox -elinters going to be a default now? | 13:28 |
sdague | johnthetubaguy: any idea why rax didn't notice it themselves?: | 13:28 |
pabelanger | AJaeger: odyssey4me: for all projects? | 13:28 |
zigo | pabelanger: Hi there! Could you review this one again? https://review.openstack.org/#/c/264726/ | 13:28 |
bauzas | johnthetubaguy: sdague: cool, do that mean we can recheck ? | 13:28 |
johnthetubaguy | sdague: it seems to have been isolated to ubuntu, AFAIK, and all our images use our mirror | 13:28 |
AJaeger | pabelanger: this would need some more discussion - we have enabled it as option and let teams choose... | 13:28 |
odyssey4me | pabelanger it's an option available, but won't be forced - many projects are consolidating linting under the pep8 tox target, but 'linters' seems more appropriate for our needs | 13:28 |
johnthetubaguy | sdague: I don't think we have done any changes on our side | 13:28 |
johnthetubaguy | sdague: but honestly, not 100% sure | 13:29 |
*** gordc has joined #openstack-infra | 13:29 | |
johnthetubaguy | sdague: the reply to my ticket was "hmm, this sounds familiar, let me go check with someone" | 13:29 |
*** lucas-hungry is now known as lucasagomes | 13:29 | |
*** ihrachys has quit IRC | 13:30 | |
AJaeger | sdague, nibalizer, yolanda : If you want to +2A the revert: https://review.openstack.org/#/c/275666/1 | 13:30 |
sdague | +2 | 13:30 |
nibalizer | +2A | 13:31 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Reduce timeout of ironic jobs https://review.openstack.org/272481 | 13:31 |
johnthetubaguy | yeah, the VM I reproduced this one seems to have fully recovered, sorry about that, not sure what way, I have asked support to let me know what changed | 13:31 |
pabelanger | zigo: 1 question, left in comments | 13:32 |
pabelanger | zigo: otherwise, code looks good. Just reviewing build log | 13:32 |
*** zeih has joined #openstack-infra | 13:32 | |
AJaeger | sdague: could you review 272481 and https://review.openstack.org/273010 as well, please? - two small cleanups... | 13:32 |
*** ihrachys has joined #openstack-infra | 13:32 | |
sdague | I don't understand why the timeout drops | 13:33 |
*** zeih has quit IRC | 13:34 | |
*** zeih has joined #openstack-infra | 13:34 | |
AJaeger | sdague: It was just a copy from earlier jobs in that file - and those use 120... | 13:34 |
zigo | pabelanger: Cheers ! | 13:34 |
nibalizer | infra-root taking the hiera lock | 13:35 |
* AJaeger wants to stop the cargocult of copying always 2hours there... | 13:35 | |
*** mrmartin has joined #openstack-infra | 13:35 | |
sdague | AJaeger: sure, but plenty of headroom is fine | 13:35 |
sdague | when people over optimize this stuff, they then add tests, and start random failing over the timeout | 13:35 |
sdague | and keep rechecking because they don't realize what's going on | 13:36 |
pabelanger | zigo: no issues uploading the images to rax or ovh? | 13:36 |
sdague | keeping a big buffer in there is fine | 13:36 |
pabelanger | zigo: eg: networking works | 13:36 |
*** zeih_ has joined #openstack-infra | 13:36 | |
sdague | the big buffer only hurts if tests are deadlocking regularly, and we need to whack things because of it | 13:36 |
nibalizer | infra-root removing lock | 13:36 |
zigo | pabelanger: httpredir.debian.org redirects to the closest mirror, why is it then a problem to not use ftp.us.debian.org? It will normally select best location ... | 13:36 |
AJaeger | sdague: I thought our default is 60 mins - and those jobs never run longer than 60 mins (rarely 50), so I put 70 in... | 13:37 |
sdague | honestly I'd almost refactor the per job timeout into a global thing for 3 hours, and be fine with it. | 13:37 |
zigo | pabelanger: If the selected mirror is wrong, we can always ask Raphael Guesler (the maintainer of the httpredir.debian.org service) to fix it. | 13:37 |
sdague | right, but the only reason you care about job timeout is deadlock | 13:37 |
sdague | the job is wedge so that it will never end | 13:37 |
sdague | that is *such* a rare occurance, you just want a backstop so it can't happen | 13:37 |
pabelanger | zigo: mostly because it is random. There have been issues with mirrors in the past, pinning it to a specific mirror just helps troubleshooting. Once we have AFS mirrors, it won't be an issue. | 13:37 |
sdague | because it would get stuck forever | 13:38 |
AJaeger | sdague: Indeed - but isn't it also a question of not having too long running jobs? | 13:38 |
zigo | pabelanger: Though this way, it's going to select the US mirrors even in France with OVH, that's bad, no? | 13:38 |
*** edmondsw has joined #openstack-infra | 13:38 | |
zigo | pabelanger: Also, "ftp.us.debian.org" is a moving target, it's not a single repo, as much as I know. | 13:38 |
sdague | AJaeger: what is going to happen if the job starts tripping a timeout? | 13:38 |
zigo | pabelanger: Therefore, which mirror do you suggest? | 13:39 |
*** daemontool has quit IRC | 13:39 | |
sdague | someone is going to push a timeout bump, and we're going to accept it | 13:39 |
sdague | so reducing timeouts is all just random busywork | 13:39 |
zigo | pabelanger: ftp.fr.debian.org is quite fast, and hosted by a serious company (the 2nd largest ADSL provider in France), and it uses a cluster of servers in the backend. | 13:39 |
pabelanger | zigo: In theory, either way just a comment, not a -1. You can default to httpredir for now, but keep in mind of the pinning. I'll let other reviewers comment too | 13:39 |
*** zeih has quit IRC | 13:40 | |
zigo | pabelanger: I trust you, and I prefer to change the patch to do what you suggest. I'm just trying to find the best way here. | 13:40 |
zigo | pabelanger: So, would ftp.fr.debian.org work? | 13:40 |
AJaeger | sdague: Ok, let me double check with others later. I'll mark the change as WIP for now - and might change my review pattern... | 13:40 |
AJaeger | thanks, sdague | 13:40 |
zigo | (it's very close for OVH...) | 13:40 |
pabelanger | zigo: That's the thing, I don't know which mirror is the best. I'm just saying we might want to pick one for all clouds. Last I checked, we pinned everything to a US mirror. But, I might be wrong. | 13:41 |
*** mase_x200 has quit IRC | 13:42 | |
pabelanger | zigo: Either way, I don't see an issue trying out httpredir to start with | 13:42 |
zigo | pabelanger: As much as I know, for Ubuntu, it's easier to choose, as Canonical sponsors them and make sure they are in good shape. For Debian, it's all sponsored ... | 13:42 |
zigo | pabelanger: Ok. then +2 the patch? :) | 13:42 |
pabelanger | zigo: can't, not core | 13:43 |
zigo | oh ... | 13:43 |
pabelanger | just providing feedback | 13:43 |
zigo | pabelanger: Thanks a lot. | 13:44 |
zigo | pabelanger: I was wondering, could you point me to a patch which adds a check job, so that I can write my build script env script + build packages? | 13:45 |
zigo | Or to the relevant doc if any... | 13:46 |
*** annegentle has joined #openstack-infra | 13:46 | |
*** tiswanso has quit IRC | 13:46 | |
yolanda | hi, so rax working again? | 13:46 |
*** vivekd has quit IRC | 13:46 | |
pabelanger | zigo: look into jenkins/jobs/infra.yaml, they are pretty decent. gate-openstackci-beaker-{ostype}-dsvm provides a good mix of JJB syntax | 13:48 |
AJaeger | yolanda: we hope so;) | 13:48 |
zigo | pabelanger: Cheers. | 13:48 |
*** Jeffrey4l has quit IRC | 13:48 | |
yolanda | we didn't take any action from infra side to solve? | 13:48 |
smcginnis | AJaeger: Thanks. That's really odd. It doesn't show up in my client, but there it is in the eavesdrop. Huh. | 13:49 |
pabelanger | zigo: I did have some code up a while ago for puppet: https://review.openstack.org/#/c/185680/6/modules/openstack_project/manifests/slave_pkg.pp but abandoned it. | 13:49 |
*** Jeffrey4l has joined #openstack-infra | 13:49 | |
pabelanger | zigo: all depends where you want things to live I guess. Adding into JJB is a good start | 13:49 |
nibalizer | yolanda: i didn't do anything | 13:50 |
nibalizer | i logged into your held node and ipv6 seems to work | 13:50 |
nibalizer | and johnthetubaguy says rax says that its fine | 13:50 |
yolanda | well, that's nice | 13:50 |
*** trown is now known as trown|afk | 13:50 | |
pabelanger | nibalizer: the best kind of fixing things! | 13:50 |
*** fhubik is now known as fhubik_brb | 13:51 | |
*** infra-red has joined #openstack-infra | 13:52 | |
*** amitgandhinz has joined #openstack-infra | 13:53 | |
*** infra-re_ has quit IRC | 13:54 | |
*** claudiub has quit IRC | 13:55 | |
*** fhubik_brb is now known as fhubik | 13:56 | |
*** amotoki has joined #openstack-infra | 13:57 | |
AJaeger | yolanda: 275666 was approved and is in the queue - but not getting a node right now ;/ | 13:57 |
*** erlon has joined #openstack-infra | 13:57 | |
*** isaacb has joined #openstack-infra | 13:57 | |
*** ihrachys has quit IRC | 14:01 | |
sdague | AJaeger: yep, the joys of not enough nodes | 14:02 |
openstackgerrit | Spencer Krum proposed openstack-infra/system-config: Add omfracloud to nodepool https://review.openstack.org/275485 | 14:02 |
openstackgerrit | Spencer Krum proposed openstack-infra/system-config: Add OmfraCloud to puppetmaster_clouds https://review.openstack.org/275477 | 14:02 |
*** rossella_s has quit IRC | 14:02 | |
*** Kevin_Zheng has quit IRC | 14:02 | |
*** rossella_s has joined #openstack-infra | 14:03 | |
openstackgerrit | Spencer Krum proposed openstack-infra/system-config: Add omfracloud to nodepool https://review.openstack.org/275485 | 14:04 |
openstackgerrit | Spencer Krum proposed openstack-infra/project-config: Use omfracloud in nodepool https://review.openstack.org/275491 | 14:04 |
*** julim has quit IRC | 14:05 | |
*** annegentle has quit IRC | 14:06 | |
*** henrynash has joined #openstack-infra | 14:06 | |
*** alivigni has joined #openstack-infra | 14:06 | |
yolanda | nibalizer, i'm curious? why "omfracloud" ? :) | 14:06 |
pabelanger | yolanda: do you mind helping review: 275471 | 14:07 |
yolanda | pabelanger sure | 14:08 |
nibalizer | yolanda: it was a typo from jeblair I think, and it sortof stuck | 14:08 |
pavel_bondar | sdague: hi | 14:08 |
*** zeih_ has quit IRC | 14:08 | |
nibalizer | that patch is an appropriate place to say that we chould call int infracloud | 14:08 |
*** skraynev has quit IRC | 14:08 | |
*** isaacb has quit IRC | 14:09 | |
*** zeih has joined #openstack-infra | 14:09 | |
*** _amrith_ is now known as amrith | 14:09 | |
yolanda | ah, much better with regions now | 14:09 |
yolanda | only 4 max-servers? | 14:09 |
openstackgerrit | Merged openstack-infra/project-config: Revert "Set max-servers to 0 to rax-iad" https://review.openstack.org/275666 | 14:10 |
yolanda | we have over 40 nodes in west? | 14:10 |
openstackgerrit | Marton Kiss proposed openstack-infra/groups: Fix broken map on frontpage https://review.openstack.org/275715 | 14:10 |
nibalizer | yolanda: just a number to get us started | 14:10 |
nibalizer | we probably don't want to go hog wild at the start | 14:10 |
*** daemontool has joined #openstack-infra | 14:11 | |
*** thiagop has joined #openstack-infra | 14:11 | |
*** yamamoto has joined #openstack-infra | 14:11 | |
openstackgerrit | Marton Kiss proposed openstack-infra/groups: Fix broken map on frontpage https://review.openstack.org/275715 | 14:13 |
*** kgiusti has joined #openstack-infra | 14:14 | |
pabelanger | good news everybody! python-grafyaml has been uploaded to fedora rawhide. That is all | 14:14 |
*** trown|afk is now known as trown | 14:16 | |
*** amotoki has quit IRC | 14:17 | |
*** dkranz has joined #openstack-infra | 14:18 | |
nibalizer | pabelanger: neat | 14:19 |
*** dprince has joined #openstack-infra | 14:20 | |
*** jistr|mtg is now known as jistr | 14:20 | |
*** amotoki has joined #openstack-infra | 14:21 | |
openstackgerrit | Merged openstack-infra/groups: Fix broken map on frontpage https://review.openstack.org/275715 | 14:22 |
*** tiswanso has joined #openstack-infra | 14:24 | |
*** peristeri has joined #openstack-infra | 14:25 | |
*** vgridnev_ has quit IRC | 14:27 | |
*** daemontool has quit IRC | 14:27 | |
*** vgridnev_ has joined #openstack-infra | 14:27 | |
*** sabeen3 has joined #openstack-infra | 14:28 | |
pabelanger | yolanda: fungi: Odd, another review that has +1 workflow and seems to be stuck: 275471 | 14:29 |
openstackgerrit | Merged openstack-infra/project-config: Remove jobs from gantt and python-ganttclient https://review.openstack.org/273010 | 14:30 |
fungi | pabelanger: why do you say that one's "stuck"? | 14:31 |
fungi | looks like it entered the gate 20 minutes ago when yolanda approved it | 14:31 |
pabelanger | fungi: wait, so why didn't zuul leave a comment? | 14:31 |
*** ihrachys has joined #openstack-infra | 14:31 | |
fungi | it did--that's how i know | 14:32 |
*** tiswanso has quit IRC | 14:32 | |
pabelanger | /faceplam | 14:32 |
pabelanger | toggle CI | 14:32 |
fungi | yep | 14:32 |
*** tiswanso has joined #openstack-infra | 14:32 | |
pabelanger | fungi: sorry for the noise | 14:32 |
fungi | np | 14:32 |
*** burgerk has quit IRC | 14:33 | |
*** skraynev has joined #openstack-infra | 14:35 | |
anteaya | clarkb: ah floating ip exhaustion, thank you | 14:36 |
*** Swami has joined #openstack-infra | 14:36 | |
anteaya | zaro: yes figuring out what is going on with gerrit is useful, thanks for the explanation | 14:36 |
anteaya | armax: I'm here not but you are probably not, have to run errands in about an hour that will likely take me most of my morning, I will look for you upon my return | 14:37 |
*** ihrachys has quit IRC | 14:37 | |
*** annegentle has joined #openstack-infra | 14:38 | |
*** annegentle has quit IRC | 14:38 | |
*** annegentle has joined #openstack-infra | 14:39 | |
*** regXboi has joined #openstack-infra | 14:41 | |
*** fhubik is now known as fhubik_brb | 14:43 | |
*** fhubik_brb is now known as fhubik | 14:43 | |
*** yamamoto has quit IRC | 14:43 | |
*** yaume_ has joined #openstack-infra | 14:44 | |
openstackgerrit | Merged openstack/requirements: Change OSprofiler constraints to >=1.0.0 https://review.openstack.org/275426 | 14:44 |
openstackgerrit | Merged openstack-infra/project-config: Always clone openstack/windmill for jobs https://review.openstack.org/275471 | 14:46 |
openstackgerrit | Mikhail S Medvedev proposed openstack-infra/ciwatch: Ability to select specific CI results https://review.openstack.org/274412 | 14:46 |
*** yamamoto has joined #openstack-infra | 14:47 | |
*** tongli has joined #openstack-infra | 14:47 | |
*** yaume has quit IRC | 14:47 | |
*** julim has joined #openstack-infra | 14:48 | |
*** fhubik is now known as fhubik_brb | 14:49 | |
*** fhubik_brb is now known as fhubik | 14:49 | |
*** anteaya has quit IRC | 14:49 | |
*** zeih has quit IRC | 14:51 | |
*** yamamoto_ has joined #openstack-infra | 14:52 | |
*** yamamoto has quit IRC | 14:52 | |
*** anteaya has joined #openstack-infra | 14:52 | |
*** annegentle has quit IRC | 14:53 | |
*** Daisy has quit IRC | 14:53 | |
*** Daisy has joined #openstack-infra | 14:54 | |
*** mriedem has joined #openstack-infra | 14:54 | |
*** jsavak has quit IRC | 14:55 | |
anteaya | would it make sense to have the mirrors, such as this one: http://mirror.iad.rax.openstack.org/wheel/, in cacti? | 14:55 |
*** Daisy has quit IRC | 14:55 | |
*** jsavak has joined #openstack-infra | 14:56 | |
fungi | i thought they were | 14:56 |
* anteaya looks again | 14:56 | |
fungi | looks like their predecessors are present | 14:57 |
*** fhubik is now known as fhubik_brb | 14:57 | |
krotscheck1 | fungi: http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=186 | 14:57 |
anteaya | so they are | 14:57 |
fungi | i guess the migration involved building replacements | 14:57 |
anteaya | sorry | 14:57 |
*** yamamoto_ has quit IRC | 14:57 | |
fungi | so anyway, yeah if there's not already a change in flight to take those out of cacti and add the new ones, then we'll get one in shortly | 14:59 |
krotscheck1 | Does that particular instance have a cinder volume attached? It seems like disk usage has been creeping up for the last few days, and I don't know of anything that would do that other than the mirror cache. | 14:59 |
*** pradk has joined #openstack-infra | 14:59 | |
krotscheck1 | I _think_ it does. | 14:59 |
krotscheck1 | (because it has the /cache mount. | 14:59 |
krotscheck1 | But that one' snot moving, while the on-disk is. | 15:00 |
anteaya | oh, I guess I don't know what the new ones are | 15:00 |
fungi | oh, wait the new ones are in the list | 15:00 |
fungi | along with the old ones | 15:00 |
krotscheck1 | mirror.* are the new ones. pypi.* are the old ones. | 15:00 |
fungi | krotscheck1 just linked to one | 15:00 |
* fungi needs to get his eyes checked | 15:00 | |
krotscheck1 | We're keeping the old ones around just in case the world explodes. | 15:00 |
anteaya | krotscheck1: ah | 15:00 |
anteaya | fungi: glad I'm not the only one with morning eyes | 15:00 |
krotscheck1 | Because if the world explodes, we want to have a backup mirror. | 15:00 |
krotscheck1 | For... urm... space mirrors? | 15:01 |
anteaya | is that what we want? | 15:01 |
*** yamamoto has joined #openstack-infra | 15:01 | |
krotscheck1 | mirrors in space? | 15:01 |
anteaya | they can reflect light | 15:01 |
anteaya | so that's a plus | 15:01 |
* krotscheck1 feels like that needs a soundtrack. | 15:01 | |
krotscheck1 | Mirrors. IiIIIIN SPAAAAACE! | 15:01 |
anteaya | oh so many muppet refereneces | 15:02 |
anteaya | references | 15:02 |
*** fhubik_brb is now known as fhubik | 15:02 | |
fungi | krotscheck1: yeah, the afs cache is mounted at /var/cache/openafs on a logical volume from a cinder-supplied block device | 15:03 |
krotscheck1 | fungi: Gotcha. So how come the / disk is slowly increasing? | 15:03 |
fungi | good question. /var/log? | 15:03 |
krotscheck1 | fungi: Is it just misreporting the /afs content? | 15:03 |
fungi | my guess would be apache access logs, since every single access to the mirror is logged | 15:04 |
fungi | 8.2gb of apache logs so far | 15:04 |
krotscheck1 | fungi: Do I need to add logrotate to that or will apache handle it itself? | 15:04 |
krotscheck1 | Ah | 15:04 |
krotscheck1 | That would be it. | 15:04 |
anteaya | krotscheck1: apache's default is 52 weeks of logs | 15:05 |
anteaya | I'm in favour of logrotate | 15:05 |
fungi | looks like it's rotating weekly | 15:05 |
anteaya | oh, awesome | 15:05 |
*** Daisy has joined #openstack-infra | 15:05 | |
fungi | previous log rotated on february 1 | 15:05 |
*** dprince has quit IRC | 15:05 | |
fungi | current log size is 6gb for the past couple days worth of requests | 15:05 |
* krotscheck1 wonders how many copies of war & peace that is. | 15:06 | |
*** zeih has joined #openstack-infra | 15:06 | |
fungi | as for why the afs cache is not being shown in cacti, it's possible snmpd was not restarted after that filesystem was mounted | 15:06 |
fungi | i'll restart snmpd on that host now and see if it pops up in the graphs soon | 15:06 |
krotscheck1 | fungi: I see it. | 15:07 |
krotscheck1 | fungi: It's only about 1GB though | 15:07 |
krotscheck1 | Which I find AMAZING | 15:07 |
krotscheck1 | Of all the terabytes of things that we mirror. Actual thigns we use? A fraction. | 15:07 |
fungi | oh, indeed, it was already there | 15:07 |
fungi | back to needing to get my eyes checked... | 15:07 |
*** fhubik is now known as fhubik_brb | 15:07 | |
anteaya | off running errands back in a few hours | 15:08 |
*** henrynash has quit IRC | 15:08 | |
*** amotoki has quit IRC | 15:08 | |
*** infra-red has quit IRC | 15:08 | |
AJaeger | yolanda, fungi: The nodepool puppet run to re-enable iad has not happened yet, do you want to kick it somehow or wait? | 15:09 |
yolanda | AJaeger, it merged some time ago right? | 15:10 |
*** Daisy has quit IRC | 15:10 | |
*** eharney has joined #openstack-infra | 15:10 | |
fungi | what happened in iad? | 15:10 |
AJaeger | yolanda: yes | 15:10 |
yolanda | devstack nodes were failing due to ipv6 problems | 15:10 |
AJaeger | fungi: all dsvm jobs failed | 15:11 |
yolanda | AJaeger, i'll update manually until puppet lands | 15:11 |
fungi | oh, rackspace announced they were having a network issue in iad? | 15:11 |
AJaeger | fungi: no, they just broke it - and suddenly it worked again... | 15:12 |
yolanda | we holded a node and detected that ipv6 errors, and then after a while it started to work again | 15:12 |
yolanda | we were in contacst with johnthetubaguy for that, he opened a ticket | 15:12 |
fungi | yeah, https://status.rackspace.com/ doesn't indicate anything other than some issues with monitoring | 15:12 |
*** annegentle has joined #openstack-infra | 15:13 | |
*** jed56 has joined #openstack-infra | 15:13 | |
jeblair | o/ | 15:15 |
johnthetubaguy | fungi: it didn't seem like support changed anything for me, but it certainly appeared to be IAD specific somehow, but also seemed a little ubuntu specific somehow | 15:16 |
*** zeih has quit IRC | 15:16 | |
*** boris-42 has joined #openstack-infra | 15:16 | |
johnthetubaguy | it seemed like I had access to most ipv6 things, just not ubuntu + ipv6 | 15:16 |
*** kurtmartin has joined #openstack-infra | 15:17 | |
fungi | are the symptoms mentioned in scrollback here? i'll read through in a bit | 15:17 |
yolanda | fungi, jeblair, it's strange that i haven't seen puppet run on nodepool in more than one hour | 15:17 |
yolanda | and it's not disabled | 15:17 |
fungi | yolanda: is it running elsewhere? did you look at the run_all log on the puppetmaster? | 15:18 |
*** dtantsur is now known as dtantsur|brb | 15:18 | |
* fungi needs to step away from the computer for a few minutes, brb | 15:19 | |
*** mtanino has joined #openstack-infra | 15:19 | |
sigmavirus24_awa | dims: you're welcome :) | 15:19 |
sdague | fungi: http://status.openstack.org//elastic-recheck/#1541364 is the bug that catches most of it | 15:22 |
sdague | basically apt-get update timed out | 15:22 |
*** jsavak has quit IRC | 15:22 | |
sdague | trying to get openvswitch installed in d-g | 15:22 |
yolanda | fungi, i see run_all running regularly on puppetmaster | 15:22 |
*** jsavak has joined #openstack-infra | 15:23 | |
yolanda | going to check run_all , i see some entries for nodepool but need to check the time | 15:23 |
*** ihrachys has joined #openstack-infra | 15:23 | |
yolanda | oh, it's difficult to grok last time puppet ran on nodepool with that log | 15:24 |
*** zeih has joined #openstack-infra | 15:24 | |
*** sigmavirus24_awa is now known as sigmavirus24 | 15:24 | |
*** annegentle has quit IRC | 15:24 | |
yolanda | last entry in reports for nodepool is 2 hours ago | 15:25 |
*** fhubik_brb is now known as fhubik | 15:26 | |
jeblair | caught up with scrollback | 15:29 |
jeblair | syslog on nodepool looks really weird with dib running | 15:30 |
*** rbrndt_ has joined #openstack-infra | 15:31 | |
jeblair | Feb 3 15:16:16 nodepool corvus: os-prober: debug: running /usr/libexec/os-probes/mounted/90solaris on mounted /dev/mapper/main-opt | 15:31 |
jeblair | i definitely did not run that | 15:31 |
AJaeger | jeblair: did you install new kernel? | 15:31 |
AJaeger | Or grub? | 15:31 |
jeblair | AJaeger: that's from dib | 15:31 |
openstackgerrit | Daniel Wallace proposed openstack-infra/shade: granting and revoking privs to users and groups https://review.openstack.org/268404 | 15:32 |
openstackgerrit | Doug Hellmann proposed openstack-infra/project-config: add a job to send automated announcements of releases https://review.openstack.org/272767 | 15:32 |
*** Swami_ has joined #openstack-infra | 15:32 | |
*** esker has joined #openstack-infra | 15:32 | |
jeblair | yolanda: it looks like ansible is running puppet on elasticsearch07.openstack.org. | 15:33 |
jeblair | yolanda: that's based on running ps | 15:33 |
yolanda | and is stuck there? | 15:34 |
jeblair | yeah, from 13:50 utc | 15:34 |
openstackgerrit | Valeriy Ponomaryov proposed openstack-infra/project-config: Add new experimental job for manila-image-elements project https://review.openstack.org/275682 | 15:34 |
yolanda | why does it take so long? | 15:34 |
jeblair | yolanda: i think es07 rebooted | 15:34 |
*** zeih has quit IRC | 15:34 | |
jeblair | i don't see any indication as to why | 15:36 |
yolanda | i applied the max-servers change manually to nodepool until we wait | 15:36 |
*** Swami has quit IRC | 15:36 | |
jeblair | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=1834&rra_id=all | 15:36 |
*** zeih has joined #openstack-infra | 15:36 | |
jeblair | possibly something wrong with the nova host it was on, though we usually get emails about that | 15:37 |
fungi | sounds similar to the 2+ hour timeout we had on ansible trying to reach the hpcloud pypi mirror when it finally went offline | 15:39 |
*** doug-fish has joined #openstack-infra | 15:39 | |
fungi | it's possible we may want to figure out how to shorten that timeout | 15:40 |
yolanda | jeblair, wanted to ask you, how can i receive that alerts in email? i'm not receiving | 15:40 |
*** mrmartin has quit IRC | 15:40 | |
fungi | yolanda: rackspace only e-mails it to the account "owner" (which for one of our accounts is mordred i think, pvo is the owner of one of our accounts as well, not sure about the third--would have to check) | 15:41 |
jeblair | it's possible i'm not an owner of this account, but maybe one of our owners did get an email | 15:41 |
*** Sukhdev has joined #openstack-infra | 15:41 | |
jeblair | yolanda: i'm going to try killing pid 20357 on puppetmaster | 15:41 |
fungi | i've been curious whether we can switch those to a role address somewhere, but i haven't looked into it | 15:41 |
yolanda | jeblair ok | 15:42 |
jeblair | ansible is moving again | 15:42 |
yolanda | fungi looks as a good idea, having some generic adress we can all receive | 15:42 |
*** dimtruck has joined #openstack-infra | 15:42 | |
*** annegentle has joined #openstack-infra | 15:43 | |
*** zul has quit IRC | 15:46 | |
*** EricGonczer_ has joined #openstack-infra | 15:51 | |
*** salv-orl_ has joined #openstack-infra | 15:54 | |
*** clif_h has joined #openstack-infra | 15:55 | |
clif_h | is it still possible to view our submitted talks to the austin summit? | 15:55 |
openstackgerrit | Igor Belikov proposed openstack-infra/project-config: Add 'create reference' to fuel-plugin-detach-* ACL https://review.openstack.org/275780 | 15:56 |
*** salv-orlando has quit IRC | 15:57 | |
*** burgerk has joined #openstack-infra | 15:59 | |
*** annegentle has quit IRC | 15:59 | |
*** annegentle has joined #openstack-infra | 16:00 | |
*** zeih has quit IRC | 16:00 | |
*** jgriffith_away is now known as jgriffith | 16:00 | |
*** jcoufal has quit IRC | 16:00 | |
prometheanfire | https://review.openstack.org/#/c/273790/ doesn't seem to be going through workflow even with rechecks | 16:01 |
*** gildub has quit IRC | 16:02 | |
*** vgridnev_ has quit IRC | 16:02 | |
*** peristeri has quit IRC | 16:03 | |
*** lennyb has quit IRC | 16:03 | |
*** dtantsur|brb is now known as dtantsur | 16:04 | |
*** zeih has joined #openstack-infra | 16:04 | |
*** esker has quit IRC | 16:05 | |
*** woodster_ has joined #openstack-infra | 16:05 | |
*** sshnaidm has quit IRC | 16:05 | |
*** esker has joined #openstack-infra | 16:05 | |
*** hdd has joined #openstack-infra | 16:06 | |
openstackgerrit | Illia Khudoshyn proposed openstack-infra/project-config: Add gate job for Rally against Keystone v2 https://review.openstack.org/274668 | 16:07 |
jeblair | prometheanfire: it needs to be reapproved because it depends-on a patch that it doesn't share a gate queue with | 16:08 |
*** zeih has quit IRC | 16:09 | |
prometheanfire | don't think it depends on a patch anymore | 16:10 |
prometheanfire | it used to, but now doesn't with other changes having already gone in | 16:10 |
*** infra-red has joined #openstack-infra | 16:11 | |
jeblair | nibalizer: i think i'd prefer infracloud in puppet/nodepool | 16:12 |
jeblair | rather than omfracloud | 16:12 |
*** rajinir has joined #openstack-infra | 16:12 | |
crinkle | yolanda: nibalizer i have been using very small unencrypted keys to sign these certs, i really want a rooter to regenerate them officially, so i don't want to use my uswest root ca to sign certs for east | 16:13 |
clarkb | jeblair: https://review.openstack.org/#/c/275483/1 is the fix for the zuul leak | 16:13 |
jeblair | clarkb: yeah, i read and responded to the email, thanks | 16:13 |
clarkb | should I go ahead and approve the change or do you want to review it? | 16:14 |
jeblair | clarkb: i was planning on reviewing it | 16:14 |
clarkb | ok | 16:14 |
zigo | clarkb: jeblair: Hi there! Would you mind reviewing https://review.openstack.org/#/c/264726/ ? | 16:15 |
*** lennyb has joined #openstack-infra | 16:16 | |
fungi | crinkle: i suspect we'll be using unencrypted keys regardless, but they'll be larger and presumably we'll be managing them through hiera? | 16:17 |
jeblair | krotscheck1, fungi: what's the latest on wheel building slaves? | 16:17 |
fungi | jeblair: i have not started building the builders | 16:17 |
yolanda | fungi jeblair, how can we proceed with the certs for infra-cloud? if we want to start using them in nodepool, i think we first need a trusted ca, so we can just use the same cert for both regions | 16:19 |
cody-somerville | jeblair: clarkb: Would it make sense to add a test along with that fix? | 16:20 |
*** exploreshaifali has joined #openstack-infra | 16:20 | |
openstackgerrit | ice4o@hotmail.com proposed openstack-infra/jenkins-job-builder: Add HockeyApp Plugin support. https://review.openstack.org/275304 | 16:20 |
*** ihrachys has quit IRC | 16:20 | |
clarkb | zigo: ya I can try building a debian image locally with that | 16:21 |
clarkb | zigo: the curl thing is curious because it should use the build system's curl not the chroots iirc | 16:21 |
fungi | yolanda: when we discussed continuing to use self-signed/snakeoil certs for the api endpoint, i don't recall whether anyone identified how we were going to tackle that on the nodepool end. add the cert to the local trust on the server in /etc/ssl? will that work for nodepool's use case? | 16:21 |
yolanda | fungi, if we want to use it on nodepool, we need to add the ca cert to cloud configuration | 16:22 |
zigo | clarkb: It's still nice to have either curl or wget available, IMO. | 16:22 |
fungi | yolanda: as in add the path to that cert in our clouds.yaml? | 16:22 |
clarkb | zigo: sure I just don't think it is necessary there | 16:22 |
yolanda | fungi yes | 16:23 |
clarkb | zigo: also https://review.openstack.org/#/c/275323/ will break ubuntu I think | 16:23 |
fungi | yolanda: okay, that's pretty straightforward then | 16:23 |
yolanda | crinkle , why do you prefer a official rooter? | 16:23 |
*** shashank_hegde has joined #openstack-infra | 16:23 | |
clarkb | there is no linux-headers-amd64 on ubuntu | 16:23 |
crinkle | yolanda: i want to make sure it's done by someone who knows what they're doing and that it's managed the same way as every other https service | 16:24 |
*** matrohon has quit IRC | 16:24 | |
fungi | yolanda: we're going to need to put it in hiera and she doesn't have access to the hiera files, so someone in infra-root may as well regenerate it while adding | 16:24 |
yolanda | fungi i think spencer has been doing that already | 16:25 |
fungi | sounds good | 16:25 |
clarkb | still trying to track down how that all plays out, the branching there isn't well documented | 16:25 |
yolanda | but i wanted to use same ca for east/west, so we don't have two files. Also concerns that crinkle says sound legit | 16:25 |
fungi | yolanda: he's an infra-root, so if needed he can generate a fresh and stronger keypair | 16:25 |
*** dtantsur is now known as dtantsur|afk | 16:25 | |
jeblair | cody-somerville: do you have a suggestion on how to structure it? | 16:25 |
zigo | clarkb: In Ubuntu, linux-headers-amd64 is a virtual package (ie: only Provides: by some other packages) | 16:26 |
clarkb | zigo: I may be mistaken that ubuntu takes that branch at all too. Checking with factor | 16:26 |
*** sc68cal has joined #openstack-infra | 16:26 | |
*** hashar has quit IRC | 16:26 | |
zigo | clarkb: It's just in a "if ($::operatingsystem == 'Debian') {" thing, so it should be fine. | 16:26 |
clarkb | zigo: yup I am mistaken, there are two levels of debianness there, one that applies to ubuntu and one that doesn't | 16:27 |
zigo | clarkb: "operatingsystem" is really the name of the os, not the familly of os. | 16:27 |
jeblair | krotscheck1: do you have a hostname preference for the wheel building slave? if not, how about wheel-mirror-trustyx64.slave.o.o ? | 16:27 |
yolanda | if we go with self-signed, i'd prefer to have a single CA for both regions, and regenerate certs based on that | 16:27 |
jeblair | yolanda: ++ | 16:27 |
fungi | yolanda: easy enough to do in hiera too | 16:27 |
crinkle | fungi: yolanda i didn't realize we were going to continue to use self-signed certs, that will mean we need another configuration step in puppet to add the ca chain to the trusted certs on the host, because puppet needs to access these endpoints | 16:27 |
fungi | crinkle: according to yolanda we can just puppet it somewhere on the machine and then add the path to it in our clouds.yaml | 16:28 |
jeblair | crinkle: i don't understand "puppet needs to access these endpoints" ? | 16:28 |
crinkle | fungi: jeblair puppet itself makes API calls and it doesn't use --insecure or --cacert | 16:28 |
jeblair | crinkle: got it | 16:28 |
fungi | ohh... | 16:29 |
yolanda | fungi, that's for nodepool side | 16:29 |
fungi | sorry, i was thinking you meant ansible needed it for inventorying before calling puppet | 16:29 |
jeblair | like nibalizer's change to do that for nodepool | 16:29 |
crinkle | so i had been building this with the assumption that the CA would be signed | 16:29 |
*** dtardivel has joined #openstack-infra | 16:29 | |
yolanda | but yes, crinkle is right, that we need to add the config in puppet for infracloud as well | 16:29 |
fungi | ansible is tackled easily by the clouds.yaml addition, but puppet itself less so | 16:29 |
*** dizquierdo has quit IRC | 16:30 | |
*** ifarkas has quit IRC | 16:30 | |
jeblair | well, it's not hard to drop it into the host's trusted certs, right? | 16:30 |
fungi | /etc/ssl/certs/*splat* | 16:30 |
cody-somerville | jeblair: Will need to look a bit more to see if we have the right data types used, but if # of cache objects is deterministic I was thinking of using gc module to to count how many cache objects there are before and after reload or something like that. | 16:30 |
yolanda | it will need some changes in puppet for infra cloud, to place the cacert in trusted hosts | 16:30 |
yolanda | i wasn't successful with my tries in east, but crinkle got it working on west.. so it will be a matter of puppetizing that properly | 16:31 |
jeblair | cf https://review.openstack.org/275485 | 16:31 |
fungi | yolanda: presumably just adding a file resource with content supplied from hiera? | 16:31 |
fungi | or does puppet not actually trust what's in /etc/ssl/certs? | 16:31 |
jeblair | fungi, yolanda, crinkle: like that patch and its parent ^ | 16:31 |
yolanda | fungi, when i tested, i had complains about openstack client, not trusting it, but i bet i did something wrong when generating the cacert | 16:32 |
jeblair | krotscheck1: also, did you add logrotate to the mirrors? | 16:32 |
yolanda | crinkle, you just added the ca to /etc/ssl/certs and worked for you? | 16:32 |
*** bpokorny has joined #openstack-infra | 16:32 | |
crinkle | i was never really successful with just dropping it in /etc/ssl/certs, it seemed like an update-ca-certificates needs to be run as well | 16:32 |
clarkb | fungi: yolanda the openstack clients use python requests which ignores the system certs list by default. This means you either have to set env vars to make it trust a specific path or pass certs directly iirc | 16:33 |
clarkb | crinkle: even that isn't sufficient with python-requests now that they vendor their own certs | 16:33 |
yolanda | clarkb, looks as i was hitting that | 16:33 |
jeblair | this is the worst thing i've heard | 16:33 |
jeblair | and i've heard a lot of bad things in this channel | 16:33 |
yolanda | clarkb, but i didn't find any way to pass the path with puppet manifests. crinkle, do you know about it? | 16:34 |
clarkb | jeblair: they have chosen to do this to reduce the number of bugs around differing system cert lists or something | 16:34 |
fungi | i'm sure i can come up with something worse, but lack time to devote to such a task ;) | 16:34 |
cody-somerville | jeblair: though just checking that the old stuff is deleted might be better incase there are other "leaks" of the cache object in the reload codepath. | 16:34 |
yolanda | i ended tweaking the files manually to make it work, but i didn't spend so much time on that | 16:34 |
yolanda | also, seems that puppet-openstack modules doesn't allow to pass a --insecure flag easily to just ignore the problems | 16:34 |
crinkle | clarkb: that doesn't sound right, how does anything auth against https without passing a cert path? | 16:34 |
krotscheck1 | jeblair: In a meeting, sec. | 16:34 |
*** ihrachys has joined #openstack-infra | 16:34 | |
clarkb | crinkle: because most things work with their vendored list | 16:35 |
krotscheck1 | jeblair: I think no on the logrotate | 16:35 |
*** diana_clarke has joined #openstack-infra | 16:35 | |
*** bpokorny has quit IRC | 16:35 | |
yolanda | crinkle but you have your controller working with https and self-signed cert , how did you fix that auth problem? | 16:35 |
fungi | so it's sounding like it might make more sense to reverse direction on the original snakeoil decision if we need to hack around multiple disparate systems each with their own idea of how to extend trust | 16:35 |
crinkle | clarkb: aha, well this is another reason to get a real cert then | 16:35 |
*** bpokorny has joined #openstack-infra | 16:36 | |
fungi | however i still think if it's not too hard to solve, it may be of benefit to the wider community who may want to try to stand up a similar test environment without bothering with an external ca | 16:36 |
crinkle | yolanda: i have no idea | 16:36 |
jeblair | well | 16:36 |
jeblair | let's see how hard it is to tell requests to just use system certs | 16:37 |
crinkle | yolanda: the uswest cloud seems to be working but recently i've been having a ton of trouble in my dev env with it | 16:37 |
clarkb | crinkle: yolanda well te other side of this is that python on trusty doesn't use ssl properly :) | 16:37 |
*** kzaitsev_mb has quit IRC | 16:37 | |
yolanda | my experience with self-signed is that some calls worked, but the ones based on openstack client failed | 16:37 |
clarkb | which is where the whole insecure platform warning thing comes from | 16:37 |
crinkle | yolanda: neutron client behaves the same i think | 16:37 |
openstackgerrit | Devananda van der Veen proposed openstack-infra/project-config: Allow ironic jobs to run with tempest plugin https://review.openstack.org/265311 | 16:38 |
yolanda | yep, failures in keystone and neutron | 16:38 |
Swanson | Trying to get a nodepool node to actually run cinder tests. I'm failing due to network issues. Getting a error running the nova bridge stuff. Address already in use. This seen? Easy fix? | 16:38 |
jeblair | ah here we go: http://docs.python-requests.org/en/v1.0.0/user/advanced/#ssl-cert-verification | 16:38 |
cody-somerville | This might be a stupid question but would it be possible to use https://letsencrypt.org for this? | 16:38 |
*** diana_clarke has quit IRC | 16:39 | |
*** yumapath has joined #openstack-infra | 16:39 | |
yolanda | ah, also rapidssl was offering some free certs? | 16:39 |
openstackgerrit | sebastian marcet proposed openstack-infra/openstackid-resources: Summit Application API https://review.openstack.org/221964 | 16:39 |
jeblair | REQUESTS_CA_BUNDLE is the var we set | 16:39 |
jeblair | cody-somerville: that will likely be much more work | 16:39 |
clarkb | Swanson: thats a known issue if oyu reuse devstack gate machines. There was a fix pushed whicc hI though I approved a while back | 16:40 |
*** mriedem is now known as mriedem_afk | 16:40 | |
crinkle | yesterday my puppet kept complaining about ssl, today it magically works /tableflip | 16:40 |
*** ashleighfarnham has joined #openstack-infra | 16:41 | |
Clint | groundhog day error | 16:41 |
jeblair | crinkle: do you think you can try setting REQUESTS_CA_BUNDLE when running puppet and see if it can make the api calls it needed | 16:41 |
jeblair | crinkle: oh | 16:41 |
jeblair | that does make experimentation harder | 16:41 |
crinkle | indeed | 16:41 |
crinkle | but maybe yolanda can try | 16:41 |
Swanson | clarkb, I see. Well, I'll fire up a new one. | 16:42 |
yolanda | crinkle, i don't have my nodes on a healthy state now, because i was playing with bridge, but i can try to fix and setup tomorrow morning | 16:42 |
* jeblair picks up ssl certs from the floor | 16:42 | |
krotscheck1 | jeblair: Yes, I think there was a specific name required for the wheel. sec if I can find it | 16:42 |
clarkb | Swanson: yes should be corrected with https://review.openstack.org/#/c/239363/6 | 16:42 |
*** julim has quit IRC | 16:42 | |
Swanson | clarkb, Thanks! | 16:43 |
*** Swami__ has joined #openstack-infra | 16:43 | |
*** julim has joined #openstack-infra | 16:44 | |
*** tonyb has quit IRC | 16:45 | |
fungi | cody-somerville: yolanda: the goal of using snakeoil certs in these cases is to eliminate unnecessary dependence on an external party. if the goal were merely to "save a few bucks" then we wouldn't be having this discussion | 16:45 |
*** armax has joined #openstack-infra | 16:46 | |
*** tonyb has joined #openstack-infra | 16:46 | |
clarkb | fungi: responded to comment on https://review.openstack.org/#/c/274821/1 with example runs that fail in the same way on ovh today using gre | 16:46 |
*** kevinbenton has quit IRC | 16:46 | |
*** mahatic_ has quit IRC | 16:46 | |
*** mugsie has quit IRC | 16:46 | |
*** mugsie has joined #openstack-infra | 16:46 | |
fungi | clarkb: thanks, i was going to look for some but hadn't found the time yet | 16:47 |
*** Swami_ has quit IRC | 16:47 | |
*** mahatic has joined #openstack-infra | 16:47 | |
clarkb | fungi: I am fairly confident that that is a fail in openstack and not with the overlay network | 16:47 |
yolanda | i can give a try tomorrow morning | 16:47 |
*** NikitaKonovalov has quit IRC | 16:47 | |
crinkle | yolanda: actually now that i think about it i haven't had problems with keystone, because keystone's ssl is being managed by apache | 16:47 |
fungi | clarkb: i suspected as much, which is why i had +2'd it anyway | 16:48 |
yolanda | crinkle i've had with keystone auth | 16:48 |
crinkle | yolanda: i wonder what's different :/ | 16:48 |
*** aarefiev has quit IRC | 16:48 | |
yolanda | basically openstack calls were failing with non-trusted ca | 16:48 |
yolanda | i generated the ca using the same links you sent me | 16:48 |
*** aarefiev has joined #openstack-infra | 16:49 | |
*** diana_clarke has joined #openstack-infra | 16:49 | |
*** annegentle has quit IRC | 16:49 | |
*** NikitaKonovalov has joined #openstack-infra | 16:49 | |
*** kevinbenton has joined #openstack-infra | 16:51 | |
*** jordanP has quit IRC | 16:52 | |
*** annegentle has joined #openstack-infra | 16:52 | |
*** annegentle has quit IRC | 16:52 | |
*** annegentle has joined #openstack-infra | 16:53 | |
*** sridhar_ram has joined #openstack-infra | 16:53 | |
*** jordanP has joined #openstack-infra | 16:53 | |
*** infra-re_ has joined #openstack-infra | 16:53 | |
*** kzaitsev_mb has joined #openstack-infra | 16:54 | |
openstackgerrit | Valeriy Ponomaryov proposed openstack-infra/project-config: Add LXD experimental Tempest job for Manila https://review.openstack.org/275823 | 16:55 |
*** hashar has joined #openstack-infra | 16:56 | |
*** fitoduarte has joined #openstack-infra | 16:56 | |
*** infra-red has quit IRC | 16:56 | |
*** henrynash has joined #openstack-infra | 16:57 | |
*** sshnaidm has joined #openstack-infra | 16:57 | |
*** jistr has quit IRC | 16:57 | |
krotscheck1 | jeblair: OK! Done with meeting. | 16:58 |
krotscheck1 | jeblair: Checking for server name | 16:58 |
*** e0ne has quit IRC | 16:59 | |
cznewt | hello everyone, who can please help with adding a few people to gerrit group? For the https://review.openstack.org/#/c/272469/ | 16:59 |
yolanda | fungi, jeblair, also for preparing for infra cloud. We need to add dns entries for that. In case of east, it's 100 servers. Is that ok to add to our dns servers? | 17:00 |
krotscheck1 | jeblair: wheel-mirror-ubuntu-trusty-amd64 | 17:00 |
krotscheck1 | jeblair: It needs to be "wheel-mirror-" + the suffix from https://review.openstack.org/#/c/164927/33/zuul/layout.yaml | 17:00 |
*** Apoorva has joined #openstack-infra | 17:01 | |
*** gyee has joined #openstack-infra | 17:01 | |
*** sbalukoff has quit IRC | 17:02 | |
*** fhubik is now known as fhubik_brb | 17:02 | |
*** yamahata has joined #openstack-infra | 17:02 | |
jeblair | yolanda: yeah, we may want to use the rackdns client for that; | 17:03 |
*** infra-red has joined #openstack-infra | 17:03 | |
crinkle | yolanda: these machines are going to move and most likely be renumbered soon, so probably don't want to do all that work yet? | 17:03 |
jeblair | krotscheck1: we can do wheel-mirror-ubuntu-trusty-amd64 if you want, but why does it need to be that? | 17:04 |
*** dims has quit IRC | 17:04 | |
yolanda | crinkle , raising that because of the changes for nodepool preparation, that nibalizer raised this morning | 17:04 |
yolanda | i know they are going to be moved soon, but cannot tell about the schedule | 17:04 |
krotscheck1 | jeblair: https://review.openstack.org/#/c/164927/33/jenkins/jobs/wheel-mirror.yaml Line 35 | 17:05 |
krotscheck1 | jeblair: The job defenition is here https://review.openstack.org/#/c/164927/33/jenkins/jobs/projects.yaml | 17:05 |
jeblair | krotscheck1: got it | 17:05 |
*** dims has joined #openstack-infra | 17:06 | |
jeblair | krotscheck1: i will create wheel-mirror-ubuntu-trusty-amd64.slave.openstack.org | 17:06 |
jeblair | krotscheck1: oh | 17:06 |
krotscheck1 | Too long? | 17:06 |
krotscheck1 | Wait. | 17:06 |
jeblair | krotscheck1: actually that's a node label which is not necessarily its hostname | 17:06 |
*** infra-re_ has quit IRC | 17:06 | |
jeblair | but i think it's our usual pattern to make the hostname the label | 17:07 |
jeblair | might even be a jenkins default | 17:07 |
yolanda | crinkle if we want to add to nodepool now, we could go with something simple, such as bifrost, a controller and a pair of computes | 17:07 |
jeblair | it's been so long since i made a slave by hand :) | 17:07 |
krotscheck1 | Well, the node label is /.*wheel-mirror-.*\.openstack\.org/ | 17:07 |
jeblair | krotscheck1: that's puppet | 17:07 |
krotscheck1 | Yep | 17:07 |
jeblair | krotscheck1: so wheel-mirror-<anything> will run the puppet code | 17:07 |
fungi | jeblair: i think by default there is no node label, but you can interchangeably refer to nodes by name or by node label in the same places in configuration | 17:08 |
jeblair | krotscheck1: in the link you sent, that's jjb telling jenkins to run this job on this node | 17:08 |
krotscheck1 | jeblair: Yes. The job label is different from the puppet label? | 17:08 |
prometheanfire | can someone reworkflow this? https://review.openstack.org/#/c/273790/ | 17:08 |
*** jlanoux has quit IRC | 17:08 | |
jeblair | krotscheck1: right, so based on what fungi says, if we spin up the node, attach it to jenkins, give it no node label, the hostname will be used as a node label, so the job will run on that slave if we use the name you suggest | 17:09 |
krotscheck1 | Sounds good to me. | 17:09 |
*** fhubik_brb is now known as fhubik | 17:09 | |
jeblair | krotscheck1, fungi: i'm creating the slave now | 17:09 |
jeblair | krotscheck1: do you know how big a single-arch wheel mirror will be after it's built? | 17:09 |
fungi | well, we're normally naming slaves for their fqdn but we've been trimming that for node labels to the hostname portion of the fqdn. anyway, it's trivially solvable | 17:10 |
krotscheck1 | jeblair: maybe, let me check. | 17:10 |
* krotscheck1 might still have that vm around | 17:10 | |
*** ihrachys has quit IRC | 17:10 | |
fungi | and now that i say that, i'm less confident in my memory so double-checking | 17:10 |
*** nmagnezi has quit IRC | 17:10 | |
jeblair | krotscheck1: also, answers like "20G should be (more than | not) enough" are useful :) | 17:10 |
fungi | yeah, the slave named release.slave.openstack.org connects to host release.slave.openstack.org but has a node label manually added of "release" | 17:11 |
jeblair | ok i'll manually add the hostname then | 17:11 |
krotscheck1 | jeblair: I don' thave that image anymore. I do believe that 20G should be plenty. | 17:11 |
*** fhubik has quit IRC | 17:11 | |
jeblair | cool | 17:11 |
krotscheck1 | jeblair: Note, though, that the job right now does _not_ clean up after itself. | 17:11 |
* krotscheck1 can add that | 17:12 | |
*** dims has quit IRC | 17:12 | |
jeblair | krotscheck1: do we want it to? | 17:12 |
*** samuelBartel has quit IRC | 17:13 | |
krotscheck1 | jeblair: RIght now wheels are built into /tmp/wheelhouse | 17:13 |
krotscheck1 | And then moved into AFS folder structure from there. | 17:13 |
krotscheck1 | Since /tmp is usually only cleaned up on reboot, and this is going ot be a long-lived slave.... | 17:13 |
fungi | what sort of mess do we anticipate that job accumulating? also you don't want to rely on the job "cleaning up after itself" since that may not happen for various reasons. if you need to clean it up, do so at the start of the next run | 17:13 |
*** Swami__ has quit IRC | 17:13 | |
jeblair | can it re-use previous work if we don't clean it up? | 17:14 |
krotscheck1 | jeblair: I believe it does. | 17:14 |
mwhahaha | hey openstack-infra folks, how can you determine capacity for a specific job associated with a project? I ask because we've got some changes stuck for fuel-library because gate-fuel-library-puppet-lint is QUEUED for all of the jobs. Are there no nodes currently running that can do that job? | 17:14 |
jeblair | if so, that seems like a reason to count on having it not clean up | 17:14 |
krotscheck1 | jeblair: The downside is that that directory will grow over time. | 17:14 |
fungi | also, is it safe to re-use previous work (for example, if library versions change behind the scenes due to distro package updates)? | 17:14 |
jeblair | fungi: right, which would be a good argument for cleaning up :) | 17:15 |
*** [HeOS] has joined #openstack-infra | 17:15 | |
*** annegentle has quit IRC | 17:15 | |
*** kzaitsev_mb has quit IRC | 17:15 | |
fungi | mwhahaha: it depends on what node type is needed for that particular job | 17:15 |
jeblair | krotscheck1: how much ram/cpu should the slave have? | 17:16 |
fungi | i wonder if anyone's working on node counts by type as an addition to http://grafana.openstack.org/dashboard/db/nodepool | 17:16 |
mgagne | fungi can this info be graphed? is ready node per type available in graphite? | 17:16 |
mgagne | fungi hehe | 17:17 |
jeblair | fungi: i don't know. mgagne: yes, it's in graphite. | 17:17 |
*** nt has quit IRC | 17:17 | |
*** yamamoto has quit IRC | 17:17 | |
krotscheck1 | jeblair: I was running it reasonably well on a 2VCPU virtualbox guest | 17:17 |
fungi | yeah, shouldn't be hard to add to grafana since we already track it in graphite | 17:17 |
*** HeOS has quit IRC | 17:17 | |
mgagne | I like graphs =) | 17:17 |
krotscheck1 | jeblair: I _did_ run into a memory constraint though | 17:17 |
clarkb | krotscheck1: jeblair our default slave size should be plenty as they all make wheels today | 17:17 |
krotscheck1 | jeblair: Lemme see what I set it to. | 17:18 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Source undercloud environment variable from a file https://review.openstack.org/275667 | 17:18 |
openstackgerrit | Derek Higgins proposed openstack-infra/tripleo-ci: Split the deploy script into its own file https://review.openstack.org/275668 | 17:18 |
*** tphummel has joined #openstack-infra | 17:18 | |
*** dims has joined #openstack-infra | 17:18 | |
krotscheck1 | jeblair: 512MB is not enough. 2G was fine. | 17:18 |
jeblair | clarkb: you're suggesting 8G? | 17:18 |
clarkb | jeblair: as a conservative upper bound | 17:18 |
fungi | krotscheck1: clarkb: jeblair: in particular, the constraints updates some other requirements jobs do it for the entire global-requirements.txt set, so holding one of those and looking after completion should give us some idea of what size a base wheel cache for the set would be | 17:19 |
*** nt has joined #openstack-infra | 17:19 | |
mwhahaha | i guess the next question would be, how does one determine node type :D in the jobs.yaml node is just set to '{node}' | 17:19 |
jeblair | i'm not too worried about disk space | 17:19 |
mwhahaha | it appears this is also impacting the puppet-* repos as well | 17:19 |
jeblair | should i be? :) | 17:19 |
mgagne | mwhahaha check projects.yaml the variable is in there | 17:20 |
fungi | ahh, you were asking how much ram. well, i agree with clarkb in that we do it for requirements changes now | 17:20 |
zaro | morning | 17:20 |
jeblair | krotscheck1: can you move the wheel building from /tmp to /opt ? | 17:20 |
krotscheck1 | jeblair: Yep | 17:21 |
krotscheck1 | jeblair: Does that require a puppet update to make sure the directory exists? | 17:21 |
jeblair | how do we feel about 4G/4cpu/40GB /opt ? | 17:21 |
*** kzaitsev_mb has joined #openstack-infra | 17:21 | |
mwhahaha | does it inherit that information from the project itself? if so i guess it would be bare-trusty | 17:21 |
krotscheck1 | Seems good | 17:21 |
*** harlowja_at_home has joined #openstack-infra | 17:22 | |
jeblair | krotscheck1: either puppet or in the job via sudo; /opt will exist, but you'll need to make /opt/wheelhouse (or whatever) and chown it to jenkins | 17:22 |
krotscheck1 | jeblair: Willdo | 17:23 |
jeblair | krotscheck1: if you decide to build from scratch each time, then the answer is certainly 'do it in the job instead of puppet' | 17:23 |
fungi | mwhahaha: also you can confirm from a recent run, for example http://logs.openstack.org/55/260655/11/check/gate-fuel-library-puppet-lint/4758f8b/console.html#_2016-02-02_14_19_16_214 | 17:23 |
mgagne | mwhahaha it does | 17:23 |
jeblair | krotscheck1: so maybe that's the way we should go regardless of whether we decide to clean up or not, so that we have more flexibility | 17:23 |
mgagne | mwhahaha variables are inherited from projects | 17:23 |
mwhahaha | so next questions, how can i check bare-trusty capacity :) | 17:24 |
mgagne | mwhahaha otherwise node wouldn't exist and JJB would crash | 17:24 |
fungi | mwhahaha: capacity is likely 0 right now. it's a fairly popular node type, and changes in the gate pipeline get priority over changes in the check pipeline | 17:24 |
jeblair | mwhahaha: http://tinyurl.com/hxuv2kh | 17:25 |
fungi | a better question might be how many are in use right now | 17:25 |
krotscheck1 | jeblair: Cando. | 17:25 |
*** jgriffith is now known as jgriffith_away | 17:25 | |
mgagne | when you know what metrics to look for, it looks so easy :O | 17:26 |
*** kzaitsev_mb has quit IRC | 17:26 | |
mwhahaha | ok, i guess we'll just continue to wait. and if it hits 4+ hours i'll come back, so far the oldest appears ~2.5 hours old | 17:27 |
jeblair | we document the stats for zuul, but i don't think we do for nodepool | 17:27 |
krotscheck1 | jeblair: I'd have to remove revoke-sudo from the template | 17:27 |
*** achanda has joined #openstack-infra | 17:27 | |
jeblair | krotscheck1: hrm, we probably don't actually want to use sudo for anything do we... | 17:28 |
mwhahaha | is it stuck because of the number of nodes pending deletion? | 17:28 |
krotscheck1 | jeblair: I should be able to just empty the directory, even though it's a bit more annoying. | 17:28 |
jeblair | krotscheck1: so maybe we need to make /opt/wheelthingy in puppet and chown it to jenkins, and then build in /opt/wheelthingy/actual_build_dir | 17:28 |
krotscheck1 | That works | 17:28 |
fungi | mwhahaha: nah, nodes in delete state don't satisfy nodepool's demand calculations, they just count against total quota in our providers across all node types | 17:29 |
fungi | mwhahaha: the bigger issue is the overall number of bare-trusty nodes in use is small because most of the quota is occupied by devstack-trusty nodes which have a much higher demand across all projects and we're running at full quota capacity in all our providers under the current workload | 17:30 |
openstackgerrit | Michael Krotscheck proposed openstack-infra/system-config: Add wheel working directory to wheel slave https://review.openstack.org/275854 | 17:31 |
*** hashar is now known as hasharDinnerTime | 17:31 | |
fungi | mwhahaha: right now there are almost 6x as many devstack-trusty nodes running jobs as there are bare-trusty nodes | 17:31 |
mwhahaha | yea i figured it was a quota issue | 17:31 |
*** mrmartin has joined #openstack-infra | 17:31 | |
mwhahaha | what's annoying is that the job it's waiting on probably takes a fraction of the time to allocate the bare-trusty nodes themselves | 17:31 |
*** alivigni has quit IRC | 17:32 | |
mwhahaha | er sentence fail, the job takes a fraction of the time when compared to the time it takes to allocate a node | 17:33 |
fungi | of the 462 nodes running jobs right now, 293 (63%) are devstack-trusty while 58 (13%) are bare-trusty | 17:33 |
*** julim has quit IRC | 17:33 | |
jeblair | mwhahaha: well, yes, we're running over capacity so it's going to take a long time to get a node. | 17:34 |
openstackgerrit | Michael Krotscheck proposed openstack-infra/project-config: Add wheel mirror to configure_mirror.sh https://review.openstack.org/267117 | 17:34 |
openstackgerrit | Michael Krotscheck proposed openstack-infra/project-config: Added wheel-release job https://review.openstack.org/273549 | 17:34 |
openstackgerrit | Michael Krotscheck proposed openstack-infra/project-config: Create jobs for a wheel mirror https://review.openstack.org/164927 | 17:34 |
mgagne | http://paste.openstack.org/show/485878/ for grafana json | 17:34 |
fungi | and yes, the turnover rate of jobs which run on bare-trusty is higher, which means the build and burn overhead for those node types is also higher, which eates into the overall availability of them proportionally as well | 17:34 |
krotscheck1 | jeblair: https://review.openstack.org/#/c/164927/33..34/jenkins/jobs/wheel-mirror.yaml | 17:35 |
krotscheck1 | Seem good? | 17:35 |
fungi | s/eates/eats/ | 17:35 |
jeblair | fungi: not a huge amount though; our current providers build nodes *fast* | 17:35 |
jeblair | mgagne: you want to translate that to yaml and put it in project-config/grafana? | 17:35 |
fungi | jeblair: true, i only see a few bare-trusty nodes building for 30 minutes. most are a handful of minutes | 17:35 |
mgagne | doing it atm | 17:35 |
*** sbalukoff has joined #openstack-infra | 17:35 | |
jeblair | mgagne: woot! | 17:35 |
jeblair | fungi: and likely spending a lot of time waiting on nodepool actually | 17:36 |
jeblair | krotscheck1: lgtm | 17:36 |
zaro | clarkb: hey, i was not able to come up with an optimum heap number from reading gerrit info. | 17:36 |
*** unicell has quit IRC | 17:36 | |
jeblair | fungi: https://review.openstack.org/275854 | 17:37 |
zaro | clarkb: was the intention to change VM flavors with more RAM+CPU? | 17:37 |
jeblair | zaro: yes, we're considering that | 17:37 |
zaro | clarkb: if so i think it would help to increase VCPU as well as bump up jvm heap. | 17:37 |
fungi | jeblair: are we mounting the ephemeral disk at /opt or is this just on the rootfs? | 17:38 |
fungi | i guess we can worry about that after watching filesystem graphs | 17:38 |
zaro | clarkb: since the 503 indicates a degration in ability to repsond to requests. | 17:38 |
jeblair | zaro: we're not actually cpu bound except when we're dealing with GC | 17:38 |
*** jgriffith_away is now known as jgriffith | 17:39 | |
jeblair | zaro: examine the graphs on http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=9 | 17:39 |
*** Swami has joined #openstack-infra | 17:39 | |
*** ashleighfarnham has quit IRC | 17:40 | |
jeblair | fungi: i believe launch-node mounts ephemeral on opt | 17:41 |
jasondotstar | hi all | 17:41 |
jasondotstar | 17:41 | |
jasondotstar | https://review.openstack.org/#/c/272469/ has been merged | 17:41 |
openstackgerrit | Mathieu Gagné proposed openstack-infra/project-config: grafana: Add Nodepool nodes capacity per type https://review.openstack.org/275862 | 17:41 |
jasondotstar | we're looking to get cznewt added at PTL so that he can add the other core team members | 17:41 |
*** sfinucan has quit IRC | 17:41 | |
*** achanda has quit IRC | 17:42 | |
cznewt | hi, we need to add ppl to the openstack-salt-core team, pls anyone can help? | 17:42 |
jasondotstar | jeblair, dhellmann ^^ | 17:42 |
*** julim has joined #openstack-infra | 17:42 | |
jeblair | mgagne: gertty bug! | 17:42 |
zaro | jeblair, clarkb : yeah, i see that. but i was thinking that if we were to change flavors anyways might as well bump up VCPU. there's only 8 on review.o.o correct? | 17:42 |
*** harlowja_at_home has quit IRC | 17:43 | |
mgagne | jeblair there is? | 17:43 |
jeblair | zaro: we probably don't have a choice, more ram usually means more cpu | 17:43 |
mgagne | jeblair trying a new repo setup on my side | 17:43 |
dhellmann | jasondotstar : you need someone on the infra team with gerrit admin rights. I know that includes fungi, clarkb, and jeblair but there may be others. | 17:43 |
jeblair | mgagne: when i diff that i get UnicodeEncodeError: 'ascii' codec can't encode character u'\u0301' in position 92: ordinal not in range(128) | 17:43 |
zaro | jeblair, clarkb: that makes sense and sounds good. | 17:43 |
*** fedexo has joined #openstack-infra | 17:44 | |
jeblair | zaro: yeah, i'm just making the point that more cpu will be nice, it will help us handle more requests in parallel and so will support a higher load, but it will not actually help with the 503 errors, as those are driven by the garbage collector, and more cpus won't help there | 17:44 |
jasondotstar | fungi, clarkb, jeblair - can either of you guys assist us? | 17:45 |
tonyb | ttx: That'd be helpful (adding me to yaml2ical-core) Do you want to request that or should I and use this log as backup? | 17:45 |
jeblair | cznewt: what's your email address in gerrit? | 17:45 |
zaro | jeblair, clarkb : so unless we can do testing against the review.o.o to verify optimum heap then maybe we can just keep bumping up heap and continue to monitor? maybe a good start is 20G? | 17:46 |
*** jaosorior has quit IRC | 17:46 | |
jeblair | tonyb, ttx: see fungi about that | 17:46 |
cznewt | jeblair: ales.komarek@tcpcloud.eu | 17:46 |
zaro | jeblair, clarkb : switch to a flavor that provides adequate RAM for cushion? | 17:46 |
jeblair | cznewt, jasondotstar: done | 17:46 |
mgagne | jeblair could it be caused by my name having an accent? | 17:46 |
*** yamahata has quit IRC | 17:46 | |
fungi | tonyb: ttx: i'm actually looking now to confirm i'm okay with the request. just a moment | 17:46 |
cznewt | jeblair: thanks a lot | 17:47 |
tonyb | fungi: thanks. | 17:47 |
mgagne | that unicode character is ́ | 17:47 |
jasondotstar | jeblair++ | 17:47 |
jasondotstar | thx | 17:47 |
*** esikachev has joined #openstack-infra | 17:47 | |
jeblair | zaro: we can increase heap on current review.o.o if needed, but it will eat into disk cache. however, we are spending almost no time in iowait, so we can probably safely do that. we would need to keep an eye on iowait to make sure it doesn't significantly increase. cc: clarkb | 17:48 |
*** abregman has quit IRC | 17:48 | |
jeblair | mgagne: not sure yet, doing 4 things at once :( | 17:48 |
clarkb | jeblair: the other memory consumer is the git mirror | 17:48 |
clarkb | jeblair: its possible we can just send that off to git.o.o via a redirect? | 17:49 |
jeblair | clarkb: i do not immediately know the implications of that. | 17:49 |
jeblair | clarkb: i'd probably start by checking to see if we're still using it in CI | 17:50 |
*** shashank_hegde has quit IRC | 17:50 | |
zaro | clarkb: do you mean the local replication? | 17:51 |
clarkb | zaro: yes, we serve that out via apache and it uses a chunk of the memroy on that host | 17:51 |
fungi | ttx: looking at changes to that repo over the past year (both open and merged) i agree reviews look consistent and thorough. tonyb: welcome to yaml2ical-core, don't let the power go to your head ;) | 17:51 |
fungi | on a related note, there are 9 core reviewers for that repo, almost none of whom are reviewing from what i can tell. what were the criteria for populating that? | 17:52 |
tonyb | fungi: Thanks. | 17:52 |
tonyb | fungi: I wasn't around then | 17:53 |
tonyb | fungi: but I think it was a intern project or similar which was done $somewhere_else | 17:53 |
*** esikachev has quit IRC | 17:54 | |
tonyb | fungi: then when it was brought into our infrastructure all the current authors got core | 17:54 |
*** jordanP has quit IRC | 17:54 | |
tonyb | fungi: but I don't *know* that | 17:54 |
tonyb | fungi: I think it shoudl be reduced to ttx and I | 17:55 |
zaro | clarkb: is there a way to tell how many request review.o.o gets for that? | 17:55 |
jeblair | mgagne: it's the unicode combining acute accent in your name in the commit message... is it possible that your commit message is not utf8? | 17:55 |
mgagne | jeblair maybe it's repo that badly configured git | 17:55 |
clarkb | zaro: we can check the apache logs, is there something specific you are trying to determine? | 17:56 |
mgagne | will retry from my old setup | 17:56 |
*** Daisy has joined #openstack-infra | 17:56 | |
tonyb | fungi: I'd be happy to mail them all and say "Umm yeah you're not really doing anything with yaml2ical, we we're going to remove you core status" If you like. | 17:56 |
fungi | tonyb: that sounds likely. i guess we can revisit when i get around to running actual blanket stats across all our repos. there are probably more in similar situations | 17:56 |
tonyb | fungi: okay | 17:57 |
fungi | it's on my neverending to-do list | 17:57 |
jeblair | mgagne: how new is it? i see the problem on some of your older changes... | 17:57 |
mgagne | jeblair new from today | 17:57 |
zaro | clarkb: wehter you can tell where it's coming from and if requests are low do we need to worry about it? | 17:57 |
mgagne | IIRC | 17:57 |
*** sputnik13 has joined #openstack-infra | 17:57 | |
mgagne | which changes? | 17:58 |
mgagne | gerrit-ui? | 17:58 |
*** lucasagomes has quit IRC | 17:58 | |
jeblair | mgagne: https://review.openstack.org/269857 | 17:58 |
openstackgerrit | Mathieu Gagné proposed openstack-infra/project-config: grafana: Add Nodepool test nodes capacity per type https://review.openstack.org/275862 | 17:58 |
mgagne | jeblair this is with old setup | 17:58 |
jeblair | mgagne: i'll try that | 17:58 |
mgagne | jeblair this one is from old setup | 17:58 |
mgagne | meeting now, bbl | 17:59 |
jeblair | mgagne: same problem; seems unlikely to be your side, i'll keep digging | 17:59 |
*** jtomasek__ has quit IRC | 17:59 | |
mgagne | could be git-review I updated last week | 17:59 |
*** Sukhdev has quit IRC | 17:59 | |
jeblair | mgagne: i checked a change from december and still see it | 18:00 |
*** Daisy has quit IRC | 18:00 | |
*** mikelk has quit IRC | 18:00 | |
tonyb | fungi: So speaking of power I'm going to +W my open patches to yaml2ical | 18:01 |
jpr | asselin_: I'm getting an error about a null user id when jenkins tries to test the ssh connection with a slave on a stock openstackci config | 18:01 |
tonyb | fungi: Also would running reviewstats over openstack-infra/* be a helpful start? | 18:02 |
*** arxcruz has quit IRC | 18:02 | |
*** trown is now known as trown|lunch | 18:02 | |
jeblair | mgagne: it's something related to the recent py3 changes | 18:02 |
*** rossella_s has quit IRC | 18:02 | |
jeblair | the string we're manipulating is already a unicode string | 18:03 |
*** rossella_s has joined #openstack-infra | 18:03 | |
jpr | asselin_: the jenkins username is configured on the nodepool side and the creds in hiera work against the jenkins account on the slave, but i dont' see anything related to these credentials defined in the jenkins manage credentials ui. having difficulty finding where to look. | 18:03 |
fungi | tonyb: if they're languishing then go ahead. sounds like ttx already +2'd them | 18:03 |
tonyb | fungi: Yeah they all have his +2 | 18:04 |
fungi | tonyb: as for reviewstats, it's less useful for infra because of our council arrangement. the granularity assumptions are vastly different | 18:04 |
*** lucasagomes has joined #openstack-infra | 18:04 | |
fungi | i've been wanting to look into ways to extend it for our use case, but more likely this time around i'll be playing with lots of my own stats generation first just to figure out what insights different aggregations of review data are providing at all | 18:05 |
*** bnemec has quit IRC | 18:05 | |
tonyb | fungi: Ahh ok, I'll leave it to you then I just thought it might be a helpful first step that I could do for you. | 18:06 |
fungi | i do look at reviewstats, it's just very muddy with our hundred or so repos in widely varying areas of expertise | 18:07 |
fungi | i mostly want to see which non-cores seem to have significant interest in individual repos, especially ones with lagging/under-served core review activity | 18:07 |
*** rhallisey has quit IRC | 18:07 | |
fungi | and also see if i can spot trends of reviewing across multiple repos and multiple disciplines | 18:08 |
*** mrmartin has quit IRC | 18:08 | |
fungi | all those provide possible starting points for figuring out whose reviews i should be looking through for what subgroups | 18:08 |
openstackgerrit | Athlan-Guyot sofer proposed openstack-infra/project-config: puppet: initiate testing for puppet-openstack-cookicutter https://review.openstack.org/272156 | 18:09 |
openstackgerrit | Volodymyr Samotiy proposed openstack-infra/project-config: Adding broadview-lib project and jobs. https://review.openstack.org/275873 | 18:09 |
tonyb | fungi: Yeah, I wish I had something more helpful to say, but that sounds great and hard. I think the interested non-cores aspect it probably doable with either reviewstats as a base. | 18:11 |
tonyb | fungi: but as you clearly know it'll be non-trivial. | 18:11 |
*** _nadya_ has quit IRC | 18:12 | |
fungi | right. i want to add support for it in an existing tool if i can, but step 1 is figuring out what sorts of collections/aggregations of these stats are actually useful | 18:13 |
*** shashank_hegde has joined #openstack-infra | 18:14 | |
fungi | tooling to collect the data is not the hard part, deciding what to analyze is | 18:14 |
*** ybathia has joined #openstack-infra | 18:18 | |
tonyb | fungi: yeah. I guess an additional challenge is quality but I s'pose that can be investigated once you've found someeone to investigate. | 18:18 |
tonyb | fungi: I'm trying hard not to say "I'll help you" as I know I don't really have enough time to do it right | 18:18 |
*** Apoorva is now known as apoorvad | 18:18 | |
*** ybathia has left #openstack-infra | 18:19 | |
*** jsavak has quit IRC | 18:20 | |
* anteaya prevents tonyb from jumping under another bus | 18:20 | |
tonyb | *sigh* it's true | 18:20 |
clarkb | any other takers for https://review.openstack.org/#/c/274821/1 ? it should address a major source of failures on bluebox | 18:21 |
jeblair | fungi: wow, the py3k changes from jaypipes obliterated the unicode safety we had :( | 18:22 |
*** e0ne has joined #openstack-infra | 18:24 | |
*** _nadya_ has joined #openstack-infra | 18:25 | |
*** Jeffrey4l has quit IRC | 18:25 | |
AJaeger | fungi, you should be easily able to create some groups for reviewstats, for example a puppet group, a project-config one, and a system-config one. | 18:26 |
*** jsavak has joined #openstack-infra | 18:26 | |
*** pvaneck has joined #openstack-infra | 18:27 | |
AJaeger | fungi, if you want to play with that, I can create a review request so that you have a basis to play with... | 18:27 |
*** sbalukoff has quit IRC | 18:27 | |
*** sabeen3 has quit IRC | 18:29 | |
*** achanda has joined #openstack-infra | 18:29 | |
openstackgerrit | Volodymyr Samotiy proposed openstack-infra/project-config: Adding broadview-lib project and jobs. https://review.openstack.org/275873 | 18:30 |
*** vgridnev has joined #openstack-infra | 18:31 | |
fungi | AJaeger: yeah, again, that's an easy but small problem. what i want to do is data mining to determine whether there are emergent patterns and groups rather than starting the analysis from an existing rigid group structure | 18:32 |
AJaeger | fungi, infra.json in reviewstats is not usable, has the wrong core teams... | 18:33 |
*** electrofelix has quit IRC | 18:33 | |
AJaeger | fungi: seems you need to do it from scratch on your own ;) | 18:33 |
*** unicell has joined #openstack-infra | 18:33 | |
fungi | indeed, a conclusion at which i had already arrived | 18:34 |
fungi | jeblair: which py3k changes? nodepool? | 18:34 |
jeblair | fungi: gertty | 18:34 |
AJaeger | fungi, one idea: Go to http://stackalytics.com/?module=infrastructure-group and dig into individuals | 18:34 |
fungi | oh | 18:34 |
jeblair | fungi: i think i've found the changes that prompted those, and i think i've just about corrected it | 18:35 |
fungi | i think i probably hadn't looked at his python 3 support patches | 18:35 |
jeblair | fungi: i have been reading eavesdrop from 1.5 years ago | 18:35 |
clarkb | zaro: I am tailing the access logs for apache on /p/ | 18:35 |
fungi | that's some serious detective work | 18:35 |
fungi | clarkb: does it seem like we have any of our own processes or jobs hitting that? | 18:35 |
fungi | or is it all from other stuff not under our control? | 18:36 |
clarkb | fungi: ya zuul is occasionally hitting it | 18:36 |
jeblair | clarkb: the mergers? | 18:36 |
*** unicell1 has joined #openstack-infra | 18:36 | |
clarkb | jeblair: no zuul.o.o according to dns | 18:37 |
*** fitoduarte has quit IRC | 18:37 | |
fungi | interesting | 18:37 |
clarkb | 2001:4800:7815:101:3bc3:d7f6:ff04:e07f - - [03/Feb/2016:18:35:52 +0000] "GET /p/openstack/tempest/info/refs?service=git-upload-pack HTTP/1.1" 200 2049431 "-" "Python-urllib/2.7" | 18:37 |
fungi | polling to check merged state of a change? | 18:37 |
AJaeger | jasondotstar: update for all repos .gitreview like done with https://review.openstack.org/274621 | 18:37 |
jeblair | clarkb: just hitting info/refs? | 18:37 |
jeblair | fungi: yeah, i think so if that ^ is true | 18:37 |
*** unicell1 has quit IRC | 18:37 | |
clarkb | jeblair: so far yes | 18:37 |
*** unicell has quit IRC | 18:37 | |
*** unicell1 has joined #openstack-infra | 18:37 | |
jeblair | fungi, clarkb: which is an oops, since that should actually hit real gerrit | 18:37 |
fungi | right, there was a race... | 18:37 |
jeblair | oh | 18:38 |
jeblair | i think i remember | 18:38 |
fungi | where sometimes gerrit's api would report merged state before it actually merged the change on the backend | 18:38 |
jeblair | i think we actually did want to test the mirror so that we would know that not only it had merged, but propogated | 18:38 |
jeblair | that thinking still predates the mirror farm though. | 18:38 |
fungi | i wonder if we should be checking the mirror farm, yeah | 18:38 |
*** unicell1 has quit IRC | 18:39 | |
*** unicell has joined #openstack-infra | 18:39 | |
*** eil397 has joined #openstack-infra | 18:39 | |
fungi | it's possible some of that race window was gerrit not getting around to queuing mirroring the update | 18:39 |
*** exploreshaifali has quit IRC | 18:39 | |
fungi | or the queue being backed up | 18:39 |
jeblair | i kind of think we should change it to point to real gerrit | 18:40 |
openstackgerrit | Merged openstack-infra/yaml2ical: Add support for skipping/excluding meetings https://review.openstack.org/232312 | 18:40 |
*** jsavak has quit IRC | 18:40 | |
jeblair | the mergers pull from real gerrit now, and the critical parts of the changes under test pull from the mergers | 18:40 |
openstackgerrit | Merged openstack-infra/yaml2ical: Add one off events for skipped meetings https://review.openstack.org/235100 | 18:40 |
openstackgerrit | Merged openstack-infra/yaml2ical: Remove argparse from requirements https://review.openstack.org/270408 | 18:40 |
jeblair | so we're not really subject to mirror races like we were | 18:40 |
*** jsavak has joined #openstack-infra | 18:40 | |
fungi | perfect | 18:42 |
krotscheck1 | ~400 zuul queue items is sadpanda | 18:43 |
*** yamahata has joined #openstack-infra | 18:43 | |
*** fedexo has quit IRC | 18:44 | |
dmsimard | krotscheck1: last I know jenkins was happy https://twitter.com/osjenkins :P | 18:44 |
fungi | krotscheck1: rackspace iad was offline for a while earlier, which has made the backlog worse than it would have been | 18:44 |
openstackgerrit | James E. Blair proposed openstack/gertty: Fix unicode regressions https://review.openstack.org/275879 | 18:45 |
openstackgerrit | James E. Blair proposed openstack/gertty: Fix typo in expire-age setting https://review.openstack.org/275880 | 18:45 |
AJaeger | krotscheck1: yeah ;( rax-iad made all devstack-trusty fail earlier, then we took it down, now it's up and everybody rechecked I guess... | 18:45 |
*** dizquierdo has joined #openstack-infra | 18:45 | |
krotscheck1 | dmsimard: http://status.openstack.org/zuul/ | 18:45 |
*** slogan621 has joined #openstack-infra | 18:45 | |
*** doug-fish has quit IRC | 18:45 | |
dmsimard | krotscheck1: yeah, I know :/ | 18:45 |
krotscheck1 | Fun times, fun times. | 18:45 |
*** jsavak has quit IRC | 18:46 | |
krotscheck1 | Do we have historical data for that queue btw? | 18:46 |
* krotscheck1 wants to get a screenshot of "here's what happened when HPCloud went away" | 18:46 | |
fungi | krotscheck1: as in queue size over time? sure | 18:46 |
*** jsavak has joined #openstack-infra | 18:46 | |
clarkb | krotscheck1: it is in graphite | 18:46 |
clarkb | that is what renders the sparklines | 18:46 |
krotscheck1 | neat! | 18:46 |
fungi | those sparklines at the headers... yeah | 18:46 |
fungi | we have history into the wayback | 18:47 |
anteaya | clarkb: the code looks fine to my eyes on 274821 and will have to take your word that this may improve things for bluebox | 18:47 |
anteaya | dougwig: care to join me in a review of 274821? | 18:47 |
cody-somerville | I wonder what's the carbon footprint of OpenStack CI. | 18:47 |
jeblair | fungi, mgagne: https://review.openstack.org/275879 | 18:47 |
* anteaya offers to share her cold remedies with dougwig | 18:47 | |
fungi | cody-somerville: massive, i'm sure | 18:47 |
*** sdake has quit IRC | 18:47 | |
dougwig | anteaya: looking | 18:47 |
AJaeger | jeblair, fungi, clarkb, anteaya : sdague and myself had earlier today a discussion about default timeouts - do you have time now for some discussion and to educate me, please? | 18:47 |
anteaya | dougwig: thanks | 18:47 |
jeblair | cody-somerville: http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=25 | 18:48 |
anteaya | AJaeger: I can listen | 18:48 |
anteaya | AJaeger: I'm not sure how much I can offer by way of compelling argument | 18:48 |
jeblair | cody-somerville: or did you mean carbon *dioxide* ? | 18:48 |
*** ashleighfarnham has joined #openstack-infra | 18:48 | |
cody-somerville | jeblair: the latter <grins> | 18:48 |
AJaeger | anteaya: asking good questions is also helpfull ;) | 18:49 |
fungi | AJaeger: yeah, saw the discussion. summary was "some timeout is needed, but cropping timeouts close means updating more often/failing jobs periodically as their scopes or runtimes creep" | 18:49 |
jeblair | cody-somerville: large but smaller than the alternative i'd wager | 18:49 |
anteaya | AJaeger: I will endeavour to do so | 18:49 |
AJaeger | jeblair, fungi, clarkb, anteaya, sdague: This was triggered by https://review.openstack.org/#/c/272481/ . | 18:49 |
AJaeger | What should I review and accept as default timeout for jobs? I remember that default should be 1 hour. | 18:49 |
AJaeger | And I see some cargo cult in devstack-gate where everybody uses 120 mins as timeout. | 18:49 |
clarkb | AJaeger: unfortunately I think we have regressed well past an hour at this point :/ | 18:50 |
clarkb | but an hour has bene our goal | 18:50 |
fungi | AJaeger: the counterargument of course is that projects ignore glaring ineffciiencies in their jobs until they start hitting a timeout, so many jobs will rapidly erode toward whatever timeout we set | 18:51 |
AJaeger | clarkb: so, should we accept any timeout? Any length for running? | 18:51 |
anteaya | AJaeger: yeah I think if an hour is our goal having the default timeout that folks can cargo cult be 70 minutes make sense to me | 18:51 |
*** deva_ has joined #openstack-infra | 18:51 | |
notmorgan | clarkb, sdague: would you rather i roll up a change for devstack to make milliseconds happen for keystone logs or is second resolution good enough? I am dropping microseconds in either case. | 18:51 |
*** tcammann has joined #openstack-infra | 18:51 | |
anteaya | well having no limits on things doesn't tend to work well for us | 18:51 |
*** esp_ has joined #openstack-infra | 18:51 | |
jeblair | there's significant variation in run times, especially with more clouds, so the timeout should accomodate that. | 18:52 |
anteaya | agreeing to a limit and then educating folks about the purpose of it is useful I think | 18:52 |
*** thiagop has quit IRC | 18:52 | |
AJaeger | sdague mentioned earlier that we should use a timeout of 3 hours - to protect against endless loops and that if people hit the timeout, they will just recheck which is bad | 18:52 |
*** NobodyCa1 has joined #openstack-infra | 18:52 | |
jeblair | 3 hours seems very long | 18:52 |
clarkb | notmorgan: seconds is good enough for me if you think you can debug the interactions between keystone and other services using only second resolution | 18:52 |
fungi | and lower timeouts provide more obvious pushback on projects whose jobs are becoming less efficient, while also penalizing projects whose jobs are simply getting more robust and increasing coverage | 18:52 |
anteaya | jeblair: that is a fair point, do our run times have any suggestions about what a good time out might be? | 18:52 |
fungi | hard to strike a balance | 18:52 |
*** trown|lunch is now known as trown | 18:52 | |
clarkb | notmorgan: mostly just wanted to reduce the number of unique keys in elasticsearch | 18:52 |
clarkb | notmorgan: as we were hitting trouble with the total volume there | 18:52 |
notmorgan | clarkb: absolutely, hence the drop of microseconds in either case | 18:53 |
notmorgan | :) | 18:53 |
anteaya | fungi: yes | 18:53 |
AJaeger | for ironic I checked for some samples and found majority of runtime betwene 30 and 50 mins, dtantsur mentioned not seeing more than 60 mins - so I took 70 there... | 18:53 |
*** jgriffith is now known as jgriffith_away | 18:53 | |
jeblair | anteaya: http://grafana.openstack.org/dashboard/db/nodepool | 18:53 |
notmorgan | clarkb: i don't have a strong feeling that we can/can't debug. i'm content to stick with oslo_log in devstack if it'll help others in your experience | 18:53 |
notmorgan | clarkb: it's a... 2 line change instead of a 1 line change to get milliseconds iirc | 18:54 |
cody-somerville | 90 minutes is probably fair default. (1 hour "target" for all jobs with 30 minutes as grace). | 18:54 |
anteaya | jeblair: :) | 18:54 |
anteaya | cody-somerville: that isn't what happens though | 18:54 |
cody-somerville | If we have jobs that intentionally require longer than that we should probably want to have a conversation about how to best support that and ensure that use of valuable resources is returning appropriate value. | 18:55 |
*** infra-re_ has joined #openstack-infra | 18:55 | |
* AJaeger has started asking during reviews that use the cargo cult of 120mins for runtime of similar jobs and reduce if possible. | 18:55 | |
*** kushal has quit IRC | 18:55 | |
*** thiagop has joined #openstack-infra | 18:55 | |
jeblair | krotscheck1, fungi: wheel-mirror-ubuntu-trusty-amd64.slave.openstack.org exists | 18:55 |
AJaeger | jeblair: great! | 18:56 |
jeblair | /dev/xvde2 on /opt type ext4 (rw,errors=remount-ro,barrier=0) | 18:56 |
krotscheck1 | jeblair: WOOHOO | 18:56 |
jeblair | AFS on /afs type afs (rw) | 18:56 |
fungi | interesting but probably insane thought. i wonder if we could implement "soft" timeouts where a job fails if it runs past a certain threshold but continues running up to the hard timeout where we then wrap up the job and collect artifacts | 18:56 |
*** yumapath has quit IRC | 18:56 | |
cody-somerville | It would almost be nice if we could have additional grace time when running in the gate to avoid the nasty cost of a gate reset. | 18:56 |
anteaya | AJaeger: seems we agree that having a default is valuable, as well as agreeing that deciding what the default should be is hard | 18:56 |
jeblair | fungi: we do that already | 18:56 |
AJaeger | jeblair: how? | 18:56 |
anteaya | AJaeger: seems we also agree that having a default of 3 hours is excessive | 18:56 |
fungi | jeblair: sort of. we still terminate the job workload at the current "soft" timeout | 18:56 |
fungi | i'm talking about an even softer timeout | 18:57 |
*** mrmartin has joined #openstack-infra | 18:57 | |
krotscheck1 | jeblair: Now all we ahve to do is wait forhttps://review.openstack.org/#/c/275854/ to merge. | 18:57 |
fungi | well, we terminate the inner workload at the soft timeout presently | 18:57 |
jeblair | fungi: oh. for what purpose? | 18:57 |
*** jgriffith_away is now known as jgriffith | 18:57 | |
AJaeger | jeblair: idea would be not to terminate but mark still as failure for the "fungi-softtimeout" | 18:57 |
*** infra-red has quit IRC | 18:57 | |
*** tcammann has quit IRC | 18:57 | |
fungi | making it easier to analyze what parts of the job are slow by allowing the tests to hopefully run to completion even though the job will fail for running longer than we want | 18:58 |
jeblair | fungi: has anyone expressed an interest in doing that? | 18:58 |
*** esp_ has quit IRC | 18:58 | |
cody-somerville | I think that is already supported by just looking at the data for outliers or upper quarter of runtimes. | 18:58 |
fungi | no, just brainstorming how else to provide pushback on long-running jobs without a hard-stop to what's being tested | 18:58 |
jeblair | fungi: also, we do copy as much as has run already to the timeout, are we certain that we need more data than that to determine why it's slow? | 18:59 |
cody-somerville | StackViz is all about being able to visualize the runtime of dsvm jobs. | 18:59 |
fungi | yep, was a crazy idea as i predicted | 18:59 |
*** NobodyCa1 has quit IRC | 18:59 | |
cody-somerville | I believe they're still working to implement that so it runs for all jobs? | 18:59 |
anteaya | tonyb: thank you for 232312 | 18:59 |
cody-somerville | https://etherpad.openstack.org/p/BKgWlKIjgQ | 19:00 |
*** deva_ has quit IRC | 19:00 | |
tonyb | anteaya: yw | 19:00 |
cody-somerville | ala this would be the output: https://static.timothyb89.org/stackviz2/#/ | 19:00 |
cody-somerville | anyhow, that's a tangent. | 19:00 |
fungi | basically at a loss for how to balance job performance regression against job coverage increases | 19:00 |
tonyb | anteaya: I'll rebase my other patches and then work with ttx, dhellman to get a release of yaml2ical out the door | 19:01 |
*** notnownikki has quit IRC | 19:01 | |
tonyb | anteaya: once that's done you'll be able to use the skip feature | 19:01 |
fungi | so that it's clear to projects when they should be looking at increasing their timeouts or splitting their job scopes, vs when they should be taking a hard look at the performance of their tests | 19:01 |
tonyb | anteaya: so it should be ready for Austin :) | 19:01 |
anteaya | tonyb: wooot, and I've agreed to review more patches especially ones that use the skip feature | 19:01 |
anteaya | tonyb: yay!! | 19:01 |
tonyb | anteaya: I | 19:01 |
anteaya | thank you thank you thank you | 19:01 |
tonyb | ll poke you if needed ;P | 19:02 |
anteaya | do that please | 19:02 |
cody-somerville | fungi: jeblair: Would you say that the system works best with super-consolidated long running jobs vs. more granular but still lengthy jobs? (assuming we already know that many small jobs is not efficient due to the overhead of vm creation/deletion) | 19:02 |
tonyb | anteaya: I don't really know how well used it'll be. I think it's most usful for 'edge consumers' people that arrn't on IRC / don't read os-dev | 19:03 |
tonyb | anteaya: but time will tell. | 19:03 |
clarkb | zaro: jeblair fungi I am resonably confident that the majority of the things hitting the /p/ url are instances of zuul. Our own and for third parties | 19:03 |
clarkb | I don't see much other traffic against that /p/ prefix | 19:03 |
jeblair | cody-somerville: i don't think it's worth worrying about too much | 19:03 |
fungi | the common story is projects coming to us wanting a "temporary" timeout bump so that they can work out the performance issues they've grown (which tey weren't concerned with until it started regularly reaching the timeout), then coming to us again not long after asking for another "temporary" increase because their performance had regressed further and they hadn't worked on solving it | 19:03 |
anteaya | tonyb: well I'll use it for the third-party meetings, as I do feel that group fits your description | 19:03 |
tonyb | anteaya: I have additional features that I'd very much like to add | 19:03 |
anteaya | tonyb: I look forward to patches | 19:04 |
*** infra-red has joined #openstack-infra | 19:04 | |
anteaya | or etherpads | 19:04 |
tonyb | anteaya: totally :) that's why we wrote the code :) | 19:04 |
anteaya | :) | 19:04 |
*** julim has quit IRC | 19:04 | |
anteaya | fungi: yes | 19:04 |
clarkb | zaro: jeblair fungi with that in mind a better alternative to redirecting to git.o.o maybe to simply have gerrit serve those refs directly and stop using a local replica? | 19:04 |
clarkb | then we can allocate more memory to gerrit on the whole | 19:05 |
AJaeger | open review for timeout increase: https://review.openstack.org/270134 | 19:05 |
fungi | cody-somerville: technically yes, by a small margin (a reduction in the amount of quota occupied by building and deleting nodes) | 19:05 |
fungi | cody-somerville: but really a premature optimization | 19:05 |
fungi | clarkb: yes, i think so, and also adjust zuul to go to the right url rather than hitting the transparent rewrite | 19:06 |
jeblair | i much prefer the approach of people just building the jobs they need and not worrying about that for anything less than, say, a 5 minute job. | 19:06 |
*** infra-re_ has quit IRC | 19:06 | |
cody-somerville | jeblair: fungi: thinking more from a queue theory perspective vs. optimizing away that particular overhead. | 19:06 |
jeblair | cody-somerville: ah, well in theory there's no difference :) | 19:06 |
*** julim has joined #openstack-infra | 19:06 | |
*** yaume_ has quit IRC | 19:07 | |
clarkb | I can whip up that change and we can argue about it in gerrit I guess :) | 19:07 |
fungi | clarkb: with the goal of eventually dropping the rewrite when we see its use trail off | 19:07 |
*** bhunter71 has quit IRC | 19:08 | |
jeblair | cody-somerville: in practice, the answer is very difficult (we've discussed the merits of greater nodepool efficiency with long jobs, but neglected the greater zuul efficiency (early aborting of changes) with short jobs) | 19:08 |
*** mrmartin has quit IRC | 19:08 | |
clarkb | fungi: yup | 19:08 |
*** infra-re_ has joined #openstack-infra | 19:08 | |
jeblair | cody-somerville: in fact, we were so concerned with the latter at some point that we wanted to parse the subunit stream to abort changes *even earlier* | 19:08 |
dougwig | fungi: can't folks debug those kinds of jobs by commenting out some tests and making the job smaller, until its sorted? i'm not sure that's an infra problem. | 19:09 |
jeblair | i still think that would be neat, but it doesn't seem pressing | 19:09 |
clarkb | I just have to find where we set that :) | 19:09 |
anteaya | AJaeger: thank you for bringing up this topic for discussion | 19:10 |
anteaya | AJaeger: we don't seem to have a clear happy outcome yet but at least we are discussing it | 19:10 |
AJaeger | anteaya: yeah... | 19:11 |
fungi | dougwig: yep, and also there's the "we shouldn't have to police this" aspect. let projects be accountable for making their tests efficient, and assume good faith | 19:11 |
*** dizquierdo has quit IRC | 19:11 | |
*** infra-red has quit IRC | 19:12 | |
jeblair | heh, i forgot about tmpreaper; we actually do clean up slave tmp dirs frequently :) | 19:12 |
fungi | ooh! the zuul job backlog _seems_ to be steadily dropping over the past hour | 19:12 |
cody-somerville | fungi: providing a "budget" isn't necessarily a bad idea though, no? | 19:12 |
fungi | cody-somerville: right, though also figuring out who gets what budget becomes a political quagmire | 19:13 |
*** sbalukoff has joined #openstack-infra | 19:13 | |
fungi | in some ways it's easier to let the community collectively provide pushback against the rare selfish actors rather than setting rigid guidelines which make things inconvenient for everyone (including on those who have to enforce them) | 19:14 |
jeblair | krotscheck1, fungi: https://jenkins.openstack.org/computer/wheel-mirror-ubuntu-trusty-amd64.slave.openstack.org/ is online | 19:14 |
*** Sukhdev has joined #openstack-infra | 19:14 | |
anteaya | fungi: except when the community doesn't provide pushback | 19:14 |
AJaeger | and that's also a question, fungi: Should we enforce or ask them to explain their timeout value? | 19:15 |
fungi | anteaya: also correct | 19:15 |
jeblair | krotscheck1: /etc/wheel.keytab is root-owned; you'll need to make it owned by jenkins i think | 19:15 |
fungi | AJaeger: i think expecting an explanation is entirely warranted | 19:15 |
krotscheck1 | jeblair: Hrm, good point. | 19:15 |
jeblair | krotscheck1, fungi: unless we want to make a sudo rule for that? | 19:15 |
cody-somerville | luckily the gate is relatively stable. long actual runtimes + flaky jobs in the integrated gate is killer for throughput. | 19:15 |
anteaya | if it is other than the default | 19:15 |
fungi | AJaeger: for any proposed change, not just timeout increases | 19:15 |
jeblair | cody-somerville: i don't think that's luck :) | 19:16 |
AJaeger | So, default value that we review for is 60 mins (the wrapper is then 65 mins)? | 19:16 |
anteaya | AJaeger: I commented on https://review.openstack.org/#/c/270134/2 | 19:16 |
fungi | jeblair: krotscheck1: i prefer relying on the filesystem permissions if we can | 19:16 |
anteaya | AJaeger: I can support that | 19:16 |
jeblair | fungi: owned by jenkins? | 19:16 |
openstackgerrit | Clark Boylan proposed openstack-infra/zuul: Don't use /p/ path for info/refs https://review.openstack.org/275893 | 19:16 |
*** deva_ has joined #openstack-infra | 19:16 | |
clarkb | something like that | 19:16 |
cody-somerville | jeblair: Good point. Lots of hard work has been put into that. :) | 19:16 |
*** annegentle has joined #openstack-infra | 19:16 | |
krotscheck1 | on it | 19:16 |
AJaeger | and if there is a larger timeout, we ask for an explanation - and trust a good explanation | 19:16 |
jeblair | fungi: i was mostly exploring the idea that maybe we wanted to have a sudo rule so that the key isn't actually available to jenkins | 19:17 |
AJaeger | ? | 19:17 |
*** tcammann_ has joined #openstack-infra | 19:17 | |
jeblair | fungi: but i think actually closing that gap would be very difficult | 19:17 |
*** achanda has quit IRC | 19:17 | |
*** esp_ has joined #openstack-infra | 19:17 | |
*** annegentle has quit IRC | 19:17 | |
dougwig | AJaeger: don't most dsvm jobs use 125/120 ? Or did that change? | 19:17 |
AJaeger | So, what should be done with the ironic change? https://review.openstack.org/272481 ? Do you want +2 it besides sdague -1 | 19:17 |
fungi | jeblair: i suppose it's a question of relying on there not being a bug in sudo or the tools being allowed by sudo which could leak the file, vs just accepting the user has access to that file and planning accordingly | 19:17 |
AJaeger | dougwig: many do - and new jobs just copy that value without thinking.... | 19:18 |
*** annegentle has joined #openstack-infra | 19:18 | |
*** jgriffith is now known as jgriffith_away | 19:18 | |
openstackgerrit | Michael Krotscheck proposed openstack-infra/system-config: Changed wonership of wheel.keytab to jenkins https://review.openstack.org/275895 | 19:18 |
jeblair | fungi: yeah, i lean toward the latter for now (we already decided to do that, i just hadn't really thought of sudo before now) | 19:18 |
anteaya | AJaeger: well I'm really not a fan of +2'ing something someone else has -1'd | 19:18 |
AJaeger | dougwig: the usual cargocult - use what's available without checking what is sensible... | 19:18 |
*** NobodyCa1 has joined #openstack-infra | 19:18 | |
dougwig | AJaeger: that's what I do, didn't even realize it was incorrect. though i've never seen a >60 job that didn't need to be shot in the head, come to think of it. | 19:18 |
jeblair | anteaya: disagreement is permitted :) | 19:18 |
krotscheck1 | jeblair, fungi: ^^ something something 400 things in zuul something | 19:19 |
* dougwig adjusts his internal constraints. | 19:19 | |
anteaya | jeblair: true | 19:19 |
AJaeger | anteaya: we don't need to +A that one... | 19:19 |
krotscheck1 | Doh type | 19:19 |
krotscheck1 | typo | 19:19 |
openstackgerrit | Michael Krotscheck proposed openstack-infra/system-config: Changed ownership of wheel.keytab to jenkins https://review.openstack.org/275895 | 19:19 |
jeblair | krotscheck1: even your typos have typos | 19:19 |
*** sabeen1 has joined #openstack-infra | 19:19 | |
AJaeger | And I guess we have to continue this discussion another time based on the reviews coming in. | 19:19 |
anteaya | AJaeger jeblair how would we feel if we give that change 24 hours to give sdague a chance to read backscroll if he would like to | 19:19 |
krotscheck1 | jeblair: Yo dawg, I heard you like typo's.... | 19:19 |
AJaeger | anteaya: perfectly fine with me - I can keep my WIP for a day or two... | 19:20 |
jeblair | fungi: https://review.openstack.org/275895 | 19:20 |
anteaya | AJaeger: thanks for understanding, it at least gives him an opportunity to read the discussioni | 19:20 |
AJaeger | anteaya: sure. I'd loved to have him here as well... | 19:21 |
openstackgerrit | Daniel Wallace proposed openstack-infra/shade: granting and revoking privs to users and groups https://review.openstack.org/268404 | 19:21 |
anteaya | AJaeger: understood, me too, async working times and all | 19:21 |
*** sabeen2 has joined #openstack-infra | 19:21 | |
fungi | AJaeger: comes down to reviewer judgement and how thorough of an explanation you're looking for. and if there are other core reviewers who are willing to disregard your request and +2/+3 over top of you, then those are expectation discussions to be reraised as a group | 19:21 |
*** _nadya_ has quit IRC | 19:21 | |
*** esp_ has quit IRC | 19:21 | |
AJaeger | fungi: so far, I just asked questions and didn't vote on these... | 19:22 |
*** hasharDinnerTime is now known as hashar | 19:22 | |
AJaeger | Once I was happy with answer - or the change - I gave my +2 ;) | 19:22 |
*** tcammann_ has quit IRC | 19:22 | |
dougwig | AJaeger: gotta juice that disagreement stat! | 19:22 |
anteaya | ha ha ha | 19:22 |
anteaya | I'm sure taht is at the top of AJaeger's mind when he reviews | 19:23 |
*** deva_ has quit IRC | 19:23 | |
fungi | as for picking a sane "default" timeout for a given template, it's a tuning question (how often do we want to be reviewing requests to increase timeouts on specific jobs vs drawbacks of just increasing the starting/default timeout expectation) | 19:23 |
*** NobodyCa1 has quit IRC | 19:23 | |
*** sabeen1 has quit IRC | 19:24 | |
AJaeger | what I've learned is that a full neutron setup needs more than 1 hour... But if neutron is not in the equation or only a small subset is run - like in the ironic case, we should be fine with an hour... | 19:24 |
jeblair | krotscheck1: i also aprvd the wheel build job since the slave exists; i'm waiting on the release job until after we manually vos release a few times | 19:24 |
krotscheck1 | jeblair: ...ooookaaaaaaay. | 19:25 |
AJaeger | fungi, I'm not looking for a thorough explanation, I'm looking that they thought about the value and have some comparasions done - instead of blindly copying a value... | 19:25 |
anteaya | AJaeger: +1 | 19:25 |
krotscheck1 | jeblair: That's going to fail a few times if wheel.keytab's not set up yet. | 19:25 |
fungi | i assume all the bare uuid hosts now appearing on http://puppetboard.openstack.org/ are infra-cloud servers? | 19:25 |
slogan621 | I assume only people on infra team can review this: https://review.openstack.org/#/c/275873/ ? That's ok if so, just wondering. | 19:26 |
slogan621 | BTW, hi :-) | 19:26 |
AJaeger | slogan621: everybody can review | 19:26 |
jeblair | krotscheck1: it's been approved too, and i'm hoping it lands before jjb gets around to updating, but yeah, they are racing. | 19:26 |
krotscheck1 | go patch go! | 19:26 |
crinkle | fungi: no i don't think so | 19:26 |
jeblair | elephant race | 19:26 |
fungi | i'll see if i can track some down then | 19:26 |
dougwig | AJaeger: oh indeed, some neutron are cresting an hour. that needs some investigation, i think. | 19:27 |
AJaeger | thanks all for the comments on the job run times, I'll feel better continuing to review... | 19:27 |
slogan621 | I'd suppose I'd prefer to let the experts on infra do the review, though I certainly want to support it, or vote if that matters. | 19:27 |
zaro | clarkb: comment on that zuul change. | 19:27 |
*** bpokorny_ has joined #openstack-infra | 19:29 | |
*** jgriffith_away is now known as jgriffith | 19:29 | |
fungi | also we seem to have a lot (~50) of nodepool nodes stuck in delete in rackspace dfw and ord for between 1 and 6 hours now | 19:29 |
anteaya | AJaeger: thanks for bringing it up for discussion :) | 19:30 |
*** bpokorny_ has quit IRC | 19:30 | |
*** bpokorny_ has joined #openstack-infra | 19:30 | |
sdague | back from food | 19:30 |
anteaya | slogan621: you might find this blog post of mine enjoyable: http://anteaya.info/blog/2013/03/21/reviewing-an-openstack-patch/ | 19:30 |
anteaya | sdague: welcome back | 19:31 |
sdague | my point is the 3 hour timeout is the backstop for runaways | 19:31 |
sdague | no one wants jobs that run that long | 19:31 |
sdague | every time anything crests an hour regularly it, rightly, causes angst | 19:31 |
jeblair | 1.5 or 2x should be good for backstop, though, yeah? | 19:32 |
*** bpokorny has quit IRC | 19:32 | |
sdague | and the gate fails that happen when a 60 minute job becomes 61 minutes, and times out uploading files is just terrible experience | 19:32 |
fungi | wow, here's a nova list result for an example of one of the "stuck" nodes... weird: | aa512172-d34f-4114-af1b-2300f47b1e0a | bare-trusty-rax-ord-7677301 | BUILD | deleting | NOSTATE | public=2001:4801:7829:101:be76:4eff:fe11:cd99, 192.237.175.126; private=10.210.225.116 | | 19:32 |
*** bpokorny_ has quit IRC | 19:32 | |
sdague | jeblair: sure, if you are going to keep updating as things shift around | 19:33 |
jeblair | sdague: we shouldn't timeout uploading files; if we are, we can adjust the cushion in devstack-gate for that | 19:33 |
sdague | my point was mostly that a large common timeout that ensures things return, eventually, is fine. No one runs up to that load intentionally. | 19:34 |
sdague | it just seems like a lot of extra work to judge every job | 19:34 |
jeblair | sdague: yeah, i agree with you on that, i just lean toward 2h instead of 3h for that purpose. :) | 19:34 |
*** bpokorny has joined #openstack-infra | 19:34 | |
AJaeger | we have two timeouts for the devstack jobs, the wrapper one and the devstack variable - the wrapper is an extra 6 mins. We can easily enlarge that wrapper one. | 19:34 |
jeblair | AJaeger: oh, that's in p-c, right, sorry i misplaced it :) | 19:35 |
*** e0ne has quit IRC | 19:35 | |
*** achanda has joined #openstack-infra | 19:35 | |
jeblair | we should definitely do that if we are having upload timeouts; sdague do you know if that's a current problem? | 19:35 |
sdague | sure, 2h is probably fine. There are some 3h ones in there for some jobs that end up on slower nodes and already start at > 1 hour | 19:35 |
sdague | jeblair: I don't know if we are right now, I've seen it in the past | 19:35 |
openstackgerrit | Merged openstack-infra/zuul: Fix memory leak reloading triggers https://review.openstack.org/275483 | 19:35 |
sdague | the dual timeout management bit is just a little goofy | 19:35 |
jeblair | sdague: ok; well if you do see it, it's easily fixable | 19:36 |
sdague | didn't zaro build a jenkins module to make us not have to do that | 19:36 |
AJaeger | jeblair: p-c? | 19:36 |
slogan621 | anteaya: thanks | 19:36 |
jeblair | AJaeger: project-config | 19:36 |
AJaeger | jeblair: ah - yeah, jenkins/jobs/devstack-gate.yaml... | 19:36 |
* AJaeger needs to step out for 10 mins | 19:36 | |
jeblair | i'm going to go work on goofy things | 19:36 |
*** ihrachys has joined #openstack-infra | 19:37 | |
zaro | sdague: i remember fixing stuff in timeout plugin but don't remember what it was atm | 19:38 |
zaro | clarkb: have we ever seen this issue? https://gerrit-review.googlesource.com/#/c/70627 | 19:38 |
*** daemontool has joined #openstack-infra | 19:38 | |
clarkb | sdague: zaro the plugin work happened but no one updated the jobs | 19:38 |
clarkb | zaro: that may be related to apache 500ing | 19:39 |
zaro | clarkb: i remember updating the macro upon the fix to the plugin | 19:39 |
ikhudoshyn | Hello everybody! Could you please take a look at https://review.openstack.org/#/c/274668/ | 19:39 |
ikhudoshyn | clarkb: jeblair AJaeger sdague ^^ | 19:40 |
zaro | clarkb: the disccusion [1] has a specific stack trace. | 19:40 |
*** annegentle has quit IRC | 19:40 | |
ikhudoshyn | we're about to move Rally testing to keystone api v3 and to add a dedicated job for v2 | 19:40 |
anteaya | slogan621: welcome, do join us and look at a few patches | 19:41 |
anteaya | slogan621: starting out by saying "I looked at this." is fine | 19:41 |
krotscheck1 | jeblair: Daycare mode. Back when possible. | 19:42 |
*** krotscheck1 is now known as krotscheck_dcm | 19:42 | |
clarkb | zaro: I can check for that in a bit | 19:42 |
*** achanda has quit IRC | 19:43 | |
*** annegentle has joined #openstack-infra | 19:43 | |
*** vgridnev has quit IRC | 19:43 | |
*** boris-42 has quit IRC | 19:43 | |
*** _nadya_ has joined #openstack-infra | 19:44 | |
*** Sukhdev has quit IRC | 19:44 | |
zaro | sdague, clarkb : RE: jenkins timeout macro, https://review.openstack.org/#/c/95933/ | 19:44 |
*** sdake has joined #openstack-infra | 19:47 | |
*** ociuhandu has quit IRC | 19:47 | |
clarkb | zaro: ah ok so the missing piece is in devstack-gate to do the soft timeout | 19:48 |
clarkb | thenremove all the vars for setting in the jobs | 19:48 |
*** infra-red has joined #openstack-infra | 19:49 | |
slogan621 | andreas: would you suggest setting up pypi now is important, or should we just remove references to pypi and revisit at a later time? I don't believe pyi is a requirement of all projects, correct? | 19:50 |
anteaya | slogan621: pypi is not a requirement | 19:51 |
AJaeger | slogan621: you can revisit at a later time. | 19:51 |
anteaya | slogan621: if you are uncertain remove the job for now | 19:51 |
slogan621 | okay, we'll do that, and thanks for the catch | 19:51 |
*** daemontool has quit IRC | 19:51 | |
AJaeger | personally, i would do it now so that nobody else takes the name ;) but that's your call. | 19:51 |
anteaya | and as AJaeger says, add it when you want to support it | 19:51 |
anteaya | slogan621: you can register the name on pypi but still hold off on the pypi job if you like | 19:52 |
anteaya | and as AJaeger getting the name is useful | 19:52 |
*** infra-re_ has quit IRC | 19:52 | |
AJaeger | zaro, we do use build-timeout ... | 19:52 |
jeblair | zaro, clarkb: i will propose a d-g patch | 19:52 |
*** daemontool has joined #openstack-infra | 19:53 | |
*** rajinir has quit IRC | 19:55 | |
AJaeger | jeblair: so, missing piece is to take in d-g BUILD_TIMEOUT variable? And then change project-config to not have two timeouts? | 19:55 |
clarkb | zaro: [2016-02-02 23:13:16,765] WARN org.eclipse.jetty.io.SelectorManager : Could not process key for channel java.nio.channels.SocketChannel shows up pretty frequently in yesterday's error log | 19:55 |
clarkb | zaro: does not show up in todays but we restarted gerrit which may explain it | 19:55 |
*** diana_clarke has quit IRC | 19:57 | |
*** jsavak has quit IRC | 19:58 | |
*** jsavak has joined #openstack-infra | 19:58 | |
*** achanda has joined #openstack-infra | 19:59 | |
*** baoli_ has quit IRC | 19:59 | |
zaro | clarkb: wouldn't that issue cause incoming connections to bloat? did we see that? | 20:00 |
*** sdake_ has joined #openstack-infra | 20:00 | |
*** annegentle has quit IRC | 20:01 | |
AJaeger | ikhudoshyn: what identity version do you currently test against (without the change)? | 20:01 |
clarkb | no I think this is a symptom of incoming connections bloating | 20:01 |
*** slogan621 has quit IRC | 20:02 | |
*** annegentle has joined #openstack-infra | 20:02 | |
*** vgridnev has joined #openstack-infra | 20:02 | |
ikhudoshyn | AJaeger: once sdague reverted that patch to Devstack, we currently test against v2 | 20:02 |
*** diana_clarke has joined #openstack-infra | 20:02 | |
AJaeger | ikhudoshyn: so, your change does two things: 1) Switch default for all jobs to v3; 2) Add new v2 job. Correct? | 20:03 |
ikhudoshyn | AJaeger: with my pathc we're about to run everything about v3 explicitly | 20:03 |
ikhudoshyn | and have one job that runs against v2, explicitly as well | 20:03 |
*** annegentle has quit IRC | 20:03 | |
*** sdake has quit IRC | 20:03 | |
*** annegentle has joined #openstack-infra | 20:04 | |
zaro | clarkb: alright, want to try to increase maxQueued to 200 before increasing memory? | 20:04 |
fungi | as for the bare uuid hosts showing up in puppetdb, it looks like we've somehow started showing no hostnames for unreported hosts | 20:04 |
zaro | i can create the changes for that | 20:04 |
AJaeger | ikhudoshyn: I'll comment on the patch - and will also request a better commit message. | 20:04 |
clarkb | zaro: we can, though I think it is all tied together, bigger queues require more memory | 20:04 |
clarkb | zaro: but its worth a shot | 20:04 |
*** annegentle has quit IRC | 20:05 | |
zaro | clarkb: alright gotta afk for a little bit but changes should be up before end of day. | 20:05 |
fungi | did we merge something in the past day or so which could alter how puppetdb tracks reporters? | 20:05 |
*** annegentle has joined #openstack-infra | 20:05 | |
ikhudoshyn | AJaeger: yep, thanks for pointing that. I'll update the message right now | 20:06 |
AJaeger | ikhudoshyn: wait for my comments ;) | 20:06 |
AJaeger | ikhudoshyn: added them... | 20:06 |
clarkb | zaro: it is also entirely possible that queues are small cost and queuing up avoids 500s that make apache unhappy | 20:07 |
clarkb | zaro: so ya I would go ahead and give it a shot | 20:07 |
openstackgerrit | Costin Galan proposed openstack/requirements: Remove "args[0]" before trying to rename the file https://review.openstack.org/275905 | 20:07 |
*** annegentle has quit IRC | 20:08 | |
jeblair | zaro, clarkb: i don't see evidence that BUILD_TIMEOUT is set on this page: https://jenkins02.openstack.org/job/gate-nova-docs/4663/injectedEnvVars/ | 20:08 |
*** annegentle has joined #openstack-infra | 20:08 | |
clarkb | jeblair: I agree, I see it in the config but not the injected vars list | 20:09 |
*** ihrachys has quit IRC | 20:10 | |
openstackgerrit | James E. Blair proposed openstack-infra/devstack-gate: Use BUILD_TIMEOUT https://review.openstack.org/275908 | 20:10 |
jeblair | clarkb, zaro: ^ i included a debugging echo, so we can check that it's actually there | 20:11 |
jeblair | sdague: ^ | 20:11 |
*** ybathia has joined #openstack-infra | 20:12 | |
fungi | so... it appears we have duplicate keys in the puppetdb postgres backend... this is starting to trigger a bit of déjà vu for me | 20:12 |
jeblair | fungi: you and the dbms apparently | 20:13 |
fungi | started showing up in the log at 2016-02-03 18:41:59 | 20:13 |
fungi | jeblair: ha! yes | 20:13 |
*** imcsk8 has joined #openstack-infra | 20:14 | |
sdague | jeblair: cool | 20:14 |
*** nmagnezi has joined #openstack-infra | 20:14 | |
sdague | seems like we might want to bump up the default to 120 in d-g at the same time, as 60 is not a great default for it if not provided | 20:15 |
sdague | that that will be super helpful to not have 2 values in the template | 20:15 |
AJaeger | jeblair: https://jenkins02.openstack.org/job/gate-tempest-dsvm-centos7/467/console | 20:15 |
AJaeger | "BUILD_TIMEOUT is 7500000" | 20:16 |
*** ldnunes has quit IRC | 20:16 | |
jeblair | sdague: yeah probably so; was just trying to keep change minimal | 20:16 |
jeblair | AJaeger: HA! | 20:16 |
*** jsavak has quit IRC | 20:16 | |
fungi | beginning at 18:40:47 jenkins02 reports (for example) are now indicated in the logs reporting as 44dbb9fb-dce7-4694-9365-91e25f97eb1f instead | 20:16 |
fungi | and also as jenkins02 | 20:16 |
clarkb | it is over 90000 | 20:16 |
*** annegentle has quit IRC | 20:16 | |
sdague | AJaeger: nice | 20:16 |
AJaeger | jeblair: but the job has timeout: 125 | 20:17 |
jeblair | AJaeger: assuming that's milliseconds (java standard), that converts to 125 minutes | 20:17 |
jeblair | so that seems to match up | 20:17 |
imcsk8 | hello, where can i download a centos image like the one is used on the gate jobs? | 20:17 |
sdague | ah, yeh, that makes a ton of sense | 20:17 |
openstackgerrit | Illia Khudoshyn proposed openstack-infra/project-config: Run all Rally jobs against Keystone API v3. Add job for v2. https://review.openstack.org/274668 | 20:17 |
jeblair | so we just need to divide by 60k | 20:17 |
fungi | so somehow we've got hosts now reporting into puppetdb as both their hostname and their nova uuid, and i think that's the source of the duplicate keys it's complaining about | 20:17 |
*** jsavak has joined #openstack-infra | 20:17 | |
AJaeger | jeblair: yes, should be milliseconds - your math is right. It works! | 20:18 |
jeblair | fungi: are they doing it during the same runs? | 20:18 |
anteaya | dougwig: you might want to take a peek at https://review.openstack.org/#/c/275908/ | 20:18 |
fungi | yes | 20:18 |
AJaeger | jeblair: output the value in minutes, please ;) | 20:18 |
fungi | need to see what changes we've made today that would have kicked in an hour or so ago | 20:18 |
jeblair | fungi: hrm; i might expect some hosts to switch between name and uuid as we spun up a replacement but then deleted an old one; but both at once sounds buggy. | 20:19 |
fungi | also the example i'm looking at is jenkins02. it's not been replaced in ages | 20:20 |
fungi | oh, actually its in the log with a "replace facts" under the uuid but "store report" under the hostname | 20:21 |
fungi | so maybe this is not a duplicate report | 20:21 |
fungi | though the duplicate key tracebacks look, from the index column name, like they're probably report unique identifiers | 20:22 |
openstackgerrit | James E. Blair proposed openstack-infra/devstack-gate: Use BUILD_TIMEOUT https://review.openstack.org/275908 | 20:22 |
*** fitoduarte has joined #openstack-infra | 20:23 | |
AJaeger | anteaya: could you review https://review.openstack.org/#/c/274832/ after a rebase again, please? The 3 stacked changes are fine to merge now | 20:23 |
jeblair | AJaeger, clarkb, zaro, sdague: this looks right, so i'm going to drop the debug lines now: https://jenkins03.openstack.org/job/gate-tempest-dsvm-centos7/329/console | 20:25 |
AJaeger | jeblair: really two hours now as default? | 20:25 |
AJaeger | jeblair: yes, it's fine. | 20:25 |
jeblair | AJaeger: i think that makes sense for devstack-gate itself, yeah. this won't affect us running in jenkins so we don't need to bikeshed too much over it :) | 20:26 |
*** Sukhdev has joined #openstack-infra | 20:28 | |
AJaeger | jeblair: so, what would we set in jenkins? I agree, jenkins value is important... | 20:28 |
imcsk8 | pabelanger: hello, do you have the centos gate image for download? or a way for me to generate it? | 20:28 |
jeblair | AJaeger: well, that's the earlier conversation :) | 20:28 |
AJaeger | ;) | 20:28 |
openstackgerrit | James E. Blair proposed openstack-infra/devstack-gate: Use BUILD_TIMEOUT https://review.openstack.org/275908 | 20:30 |
jeblair | AJaeger, clarkb, zaro, sdague: ^ should be ready | 20:30 |
*** sabeen2 has quit IRC | 20:31 | |
*** sabeen1 has joined #openstack-infra | 20:31 | |
sdague | jeblair: +2 | 20:31 |
clarkb | jeblair: trade you https://review.openstack.org/#/c/274821/1 ? which is another d-g reliability improvement | 20:31 |
clarkb | sdague: ^ you may be intrested in that one as well | 20:31 |
*** jsavak has quit IRC | 20:31 | |
jeblair | clarkb: do we want to follow up with bluebox? | 20:32 |
fungi | nibalizer: do you have much experience delving into puppetdb database issues? | 20:32 |
clarkb | jeblair: I think we do, but at the same time we don't want to wait on that | 20:32 |
jeblair | clarkb: are we ready to say the letters j l k? :) | 20:32 |
jeblair | clarkb: i wasn't clear on whether we still thought the conntrack module could still be an issue | 20:32 |
sdague | clarkb: oh, is that why internap is an issue? | 20:33 |
clarkb | sdague: and bluebox | 20:33 |
jeblair | sdague: internap should be fixed, right clarkb ? | 20:33 |
clarkb | jeblair: I think it may still be the issue | 20:33 |
clarkb | jeblair: maybe, the health dashboard is buggy so I wasn't able to say definitively | 20:33 |
fungi | i'm not finding the credentials to log into the puppetdb postgres instance to perform queries interactively | 20:33 |
clarkb | jeblair: it looked like we had a 1:1 pass fail ratio but some passes were actually fails | 20:33 |
*** jsavak has joined #openstack-infra | 20:33 | |
jeblair | clarkb: logstash? | 20:34 |
clarkb | jeblair: I hvaen't asekd it yet but can do so really quick | 20:34 |
sdague | clarkb: do you have multinode jobs passing on one of them with this? | 20:34 |
sdague | the full job was ovh in that run | 20:34 |
clarkb | sdague: the grenade one passed on bluebox on one of the rechecks, the non grenade one is independently broken | 20:35 |
clarkb | volume based live migration is failing frquently regardless of gre/vxlan | 20:35 |
openstackgerrit | Merged openstack-infra/project-config: Re-Enable django_openstack_auth/designate-dashboard translations https://review.openstack.org/274832 | 20:35 |
sdague | clarkb: sure, just that one test was failing though | 20:35 |
clarkb | right, we no longer fail on ssh issues | 20:36 |
clarkb | we fail on other things that were masked by ssh issues | 20:36 |
jeblair | clarkb: so i should see a gate-grenade-dsvm-multinode on bb? | 20:36 |
jeblair | (i haven't found it yet) | 20:36 |
clarkb | jeblair: it may have been internap let me look at the logs | 20:37 |
tristanC | greeting folks, is it normal if recheck comment doesn't retrigger checks ? | 20:37 |
anteaya | tristanC: what patch? | 20:37 |
jeblair | clarkb: i'm just seeing rax and ovh for that job on that change | 20:38 |
fungi | tristanC: the usual reason it wouldn't is if it's already running and hasn't reported yet | 20:38 |
clarkb | hrm I thought I got it to run on either internap or bb for each of the three multinode jobs | 20:38 |
*** ybathia has quit IRC | 20:39 | |
clarkb | ya I don't see it either so I must've misremembered | 20:39 |
clarkb | it definitely ran the nova net job on bluebox and failed | 20:39 |
tristanC | anteaya: it's https://review.openstack.org/#/c/275736/ , but fungi nailed it and it's still queued | 20:39 |
clarkb | but that seemed "normal" | 20:39 |
jeblair | clarkb: oh there's a neutron: http://logs.openstack.org/21/274821/1/check/gate-tempest-dsvm-neutron-multinode-full/49ae46c/console.html | 20:39 |
jeblair | two: http://logs.openstack.org/21/274821/1/check/gate-tempest-dsvm-neutron-multinode-full/f5fa3ad/console.html | 20:40 |
jeblair | one each success/failure | 20:40 |
anteaya | tristanC: ah ha | 20:40 |
fungi | tristanC: if it's a security fix and needs to get merged urgently, we can bypass the check pipeline and enqueue it directly into the gate since there's a pretty lengthy backlog in check | 20:41 |
openstackgerrit | Illia Khudoshyn proposed openstack-infra/project-config: Run all Rally jobs against Keystone API v3. Add job for v2. https://review.openstack.org/274668 | 20:41 |
jeblair | sdague: why the recheck? | 20:41 |
sdague | ? | 20:41 |
clarkb | logstash says internap multinde grenade is not working | 20:41 |
jeblair | roaet-: on https://review.openstack.org/274821 | 20:41 |
jeblair | grr | 20:41 |
tristanC | fungi: then yes please, I'd like to recheck the master fix since the failure seems unrelated | 20:41 |
jeblair | sdague: on https://review.openstack.org/274821 | 20:41 |
jeblair | sdague: just want to know what you're looking for | 20:42 |
sdague | jeblair: to try to get multinode to hit different nodes | 20:42 |
*** tiswanso has quit IRC | 20:42 | |
jeblair | sdague: do you want to see at least one internap result? | 20:42 |
sdague | yeh | 20:42 |
jeblair | ok, so i think we have ovh, rax, and bb covered; just waiting on internap then | 20:42 |
tristanC | fungi: not master, I meant stable/liberty which is change 275736 << this one could use a recheck probably because of bug 1500615 | 20:43 |
openstack | bug 1500615 in OpenStack Compute (nova) "Large Ops scenario is taking too long" [High,Confirmed] https://launchpad.net/bugs/1500615 | 20:43 |
openstackgerrit | Merged openstack-infra/shade: Fix normalize_role_assignments() return value https://review.openstack.org/271068 | 20:44 |
clarkb | sdague: jeblair https://jenkins03.openstack.org/job/gate-grenade-dsvm-multinode/2250/ will show us | 20:44 |
fungi | tristanC: yeah, i did 275736 | 20:44 |
sdague | clarkb: great | 20:44 |
fungi | tristanC: it's in the gate pipeline now and should get workers allocated as soon as the window progresses there by a couple changes | 20:44 |
jeblair | clarkb, sdague: i +2d but did not +A and left a comment about current status | 20:44 |
* jeblair lunches | 20:44 | |
AJaeger | anteaya: thanks for the review! 275140 and 275225 are the next on the stack if you have further review time, please | 20:44 |
anteaya | AJaeger: okay | 20:44 |
clarkb | the big differene between vxlan and gre is vxlan uses udp so if udp is allowed vxlan should work | 20:45 |
clarkb | gre on the othre hand is its own ip protocol and subject to specialness | 20:45 |
tristanC | fungi: thank! I'm checkin the other patch now, they'll probably need another patchset | 20:45 |
*** gordc has quit IRC | 20:46 | |
nibalizer | fungi: not.much | 20:46 |
nibalizer | and i am way past eod, but whats up? | 20:46 |
fungi | nibalizer: oh, right, you're in eu | 20:47 |
*** sridhar_ram has quit IRC | 20:47 | |
Shrews | jeblair: i have a shade question for you about something i *think* you added (inner exceptions) when you get time | 20:47 |
openstackgerrit | David Moreau Simard proposed openstack-infra/project-config: Run packstack jobs inside a root shell https://review.openstack.org/275915 | 20:47 |
*** dtardivel has quit IRC | 20:48 | |
fungi | nibalizer: puppetdb started complaining a couple hours ago about duplicate report keys in pgsql, and now a lot of our hosts are showing up in puppetboard as unreported and with their nova uuids instead of their hostnames | 20:48 |
*** annegentle has joined #openstack-infra | 20:48 | |
fungi | nibalizer: anyway, don't sweat it. enjoy your evening! | 20:49 |
* AJaeger thanks anteaya | 20:49 | |
anteaya | AJaeger: welcome, thanks for your patience with me | 20:49 |
anteaya | AJaeger: you are doing the lion's share of the work on project-config and I'm sorry I don't have the energy to help you out more | 20:49 |
anteaya | AJaeger: thanks for understanding | 20:49 |
AJaeger | challenge is if I have patches on my own - thanks for reviewing those, anteaya ! | 20:51 |
AJaeger | pabelanger: could you review a grafana change, please? https://review.openstack.org/#/c/275862/ | 20:52 |
openstackgerrit | Daniel Wallace proposed openstack-infra/shade: granting and revoking privs to users and groups https://review.openstack.org/268404 | 20:52 |
nibalizer | fungi: ya i think.we should.patch the upload_to_puppetdb function.in ansible-puppet to take a dubug flag that will cause it to keep the json blob aroubd | 20:52 |
anteaya | AJaeger: :) welcome | 20:52 |
nibalizer | that way we can inspect those for correctness | 20:53 |
*** apoorvad has quit IRC | 20:53 | |
AJaeger | fungi, clarkb, jeblair: Does every job need a timeout? Do we have an implicit timeout? I see jobs like gate-manila-buildimage that have no timeout and wondered whether that's fine... | 20:54 |
openstackgerrit | Merged openstack-infra/project-config: Re-enable repos for translation https://review.openstack.org/275140 | 20:54 |
fungi | nibalizer: i'm also noticing in our puppet_run_all.log that the hostname for, e.g., jenkins02.openstack.org stopped being mentioned around the timeframe this started and its nova uuid began showing up instead | 20:54 |
fungi | nibalizer: mostly thinking we've made a change of some sort somewhere that should be fixed/reverted, but so far not found any good potential candidates | 20:55 |
*** infra-re_ has joined #openstack-infra | 20:55 | |
*** dims has quit IRC | 20:56 | |
nibalizer | does 02 show up ansible all --list-hosts ? | 20:56 |
clarkb | AJaeger: there is a default timeout in defaults.yaml | 20:56 |
openstackgerrit | Merged openstack-infra/project-config: Cleanup translation scripts https://review.openstack.org/275225 | 20:56 |
fungi | nibalizer: nope, bit its uuid does :/ | 20:56 |
nibalizer | huh | 20:56 |
nibalizer | in the last we saw ansible fall back to uids when we had hostname duplicates | 20:57 |
fungi | the pgsql duplicate key constraint exceptions may just be a symptom of this | 20:57 |
nibalizer | id agree with that | 20:57 |
fungi | i checked that first actually, there is one and only one jenkins02.openstack.org in nova | 20:58 |
AJaeger | clarkb: thanks, found it - 30 mins | 20:58 |
AJaeger | this is fine | 20:58 |
nibalizer | a conundrum | 20:58 |
*** infra-red has quit IRC | 20:58 | |
nibalizer | mordred: Shrews any thoughts? | 20:58 |
fungi | nibalizer: at least in rax-dfw... i don't know how to easily scan across all providers/regions | 20:59 |
fungi | but it seems unlikely that we would have stood up duplicates of _most_ of our infrastructure in one of our other accounts known to ansible | 20:59 |
Shrews | nibalizer: i have no idea what crazy things ansible does outside the OS modules :) | 20:59 |
AJaeger | dhellmann: could you review again the solum reno change, please? https://review.openstack.org/#/c/243295/ | 21:01 |
dhellmann | AJaeger : sure, after this meeting | 21:01 |
AJaeger | dhellmann: no hurry - thanks | 21:01 |
fungi | nibalizer: Shrews: i wonder what would happen if i moved the current inventory cache out of the way and allowed it to get regenerated | 21:01 |
AJaeger | fungi: is the rename of .log to .log.txt now fine to merge? https://review.openstack.org/203099 | 21:02 |
openstackgerrit | sebastian marcet proposed openstack-infra/openstackid: Fix on username generation https://review.openstack.org/275924 | 21:02 |
fungi | nibalizer: Shrews: looks like it was last modified shortly before these symptoms began | 21:03 |
fungi | AJaeger: i couldn't really tell if lifeless's comments were weak objections or merely observations. at any rate i don't have a strong feeling on it either way, it was proposed as one potential simple way out of the current challenge of viewing tox logs without having to download them to your local filesystem first | 21:05 |
*** dims has joined #openstack-infra | 21:05 | |
*** kurtmartin has quit IRC | 21:06 | |
*** amitgandhinz has quit IRC | 21:06 | |
nibalizer | fungi: shade itself has an inventory command you coukd run | 21:06 |
fungi | we could instead go with a more focused solution which just affected the tox logs and not every file with a .log extension, but i don't have time right now to write that if so | 21:06 |
fungi | AJaeger: ^ | 21:06 |
prometheanfire | can someone reworkflow this? https://review.openstack.org/#/c/273790/ | 21:06 |
fungi | nibalizer: what's shade's cli entrypoint? | 21:07 |
clarkb | fungi: >>> import shade | 21:07 |
*** _nadya_ has quit IRC | 21:07 | |
fungi | oh! there's a "shade-inventory" entrypoint | 21:07 |
fungi | tab completion ftw | 21:07 |
*** baoli has joined #openstack-infra | 21:07 | |
*** EricGonc_ has joined #openstack-infra | 21:08 | |
fungi | it's no wonder we cache this. not exactly quick | 21:08 |
*** miqui has joined #openstack-infra | 21:08 | |
AJaeger | fungi, so problem is http://logs.openstack.org/29/273929/4/check/gate-networking-midonet-python27/bfe8f02/tox/py27-0.log ? That works just fine in firefox and chromium, both open in the browser | 21:09 |
fungi | also very verbose. i should have dumped it to a file | 21:09 |
clarkb | sdague: jeblair https://jenkins02.openstack.org/job/gate-tempest-dsvm-multinode-full/2893/ is running on internap too | 21:09 |
*** amrith is now known as _amrith_ | 21:09 | |
*** EricGonczer_ has quit IRC | 21:10 | |
AJaeger | fungi, the change only takes care of tox files, nothing else. It's only in run-tox.sh script | 21:10 |
fungi | AJaeger: odd, for me in firefox it says i've chosen to open a file with type x-application/log and wants to know if i wish to download it or which application should handle it (and thinks libreoffice writer is an excellent default) | 21:10 |
fungi | AJaeger: lifeless was pointing out that when you run testr-based tests under tox you will also end up with subunit streams with .log extensions | 21:11 |
AJaeger | Ah... | 21:11 |
fungi | and that you may wish for a browser default of downloading those rather than trying to display them raw | 21:11 |
clarkb | they are binary so can't be treated as text | 21:12 |
fungi | right | 21:12 |
fungi | also the tox logs sort of are as well since they contain ansi escapes | 21:12 |
*** EricGonczer_ has joined #openstack-infra | 21:12 | |
fungi | or potentially can at least | 21:12 |
anteaya | jeblair: 275483 needs a zuul restart to be in production, yes? | 21:12 |
*** infra-re_ has quit IRC | 21:12 | |
*** apoorvad has joined #openstack-infra | 21:13 | |
*** clayton has quit IRC | 21:14 | |
*** jesusaurus has joined #openstack-infra | 21:14 | |
*** boris-42 has joined #openstack-infra | 21:14 | |
AJaeger | fungi: I'll update to only get .tox/ log files - and then let other judge whether t otake it | 21:15 |
fungi | AJaeger: awesome, thanks | 21:15 |
fungi | i'm cool with that as a slightly more focused solution | 21:15 |
*** tcammann_ has joined #openstack-infra | 21:16 | |
*** EricGonc_ has quit IRC | 21:16 | |
*** kgiusti has left #openstack-infra | 21:16 | |
*** esp_ has joined #openstack-infra | 21:16 | |
fungi | nibalizer: Shrews: ansible-inventory on the puppetmaster also only reports one jenkins02, but it's possible something went weird when it last updated the cache i guess | 21:16 |
*** Swami has quit IRC | 21:17 | |
AJaeger | fungi, done. | 21:17 |
AJaeger | And time to leave! | 21:17 |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Rename .log to .log.txt in run-tox https://review.openstack.org/203099 | 21:17 |
Sam-I-Am | AJaeger: waiiiit! :) | 21:17 |
* AJaeger waves good night | 21:17 | |
AJaeger | Sam-I-Am: leave it as comment, I'll answer tomorrow | 21:17 |
anteaya | AJaeger: good night | 21:18 |
Sam-I-Am | ok, its in -doc | 21:18 |
Sam-I-Am | tox problem i'm trying to solve | 21:18 |
*** auggy has left #openstack-infra | 21:18 | |
anteaya | Sam-I-Am: ah a -doc thing | 21:18 |
Sam-I-Am | although maybe someone else here can help | 21:18 |
Sam-I-Am | trying to figure out the problem here - http://logs.openstack.org/92/275392/2/check/gate-tempest-dsvm-networking-ovn/20e990f/logs/devstacklog.txt.gz | 21:18 |
anteaya | Sam-I-Am: ask away | 21:18 |
Sam-I-Am | tox -egenconfig works fine when i run it locally | 21:18 |
Sam-I-Am | yet its breaking in the gate | 21:18 |
fungi | nibalizer: Shrews: so anyway, i have a feeling if i let it regenerate the cache it'll go back to normal. but wondering how to track down what might have caused this | 21:18 |
*** vgridnev has quit IRC | 21:18 | |
anteaya | Sam-I-Am: what is the link to the tox file used for that test? | 21:19 |
fungi | anyway, i'll save the old cache and see what happens on the next pass | 21:19 |
Sam-I-Am | anteaya: https://github.com/openstack/networking-ovn/blob/master/tox.ini | 21:19 |
anteaya | thanks | 21:20 |
*** gordc has joined #openstack-infra | 21:20 | |
*** tcammann_ has quit IRC | 21:20 | |
*** esp_ has quit IRC | 21:21 | |
Shrews | fungi: i'm fairly certain shade inventory code doesn't much around with server names. did this happen with a single provider? or all? | 21:21 |
fungi | Shrews: hard to tell, but looks like it may have only happened for hosts in rax-dfw. unfortunately that's where most of our hosts are so it's hard to rule out it being random | 21:22 |
*** jesusaurus has quit IRC | 21:22 | |
*** amitgandhinz has joined #openstack-infra | 21:22 | |
*** auggy has joined #openstack-infra | 21:22 | |
*** Swami has joined #openstack-infra | 21:23 | |
fungi | Shrews: oh, actually it did happen for a couple in other clouds too | 21:23 |
Shrews | fungi: copypasta me a few examples? | 21:24 |
*** daemontool_ has joined #openstack-infra | 21:25 | |
gtmanfred | is there a git review command to go get a clone a pr down to look at it? | 21:25 |
fungi | also after moving the ansible-inventory.cache out of the way, the new one has the correct hostnames again | 21:25 |
fungi | Shrews: putting that together now | 21:25 |
Shrews | gtmanfred: git review -d <review#> | 21:25 |
anteaya | Sam-I-Am: the only other project I see that uses oslo-config-generator --namespace is watcher | 21:25 |
fungi | gtmanfred: git review -d or -x depending on whether you want to fetch or cherry-pick | 21:25 |
gtmanfred | cool, thanks | 21:25 |
gtmanfred | perfect | 21:25 |
anteaya | Sam-I-Am: your tox.ini doesn't look that different to my eyes: http://git.openstack.org/cgit/openstack/watcher/tree/tox.ini#n40 | 21:26 |
jeblair | gtmanfred: http://docs.openstack.org/infra/git-review/usage.html | 21:26 |
Sam-I-Am | anteaya: yeah... this is strange. | 21:26 |
fungi | also git review --help if the manpage is installed in your manpath | 21:26 |
fungi | or git-review --help if it's not | 21:27 |
sdague | just noticed this is +A - https://review.openstack.org/#/c/164927/34 | 21:27 |
sdague | that's going to make me do a happy dance once the whole stack is in | 21:27 |
*** julim has quit IRC | 21:27 | |
gtmanfred | jeblair: cool, that is exactly what I was looking for, but couldn't find it :P | 21:27 |
*** daemontool has quit IRC | 21:27 | |
jeblair | sdague: yeah, i want to vos release a couple of times manually to check out the timing on that, and also probably some manual tests, before landing the next 2 | 21:28 |
*** amitgandhinz has quit IRC | 21:28 | |
Sam-I-Am | anteaya: know any other tox experts? | 21:28 |
sdague | gotcha | 21:28 |
Sam-I-Am | AJaeger left for the night :/ | 21:28 |
anteaya | Sam-I-Am: networking-ovn has its own tox_install.sh script: http://git.openstack.org/cgit/openstack/networking-ovn/tree/tools/tox_install.sh while watcher doesn't | 21:28 |
*** zul has joined #openstack-infra | 21:29 | |
anteaya | Sam-I-Am: mordred and lifeless, but mordred is traveling somewhere (no idea) and lifeless is at lca | 21:29 |
Sam-I-Am | anteaya: hummmm | 21:29 |
dhellmann | anteaya, Sam-I-Am : most projects use a separate configuration file with the namespace options in it, but it should work the same way. I see an error coming from tox_install.sh in that log, and the line "export ZUUL_BRANCH=${ZUUL_BRANCH-$BRANCH}" looks wrong. Probably should be ":-" not just "-". | 21:29 |
Sam-I-Am | dhellmann: oddly no one has touched this file in 2 months | 21:30 |
anteaya | ah thanks dhellmann | 21:30 |
dhellmann | anteaya, Sam-I-Am : although the error is being reported on line 21 | 21:30 |
anteaya | Sam-I-Am: did it work before? | 21:30 |
Sam-I-Am | yeah | 21:30 |
anteaya | a lot of the neutron drivers appear to have their own tox install file: http://codesearch.openstack.org/?q=tox_install.sh&i=nope&files=&repos= | 21:31 |
fungi | Shrews: so the bad ansible inventory cache is http://paste.openstack.org/show/485897 and then the good one it just generated after i moved that out of the way is http://paste.openstack.org/show/485898 | 21:31 |
Sam-I-Am | dhellmann: line 21 is just a cwd | 21:31 |
fungi | Shrews: the differences are pretty striking | 21:31 |
dhellmann | Sam-I-Am : yeah. it's interesting that there's no output from that script. | 21:32 |
fungi | Shrews: i'm leaning toward "rackspace's nova api returned crazy" as an answer, but don't know where else this might go wrong | 21:32 |
*** kzaitsev_mb has joined #openstack-infra | 21:32 | |
*** jsavak has quit IRC | 21:32 | |
Sam-I-Am | dhellmann: odd that locally running tox -egenconfig works, yet running it within devstack appears to break | 21:33 |
Sam-I-Am | dhellmann: even running devstack locally | 21:33 |
*** abitha has joined #openstack-infra | 21:33 | |
fungi | Shrews: and looking at that, i do think it was just rax-dfw after all | 21:33 |
anteaya | Sam-I-Am: and this script uses zuul-cloner which we have discovered only works in the check and gate pipelines currently, though that may be tangental | 21:33 |
anteaya | Sam-I-Am: what pipeline did this run in? | 21:33 |
*** jsavak has joined #openstack-infra | 21:33 | |
fungi | Shrews: i think the others i thought were affected were really just missing from the puppetdb | 21:33 |
Sam-I-Am | anteaya: i dont know | 21:33 |
dhellmann | Sam-I-Am : what version of tox do you have? | 21:33 |
Sam-I-Am | dhellmann: locally 2.3.1 | 21:34 |
*** sabeen1 has quit IRC | 21:34 | |
fungi | Shrews: and the ones in dfw i thought weren't affected were likely reporting from a previous cache version | 21:34 |
*** sabeen1 has joined #openstack-infra | 21:34 | |
clarkb | dhellmann: Sam-I-Am could be missing header packages | 21:35 |
dhellmann | clarkb : the error is "Bad substitution" | 21:35 |
Shrews | fungi: looking | 21:35 |
dhellmann | Sam-I-Am : try changing the shebang line in your script to use bash instead of sh | 21:35 |
Sam-I-Am | dhellmann: last time i saw an obscure error like that, it was a tox version problem | 21:35 |
*** achanda has quit IRC | 21:35 | |
*** ybathia has joined #openstack-infra | 21:36 | |
Sam-I-Am | dhellmann: in the tox_install script? | 21:36 |
fungi | Shrews: my guess is that the dfw endpoint returned empty hostnames for everything when nova list was called | 21:36 |
dhellmann | Sam-I-Am : yes | 21:36 |
dhellmann | Sam-I-Am : also move the set -ex above all of the other statements, so we can see if any of the first few is being run | 21:36 |
Sam-I-Am | dhellmann: done | 21:37 |
Sam-I-Am | dhellmann: submitting a patch now | 21:37 |
Sam-I-Am | unless theres something else to fix | 21:37 |
dhellmann | those should help debug | 21:38 |
*** nmagnezi has quit IRC | 21:38 | |
Shrews | fungi: is it possible there could have been a change in shade versions between runs? | 21:39 |
*** aysyd has quit IRC | 21:39 | |
*** sabeen3 has joined #openstack-infra | 21:39 | |
*** daemontool_ has quit IRC | 21:39 | |
clarkb | sdague https://jenkins03.openstack.org/job/gate-grenade-dsvm-multinode/2250/console passed on internap if you eant to approve | 21:40 |
*** daemontool_ has joined #openstack-infra | 21:40 | |
*** weshay_xchat has quit IRC | 21:40 | |
*** zul has quit IRC | 21:40 | |
sdague | clarkb: done | 21:40 |
Shrews | fungi: there is a 'use_hostnames' option that controls that behavior. config changes? | 21:40 |
openstackgerrit | Merged openstack-infra/system-config: Add wheel working directory to wheel slave https://review.openstack.org/275854 | 21:40 |
*** thingee has quit IRC | 21:41 | |
fungi | Shrews: shade update seems unlikely. we're using "shade==1.4.0 # git sha 457ea84" from 4 weeks ago | 21:41 |
Sam-I-Am | dhellmann: patch is gating soon, but not sure a simple change to tox.ini will trigger the dsvm jobs | 21:41 |
Sam-I-Am | dhellmann: so i might need to touch a random file somewhere | 21:41 |
*** sabeen1 has quit IRC | 21:41 | |
dhellmann | Sam-I-Am : oh, I thought you might insert the patch under https://review.openstack.org/#/c/275392/ | 21:42 |
Shrews | fungi: k. config files? | 21:42 |
*** clayton has joined #openstack-infra | 21:43 | |
Sam-I-Am | dhellmann: good point. so... just to make sure i'm thinking straight... just checkout that patch on top of the tox patch? | 21:43 |
Shrews | fungi: notably the clouds.yaml | 21:43 |
*** achanda has joined #openstack-infra | 21:43 | |
fungi | Shrews: only place we seem to set it is here http://git.openstack.org/cgit/openstack-infra/system-config/tree/modules/openstack_project/templates/puppetmaster/ansible-clouds.yaml.erb#n2 (digging in git blame now) | 21:43 |
fungi | Shrews: been there since https://review.openstack.org/264858 merged a month ago | 21:44 |
dhellmann | Sam-I-Am : do "git review -d 275392" then cherry pick your patch onto the branch created then rebase -i and swap the order then git review to resubmit | 21:44 |
*** dkranz has quit IRC | 21:45 | |
dhellmann | Sam-I-Am : of course if that first patch is yours, you probably have a copy locally and don't need to do the git review -d part | 21:46 |
Sam-I-Am | dhellmann: thats what i was wondering, since it is mine | 21:46 |
dhellmann | Sam-I-Am : ah, ok, then just get your tox changes into the same branch and rebase to swap the order | 21:46 |
*** jistr has joined #openstack-infra | 21:46 | |
Shrews | fungi: hrm, dunno then. unless somebody ran inventory manually with a different clouds.yaml and modified the cache accidentally. or rax just went cuckoo for a bit | 21:47 |
*** jistr has quit IRC | 21:47 | |
fungi | Shrews: that last one seems the most likely | 21:47 |
fungi | i'll try to be on the lookout for future occurrences and maybe a pattern will emerge | 21:48 |
*** thingee has joined #openstack-infra | 21:48 | |
Shrews | yeah. i'm not even certain that first scenario is even possible | 21:48 |
*** dkranz has joined #openstack-infra | 21:49 | |
*** FallenPegasus has quit IRC | 21:51 | |
*** FallenPegasus has joined #openstack-infra | 21:52 | |
Sam-I-Am | dhellmann: i'm having trouble with this | 21:53 |
anteaya | Sam-I-Am: what is happening now? | 21:53 |
openstackgerrit | Merged openstack-infra/project-config: Create jobs for a wheel mirror https://review.openstack.org/164927 | 21:53 |
Sam-I-Am | anteaya: git and i dont get along | 21:53 |
Sam-I-Am | especially for things i dont use very often | 21:53 |
anteaya | just now or most days | 21:53 |
dhellmann | Sam-I-Am : submit your two patches and I'll get them into the right order for you | 21:53 |
jeblair | dhellmann, Sam-I-Am: have you seen 'git restack' ? | 21:53 |
dhellmann | that is, submit the new one separately | 21:53 |
*** salv-orlando has joined #openstack-infra | 21:54 | |
Sam-I-Am | dhellmann: the new one is 275939 | 21:54 |
dhellmann | jeblair : no, have a reference? | 21:54 |
jeblair | we don't need to further derail this now, but it may make this sort of situation easier | 21:54 |
fungi | someone mentioned it on a popular mailing list recently | 21:54 |
jeblair | dhellmann: lemme git one | 21:54 |
dhellmann | jeblair : I see what you did there | 21:54 |
*** FallenPegasus has quit IRC | 21:54 | |
fungi | nice typo (or was it?!?) | 21:55 |
jeblair | dhellmann: http://lists.openstack.org/pipermail/openstack-dev/2016-February/085605.html | 21:55 |
*** jsavak has quit IRC | 21:55 | |
jeblair | i said 'large' but it also works with 2 patches. | 21:55 |
dhellmann | Sam-I-Am : resubmitted | 21:55 |
lifeless | Sam-I-Am: whats up ? | 21:56 |
fungi | heck, it even works with one patch, though it's pretty uninteresting | 21:56 |
jeblair | anteaya: you may remember that from december ^; finally released it as a tool | 21:56 |
Sam-I-Am | dhellmann: thanks a bunch | 21:56 |
dhellmann | jeblair : that looks very nice, I'll put that on my list to set up | 21:56 |
*** salv-orl_ has quit IRC | 21:56 | |
Sam-I-Am | lifeless: trying to resolve some funky tox problem in the gate... and only when -egenconfig is run within devstack apparently | 21:56 |
*** dkranz has quit IRC | 21:57 | |
jeblair | dhellmann: thanks; it's one of those "you totally don't need this because it's so simple to just type a bunch of long git commands" things that once you use is just so much nicer :) | 21:57 |
*** dkranz has joined #openstack-infra | 21:58 | |
dhellmann | jeblair : yes, I see myself using this a lot | 21:59 |
anteaya | jeblair: yay | 21:59 |
anteaya | jeblair: thank you | 21:59 |
*** dkranz has quit IRC | 22:00 | |
ikhudoshyn | AJaeger: I've missed your +2 and updated the patch so that ALL devstack jobs ec | 22:00 |
*** dkranz has joined #openstack-infra | 22:00 | |
ikhudoshyn | ...except dedicated one run against keystone api v3 | 22:01 |
ikhudoshyn | AJaeger: could you pls re-approve https://review.openstack.org/#/c/274668/ | 22:01 |
*** dkranz has quit IRC | 22:02 | |
*** rossella_s has quit IRC | 22:02 | |
*** rossella_s has joined #openstack-infra | 22:03 | |
*** dkranz has joined #openstack-infra | 22:04 | |
anteaya | ikhudoshyn: he is off for the night | 22:05 |
*** thorst_ has quit IRC | 22:05 | |
ikhudoshyn | anteaya: thanks | 22:05 |
*** PsionTheory has joined #openstack-infra | 22:05 | |
*** thorst_ has joined #openstack-infra | 22:06 | |
*** sdake_ has quit IRC | 22:06 | |
anteaya | welcome | 22:06 |
ikhudoshyn | could you pls advise me somebody who I could address | 22:06 |
anteaya | start with patience | 22:06 |
anteaya | if someone has the time they will look | 22:06 |
krotscheck_dcm | jeblair: Looks like the job merged before the system-config changes did. | 22:07 |
ikhudoshyn | anteaya: fair enough, thanks | 22:07 |
openstackgerrit | Doug Hellmann proposed openstack-infra/project-config: add a job to send automated announcements of releases https://review.openstack.org/272767 | 22:07 |
anteaya | thank you | 22:07 |
*** thorst__ has joined #openstack-infra | 22:08 | |
dhellmann | fungi, jeblair: rebasing this patch ^^ is starting to turn into a bit of a mess. Can you move it up your review queue for me, please? | 22:08 |
*** trown is now known as trown|outttypeww | 22:08 | |
dhellmann | I had to make a lot of changes across one of the most actively changed files, so I'm hitting lots of merge conflicts. :-/ | 22:08 |
krotscheck_dcm | fun times! | 22:09 |
*** thorst_ has quit IRC | 22:09 | |
openstackgerrit | Richard Jones proposed openstack/requirements: Revert "Exclude xvfbwrapper 0.2.8 from global requirements" https://review.openstack.org/275946 | 22:10 |
*** lane_kong has joined #openstack-infra | 22:11 | |
*** lane_kong has left #openstack-infra | 22:11 | |
*** _amrith_ is now known as amrith | 22:12 | |
*** HeOS_ has joined #openstack-infra | 22:12 | |
*** [HeOS] is now known as HeOS | 22:12 | |
tristanC | interesting, 275736 is in both gate and check pipeline, it's a schrödinger change :) | 22:12 |
*** thorst__ has quit IRC | 22:12 | |
anteaya | don't look in the box | 22:14 |
fungi | tristanC: yeah, the one in check will report into gerrit but will not impact gating | 22:14 |
*** dkranz is now known as dkranz-brb | 22:14 | |
anteaya | still working on getting the better of this cold, calling it a night, thanks everyone, see you tomorrow | 22:15 |
pigmej | tristanC: lol | 22:15 |
jeblair | i just spent 10 minutes trying to figure out why my workstation stopped responding | 22:15 |
*** dkranz-brb is now known as dkranz | 22:15 | |
jeblair | it's because tox doesn't pass through variables anymore | 22:15 |
*** henrynash has quit IRC | 22:16 | |
jeblair | so my instruction to run zuul unit tests on a tmpfs was not honored | 22:16 |
*** gordc has quit IRC | 22:16 | |
openstackgerrit | Ian Wienand proposed openstack-infra/puppet-openstackci: nodepool : add flag to install diskimage-builder from git https://review.openstack.org/275535 | 22:16 |
*** achanda has quit IRC | 22:16 | |
fungi | dhellmann: how was it determined which entries in the projects.yaml should get that job-group? just any using publish-to-pypi or openstack-server-release-jobs in the layout? | 22:16 |
*** krotscheck_dcm has quit IRC | 22:17 | |
dhellmann | fungi : yes | 22:17 |
*** krotscheck1 has joined #openstack-infra | 22:17 | |
fungi | i take it there wasn't another job-group which it could be added to for similar effect? | 22:17 |
*** dims_ has joined #openstack-infra | 22:17 | |
fungi | seems like python-jobs would have probably worked since that's the one which also instantiates the tarball job template | 22:18 |
jeblair | dhellmann: yeah, you're going to hate me for this but i was just coming to the same conclusion as fungi | 22:19 |
fungi | it's a bit of a balancing act between instantiating the template for some projects which won't be running the job, vs adding another job-group which lots of projects will need to remember to add | 22:19 |
*** kzaitsev_mb has quit IRC | 22:19 | |
dhellmann | the announcement doesn't only apply to python things | 22:19 |
dhellmann | xstatic javascript packages don't have python-jobs, right? | 22:19 |
openstackgerrit | Merged openstack-infra/nodepool: Remove the unused 'reset' setting from the doc https://review.openstack.org/232988 | 22:21 |
*** dims has quit IRC | 22:21 | |
jeblair | dhellmann: i think they do (they are python packages) | 22:21 |
fungi | i would be surprised if the projects using publish-to-pypi and openstack-server-release-jobs didn't add python-jobs | 22:21 |
dhellmann | ok, well, my point was that I thought we would want this on additional things | 22:21 |
dhellmann | we can deal with that when it comes up, I'll redo the patch | 22:21 |
fungi | you don't currently, at least, seem to be running it on non-python things | 22:21 |
jeblair | dhellmann: i think if we do, we could add it to the 'javascript-jobs' job group or something | 22:22 |
dhellmann | yeah, I was trying to plan ahead by not baking python assumptions in | 22:22 |
dhellmann | fungi : where do we stand on the releases.o.o work? I gather there were some fires with other things today? | 22:22 |
fungi | mostly concerned people are cargo-culting old project addition changes (which is bad but we don't seem to be able to break them of), and they're going to miss adding this and we're going to somewhat often end up troubleshooting why their announcements aren't happening | 22:22 |
fungi | dhellmann: yes, sorry, i've not really started in on building that worker yet | 22:23 |
dhellmann | fungi : ack, I'll redo the announcement patch | 22:23 |
jeblair | dhellmann: good news! the next revision will be shorter! | 22:24 |
*** kzaitsev_mb has joined #openstack-infra | 22:24 | |
fungi | and no longer prone to merge conflicts! | 22:24 |
openstackgerrit | Colleen Murphy proposed openstack-infra/puppet-infracloud: Expose br_name in controller and compute classes https://review.openstack.org/275951 | 22:25 |
*** dkranz has quit IRC | 22:25 | |
openstackgerrit | Colleen Murphy proposed openstack-infra/system-config: Add Infra Cloud controller node https://review.openstack.org/209698 | 22:26 |
pabelanger | imcsk8: you can use tools/build-image.sh in project-config to build the image | 22:26 |
*** jsavak has joined #openstack-infra | 22:26 | |
openstackgerrit | Merged openstack-infra/project-config: Add centos7 and fedora-23 to bindep-fallback https://review.openstack.org/275376 | 22:27 |
*** dims_ has quit IRC | 22:27 | |
imcsk8 | pabelanger: thanks i was already on it :) | 22:27 |
openstackgerrit | Colleen Murphy proposed openstack-infra/system-config: Add Infra Cloud compute node definition https://review.openstack.org/213980 | 22:27 |
*** gildub has joined #openstack-infra | 22:28 | |
pabelanger | AJaeger: ianw left comments, using templates in grafyaml might be better | 22:29 |
openstackgerrit | Matthew Treinish proposed openstack-infra/project-config: Remove ML reporting for periodic-qa jobs https://review.openstack.org/275954 | 22:30 |
*** tzn has quit IRC | 22:30 | |
ianw | pabelanger: seems like we're about 75% of the way there with the existing log config scripts | 22:30 |
fungi | pabelanger: thanks for doing that centos and fedora addition for the bindep-fallback jobs | 22:30 |
fungi | that'll get us a good picture of how it's faring across additional node types before we start deciding to make at least the ubuntu one run in check/gate | 22:31 |
*** thorst has joined #openstack-infra | 22:31 | |
*** jsavak has quit IRC | 22:31 | |
*** jsavak has joined #openstack-infra | 22:31 | |
pabelanger | fungi: sure, now I need to test them and see if they work | 22:32 |
pabelanger | ianw: indeed | 22:32 |
*** sdake has joined #openstack-infra | 22:32 | |
fungi | pabelanger: well, the ubuntu version didn't when i wrote that job, because someone had added a centos-only package to the list without marking it as only being available there | 22:33 |
fungi | so i wouldn't be surprised to turn up similar discrepancies | 22:33 |
openstackgerrit | Colleen Murphy proposed openstack-infra/puppet-infracloud: Fix neutron ssl relationship https://review.openstack.org/275956 | 22:34 |
pabelanger | fungi: sure, I'll have some spare time tomorrow to dig into it and see the current state | 22:34 |
pabelanger | fungi: then review trusty and move it (non)voting | 22:34 |
fungi | pabelanger: good news is the job makes those easy to spot | 22:34 |
*** EricGonczer_ has quit IRC | 22:35 | |
openstackgerrit | Matthew Treinish proposed openstack-infra/project-config: Add pypi publish job to tempest https://review.openstack.org/275958 | 22:35 |
pabelanger | fungi: Indeed | 22:35 |
*** EricGonczer_ has joined #openstack-infra | 22:35 | |
*** thorst has quit IRC | 22:35 | |
openstackgerrit | Doug Hellmann proposed openstack-infra/project-config: add a job to send automated announcements of releases https://review.openstack.org/272767 | 22:37 |
dhellmann | fungi , jeblair : reworked ^^ | 22:37 |
ianw | clarkb: doh, where did we get to with the nodepool logging output | 22:39 |
jeblair | dhellmann: are we going to release release-tools some day? i'm wondiring about changing the zuul-cloner to a pip install at some point... | 22:39 |
clarkb | ianw: havent had a chance to debug, now that multinode shpuld work again I can dig in | 22:40 |
ianw | clarkb: i'll take alook, i just got distracted yesterday and forgot :( | 22:40 |
mgagne | what was it again the limitations of masterless projects in gerrit? | 22:40 |
clarkb | mgagne: HEAD is not replicated is the first one | 22:41 |
ianw | clarkb: it might be worth a cron job to rm -rf the log dir just before the next build, there's a lot of cruft in there now | 22:41 |
clarkb | so requires manual intervention to upfate | 22:41 |
ianw | actually, should probably puppet something that clears out files not touched in certain date | 22:41 |
clarkb | ianw: or delete old logs with find | 22:41 |
mgagne | clarkb tried to clone a masterless project imported by jeepyb. it didn't go well | 22:41 |
ianw | clarkb: yeah, ok, leave that with me too :) | 22:41 |
*** thorst has joined #openstack-infra | 22:42 | |
dhellmann | jeblair : maybe, but I have no immediate plans to release them | 22:43 |
fungi | mgagne: yeah, we were supposed to not be approving project imports if they lacked a master branch | 22:43 |
fungi | though i don't think we have a job checking for that specifically | 22:44 |
*** sdake has quit IRC | 22:44 | |
mgagne | fungi you have no authority on my private gerrit install :D | 22:44 |
*** ok_delta has joined #openstack-infra | 22:44 | |
jeblair | fungi is lord of all gerrits | 22:44 |
*** ok___delta has joined #openstack-infra | 22:44 | |
mgagne | dammit | 22:44 |
mgagne | https://gerrit-review.googlesource.com/Documentation/cmd-set-head.html looks to be 2.12 only | 22:44 |
fungi | mgagne: oh! not _our_ gerrit | 22:45 |
mgagne | fungi well, I learn a lot from your (bad) experience guys =) | 22:45 |
jeblair | fungi, dhellmann: that will add 1120 new jobs | 22:45 |
jeblair | we may need to babysit that one | 22:46 |
fungi | yeah, we should probably not approve until we're ready to handhold jjb updates | 22:46 |
dhellmann | jeblair : wow | 22:46 |
fungi | interesting that it's adding more jobs than we have projects | 22:46 |
jeblair | indeed, we have 937 | 22:47 |
jeblair | i'm looking into the discrepancy | 22:48 |
*** krotscheck1 has quit IRC | 22:49 | |
jeblair | the job gear-announce-release shows up in the diff 3 times, as an example | 22:50 |
jeblair | jjb claims to have genarted 420 more jobs | 22:53 |
jeblair | generated | 22:53 |
jeblair | sheesh | 22:53 |
openstackgerrit | Daniel Wallace proposed openstack-infra/shade: granting and revoking privs to users and groups https://review.openstack.org/268404 | 22:53 |
clarkb | mgagne: my suggestion wpuld be to have a master branch that isnt used | 22:54 |
clarkb | its cheap with git and makes tooling happy | 22:55 |
mgagne | we try to avoid so people don't think they can use it | 22:55 |
mgagne | it's not about cheap =) | 22:55 |
openstackgerrit | Daniel Wallace proposed openstack-infra/shade: granting and revoking privs to users and groups https://review.openstack.org/268404 | 22:55 |
*** jsavak has quit IRC | 22:55 | |
crinkle | jeblair: yolanda turns out the reason why the puppet in my dev env "worked" was that it was using production credentials and I'd forgotten to set /etc/hosts to point to localhost >.< | 22:56 |
mgagne | "git symbolic-ref HEAD refs/heads/foobar" works just fine. just have to make sure it's updated to our replication | 22:56 |
crinkle | jeblair: yolanda and the difference between the cloud in uswest and the cloud on my laptop appears to be kilo versus liberty | 22:56 |
*** annegentle has quit IRC | 22:56 | |
jeblair | fungi: ok i don't know what about diff is making it show some files 3 times. but i need to pop this off my stack. i assume 420 is the correct number. think that needs babysitting? | 22:57 |
fungi | mgagne: alternative is a master branch with only a readme saying "no development happens on this branch" | 22:57 |
fungi | jeblair: that seems marginal. i'm happy to do it and keep an eye on it | 22:58 |
fungi | just to be on the safe side | 22:58 |
*** sabeen3 has quit IRC | 22:58 | |
fungi | feel free to approve and i'll babysit | 22:58 |
openstackgerrit | Merged openstack-infra/system-config: Changed ownership of wheel.keytab to jenkins https://review.openstack.org/275895 | 22:58 |
jeblair | fungi: done | 22:58 |
openstackgerrit | Ian Wienand proposed openstack-infra/puppet-nodepool: Add periodic cleanup of log files https://review.openstack.org/275968 | 22:59 |
fungi | thanks. i'll get screen sessions going with updates once it merges | 22:59 |
*** burgerk has quit IRC | 22:59 | |
*** jpr has quit IRC | 22:59 | |
*** ok_delta has quit IRC | 22:59 | |
*** ok___delta has quit IRC | 22:59 | |
*** ccrouch has joined #openstack-infra | 23:00 | |
dhellmann | fungi , jeblair: thanks! | 23:00 |
*** piet has quit IRC | 23:01 | |
openstackgerrit | Daniel Wallace proposed openstack-infra/shade: granting and revoking privs to users and groups https://review.openstack.org/268404 | 23:02 |
*** regXboi has quit IRC | 23:03 | |
openstackgerrit | Ian Wienand proposed openstack-infra/puppet-openstackci: nodepool: Enable periodic cleanup of dib logs https://review.openstack.org/275970 | 23:04 |
ccrouch | lifeless: you mind elaborating on this a little bit: https://github.com/openstack/requirements#tox | 23:06 |
ccrouch | is that work underway? how can people help out? | 23:06 |
*** erlon has quit IRC | 23:06 | |
*** thorst has quit IRC | 23:07 | |
*** thorst has joined #openstack-infra | 23:07 | |
*** xyang1 has quit IRC | 23:08 | |
fungi | ccrouch: there are some jobs using it already. neutron is using it for most tox-based jobs that run in check/gate/experimental pipelines | 23:09 |
*** achanda has joined #openstack-infra | 23:09 | |
*** eharney has quit IRC | 23:09 | |
*** Sukhdev has quit IRC | 23:10 | |
*** dingyichen has joined #openstack-infra | 23:11 | |
*** thorst has quit IRC | 23:12 | |
*** achanda has quit IRC | 23:12 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul: Pass ZUUL_TEST_ROOT through tox https://review.openstack.org/275973 | 23:12 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul: Add job mutex support https://review.openstack.org/275974 | 23:12 |
openstackgerrit | James E. Blair proposed openstack-infra/project-config: Mutex the trusty wheel mirror/release jobs https://review.openstack.org/275975 | 23:12 |
jeblair | okay, i broke my zuulv2 fast for this ^. i don't think it's critical, but it does make the afs stuff a little more atomic and predictable. | 23:13 |
*** EricGonczer_ has quit IRC | 23:14 | |
pabelanger | jeblair: interesting | 23:18 |
*** Sukhdev has joined #openstack-infra | 23:18 | |
ccrouch | ooh fungi thanks for that | 23:18 |
ccrouch | so you not aware of any big blockers? | 23:19 |
openstackgerrit | James E. Blair proposed openstack-infra/zuul: Pass ZUUL_TEST_ROOT through tox https://review.openstack.org/275976 | 23:20 |
fungi | ccrouch: the implementation depends on cloning an additional repository for all jobs so that the correct (current, proposed, et cetera) version of the constraints file can be available for the corresponding branch, and we have one good tool for this right now (zuul-cloner) but it doesn't yet support all the pipelines in which we use tox-based jobs | 23:22 |
*** rfolco has quit IRC | 23:23 | |
*** e0ne has joined #openstack-infra | 23:24 | |
fungi | ccrouch: oh, also devstack is currently using constraints for everything it pip installs for devstack-based jobs | 23:26 |
*** sabeen has joined #openstack-infra | 23:27 | |
clarkb | except for setuptools and pip :) | 23:27 |
fungi | but that was somewhat easier since it was already checking out the appropriate state of the corresponding repository containing the constraints file | 23:27 |
*** sridhar_ram has joined #openstack-infra | 23:30 | |
*** sigmavirus24 is now known as sigmavirus24_awa | 23:30 | |
openstackgerrit | James E. Blair proposed openstack-infra/zuul: Add job mutex support https://review.openstack.org/275978 | 23:32 |
jeblair | there's the zuulv3 version :) | 23:32 |
*** baoli has quit IRC | 23:33 | |
*** e0ne has quit IRC | 23:33 | |
openstackgerrit | Tony Breeds proposed openstack-infra/yaml2ical: Clarify KeyError() in Schedule.__init__() https://review.openstack.org/232321 | 23:34 |
openstackgerrit | Tony Breeds proposed openstack-infra/yaml2ical: Clarify ValueError() in Schedule.__init__() https://review.openstack.org/232322 | 23:34 |
jeblair | tonyb: are you familiar with voluptuous? | 23:35 |
tonyb | jeblair: I an not | 23:36 |
*** baoli has joined #openstack-infra | 23:36 | |
tonyb | jeblair: googling | 23:36 |
fungi | we use it for validating zuul's yaml configuration, for example | 23:36 |
jeblair | tonyb: it's a framework for data validation we use in many of our projects to validate yaml | 23:36 |
jeblair | tonyb: it also supports transformations (grafyaml uses it as well for this purpose heavily) | 23:37 |
*** hashar has quit IRC | 23:38 | |
jeblair | tonyb: so not only can you use it to validate frequency names, or dates, you can also use it to normalize capitalization, etc. | 23:38 |
*** angdraug has joined #openstack-infra | 23:40 | |
tonyb | jeblair: Interesting. | 23:40 |
tonyb | jeblair, fungi: I'll have a play and see if it is a win. | 23:40 |
*** baoli has quit IRC | 23:41 | |
*** dimtruck is now known as zz_dimtruck | 23:42 | |
clarkb | jeblair: greghaynes fwiw the gearman client disconnects seem consistent | 23:44 |
clarkb | we may also want to look into why that is happening beyond hndling it in the client | 23:44 |
jlvillal | sdague: I almost think you are having too much fun with PS4 :) | 23:44 |
jeblair | clarkb: absolutely -- this is not going to work well if it disconnects every day. we'll never get images built. | 23:45 |
jeblair | clarkb: what's consistent about them? | 23:45 |
clarkb | jeblair: it happens every day | 23:45 |
greghaynes | clarkb: I suspect it happens when we spam the upload and build jobs | 23:45 |
*** FallenPegasus has joined #openstack-infra | 23:45 | |
clarkb | greghaynes: it seems to happen later | 23:45 |
jeblair | greghaynes: why do you suspect it happens then? | 23:45 |
greghaynes | clarkb: which is when we would fill TCP buffers | 23:45 |
clarkb | 2016-02-03 15:22:29,511 ERROR gear.Server: Exception in poll loop: is about an hour after the spamming | 23:46 |
greghaynes | jeblair: the time seemed to line up | 23:46 |
clarkb | greghaynes: spamming happenst at 1414 | 23:46 |
greghaynes | Hrm | 23:46 |
clarkb | greghaynes: it actually seems to happen once we start uploading an image | 23:46 |
greghaynes | The time is consistentish, right? | 23:46 |
clarkb | no yesterday it happened several times through the day | 23:47 |
greghaynes | :( | 23:47 |
*** jamielennox|away is now known as jamielennox | 23:47 | |
clarkb | around 1056-1058 and 2327 | 23:47 |
greghaynes | So, maybe we should tcpdump to a pcap file and filter on gearman port | 23:47 |
greghaynes | The post mortem look at the pcap | 23:47 |
greghaynes | Then | 23:48 |
fungi | if it's unpredictable, i have a feeling that would be a huge capture | 23:48 |
clarkb | we can try that, it will be a fairly large capture with all the job listings | 23:48 |
clarkb | fungi: yes | 23:48 |
fungi | as in large enough you might not have available space to store it (no idea really) | 23:49 |
jeblair | clarkb: what's the log signature? | 23:49 |
fungi | oh, though we can rotate pcap files | 23:49 |
*** edmondsw has quit IRC | 23:49 | |
greghaynes | Yes, it supports rotating | 23:49 |
fungi | tcpdump supports truncating and moving captures | 23:49 |
fungi | right that | 23:49 |
jeblair | the function listing is about 1M | 23:49 |
fungi | and has a retention setting i believe | 23:49 |
clarkb | http://paste.openstack.org/show/485903/ is what gearman server says | 23:49 |
greghaynes | We can also set snaplen to something small | 23:49 |
fungi | so as long as we catch the error before we rotate out the capture which covered it, we might have something | 23:50 |
greghaynes | Since really we only care about headers for the most part | 23:50 |
jeblair | clarkb: that's not in the time frame you mentioned | 23:50 |
fungi | greghaynes: sure, though default snaplen is already pretty small | 23:50 |
greghaynes | Ah, right | 23:50 |
openstackgerrit | Ian Wienand proposed openstack-infra/system-config: Cleanup of nodepool builder logging https://review.openstack.org/275982 | 23:50 |
clarkb | jeblair: yes it is | 23:50 |
fungi | so maybe i'm only thinking of if we set a large -s | 23:50 |
clarkb | 1522 | 23:50 |
jeblair | 23:47 < clarkb> around 1056-1058 and 2327 | 23:51 |
clarkb | jeblair: yesterday | 23:51 |
*** pradk has quit IRC | 23:51 | |
clarkb | today is what I pasted (and had earlier pasted a timestamp for) | 23:51 |
jeblair | clarkb: okay, i don't understand what you're trying to say about time, but i'll move on. how do you know that's nodepool? | 23:51 |
clarkb | jeblair: I don't yet, but when I looked at this a week and a half ago I was able to track that error to corresponding nodepool logs complaining about being disconected from gearman | 23:52 |
clarkb | I am still working back on the nodepool side | 23:53 |
jeblair | (i mean, it probably is, because it's doing a status request, and if it were zuul, we would have noticed) | 23:53 |
jeblair | (though it could be a human) | 23:53 |
fungi | i think the gist of the time point was that he's seen it happening at various times with no particular pattern, and so "consistently" actually meant "frequently, and at least daily" | 23:54 |
*** bmjen has quit IRC | 23:54 | |
*** yuanying has quit IRC | 23:54 | |
clarkb | http://paste.openstack.org/show/485904/ is what the nodepool side says which may have silently reconnected without us noticing there? | 23:54 |
clarkb | we definitely only noticed it the last time in nodepool because submitting a job and trying to get the listings of queues were failing due to being disconnected | 23:55 |
greghaynes | Or maybe the builder doesn't disconnect at all? | 23:55 |
*** Hunner has quit IRC | 23:55 | |
*** crinkle has quit IRC | 23:55 | |
*** bmjen has joined #openstack-infra | 23:56 | |
jeblair | clarkb: the status request will come from nodepoold | 23:56 |
*** yuanying has joined #openstack-infra | 23:56 | |
jeblair | so maybe check the other log file? | 23:56 |
jeblair | greghaynes: ^ right | 23:56 |
clarkb | oh right | 23:56 |
* clarkb checks other log | 23:56 | |
*** Hunner has joined #openstack-infra | 23:56 | |
*** Hunner has quit IRC | 23:56 | |
*** Hunner has joined #openstack-infra | 23:56 | |
greghaynes | Yep | 23:56 |
*** crinkle has joined #openstack-infra | 23:56 | |
* greghaynes is in Dr office so will randomly afk | 23:57 | |
prometheanfire | greghaynes: mind +w this to get it to gate again? https://review.openstack.org/#/c/273790/ | 23:57 |
greghaynes | prometheanfire: heh, can't right now | 23:57 |
prometheanfire | gate broken? | 23:57 |
Sam-I-Am | greghaynes: go doctor harder :) | 23:58 |
prometheanfire | Sam-I-Am: hi | 23:58 |
fungi | prometheanfire: see just before your question where he says he's irc'ing from a doctor's office | 23:58 |
Sam-I-Am | prometheanfire: moo. | 23:58 |
Sam-I-Am | fungi: we all need to learn how to disconnect more :/ | 23:58 |
openstackgerrit | John L. Villalovos proposed openstack-infra/glean: Stop confusing vim's tiny brain https://review.openstack.org/275984 | 23:58 |
Sam-I-Am | maybe he's seeking treatment for irc addiction | 23:59 |
greghaynes | Sam-I-Am: haha trying to Dr as hard as I can | 23:59 |
fungi | bwahahaha | 23:59 |
openstackgerrit | Merged openstack-infra/openstackid: Fix on username generation https://review.openstack.org/275924 | 23:59 |
prometheanfire | fungi: was away for a while | 23:59 |
fungi | prometheanfire: it was literally the line before your question ;) | 23:59 |
prometheanfire | heh | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!