clarkb | \o/ and selecting different patchsets to diff against | 00:00 |
---|---|---|
morganfainberg | which reminds me... i gotta track down some of the gertty issues w/ OS X cause i want this hotness now. | 00:00 |
clarkb | jeblair: this is great | 00:00 |
mordred | morganfainberg: markmclain was running it happily on OSX earlier | 00:00 |
morganfainberg | mordred, ooh, maybe fixes are in then! | 00:00 |
mordred | yah. it seemed to work | 00:01 |
morganfainberg | mordred, i will try it tomorrow! | 00:01 |
mordred | there are times when it's doing $something when it $freezes-ish | 00:01 |
* mordred waves hands | 00:01 | |
*** wenlock has joined #openstack-infra | 00:01 | |
mordred | but it's good enough that markmcclain is a convert | 00:01 |
clarkb | ya it has rough edges especially when you are brave like me and test jeblair's patches :) | 00:01 |
*** olaph has quit IRC | 00:02 | |
clarkb | but I think it has been useful for me to do that. I have found a few issues | 00:02 |
clarkb | and fixed a few too | 00:02 |
*** msabramo has quit IRC | 00:02 | |
* morganfainberg is feeling productive today | 00:02 | |
*** yjiang5 is now known as yjiang5_away | 00:03 | |
clarkb | morganfainberg: I want to feel that way too | 00:03 |
clarkb | I suppsoe I did help kick nodepool into shape | 00:03 |
clarkb | mordred I don't want to jump the gun and say 1.1 is much happier now but it looks much happier now | 00:04 |
morganfainberg | clarkb, that's always good! a happy/healthy nodepool is good | 00:04 |
clarkb | we are throwing a ton of load at it best I can tell | 00:04 |
mordred | clarkb: yeah. I concur | 00:05 |
morganfainberg | clarkb, don't jinx it! quick do something superstitious like with salt...or wood...or something | 00:05 |
clarkb | morganfainberg: I knocked on wood happy? | 00:05 |
clarkb | :P | 00:05 |
morganfainberg | clarkb, that works! | 00:05 |
openstackgerrit | Ramy Asselin proposed a change to openstack-infra/config: Allow choice of GIT protocol used. https://review.openstack.org/96539 | 00:06 |
clarkb | mordred: any chance I can get you to update nodepool numbers? | 00:06 |
clarkb | mordred: remove 1.0 and add the hpcloud-b* globbing | 00:06 |
clarkb | also the hp times graph seems off | 00:07 |
mordred | clarkb: I just emailed a new copy of the larger report | 00:07 |
mordred | clarkb: check your hp email - I sent it to the group tracking this there | 00:07 |
clarkb | mordred: the one with ready, building, delete, used, success error ? | 00:07 |
clarkb | mordred: I like the graphs that I just hit refresh on a webpage for | 00:07 |
mordred | clarkb: oh - that on - I should really add that one to the single-page | 00:07 |
clarkb | ++ and remove old graphs and fix the ones we need going forward | 00:08 |
mordred | clarkb: (that's what I do with that html page - I just have it in a file:/// url :) ) | 00:08 |
jogo | sdague: thoughts on removing check-greande-dsvm-neutron from most places? | 00:08 |
clarkb | mordred: ah | 00:08 |
jogo | same for neutron-heat-slow ? | 00:08 |
jogo | both are non-voting | 00:08 |
clarkb | mordred: if I go about asking for quota bumps to prep for region-a would you be on board with that or is it still too early? | 00:08 |
clarkb | mordred: mostly thinking it would be good to have quota's set and routers + networks | 00:08 |
clarkb | then when they are ready for us we will be ready too | 00:09 |
jogo | mordred clarkb ^ | 00:09 |
clarkb | jogo: why are we wanting to remove them? | 00:09 |
*** wenlock has quit IRC | 00:09 | |
jogo | clarkb: becasue greande-dsvm-neutron never worked | 00:09 |
jogo | and we are hitting quota limits | 00:10 |
jogo | its part of the integrated gate, but non-voting | 00:10 |
morganfainberg | clarkb, i'm going to move the apache-services gate over to expirimental tomorrow if i don't have a commit from some people on when they're solving the issues. | 00:10 |
mordred | clarkb: go ahead and ask for it - I agree, but I think we shoudl not turn it on yet | 00:10 |
jogo | which seems like an odd set up | 00:10 |
jogo | clarkb: ^ | 00:10 |
morganfainberg | clarkb, it's taken too long and it's needlessly consuming resources. | 00:10 |
clarkb | morganfainberg: ok | 00:10 |
clarkb | mordred: agreed, mostly wanting to be ready when they are and reduce the lag time | 00:11 |
*** masayukig has quit IRC | 00:11 | |
clarkb | but will still wait on their go ahead before spinning up nodes there | 00:11 |
mordred | clarkb: ++ | 00:11 |
clarkb | mordred: https://review.openstack.org/#/c/90234/ for unbound | 00:12 |
openstackgerrit | Ramy Asselin proposed a change to openstack-infra/config: Make use of unbound optional https://review.openstack.org/96551 | 00:13 |
mordred | clarkb: ok. I just sent out an updated html page with the other graph on it too | 00:13 |
*** matsuhashi has quit IRC | 00:13 | |
clarkb | mordred: asselin_: do those changes conflict with each other? | 00:13 |
clarkb | trying to grok if maybe there should be one change that just enables or disables it? | 00:13 |
clarkb | mordred: in your case we want unbound to be installed just not resolving? | 00:14 |
mordred | clarkb: I think I just wanted the ability to disable - so asselin_'s change may be the way to go | 00:14 |
clarkb | ok | 00:14 |
clarkb | mordred: if you disable in the imagebuild that will disable it in the resulting image though | 00:15 |
clarkb | mordred: or will dib set it up independent of puppet? | 00:15 |
mordred | clarkb: oh - thanks. yes. sorry. braindead | 00:15 |
mordred | clarkb: dib will do the resolv.conf | 00:15 |
mordred | the thing is, dib doesn't start services when it installs them | 00:16 |
clarkb | right so we can't resolv against unbound there | 00:16 |
mordred | right | 00:16 |
mordred | so we want it installed, we just don't want puppet to put the file in | 00:16 |
clarkb | ok | 00:16 |
*** nati_uen_ has quit IRC | 00:16 | |
*** zhiyan_ is now known as zhiyan | 00:16 | |
*** bhuvan has quit IRC | 00:17 | |
*** nati_ueno has joined #openstack-infra | 00:17 | |
clarkb | mordred: the unbound manifest ensures the unbound service is running though | 00:17 |
jeblair | clarkb: yeah the new 'check for missing commits' task takes 7 seconds for me, but it's a one time hit on startup; we can try to optimize that though | 00:17 |
clarkb | does dib do something special to make that not happen or should your change wrap the service too? | 00:17 |
jeblair | clarkb: (it's just atm, missing commits make gertty useless so it seemed like it was worth the hit) | 00:17 |
clarkb | jeblair: yup fine by me | 00:17 |
*** masayukig has joined #openstack-infra | 00:17 | |
clarkb | jeblair: its doing the initial syncage on my desktop now | 00:18 |
clarkb | which is always slow because ETOOMANYCHANGES | 00:18 |
clarkb | but the links and diff stuff look good | 00:18 |
*** SumitNaiksatam has quit IRC | 00:18 | |
jeblair | clarkb: the updated-time change should make it very low cost to just leave running all the time | 00:18 |
clarkb | jeblair: ok | 00:18 |
mordred | clarkb: yes. it does | 00:19 |
sdague | jogo: they should both be close to coming online | 00:19 |
clarkb | mordred: so its fine to have that then? +2 from me | 00:19 |
mordred | yah. | 00:19 |
mordred | clarkb: dib intercepts the debian start-service machinery and prevents it from running | 00:19 |
mordred | that way you get the service set to run on boot | 00:19 |
mordred | but not run at install time | 00:19 |
clarkb | gotcha | 00:19 |
asselin_ | clarkb, mordred sorry not following completely...but if it's too 'risky' I can just copy the script into a new one with what's needed. I'm starting to question the value of reusing in this case.... | 00:20 |
jogo | sdague: hasn't that been the case for a few weeks now? | 00:20 |
*** marcoemorais has quit IRC | 00:21 | |
sdague | jogo: given that everyone is still recovering from summit, I think giving people some breathing room there is fine | 00:21 |
clarkb | asselin_: see https://review.openstack.org/#/c/90234/ basically mordred has a change to install unbound but not install a resolv.conf for it | 00:21 |
*** nati_ueno has quit IRC | 00:21 | |
clarkb | asselin_: I don't think this is sufficient for your use case. his use case is using disk image builder to build nodepool images | 00:21 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/config: Remove grenade-dsvm-neutron from integrated-gate https://review.openstack.org/97377 | 00:22 |
jogo | sdague: ^ is what I was thinking about -- I don't think that means we are not giving folks a chance | 00:22 |
jogo | just using less resources | 00:22 |
sdague | jogo: well deleting your 2 nova silent jobs would be good to do before that :) | 00:23 |
jogo | sdague: even better we can just move that job to experimental | 00:23 |
jogo | sdague: we are collecting good data on those jobs | 00:23 |
jogo | and they are working too http://jogo.github.io/gate/large-ops-testing.html | 00:23 |
jogo | what data do we get from grenade-dsvm-neutron running? | 00:24 |
sdague | I'm so not getting in a fight over job definions tonight :P | 00:24 |
jogo | sdague: do you have a doc you can just point me to? | 00:25 |
*** _nadya_ has quit IRC | 00:25 | |
*** ArxCruz has quit IRC | 00:25 | |
*** lcostantino has joined #openstack-infra | 00:26 | |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/config: move grenade-dsvm-neutron in integrated-gate to experimental https://review.openstack.org/97377 | 00:27 |
jogo | sdague: anyway just -2 my patch with the reason why when you have a chance ;) | 00:27 |
clarkb | jogo: if they are working should we delete those jobs and update the existing jobs? | 00:28 |
jogo | clarkb: I assume you are taking about large-ops-testing. and yes, but I was thinking wait a few more days | 00:28 |
*** arnaud has joined #openstack-infra | 00:29 | |
*** gokrokve has quit IRC | 00:30 | |
morganfainberg | woo gertty "just worked"™* | 00:30 |
morganfainberg | * = as long as i cherry-picked a couple patches in (running master) | 00:30 |
mordred | sdague: 96723 - I'm confused - should that patch be dead now? | 00:30 |
mordred | morganfainberg: ++ | 00:30 |
sdague | mordred: let me look quick | 00:31 |
clarkb | morganfainberg: ya so I think a few of us run tip of jeblair's proposed changes :) | 00:31 |
sdague | mordred: yeh, it should be | 00:31 |
clarkb | morganfainberg: sometimes that means you miss out on a db migration and it doesnt work anymore | 00:31 |
morganfainberg | clarkb, hehe | 00:31 |
jeblair | it's how i encourage reviewers! :) | 00:31 |
clarkb | jeblair: I hope my reviews and alpha testing have been helpful :) | 00:32 |
jeblair | clarkb: extremely | 00:32 |
*** nati_ueno has joined #openstack-infra | 00:32 | |
*** chuckC has quit IRC | 00:34 | |
*** mburned is now known as mburned_out | 00:43 | |
openstackgerrit | A change was merged to openstack-infra/config: Fix empty directory removal for log expiration https://review.openstack.org/97307 | 00:44 |
*** yaguang has joined #openstack-infra | 00:44 | |
clarkb | jeblair: it would actually be interesting to get people at google to use gertty | 00:45 |
clarkb | jeblair: fors the lols | 00:45 |
*** otherwiseguy has quit IRC | 00:45 | |
asselin_ | clarkb, mordred approximately when would the nodepool dib be expected to merge? | 00:46 |
*** nati_ueno has quit IRC | 00:46 | |
clarkb | asselin_: I can't commit to anything because mordred is the expert here but if it didn't start happening sometime next week I will have a sad | 00:47 |
clarkb | asselin_: I really want this to happen asap | 00:47 |
*** nati_ueno has joined #openstack-infra | 00:47 | |
mordred | yeah. we all do | 00:47 |
mordred | I think the dib unbound patch can land now though | 00:47 |
clarkb | mordred: ++ | 00:47 |
clarkb | fungi: jeblair ^ can you guys maybe review that change and give it an approval if it looks good? | 00:47 |
mordred | asselin_: so if you rebase yours on top of it, then we can get those in | 00:47 |
asselin_ | mordred, ok | 00:48 |
clarkb | mordred: btw with zuul swift needing less babysitting I think nodepool dib is my next thing | 00:48 |
clarkb | jhesketh: speaking of where are we on that? | 00:48 |
*** otter768 has joined #openstack-infra | 00:48 | |
mordred | clarkb: awesome. I recommend spending a little time running the dib stuff locally and mounting the images and making sure you understand the tool workflow as well as the content | 00:49 |
mordred | I mean, checking the content is fine - but it's a new tool, so grokking that is important | 00:49 |
clarkb | mordred: ok, I should do that early tomorrow | 00:49 |
clarkb | I mean I grok how dib works | 00:49 |
*** chuckC has joined #openstack-infra | 00:50 | |
jhesketh | clarkb: with zuul swift? | 00:50 |
clarkb | mount fs in chroot. do nasty thign to it with run parts then snapshot | 00:50 |
clarkb | jhesketh: yeah the zuul swift build logs stuff | 00:50 |
jhesketh | I've got 93727 and 93728 which should help with the baby sitting required | 00:50 |
clarkb | jhesketh: I think we need to add it to jobs now and you have changes for that? I will review | 00:51 |
jhesketh | clarkb: it's been on the back-burner, but I believe that's the next step | 00:51 |
clarkb | jhesketh: oh zuul has the stuff in now | 00:51 |
*** nati_ueno has quit IRC | 00:51 | |
clarkb | jhesketh: so we can go ahead with jobs | 00:51 |
clarkb | I will review the zuul changes though | 00:51 |
jhesketh | we need this to land https://review.openstack.org/#/c/76796/ to grab the results | 00:51 |
clarkb | oh right that | 00:52 |
clarkb | jhesketh: I will approve now | 00:52 |
jhesketh | cool | 00:52 |
*** _nadya_ has joined #openstack-infra | 00:52 | |
clarkb | dinner should happen soonish for me but I will keep an eye on it | 00:53 |
clarkb | ping me if you notice oddities | 00:53 |
*** SumitNaiksatam has joined #openstack-infra | 00:54 | |
jhesketh | clarkb: thanks | 00:54 |
mordred | clarkb: yah. grokking the various run parts stages though is important - especially since those changes generate files that drive other dib stages | 00:57 |
mordred | in any case, let me know if I can be helpful as you look at it | 00:57 |
clarkb | jhesketh: 93728 should be backward compat right? | 00:59 |
clarkb | jhesketh: it will do the necessary swift steps if not specified in the config | 00:59 |
clarkb | making sure I read this properly | 00:59 |
*** lcostantino has quit IRC | 00:59 | |
jhesketh | clarkb: yep, should be | 01:00 |
clarkb | lgtm | 01:00 |
*** jaypipes has joined #openstack-infra | 01:01 | |
clarkb | gah os-loganalyze change when to bottom of check queue | 01:01 |
clarkb | oh well | 01:01 |
asselin_ | clarkb, mordred rebasing is getting complicated due to the double-dependency. Let's just wait until it merges and I'll rebase off of master... | 01:02 |
clarkb | asselin_: wfm | 01:03 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Add in experimental job to test zuul swift pushing https://review.openstack.org/97379 | 01:03 |
asselin_ | clarkb, ? wait for master? | 01:03 |
*** oomichi has quit IRC | 01:04 | |
clarkb | jhesketh: you have +2 powers now :) can I get you to review 90234? | 01:04 |
clarkb | asselin_: works for me | 01:04 |
jhesketh | clarkb: ^ we can start by testing pep8 in experimental pushing to swift | 01:04 |
asselin_ | clarkb, lol ok :) | 01:04 |
jhesketh | clarkb: gladly :-) | 01:04 |
*** gokrokve has joined #openstack-infra | 01:04 | |
clarkb | jhesketh: what do you think about continuing to do normal console logs so that we can debug if things go funny? | 01:05 |
clarkb | jhesketh: maybe we cross that bridge when we reach it | 01:05 |
*** starmer has joined #openstack-infra | 01:05 | |
mordred | asselin_: kk | 01:05 |
harlowja | how's the whole gate situation going? :-/ | 01:05 |
clarkb | harlowja: its okish | 01:06 |
clarkb | harlowja: I didn't get to taskflow release fixing today | 01:06 |
clarkb | harlowja: but tomorrow is looking up | 01:06 |
harlowja | clarkb np, i know u guys are busy | 01:06 |
jhesketh | clarkb: os-loganalyze serves up one or the other so testing that component would become more difficult | 01:06 |
harlowja | just by looking at how many zuul things are being processed i can sorta tell :-P | 01:06 |
clarkb | jhesketh: ah right | 01:06 |
harlowja | when its > 300 thats usually not good, lol | 01:06 |
jhesketh | clarkb: but I agree, we should | 01:06 |
clarkb | jhesketh: +2 for the current chagne we can deal with it when we get there I think | 01:06 |
jhesketh | but this is just a first test job for now | 01:06 |
clarkb | jhesketh: if we are lucky it will just work and not be a problem :) | 01:06 |
jhesketh | clarkb: I think the returned log URLs are going to be wrong | 01:06 |
clarkb | harlowja: if you look at the zuul job queue graph you will see the slope is negative now | 01:07 |
jhesketh | but we'll find out how deterministic things are soon | 01:07 |
clarkb | jhesketh: oh right, meh learning experience :) | 01:07 |
*** nati_ueno has joined #openstack-infra | 01:07 | |
harlowja | negative slope, woot | 01:08 |
clarkb | harlowja: as long as that trend continues we should be ok | 01:08 |
harlowja | :) | 01:08 |
fungi | clarkb: harlowja: i did at least get the patches which address the tarball build failure reviewed merged before the daily image updates this morning, but many of the image builds failed because of issues in hpcloud 1.1, and then much of the rest of the day was spent on that new tangent | 01:09 |
clarkb | fungi: oh right we need image rebuilds too | 01:09 |
clarkb | I had forgotten about that | 01:09 |
*** jaypipes has quit IRC | 01:09 | |
*** nosnos has joined #openstack-infra | 01:09 | |
*** starmer has quit IRC | 01:09 | |
fungi | yeah, which is part of why i had hurried to approve them after seeing the scrollback and bug report | 01:10 |
clarkb | in any case I expect to be in a much better position tomorrow to tackle not nodepool | 01:10 |
fungi | i would have retriggered said jobs myself, circumstances permitting... but they were not | 01:10 |
clarkb | and can devote some time to the wheels + tarballs and hopefully swift logs | 01:10 |
jhesketh | clarkb: 90234 LGTM, but I didn't +A it if you want to merge it | 01:10 |
clarkb | jhesketh: ok I will oprobably wait until tomorrow considering the backlog | 01:11 |
harlowja | does whoever gets review 100k get a prize? | 01:11 |
clarkb | jhesketh: even the os-loganalyze change has me a little worried | 01:11 |
fungi | harlowja: a reservation in a padded room | 01:11 |
clarkb | harlowja: so I have to say I have been tempted to go claim it now so that we don't have a bunch of people fighting for it :) | 01:11 |
harlowja | haha | 01:11 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Add in experimental job to test zuul swift pushing https://review.openstack.org/97379 | 01:11 |
clarkb | harlowja: but I enjoy watching people get excited about such things so won't | 01:11 |
harlowja | 100k$?? | 01:12 |
harlowja | that'd be sweet | 01:12 |
mordred | clarkb: we could write a quick zuul thing to trigger a job to submit a patch as soon as 99999 triggers | 01:13 |
clarkb | fungi: we don't have py3k nodes | 01:13 |
mordred | although that'll also be a cool patch to own | 01:13 |
clarkb | fungi: I am assuming because of the way nodepool allocations work | 01:13 |
clarkb | (now I understand why Alex_Gaynor was asking aboutthose nodes) | 01:13 |
clarkb | I think we just let the system shake itself into correctness | 01:14 |
clarkb | which it should do as long as that graph trend continues | 01:14 |
fungi | clarkb: building nodes count as filling the allocation request, which is fine in theory but if many/most fail to build on the first try those rarer node types end up starving out sooner than higher-volume types | 01:14 |
clarkb | ya | 01:14 |
Alex_Gaynor | clarkb: without a py3k node, the gate pipeline can't advance, will the system still sort itself out? | 01:17 |
*** otherwiseguy has joined #openstack-infra | 01:18 | |
clarkb | Alex_Gaynor: yes eventually the ratio between py3k nodes and non py3k nodes should get to a point where we build more of them | 01:19 |
clarkb | Alex_Gaynor: and that ratio will go in our direction as we run the other jobs | 01:19 |
Alex_Gaynor | clarkb: cool, patience isn't my best attribute :-) | 01:20 |
*** nati_ueno has quit IRC | 01:23 | |
*** nati_ueno has joined #openstack-infra | 01:24 | |
*** nati_ueno has quit IRC | 01:24 | |
*** nati_ueno has joined #openstack-infra | 01:25 | |
*** nati_uen_ has joined #openstack-infra | 01:26 | |
*** nati_uen_ has quit IRC | 01:26 | |
*** nati_uen_ has joined #openstack-infra | 01:26 | |
*** nati_uen_ has quit IRC | 01:26 | |
*** xchu has joined #openstack-infra | 01:27 | |
*** nati_ueno has quit IRC | 01:29 | |
*** trinaths has joined #openstack-infra | 01:37 | |
*** trinaths has quit IRC | 01:39 | |
*** trinaths has joined #openstack-infra | 01:39 | |
*** morganfainberg is now known as morganfainberg_Z | 01:41 | |
jhesketh | jeblair, clarkb: Do you guys have thoughts on renaming all of the jobs to remove 'check' or 'gate' from their names? For example, if I create a new job that is currently only in the check queue but will eventually be in the gate what should I name it.. If it is check, gate or other should be defined by the pipeline, not the job name | 01:43 |
* jhesketh is largely thinking outloud | 01:44 | |
harlowja | zigo yt | 01:44 |
clarkb | jhesketh: I thinkwe should use 'gate' for everything that gates. | 01:44 |
clarkb | jhesketh: it helps distinguish what the job is for | 01:44 |
clarkb | but something that will only be check or experimental should have the other pipeline in the name | 01:44 |
openstackgerrit | K Jonathan Harker proposed a change to openstack-infra/config: Add the new hostname to /etc/hosts in prepare_node https://review.openstack.org/96907 | 01:45 |
jhesketh | clarkb: isn't it a little bit redundant though? | 01:47 |
clarkb | jhesketh: sort of. we do have lots of jobs we don't gate on | 01:47 |
clarkb | like post jobs for coverage and tarballs and releases | 01:47 |
clarkb | jhesketh: I don't think there is a technical reason to keep 'gate' in the name | 01:48 |
clarkb | for me its human readability thing | 01:48 |
*** Ryan_Lane has quit IRC | 01:48 | |
jhesketh | right, but it makes naming new jobs difficult | 01:48 |
jhesketh | or is the intention to name any job that you eventually hope to be in gate with the 'gate' prefix? | 01:48 |
jhesketh | but then you get the confusion where it isn't in the gate yet but it is named like it is | 01:48 |
clarkb | jhesketh: ya I ask people to start with gate if that is the intention | 01:49 |
clarkb | I don't think it is well enforced but that is my intent | 01:49 |
jhesketh | I don't particularly disagree, but I wonder if it is confusing to newcomers | 01:50 |
*** wenlock has joined #openstack-infra | 01:50 | |
*** reed has quit IRC | 01:50 | |
*** _nadya_ has quit IRC | 01:51 | |
*** Sukhdev has joined #openstack-infra | 01:51 | |
clarkb | it may be confusing hrm | 01:51 |
jesusaurus | as a newcomer, yes it is a bit confusing, but not that bad | 01:51 |
jesusaurus | its much less confusing than trying to figure out how all the parsing of the jjb {vars} works | 01:52 |
jhesketh | true | 01:52 |
jhesketh | I wonder if we need to better explain what is meant by gate and check.. certainly I didn't know when I first started and there was clearly some confusion during the design summit | 01:53 |
jhesketh | (for example, turbo-hipster called its jobs 'gate_xyz' because I didn't get the usecase) | 01:53 |
clarkb | jhesketh: ah | 01:53 |
clarkb | so yes, I agree we can probably be better with our terminology and education to reduce confusion | 01:54 |
jhesketh | something for the new docs repo | 01:54 |
*** trinaths has quit IRC | 01:57 | |
clarkb | mordred: 1.1 error rate is picking up again | 01:58 |
clarkb | its not terrible though | 01:59 |
clarkb | but not 0 | 01:59 |
*** dims has quit IRC | 01:59 | |
Sukhdev | Is there a way to force Jenkins run on https://review.openstack.org/#/c/97125/ | 01:59 |
Sukhdev | I tried recheck no bug - does not seem to doing anything | 02:00 |
*** starmer has joined #openstack-infra | 02:00 | |
clarkb | Sukhdev: patience | 02:00 |
clarkb | though I don't see it queued | 02:00 |
clarkb | oh zuul is chewing through results right now | 02:00 |
clarkb | so ya patience I think | 02:01 |
Sukhdev | clarkb: how do I tell if it got kicked off? | 02:01 |
*** starmer has quit IRC | 02:01 | |
clarkb | Sukhdev: http://status.openstack.org/zuul/ | 02:01 |
*** yamahata has joined #openstack-infra | 02:02 | |
clarkb | jhesketh: oh hey you are in the news | 02:03 |
jhesketh | which news is that? | 02:03 |
jhesketh | lifehacker? | 02:03 |
clarkb | people pay attention to linux conf au finanials | 02:04 |
*** praneshp has joined #openstack-infra | 02:04 | |
clarkb | ya | 02:04 |
jhesketh | yep, turns out google alerts are kinda cool | 02:04 |
jhesketh | clarkb: so you should come to pyconau and lca2015 ;-) | 02:05 |
clarkb | jhesketh: pyconau is at a bad time but I am going to try really hard to do lca2015 | 02:05 |
jhesketh | :-) | 02:05 |
Sukhdev | clarkb: thank you so much... | 02:05 |
clarkb | its june already though and I have no idea what I can talk about | 02:05 |
jhesketh | it's a lot closer than Perth for you ;-) | 02:05 |
jhesketh | yeah LCA CFP's will open soon (headsup) | 02:06 |
clarkb | I feel like the last year for me has been a lot more about scaling existing things and keeping things running than doing anything fancy | 02:06 |
clarkb | maybe something will come to me | 02:06 |
clarkb | possibly for mini conf if that happens again as that is more oepnstacky | 02:06 |
clarkb | and/or the CI miniconf (again if that happens) | 02:07 |
jhesketh | yep, hopefully those happen | 02:07 |
jhesketh | miniconfs are proposed like talks to the organising committee so they tend to change every year | 02:07 |
clarkb | ya | 02:08 |
clarkb | I actually like how that goes. keeps it fresh with people interested | 02:08 |
clarkb | maybe we can do a retrospective on 100k changes | 02:08 |
*** gokrokve has quit IRC | 02:11 | |
*** oomichi has joined #openstack-infra | 02:11 | |
*** asalkeld has joined #openstack-infra | 02:13 | |
mattoliverau | clarkb: yeah, maybe a talk on how we've had to scale and what was learned along the way. That's useful for anyone doing CI. | 02:14 |
clarkb | mattoliverau: ya especially as people seem to be adopting more our model of CI | 02:14 |
*** asalkeld has quit IRC | 02:14 | |
mordred | clarkb: I've been thinking we can really do some more discussion on running a large system with a small team | 02:15 |
mordred | clarkb: I don't want to say the word devops because it's a lame word | 02:15 |
clarkb | mordred: ya thats what I am thinking | 02:15 |
mordred | but you know - there's a ton of non-CI related stuff we can talk about for fokls too | 02:15 |
clarkb | how we got here given what we have | 02:15 |
*** asalkeld has joined #openstack-infra | 02:15 | |
mattoliverau | clarkb: exactly, you can talk about the bottlenecks we've faced and how they were resolved. Sounds really interesting to me. | 02:15 |
clarkb | mordred: maybe you and I can co do the talk | 02:15 |
clarkb | then you will miss the conference and I will just do it >_> | 02:15 |
*** asalkeld has quit IRC | 02:16 | |
mordred | clarkb: haha | 02:16 |
mattoliverau | lol | 02:16 |
mordred | I did a lovely talk about scalable hybrid cloud applications today using openstack infra as an example of such a thing | 02:16 |
clarkb | mordred: neat | 02:16 |
*** unicell has quit IRC | 02:17 | |
*** _nadya_ has joined #openstack-infra | 02:17 | |
*** Sukhdev has quit IRC | 02:17 | |
*** dprince has quit IRC | 02:19 | |
jhesketh | mattoliverau: you should do a talk on deploying openstack's CI at pyconau miniconf | 02:19 |
jhesketh | propose now! | 02:19 |
jeblair | i can't make pyconau :( but hope to attent lca2015 | 02:20 |
mattoliverau | jhesketh: yeah, I could :) But I should see if someone from the US wants to do it first so they can come to Aus ;) | 02:20 |
jhesketh | propose and we'll weigh it up in the review process | 02:21 |
jhesketh | ;-) | 02:21 |
clarkb | mordred: does the 1.1 graph look funny to you? | 02:21 |
clarkb | mordred: we appear to have a ton of nodes in the delete state | 02:21 |
*** Ryan_Lane has joined #openstack-infra | 02:24 | |
mordred | clarkb: yeah. I'd really like to see things actually delete | 02:24 |
mordred | the number deleting has been steadily rising for the last hour | 02:25 |
*** reaperhulk has joined #openstack-infra | 02:26 | |
clarkb | ya | 02:26 |
*** reaperhulk has left #openstack-infra | 02:26 | |
*** shakamunyi has quit IRC | 02:31 | |
jeblair | mordred, clarkb: hp1.1 looks like it's basically just building and delete at this point | 02:32 |
mordred | jeblair: swell | 02:32 |
clarkb | jeblair: ya | 02:33 |
*** wenlock has quit IRC | 02:33 | |
clarkb | it was so happy before :/ | 02:33 |
*** homeless_ has quit IRC | 02:34 | |
Alex_Gaynor | I'm concerned that CRs might be coming in more quickly than they're getting finished. | 02:34 |
clarkb | Alex_Gaynor: they are right now because 1.1 fell over again | 02:35 |
Alex_Gaynor | clarkb: would it be helpful to ping someone about increasing the rackspace quotas so it can shoulder more of the load? | 02:35 |
clarkb | Alex_Gaynor: maybe? jeblair mordred what do you think? | 02:35 |
clarkb | Alex_Gaynor: though IAD has been having trouble too | 02:37 |
clarkb | just not as baslky | 02:37 |
clarkb | *badly | 02:37 |
dstufft | dhellmann: pip 1.5.7? idk probably whenever I make it | 02:39 |
dstufft | I can see about doing hat this week | 02:39 |
jeblair | Alex_Gaynor: the "Zuul Job Queue" graph is the important one for that -- you can see that once we started using all hp azs, the count went down, but since we've stopped getting hp nodes, it's leveled off | 02:41 |
jeblair | so yeah, our current rate is not good | 02:42 |
Alex_Gaynor | the deleting percentage of test nodes is also super high; what's that usually at? | 02:43 |
jeblair | Alex_Gaynor: much less, that appears to be one of hp1.1s current failure characteristics | 02:43 |
mordred | tteggel: when you awaken ^^ | 02:44 |
clarkb | mordred: jeblair: maybe check the floating ip list? | 02:44 |
clarkb | I am on machine without credentials to do that currently | 02:44 |
jeblair | i'm not here currently | 02:44 |
* mordred doing it | 02:44 | |
clarkb | same here >_> | 02:44 |
mordred | it's pretty solid | 02:45 |
clarkb | just thinking it would be good to rule out that fail mode | 02:45 |
mordred | WOAH. nova list is taking FOR-EVER to return | 02:45 |
clarkb | its time for some mario kart | 02:46 |
clarkb | greghaynes: ^ | 02:46 |
*** unicell has joined #openstack-infra | 02:46 | |
*** arnaud has quit IRC | 02:47 | |
mordred | mordred@camelot:~$ nova list | 02:49 |
mordred | ERROR: <attribute 'message' of 'exceptions.BaseException' objects> (HTTP 503) | 02:49 |
*** zhiyan is now known as zhiyan_ | 02:50 | |
*** melwitt has quit IRC | 02:52 | |
*** wenlock has joined #openstack-infra | 02:52 | |
greghaynes | clarkb: ++ | 02:54 |
*** mattoliverau has quit IRC | 02:56 | |
*** mattoliverau has joined #openstack-infra | 02:56 | |
*** homeless has joined #openstack-infra | 02:58 | |
jhesketh | jeblair: ping | 03:00 |
openstackgerrit | A change was merged to openstack-infra/storyboard: Project Groups API https://review.openstack.org/90736 | 03:02 |
openstackgerrit | A change was merged to openstack-infra/storyboard: Remove unnecessary files https://review.openstack.org/95741 | 03:03 |
*** zhiyan_ is now known as zhiyan | 03:03 | |
clarkb | mordred: I think that means we broke it | 03:04 |
mordred | ossum | 03:04 |
jesusaurus | mordred: im seeing similar 503 errors. i think its related to https://community.hpcloud.com/status/incident/2615 | 03:04 |
mordred | jesusaurus: nod | 03:05 |
*** grantbow has joined #openstack-infra | 03:09 | |
*** zz_gondoi is now known as gondoi | 03:12 | |
* mordred is chatting with folks from the HP NOC right now | 03:12 | |
*** Longgeek has joined #openstack-infra | 03:13 | |
*** Longgeek has quit IRC | 03:17 | |
*** nosnos has quit IRC | 03:20 | |
*** gondoi is now known as zz_gondoi | 03:21 | |
*** otter768 has quit IRC | 03:23 | |
*** zz_gondoi is now known as gondoi | 03:33 | |
*** Ryan_Lane has quit IRC | 03:37 | |
*** gondoi is now known as zz_gondoi | 03:44 | |
*** praneshp has quit IRC | 03:47 | |
*** fifieldt is now known as fifieldt-afk | 03:53 | |
*** packet has quit IRC | 03:54 | |
*** radez is now known as radez_g0n3 | 03:54 | |
*** alexandra_ has joined #openstack-infra | 03:59 | |
*** zz_gondoi is now known as gondoi | 03:59 | |
*** nosnos has joined #openstack-infra | 04:00 | |
*** asettle has quit IRC | 04:01 | |
*** amcrn has quit IRC | 04:01 | |
*** gondoi is now known as zz_gondoi | 04:03 | |
*** otherwiseguy has quit IRC | 04:08 | |
*** pcrews has quit IRC | 04:14 | |
*** Longgeek has joined #openstack-infra | 04:14 | |
*** Longgeek has quit IRC | 04:18 | |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: puppetize installation of gerrit third party plugins https://review.openstack.org/91193 | 04:23 |
clarkb | oh hey nodes are deleting now | 04:25 |
jeblair | clarkb: nova list returns | 04:26 |
jeblair | clarkb, mordred: i was just spot checking a node, and i found it was stuck trying to delete its floating ip; i suspect that the nova list problem may have extended to listing floating ips | 04:27 |
jeblair | so nodepool was never able to confirm they were deleted | 04:27 |
Alex_Gaynor | zuul job queue continues to shrink | 04:27 |
clarkb | ya I think all api calls were dying | 04:27 |
clarkb | with the 500s | 04:27 |
clarkb | jeblair: https://community.hpcloud.com/status/incident/2615 | 04:27 |
jeblair | Alex_Gaynor: the 1st derivative should start increasing soon as hpcloud1.1 has just started delivering nodes again | 04:28 |
*** praneshp has joined #openstack-infra | 04:29 | |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: give zaro access to jenkins-dev server https://review.openstack.org/97389 | 04:29 |
jeblair | (well, increasing in magnitude) | 04:30 |
mordred | jeblair: yeah - the deleting state graph is falling off | 04:30 |
StevenK | jeblair, mordred: Out of interest, has zuul coped with the check queue being ZOMG >400 | 04:30 |
clarkb | zuul was fine | 04:31 |
*** praneshp_ has joined #openstack-infra | 04:32 | |
jeblair | StevenK: the server's at about 13% user cpu usage | 04:32 |
StevenK | Nice | 04:32 |
mordred | jeblair: I now show only 6 nodes in actually deleting task | 04:32 |
jeblair | mordred: i see 309 | 04:33 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Add gerrit reviews to the change patchset approvals https://review.openstack.org/97390 | 04:33 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Allow a pipeline to specify alternative gerrit acc https://review.openstack.org/97391 | 04:33 |
*** lcheng_ has joined #openstack-infra | 04:33 | |
*** praneshp has quit IRC | 04:34 | |
*** praneshp_ is now known as praneshp | 04:34 | |
*** Ryan_Lane has joined #openstack-infra | 04:34 | |
*** trinaths has joined #openstack-infra | 04:35 | |
clarkb | jhesketh: in 97930 you add port and hostname to fakegerrit and at least port doesn't seem to be used | 04:37 |
clarkb | I feel like I must be missing something because it is almost 10pm | 04:38 |
jhesketh | clarkb: right, I think I split up my commits not quite evenly.. let me rework them a little | 04:38 |
clarkb | ah ok, no rush I probably won't do any more review tonight | 04:39 |
clarkb | but good to know I am not completely crazy | 04:39 |
jhesketh | well I'm working on this now anyway for the ironic guys | 04:39 |
openstackgerrit | A change was merged to openstack-infra/devstack-gate: Allow to configure git base URL https://review.openstack.org/95901 | 04:39 |
*** harlowja is now known as harlowja_away | 04:48 | |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Add gerrit reviews into patchset approvals https://review.openstack.org/97390 | 04:50 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Allow a pipeline to specify alternative gerrit acc https://review.openstack.org/97391 | 04:50 |
jhesketh | clarkb: there, that should be a better division | 04:50 |
*** SumitNaiksatam has left #openstack-infra | 04:51 | |
openstackgerrit | A change was merged to openstack-infra/gitdm: Update company affiliation for morganfainberg https://review.openstack.org/95980 | 04:55 |
*** pcrews has joined #openstack-infra | 04:56 | |
*** zhiyan is now known as zhiyan_ | 05:01 | |
*** dolphm has quit IRC | 05:03 | |
*** _nadya_ has quit IRC | 05:04 | |
*** dolphm has joined #openstack-infra | 05:04 | |
*** _nadya_ has joined #openstack-infra | 05:04 | |
*** marcoemorais has joined #openstack-infra | 05:04 | |
openstackgerrit | A change was merged to openstack-infra/config: Stop using HP Cloud 1.0 https://review.openstack.org/95576 | 05:06 |
*** marcoemorais1 has joined #openstack-infra | 05:06 | |
*** marcoemorais has quit IRC | 05:08 | |
bodepd | mordred: for the second time, I have gotten feedback and a -1 from someone, then had someone else give the opposite feedback and a -1 | 05:10 |
*** homeless has quit IRC | 05:10 | |
bodepd | mordred: after making the proposed changes. I've never seen such conflicting reviews before, it's hard to understand what to do | 05:11 |
clarkb | bodepd: ask for clarification? reviews are intended to be a conversation. It isn't surprising that people disagree at times | 05:11 |
clarkb | in fairness I haven't had time to follow the changes you have pushed | 05:11 |
tchaypo | When jogo pointed me at http://git.openstack.org/cgit/openstack/nova-specs/tree/setup.cfg#n20 and said that doing "something like that" would gate on the sphinx errors/warnings | 05:14 |
*** Longgeek has joined #openstack-infra | 05:14 | |
tchaypo | I think he was saying that if I updated the setup.cfg in the tripleo-incubator to have warnerrors=True in a [pbr] block, that will be applied as the tests are run | 05:15 |
tchaypo | does that make sense? | 05:15 |
clarkb | tchaypo: yes. that is an option to the pbr build_sphinx wrapper that allows you to treat sphinx warnings as errors | 05:16 |
clarkb | and updating that config is all that is necessary assuming you use pbr | 05:16 |
tchaypo | very good. | 05:16 |
tchaypo | hrm. | 05:18 |
tchaypo | clarkb: what is it about you north-west usanians working Aus hours? | 05:19 |
tchaypo | also, is greg an honest man yet? | 05:19 |
*** Longgeek has quit IRC | 05:19 | |
*** JoshNang has quit IRC | 05:19 | |
clarkb | tchaypo: work hours are 10am to 5pmish, then dinner then pretend to work on things with tv on in background | 05:19 |
*** _nadya_ has quit IRC | 05:19 | |
clarkb | tchaypo: I don't know why we do this | 05:19 |
clarkb | and no not yet | 05:19 |
clarkb | tchaypo: I have always been this way though. when I was at university I had a hard time getting to classes before 2pm | 05:20 |
clarkb | and when I worked retail I loved the closing shift. 3pm to midnight was great | 05:20 |
clarkb | also it is really hot here | 05:21 |
clarkb | so sleep won't be easy | 05:21 |
StevenK | clarkb: Define hot? | 05:21 |
* StevenK smiles | 05:21 | |
clarkb | it was uh ~22C today I think | 05:22 |
clarkb | 17C right now | 05:22 |
tchaypo | you and greg met at uni, right? | 05:22 |
clarkb | tchaypo: ya | 05:22 |
StevenK | clarkb: 17C is hot? | 05:22 |
StevenK | It's 18C here right, and it's *cold* | 05:22 |
clarkb | StevenK: I spent the afternoon hacking from seattle center. near the international foundain | 05:22 |
clarkb | StevenK: it is very warm | 05:22 |
tchaypo | 17c is tolerable sleeping temperature, 20c is comfortable | 05:23 |
StevenK | 17c is cold | 05:23 |
clarkb | tchaypo: I will agree by the time september happens | 05:23 |
clarkb | but for right now this is shorts and tshirt hack outside weather | 05:23 |
StevenK | Crazy talk | 05:23 |
tchaypo | it *should* be more like 14 here at the moment | 05:23 |
StevenK | clarkb: If 22C is hot, how did you not die in Perth? | 05:23 |
clarkb | StevenK: I did die. But it helped a lot that the hotel had AC | 05:24 |
StevenK | Hahaha | 05:24 |
tchaypo | it's the first week of winter and we're only just getting to early autumn temps | 05:24 |
clarkb | I don't have AC in my apartment | 05:24 |
StevenK | clarkb: Hot for an Australian is >33C | 05:24 |
clarkb | I know. Seattle is special | 05:24 |
StevenK | Haha | 05:24 |
clarkb | it never gets too cold or too hot | 05:24 |
clarkb | and when it does we die | 05:24 |
*** amcrn has joined #openstack-infra | 05:26 | |
*** ildikov_afk has quit IRC | 05:26 | |
clarkb | my parents are in sydney right now and I bet they agree with you that 17C is cold | 05:27 |
*** melwitt has joined #openstack-infra | 05:27 | |
StevenK | I walked to the shops for lunch in a t-shirt, but now I'm back home, I've put a jumper on | 05:28 |
clarkb | victoria is looking nice for the weekend though. sunnyish and 19C | 05:29 |
openstackgerrit | Christian Berendt proposed a change to openstack-infra/os-loganalyze: cleaning up index.rst file https://review.openstack.org/96405 | 05:30 |
*** talluri has joined #openstack-infra | 05:35 | |
*** praneshp has quit IRC | 05:38 | |
*** rdopiera has joined #openstack-infra | 05:39 | |
* StevenK peers at Zuul's queue lengths | 05:39 | |
*** praneshp has joined #openstack-infra | 05:39 | |
*** jcoufal has joined #openstack-infra | 05:41 | |
clarkb | StevenK: the result queue was up near 300 previously. I think zuul is ok | 05:42 |
clarkb | it is mostly keeping up | 05:42 |
StevenK | The result queue seems to go down and then up and then down, but its headed in the right direction. | 05:43 |
clarkb | yup, mostly worried about the event queue but with so much work already queued its not a huge deal | 05:44 |
clarkb | zuul will have plenty of things to do while it gets to the event queue | 05:44 |
*** yolanda has quit IRC | 05:49 | |
*** yolanda has joined #openstack-infra | 05:53 | |
*** Ryan_Lane has quit IRC | 05:57 | |
StevenK | clarkb: Watching the check queue drop below 400, just to climb above it is amusing too | 05:58 |
clarkb | StevenK: ya | 05:58 |
clarkb | you guys never stop :) but the important numbers are in the graphs at the bottom of the page | 05:58 |
clarkb | the number of jobs waiting is falling steadily | 05:58 |
StevenK | The check queue is LDAP! | 06:00 |
StevenK | (389) | 06:00 |
StevenK | Blink. It just got smashed by ~300 events | 06:00 |
clarkb | StevenK: the periodic queue | 06:01 |
*** talluri has quit IRC | 06:02 | |
*** talluri_ has joined #openstack-infra | 06:04 | |
*** talluri_ has quit IRC | 06:04 | |
*** basha has joined #openstack-infra | 06:05 | |
*** srenatus has quit IRC | 06:05 | |
*** lcheng_ has quit IRC | 06:06 | |
*** srenatus has joined #openstack-infra | 06:06 | |
*** yolanda has quit IRC | 06:08 | |
*** ccit has joined #openstack-infra | 06:11 | |
*** mrmartin has joined #openstack-infra | 06:14 | |
jhesketh | mordred, clarkb, jeblair: What should we call the user who will do non-binding CI voting in gerrit | 06:14 |
jhesketh | I was thinking just 'non-binding-ci' | 06:14 |
clarkb | jhesketh: wfm. though maybe we should start trying to enforce a naming scheme as has also been discussed | 06:14 |
jhesketh | SergeyLukjanov, fungi: ^ | 06:14 |
jhesketh | clarkb: yes. I believe the main thing is to make it something useful (other than 'turbo-hipster' - sorry :-() | 06:15 |
jhesketh | which 'non-binding-ci' does in my mind | 06:15 |
clarkb | yup | 06:15 |
*** Longgeek has joined #openstack-infra | 06:15 | |
lifeless | btw | 06:17 |
*** ilyashakhat_ has quit IRC | 06:17 | |
lifeless | I am confused about the new user thing | 06:17 |
lifeless | why didn't check-tripleo need a new user ? | 06:17 |
clarkb | lifeless: this is a refinement | 06:17 |
clarkb | the check-tripleo thing was the first iteration and it mostly works. The issue ironic has is they want to vote on nova changes | 06:18 |
clarkb | but zuul already votes on nova changes | 06:18 |
clarkb | to avoid confusion the votes will be reported separately so that on the gerrit side they show up neatly | 06:18 |
*** kevinbenton has quit IRC | 06:19 | |
openstackgerrit | lifeless proposed a change to openstack-infra/elastic-recheck: Add rule for bug 1325815 https://review.openstack.org/97409 | 06:19 |
uvirtbot | Launchpad bug 1325815 in tripleo "tar:... file changed as we read it" [High,Triaged] https://launchpad.net/bugs/1325815 | 06:19 |
lifeless | clarkb: oh, that would be nice for us too | 06:19 |
lifeless | clarkb: can we migrate the tripleo-check stuff to that at the same time ? | 06:20 |
clarkb | lifeless: yup I expect that this will be used by you guys too | 06:20 |
*** Longgeek has quit IRC | 06:20 | |
clarkb | probably | 06:20 |
lifeless | jogo: https://review.openstack.org/97409 | 06:20 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Add in new check-non-binding pipeline https://review.openstack.org/97411 | 06:20 |
jhesketh | lifeless: there's the new pipeline ^ | 06:20 |
jhesketh | all we'd need to do is move your tripleO stuff into there | 06:20 |
jhesketh | lifeless: or give your pipeline a different voting user | 06:21 |
lifeless | jhesketh: could just give our pipeline the new user you're creating | 06:21 |
lifeless | jhesketh: its in a separate pipeline to avoid it making folk worry about the main queue | 06:22 |
lifeless | so we wouldn't want to mix that up at this point | 06:22 |
jhesketh | lifeless: well this isn't in the main queue, but I'm also okay with that | 06:22 |
clarkb | we can probably consolidate the queues since they are very similar in purpose | 06:22 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Add in new check-non-binding pipeline https://review.openstack.org/97411 | 06:23 |
lifeless | clarkb: mmmm, *I* don't care where things live, but I know sdague and other folk that watch the queues religiously were very concerned | 06:23 |
jhesketh | I think less queues are better and its part of heading towards being integrated | 06:23 |
jhesketh | the zuul UI has filters | 06:24 |
lifeless | clarkb: and the ironic jobs don't don't have the property of running on non-regular-public-clouds tripleo's jobs do, which is where the bloat has come from. | 06:24 |
lifeless | again, *I* don't care, but I didn't argue for the new queue in the first place. | 06:24 |
StevenK | jhesketh: Which aren't saved, and then you're still doing heavy lifting in JS | 06:24 |
*** lcheng_ has joined #openstack-infra | 06:24 | |
jhesketh | SteveK: The filters are able to be saved | 06:25 |
lifeless | However, I don't want folk getting upset or antsy if this is changed - the folk that wanted distinct views are the folk to check with. | 06:25 |
StevenK | I'm sure the status page is currently horrible just due to the size of the JSON the browser has to parse | 06:25 |
jhesketh | a little save link appears when you enter one in | 06:25 |
*** mrmartin has quit IRC | 06:25 | |
*** proffalken has quit IRC | 06:25 | |
*** _nadya_ has joined #openstack-infra | 06:25 | |
jhesketh | StevenK: Sure, but if you are parsing the json yourself you can also still filter | 06:25 |
StevenK | jhesketh: Then let me sniff that 'expand by default' isn't also a first class citizen | 06:25 |
jhesketh | that saves also | 06:26 |
openstackgerrit | Christian Berendt proposed a change to openstack-infra/zuul: Test for membership should be 'not in' https://review.openstack.org/97414 | 06:26 |
StevenK | Huh. When did that change. | 06:27 |
jhesketh | StevenK: not sure, but it sets a cookie... | 06:27 |
StevenK | I used to recall when you'd close a status page and then re-open it, expand by default wasn't checked | 06:27 |
SpamapS | hm | 06:28 |
SpamapS | why can't I recheck / reverify https://review.openstack.org/#/c/96001/ ? | 06:28 |
*** jamespage_ has quit IRC | 06:28 | |
*** yaguang has quit IRC | 06:29 | |
clarkb | SpamapS: you need patience | 06:29 |
*** yaguang has joined #openstack-infra | 06:29 | |
*** proffalken has joined #openstack-infra | 06:29 | |
SpamapS | clarkb: what? I.. can't.. s***. | 06:30 |
*** lcheng_ has quit IRC | 06:30 | |
clarkb | SpamapS: you could also make hpcloud more reliable | 06:30 |
SpamapS | clarkb: https://www.youtube.com/watch?v=enzJUemPYeU | 06:30 |
*** Longgeek has joined #openstack-infra | 06:30 | |
wenlock | lol, thats been me all night | 06:31 |
StevenK | SpamapS: Queue lengths: 50 events, 27 results. I suspect your two reverifys are in that lot | 06:31 |
clarkb | I have ~170 nodes deleting, ~130 nodes building and a whole lot of queuing | 06:31 |
SpamapS | I did one recheck, and one reverify.. didn't know which one :-P | 06:31 |
SpamapS | clarkb: that sounds like a beastie boys song | 06:31 |
*** Longgeek has quit IRC | 06:31 | |
wenlock | clarkb , is that all on US East? | 06:32 |
clarkb | I want it to be a song I never sing again | 06:32 |
clarkb | wenlock: yes | 06:32 |
StevenK | SpamapS: That's OfficeSpace, isn't it? | 06:32 |
SpamapS | I got ONE hundred SEVENTY nodes de-let-in, one thirty on deck and a whole lot o queuein. | 06:32 |
wenlock | gah... | 06:32 |
clarkb | wenlock: there are also about 40 nodes being used currently | 06:32 |
SpamapS | StevenK: indeed it is. | 06:32 |
StevenK | It's been years since I've seen it | 06:32 |
wenlock | i switched back to US West about 15 min ago | 06:33 |
clarkb | wenlock: we have been told not to use west | 06:33 |
wenlock | going slow but at least no more errors | 06:33 |
clarkb | and have to go through the quota bump process before it is useable anyways | 06:33 |
wenlock | yeah West is diablo... slower | 06:33 |
clarkb | wenlock: no diablo is off | 06:33 |
clarkb | and it was way more reliable | 06:33 |
*** Ryan_Lane has joined #openstack-infra | 06:33 | |
wenlock | doh | 06:33 |
*** andreykurilin has joined #openstack-infra | 06:34 | |
*** jlk_clone has joined #openstack-infra | 06:34 | |
clarkb | we are slowly learning how to make this east thing happy with us | 06:35 |
wenlock | hehe | 06:35 |
clarkb | make sure your networks are big or have many of them, always select and az, and so on | 06:35 |
wenlock | hmmm always select an az , we're not doing that | 06:35 |
wenlock | networks big? | 06:35 |
wenlock | whats that mean? | 06:36 |
clarkb | wenlock: by default you get a /24 | 06:36 |
clarkb | with 254 useable addresses. 2 of which will be used by router | 06:36 |
clarkb | so if you have more than 252 nodes you need a bigger network or more networks | 06:36 |
wenlock | right, all private, 0.0.0 are off | 06:36 |
wenlock | ooooh | 06:36 |
wenlock | right | 06:36 |
wenlock | we're not there yet :D | 06:37 |
wenlock | <150 | 06:37 |
wenlock | i want to have private nets for those though... but not sure if it will work yet | 06:37 |
*** lcheng_ has joined #openstack-infra | 06:37 | |
wenlock | either that or multi tenant | 06:37 |
*** jlk has quit IRC | 06:37 | |
clarkb | wenlock: right now we create a router:network pair for every 100 nodes | 06:38 |
clarkb | there was some thought that more nodes per router would make it cranky. Our horribly unscientific testing indicated that ~130 nodes saw more errors than 100 | 06:38 |
*** nati_ueno has joined #openstack-infra | 06:38 | |
clarkb | so we went with ^ | 06:38 |
*** _nadya_ has quit IRC | 06:39 | |
wenlock | clarkb, seems like a reasonable guess | 06:39 |
clarkb | wenlock: and for the AZ thing apparently ouy get AZ2 if you don't specify an AZ | 06:39 |
wenlock | clarkb, we did cpu/mem/network testing , we found the new stuff is significantly slower, it's a feature apparently | 06:40 |
clarkb | yes "feature" | 06:40 |
clarkb | wenlock: our test nodes are the 30GB 8vcpu flavor | 06:40 |
wenlock | clarkb, yes, feature that we all get more evenly slow vms | 06:40 |
wenlock | more reliably slow | 06:40 |
clarkb | we then boot with kernel arg set to only use 8GB of ram | 06:40 |
clarkb | to eb consistent with the other nodes and not let code through that needs 30GB of memory | 06:40 |
wenlock | ahh large nodes gets higher priority | 06:40 |
clarkb | yes, and with 8vcpu the parallel tests are reasonable | 06:41 |
wenlock | do you virtualize inside that ... ie; like with docker? | 06:42 |
clarkb | no | 06:42 |
clarkb | the tempest tests spin up qemu VMs but we don't do much in them | 06:42 |
wenlock | all test play with each other nicley? | 06:42 |
*** Ryan_Lane has quit IRC | 06:42 | |
clarkb | wenlock: yes, its actually been a great way to make people write better tests :) | 06:42 |
clarkb | there were so many bugs in the old test suites | 06:43 |
wenlock | clarkb, thats good | 06:43 |
*** Ryan_Lane has joined #openstack-infra | 06:43 | |
*** e0ne has joined #openstack-infra | 06:44 | |
*** Ryan_Lane has quit IRC | 06:44 | |
*** Ryan_Lane has joined #openstack-infra | 06:44 | |
*** ominakov has joined #openstack-infra | 06:48 | |
*** tkelsey has joined #openstack-infra | 06:48 | |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/elastic-recheck: Ignore non-voting jobs in gerrit https://review.openstack.org/97369 | 06:49 |
StevenK | jhesketh: Er, we want that for TripleO | 06:49 |
jhesketh | StevenK: elastic recheck for failed non-voting jobs? | 06:50 |
StevenK | jhesketh: There are a few patches in flight to lay the groundwork, but yes. | 06:51 |
wenlock | clarkb, gnight... good chatting. | 06:51 |
jhesketh | StevenK: fair enough, would you mind commenting on the patch so we can move the conversation there? | 06:51 |
StevenK | lifeless: ^ | 06:51 |
*** wenlock has quit IRC | 06:52 | |
lifeless | jhesketh: StevenK: done | 06:56 |
*** melwitt has quit IRC | 06:59 | |
*** pblaho has joined #openstack-infra | 07:00 | |
*** Longgeek has joined #openstack-infra | 07:01 | |
*** ildikov has joined #openstack-infra | 07:03 | |
*** hashar has joined #openstack-infra | 07:04 | |
*** renlt has joined #openstack-infra | 07:05 | |
*** Longgeek has quit IRC | 07:07 | |
*** Longgeek has joined #openstack-infra | 07:08 | |
*** gyee has quit IRC | 07:10 | |
*** e0ne has quit IRC | 07:11 | |
*** skolekonov has joined #openstack-infra | 07:12 | |
*** afazekas has joined #openstack-infra | 07:14 | |
*** e0ne has joined #openstack-infra | 07:14 | |
*** dkranz has quit IRC | 07:21 | |
*** lcheng_ has quit IRC | 07:21 | |
*** andreaf has joined #openstack-infra | 07:22 | |
*** andreykurilin has quit IRC | 07:24 | |
*** salv-orlando has joined #openstack-infra | 07:26 | |
*** andreaf has quit IRC | 07:26 | |
*** JoshNang has joined #openstack-infra | 07:26 | |
*** Guest22618 has joined #openstack-infra | 07:29 | |
mordred | bodepd: sorry about that - which reviews? I'll be happy to track down consensus for you | 07:31 |
*** srenatus has quit IRC | 07:32 | |
*** mrmartin has joined #openstack-infra | 07:32 | |
*** srenatus has joined #openstack-infra | 07:32 | |
*** dizquierdo has joined #openstack-infra | 07:32 | |
*** Ryan_Lane has quit IRC | 07:35 | |
*** mrda is now known as mrda-away | 07:40 | |
openstackgerrit | David Caro proposed a change to openstack-infra/jenkins-job-builder: Fixed referenced before assignment in zuul module https://review.openstack.org/92186 | 07:41 |
*** jgallard has joined #openstack-infra | 07:49 | |
*** praneshp has quit IRC | 07:50 | |
*** ildikov has quit IRC | 07:51 | |
*** jpich has joined #openstack-infra | 07:53 | |
*** freyes has joined #openstack-infra | 07:55 | |
*** jlibosva has joined #openstack-infra | 07:56 | |
*** _nadya_ has joined #openstack-infra | 07:56 | |
*** viktors|afk is now known as viktors | 07:57 | |
*** achuprin has quit IRC | 07:59 | |
*** starmer has joined #openstack-infra | 07:59 | |
*** e0ne has quit IRC | 08:00 | |
*** starmer has quit IRC | 08:00 | |
*** salv-orlando has quit IRC | 08:02 | |
*** e0ne has joined #openstack-infra | 08:03 | |
*** jistr has joined #openstack-infra | 08:03 | |
openstackgerrit | yolanda.robla proposed a change to openstack-infra/storyboard-webclient: Added angular-momentjs to dependencies https://review.openstack.org/96713 | 08:07 |
*** jlibosva has quit IRC | 08:08 | |
*** jlibosva has joined #openstack-infra | 08:10 | |
*** achuprin has joined #openstack-infra | 08:12 | |
*** srenatus has quit IRC | 08:13 | |
*** srenatus has joined #openstack-infra | 08:13 | |
*** ihrachyshka has joined #openstack-infra | 08:14 | |
*** jlibosva has quit IRC | 08:15 | |
*** jlibosva has joined #openstack-infra | 08:15 | |
*** ihrachyshka has quit IRC | 08:16 | |
*** ihrachyshka has joined #openstack-infra | 08:16 | |
*** rlandy has joined #openstack-infra | 08:17 | |
openstackgerrit | Yuriy Taraday proposed a change to openstack-infra/git-review: Add an option to mark uploaded patchsets as WIP https://review.openstack.org/96528 | 08:19 |
*** apevec has joined #openstack-infra | 08:19 | |
apevec | gate queue seems stuck (top of queue >17h) ayone knows what's going on? | 08:20 |
*** derekh_ has joined #openstack-infra | 08:20 | |
*** nati_ueno has quit IRC | 08:21 | |
StevenK | apevec: setuptools broke the entire world yesterday, the queues are still recovering | 08:22 |
StevenK | Hmmm, I'm not sure Zuul is still processing events. | 08:23 |
apevec | yeah, queue length just went over 500 | 08:24 |
StevenK | apevec: You can see the results queue is dropping (slowly), so hopefully we're getting somewhere. | 08:24 |
YorikSar | It doesn't remove patchsets from the queue. | 08:24 |
StevenK | That's an event | 08:24 |
StevenK | Which aren't getting processed because the results are | 08:25 |
*** fbo_away is now known as fbo | 08:25 | |
apevec | right, top of the queue is https://review.openstack.org/96919 which has all jobs finished | 08:25 |
apevec | should be reported as failure | 08:26 |
apevec | so it will sit there until Zuul processes the result queue? | 08:26 |
YorikSar | StevenK: You're saying that it's too busy starting jobs and processing their results so that there's no timeslot left for reporting results and dropping processed patchsets from the queue? | 08:26 |
*** basha has quit IRC | 08:26 | |
*** ildikov has joined #openstack-infra | 08:27 | |
apevec | should be separate queues for events vs results | 08:27 |
StevenK | apevec: It is | 08:27 |
StevenK | Results and events are processed seperately | 08:27 |
apevec | so just slow | 08:27 |
StevenK | I'm waiting to see what happens when results hit zero | 08:28 |
*** andreaf has joined #openstack-infra | 08:28 | |
*** cammann has quit IRC | 08:29 | |
*** oomichi has quit IRC | 08:29 | |
mordred | StevenK: we'll all turn inside out | 08:31 |
*** plars has quit IRC | 08:32 | |
*** ociuhandu has joined #openstack-infra | 08:35 | |
*** oomichi has joined #openstack-infra | 08:35 | |
StevenK | Event queue dropping | 08:36 |
mordred | uhm | 08:36 |
mordred | SergeyLukjanov: do the nodepool/zuul graphs all of a sudden look bad to you? | 08:36 |
mordred | oh. I get it | 08:37 |
* mordred shuts up | 08:37 | |
*** plars has joined #openstack-infra | 08:38 | |
StevenK | Heh, and the check queue is now over 500 | 08:38 |
StevenK | Hopefully after the event queue is empty, it will remove completed jobs and reset the gate | 08:39 |
*** IvanBerezovskiy has joined #openstack-infra | 08:39 | |
mordred | yeah. there's a metric shit-ton of things to process innit? | 08:39 |
StevenK | Yeah check, is dropping | 08:40 |
apevec | hmm, which job is generating tarballs.o.o ? At least http://tarballs.openstack.org/nova/nova-stable-icehouse.tar.gz is out of date: timestamp is 30-May-2014 but last patch was merged Jun 2 https://review.openstack.org/#/q/status:merged+project:openstack/nova+branch:stable/icehouse,n,z | 08:41 |
*** flaper87|afk is now known as flaper87 | 08:41 | |
StevenK | apevec: It's a post job | 08:41 |
StevenK | And that queue is still backed up | 08:41 |
*** zehicle_at_dell has quit IRC | 08:42 | |
*** zehicle_at_dell has joined #openstack-infra | 08:43 | |
apevec | StevenK, looking at http://tarballs.openstack.org/nova/?C=M;O=D I see fresh master and havana tarballs only icehouse is left behind? | 08:43 |
apevec | maybe event got lost? | 08:44 |
*** basha has joined #openstack-infra | 08:44 | |
*** e0ne has quit IRC | 08:44 | |
StevenK | Oh, icehouse. I doubt the event got lost | 08:44 |
apevec | http://tarballs.openstack.org/glance/glance-stable-icehouse.tar.gz seem up to date | 08:45 |
*** e0ne has joined #openstack-infra | 08:45 | |
*** doude has joined #openstack-infra | 08:45 | |
YorikSar | wow, >70 changes dropped from check queue | 08:46 |
StevenK | It's still dropping them, up to ~9h changes | 08:47 |
StevenK | The 12h change at the top of the queue just had a job finish, since it's blocked on an event | 08:47 |
*** jp_at_hp has joined #openstack-infra | 08:48 | |
openstackgerrit | Emilien Macchi proposed a change to openstack-infra/config: ceilometer: enable gate-grenade-dsvm-forward https://review.openstack.org/97430 | 08:48 |
*** amcrn has quit IRC | 08:49 | |
openstackgerrit | Emilien Macchi proposed a change to openstack-infra/config: ceilometer: enable gate-grenade-dsvm-forward https://review.openstack.org/97430 | 08:50 |
apevec | nova, neutron and ceilometer stable-icehouse tarballs are out of date (timestamp < last merge) - is there a way to trigger them manually? | 08:52 |
apevec | SergeyLukjanov, ^ | 08:54 |
apevec | mordred, ^ | 08:54 |
*** pblaho has quit IRC | 08:55 | |
StevenK | apevec: The post jobs just all flushed. Check in about 5 minutes? | 08:55 |
apevec | ok | 08:55 |
*** salv-orlando has joined #openstack-infra | 08:58 | |
*** pblaho has joined #openstack-infra | 08:59 | |
SergeyLukjanov | hey folks, just open irc | 08:59 |
SergeyLukjanov | today's morning was kitchen-table-selection-morning | 08:59 |
SergeyLukjanov | mordred, yup, graphs were bad | 09:01 |
mordred_phone | SergeyLukjanov: I think they're getting better... | 09:02 |
SergeyLukjanov | mordred_phone, yeah, looks like results are processing much more better | 09:02 |
SergeyLukjanov | mordred_phone, btw, if I need to retrigger some job, it's better to use zuul to retrigger? | 09:03 |
SergeyLukjanov | mordred, and for post jobs like tarballs publishing too? | 09:03 |
mordred_phone | SergeyLukjanov: yes, that's right. there's a tool to inject a new event to gearman | 09:04 |
SergeyLukjanov | mordred_phone, yep, zuul enqueue | 09:05 |
SergeyLukjanov | mordred_phone, thx, I'm collecting infra root guides :) | 09:08 |
*** srenatus has quit IRC | 09:09 | |
*** _nadya_ has quit IRC | 09:10 | |
*** srenatus has joined #openstack-infra | 09:10 | |
openstackgerrit | A change was merged to openstack-infra/gerritlib: Apply cookiecutter defaults https://review.openstack.org/94844 | 09:10 |
openstackgerrit | A change was merged to openstack-infra/os-loganalyze: Add support to check swift for log files https://review.openstack.org/76796 | 09:11 |
openstackgerrit | A change was merged to openstack-infra/config: move grenade-dsvm-neutron in integrated-gate to experimental https://review.openstack.org/97377 | 09:12 |
*** pelix has joined #openstack-infra | 09:14 | |
*** _nadya_ has joined #openstack-infra | 09:15 | |
*** ociuhandu has quit IRC | 09:19 | |
openstackgerrit | A change was merged to openstack-infra/nodepool: Use except x as y instead of except x, y https://review.openstack.org/96606 | 09:20 |
*** marcoemorais1 has quit IRC | 09:22 | |
*** salv-orlando has quit IRC | 09:22 | |
*** srenatus has quit IRC | 09:23 | |
*** srenatus has joined #openstack-infra | 09:24 | |
openstackgerrit | A change was merged to openstack-infra/reviewstats: cleaning up index.rst file https://review.openstack.org/96406 | 09:24 |
openstackgerrit | A change was merged to openstack-infra/reviewstats: Update list to reflect recent changes to core team https://review.openstack.org/96133 | 09:24 |
*** bogdando has joined #openstack-infra | 09:26 | |
*** ociuhandu has joined #openstack-infra | 09:26 | |
openstackgerrit | A change was merged to openstack-infra/config: Add in experimental job to test zuul swift pushing https://review.openstack.org/97379 | 09:33 |
openstackgerrit | A change was merged to openstack-infra/reviewstats: Properly support WIP in Gerrit 2.8 https://review.openstack.org/94640 | 09:35 |
openstackgerrit | Isaku Yamahata proposed a change to openstack-infra/config: Add tacker project on StackForge https://review.openstack.org/97435 | 09:41 |
*** ociuhandu has quit IRC | 09:42 | |
openstackgerrit | Isaku Yamahata proposed a change to openstack-infra/config: Add tacker project on StackForge https://review.openstack.org/97435 | 09:45 |
*** Guest22618 has quit IRC | 09:49 | |
*** flaper87 is now known as flaper87|afk | 09:50 | |
*** achuprin has quit IRC | 09:51 | |
*** ihrachyshka has quit IRC | 09:55 | |
*** primemin1sterp has quit IRC | 09:55 | |
*** primeministerp has quit IRC | 09:55 | |
openstackgerrit | A change was merged to openstack-infra/config: Gate barbicanclient on PyPy and Python 3.3 https://review.openstack.org/97288 | 09:56 |
openstackgerrit | A change was merged to openstack-infra/config: Run tox by stack user in rally-cli job https://review.openstack.org/97249 | 09:56 |
openstackgerrit | A change was merged to openstack-infra/config: Add docs-on-rtfd to rally https://review.openstack.org/95523 | 09:56 |
*** amotoki has joined #openstack-infra | 09:57 | |
*** IvanBerezovskiy has quit IRC | 10:00 | |
*** Alexei_987 has left #openstack-infra | 10:03 | |
*** gabriel-bezerra has quit IRC | 10:07 | |
*** achuprin has joined #openstack-infra | 10:07 | |
*** samuelmz has quit IRC | 10:07 | |
*** gabriel-bezerra has joined #openstack-infra | 10:09 | |
*** samuelmz has joined #openstack-infra | 10:09 | |
*** e0ne_ has joined #openstack-infra | 10:10 | |
openstackgerrit | Sergey Skripnick proposed a change to openstack-infra/config: Remove rally-python33 job https://review.openstack.org/97448 | 10:11 |
apevec | StevenK, SergeyLukjanov - still no joy with http://tarballs.openstack.org/nova/nova-stable-icehouse.tar.gz - it's out of date compared to what's been merged to stable/icehouse | 10:11 |
*** jgallard has quit IRC | 10:12 | |
*** e0ne has quit IRC | 10:14 | |
*** Longgeek_ has joined #openstack-infra | 10:27 | |
*** ihrachyshka has joined #openstack-infra | 10:29 | |
*** Longgeek has quit IRC | 10:30 | |
*** IvanBerezovskiy has joined #openstack-infra | 10:36 | |
srenatus | hi there. I'm unsure what's missing here: https://review.openstack.org/#/c/97196/ lots of +1s already... ;) | 10:38 |
*** _nadya_ has quit IRC | 10:38 | |
*** mburned_out is now known as mburned | 10:38 | |
*** xchu has quit IRC | 10:38 | |
*** IvanBerezovskiy has quit IRC | 10:39 | |
*** _nadya_ has joined #openstack-infra | 10:40 | |
*** jooools has joined #openstack-infra | 10:41 | |
StevenK | srenatus: It needs another core reviewer to look it over and approve it. | 10:48 |
*** yamahata has quit IRC | 10:52 | |
srenatus | StevenK: ah ok. | 10:52 |
srenatus | now that would be great. ;) | 10:52 |
openstackgerrit | Ruslan Kamaldinov proposed a change to openstack-infra/config: Make murano-dsvm jobs voting https://review.openstack.org/97462 | 10:52 |
*** e0ne_ has quit IRC | 10:54 | |
*** zz_gondoi is now known as gondoi | 10:57 | |
*** salv-orlando has joined #openstack-infra | 10:58 | |
*** gondoi is now known as zz_gondoi | 10:58 | |
openstackgerrit | Sergey Skripnick proposed a change to openstack-infra/config: Remove rally-python33 job https://review.openstack.org/97448 | 10:59 |
openstackgerrit | Ruslan Kamaldinov proposed a change to openstack-infra/config: Make murano-dsvm jobs voting https://review.openstack.org/97462 | 11:01 |
*** flaper87|afk is now known as flaper87 | 11:04 | |
*** Guest22618 has joined #openstack-infra | 11:04 | |
*** rcarrillocruz has quit IRC | 11:13 | |
*** rcarrillocruz has joined #openstack-infra | 11:14 | |
*** dims_ has joined #openstack-infra | 11:14 | |
*** lcostantino has joined #openstack-infra | 11:14 | |
openstackgerrit | A change was merged to openstack-infra/config: Add rally-dsvm-netron job to rally job group https://review.openstack.org/97234 | 11:18 |
*** ociuhandu has joined #openstack-infra | 11:19 | |
*** ociuhandu has quit IRC | 11:22 | |
*** renlt has quit IRC | 11:23 | |
*** IvanBerezovskiy has joined #openstack-infra | 11:24 | |
*** jp_at_hp1 has joined #openstack-infra | 11:27 | |
*** jp_at_hp has quit IRC | 11:28 | |
sdague | jhesketh: you still awake? I had questions on https://review.openstack.org/#/c/97411/ | 11:29 |
sdague | does everything in that pipeline vote as the same user? | 11:29 |
openstackgerrit | Sergey Skripnick proposed a change to openstack-infra/config: Fix scenario name in rally-neutron job https://review.openstack.org/97469 | 11:31 |
*** hashar has quit IRC | 11:33 | |
*** matsuhashi has joined #openstack-infra | 11:34 | |
*** e0ne has joined #openstack-infra | 11:38 | |
*** miqui has quit IRC | 11:38 | |
*** yamahata has joined #openstack-infra | 11:39 | |
*** dkliban_afk is now known as dkliban | 11:40 | |
*** IvanBerezovskiy has quit IRC | 11:40 | |
*** e0ne_ has joined #openstack-infra | 11:41 | |
*** e0ne has quit IRC | 11:42 | |
*** hashar has joined #openstack-infra | 11:45 | |
*** flaper87 is now known as flaper87|afk | 11:46 | |
*** trinaths has quit IRC | 11:46 | |
*** weshay has joined #openstack-infra | 11:49 | |
*** mwagner_lap has quit IRC | 11:53 | |
*** packet has joined #openstack-infra | 11:55 | |
*** rlandy has quit IRC | 12:00 | |
*** rlandy has joined #openstack-infra | 12:00 | |
*** ildikov_ has joined #openstack-infra | 12:03 | |
fungi | apevec: the tarballs jobs were broken for a couple days after the changes went in to start also building wheels. changes merged about 24 hours ago to fix it, but rely on having fresh nodepool images built with those scripts on them (and many failed to rebuild yesterday due to all the trouble in hpcloud 1.1) | 12:05 |
*** amotoki has quit IRC | 12:05 | |
*** ildikov has quit IRC | 12:05 | |
fungi | apevec: i'll check what the current nodepool image freshness is once i get some coffee, but right now i expect it's a coin toss whether a given tarball job works (depending on where it ran) | 12:06 |
*** doude has quit IRC | 12:08 | |
apevec | fungi, thanks! Yeah, some projects got fresh tarballs, some are out of date | 12:09 |
*** ArxCruz has joined #openstack-infra | 12:09 | |
fungi | yep, tarball jobs run in rackspace should have those fixes, but any run in hpcloud are still broken (we should hopefully get another try on image updates in a couple hours but if those also fail i'll start looping to manually retry them) | 12:10 |
*** IvanBerezovskiy has joined #openstack-infra | 12:11 | |
*** packet has quit IRC | 12:11 | |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Ignore non-voting jobs in gerrit https://review.openstack.org/97369 | 12:13 |
*** pdmars has joined #openstack-infra | 12:13 | |
openstackgerrit | A change was merged to openstack-infra/config: Add flag for unbound's resolv.conf https://review.openstack.org/90234 | 12:13 |
*** achuprin has quit IRC | 12:14 | |
jhesketh | sdague: yes | 12:14 |
sdague | jhesketh: yes to awake, or yes, 1 user per pipeline | 12:15 |
jhesketh | Yes to the 1 user per pipeline | 12:15 |
jhesketh | and an implied yes to the first | 12:15 |
sdague | ok, so the nova use case for this is pretty specific | 12:15 |
jhesketh | but between computer and couch :p | 12:15 |
*** pdmars has quit IRC | 12:16 | |
sdague | the team wants Ironic to vote as an Ironic user, so it looks like VMWare and Xen | 12:16 |
sdague | which I think this is fine, but I expect means a ton of new pipelines | 12:16 |
jhesketh | sdague: yep, you're right | 12:16 |
*** pdmars has joined #openstack-infra | 12:16 | |
jhesketh | but there's also the use cases pointed out by jeblair | 12:17 |
*** unicell has quit IRC | 12:17 | |
*** unicell1 has joined #openstack-infra | 12:17 | |
jhesketh | in http://lists.openstack.org/pipermail/openstack-dev/2014-May/036174.html | 12:17 |
sdague | this is grouping up all the rest of the non voting jobs? | 12:17 |
jhesketh | well maybe... Ironic have a very specific case where it might be okay for them to have their own pipeline (and tripleo probably fall into that too) | 12:18 |
sdague | just so I figure out if we are on the same page, I'd see - https://review.openstack.org/#/c/97411/ as class b | 12:18 |
jhesketh | but others could probably go into a non-binding general pipeline | 12:18 |
sdague | and I think class 'c' jobs would probably each get their own pipeline | 12:18 |
*** adalbas has joined #openstack-infra | 12:19 | |
jhesketh | so without making a call on what is what I think this code covers that kind of separation | 12:19 |
jcoufal | ttx: Hi, I would like to ask you, if you could change the meeting times for UX meeting - we decided to alternate | 12:19 |
jhesketh | and I do agree with your definition between b+c | 12:19 |
sdague | jhesketh: yep, sure | 12:19 |
sdague | jhesketh: would also love zuul to automatically shuffle non-voting things into this pipeline | 12:19 |
sdague | so we didn't have to reassign | 12:19 |
sdague | and jobs wouldn't wait to report for a slow non-voting job | 12:20 |
sdague | but I think that's follow on | 12:20 |
jhesketh | hmm, I'd rather write a script that fixed the layout to move them rather than zuul support a shuffle | 12:20 |
*** mattt has joined #openstack-infra | 12:20 | |
jhesketh | because a) zuul is becoming increasingly complicated, and b) that's reinforcing the fact that non-voting jobs are ignored | 12:21 |
sdague | well, as someone that ends up moving jobs all the time, I think that would be error prone :) | 12:21 |
*** rfolco has joined #openstack-infra | 12:21 | |
sdague | non-voting jobs are always going to be ignored | 12:21 |
*** zhiyan_ is now known as zhiyan | 12:21 | |
sdague | they will probably be less ignored if they come in on a second job with a vote | 12:21 |
jhesketh | agreed, so moving them to this pipeline lets us make them voting | 12:22 |
sdague | yep | 12:22 |
sdague | that's what I meant | 12:22 |
jhesketh | sorry, not the most awake ;-) | 12:22 |
sdague | so if voting: false | 12:22 |
sdague | actually put them in a second queue | 12:22 |
sdague | but voted there | 12:22 |
*** yaguang has quit IRC | 12:22 | |
sdague | because if we have to shuffle that back and forth manually, we're going to screw it up a bunch | 12:23 |
jhesketh | I think the first big shuffle may be difficult, but subsequent ones to move them from non-binding to check are the same number of lines | 12:23 |
jhesketh | and we can do the first big shuffle in parts | 12:23 |
sdague | jhesketh: we move things back and forth a lot :) | 12:25 |
*** achuprin has joined #openstack-infra | 12:26 | |
*** jgallard has joined #openstack-infra | 12:27 | |
jhesketh | sdague: well I'm also hoping having this non-binding pipeline will make it easier to see how stable a job is and hopefully reduce that | 12:27 |
* StevenK glares at post. Come on, run jobs! | 12:27 | |
sdague | jhesketh: maybe, part of the reason for the shuffle is that as jobs mature they sometimes regress | 12:28 |
*** zhiyan is now known as zhiyan_ | 12:31 | |
*** dims_ has quit IRC | 12:31 | |
*** dims_ has joined #openstack-infra | 12:32 | |
sdague | question: why are docs jobs running 5 tox jobs on every run - https://review.openstack.org/#/c/97387/ | 12:32 |
*** oomichi has quit IRC | 12:32 | |
gilliard | fungi: looks like HP cloud 1.1 is runnnig OK for you now, is that right? | 12:33 |
*** weshay has quit IRC | 12:36 | |
*** weshay has joined #openstack-infra | 12:37 | |
*** aysyd has joined #openstack-infra | 12:38 | |
*** bauzas has quit IRC | 12:38 | |
openstackgerrit | Kiall Mac Innes proposed a change to openstack-infra/config: Add Designate DevStack/Requirements/Docs Jobs https://review.openstack.org/97348 | 12:41 |
openstackgerrit | A change was merged to openstack-infra/os-loganalyze: cleaning up index.rst file https://review.openstack.org/96405 | 12:43 |
*** salv-orlando has quit IRC | 12:43 | |
openstackgerrit | A change was merged to openstack/requirements: Add pyscss and django_pyscss for Horizon https://review.openstack.org/94376 | 12:45 |
*** krtaylor has quit IRC | 12:45 | |
openstackgerrit | Antoine Musso proposed a change to openstack-infra/zuul: cloner to easily clone dependent repositories https://review.openstack.org/70373 | 12:45 |
*** doude has joined #openstack-infra | 12:46 | |
openstackgerrit | A change was merged to openstack-infra/config: Upload tripleo-heat-templates releases to PyPi https://review.openstack.org/97265 | 12:46 |
*** julim has joined #openstack-infra | 12:46 | |
*** Longgeek_ has quit IRC | 12:48 | |
openstackgerrit | A change was merged to openstack-infra/config: Add project for zuul https://review.openstack.org/96542 | 12:49 |
openstackgerrit | Antoine Musso proposed a change to openstack-infra/zuul: Merger: support for finding branches in remote https://review.openstack.org/92161 | 12:49 |
openstackgerrit | Antoine Musso proposed a change to openstack-infra/zuul: Merger: support for pruning remotes https://review.openstack.org/92141 | 12:49 |
*** salv-orlando has joined #openstack-infra | 12:50 | |
*** smarcet has joined #openstack-infra | 12:51 | |
*** jcoufal has quit IRC | 12:52 | |
*** jcoufal has joined #openstack-infra | 12:53 | |
*** salv-orlando has quit IRC | 12:53 | |
*** changbl has quit IRC | 12:54 | |
*** salv-orlando has joined #openstack-infra | 12:55 | |
openstackgerrit | A change was merged to openstack-infra/config: add gerrit-dash-creator to stackforge https://review.openstack.org/97346 | 12:55 |
*** bauzas has joined #openstack-infra | 12:55 | |
*** dprince has joined #openstack-infra | 12:56 | |
*** radez_g0n3 is now known as radez | 12:57 | |
openstackgerrit | Antoine Musso proposed a change to openstack-infra/zuul: Add accessor for Repo._initialized https://review.openstack.org/97487 | 12:57 |
*** jistr has quit IRC | 12:57 | |
openstackgerrit | Antoine Musso proposed a change to openstack-infra/zuul: cloner to easily clone dependent repositories https://review.openstack.org/70373 | 12:59 |
*** skolekonov has quit IRC | 13:00 | |
*** jistr has joined #openstack-infra | 13:01 | |
mordred_phone | gilliard: maybe? it seems its still slow on deletes ... and on spinups slow too ... but it's not error | 13:02 |
*** heyongli has joined #openstack-infra | 13:02 | |
*** pblaho has quit IRC | 13:02 | |
*** mriedem has joined #openstack-infra | 13:03 | |
*** jaypipes has joined #openstack-infra | 13:03 | |
*** basha has quit IRC | 13:04 | |
openstackgerrit | Antoine Musso proposed a change to openstack-infra/zuul: Remove trailing spaces in debug log https://review.openstack.org/97488 | 13:04 |
*** miqui has joined #openstack-infra | 13:04 | |
gilliard | mordred_phone: ye - I was basing my "seems OK" on the low number of errors. Have you spoken to anyone about slow boots/deletes? | 13:05 |
*** ildikov_ has quit IRC | 13:05 | |
*** openstackgerrit has quit IRC | 13:06 | |
*** ildikov has joined #openstack-infra | 13:06 | |
tteggel | gilliard: mordred_phone: our DB is still not entirely healthy | 13:06 |
*** openstackgerrit has joined #openstack-infra | 13:07 | |
*** jhesketh has quit IRC | 13:09 | |
*** mrmartin has quit IRC | 13:09 | |
*** wenlock has joined #openstack-infra | 13:11 | |
*** _nadya_ has quit IRC | 13:12 | |
*** matsuhashi has quit IRC | 13:12 | |
*** bauzas has quit IRC | 13:12 | |
*** nosnos has quit IRC | 13:12 | |
*** hdd_ has joined #openstack-infra | 13:14 | |
*** msabramo has joined #openstack-infra | 13:14 | |
*** liyuezho has joined #openstack-infra | 13:24 | |
*** liyuezho has quit IRC | 13:24 | |
fungi | gilliard: tteggel: at the moment we seem to have 53 nodes in use there, 233 building and 214 deleting. this suggests that most of the nodes built error out or are otherwise unusable/unreachable and are immediately going back into the delete queue | 13:24 |
*** alexpilotti has joined #openstack-infra | 13:24 | |
*** bauzas has joined #openstack-infra | 13:25 | |
fungi | i'll track a sample failure down to see if i can get a more specific anecdote | 13:25 |
tteggel | fungi: ack, investigating. | 13:26 |
fungi | worth noting, those numbers seem fairly evenly weighted across all three azs in region b | 13:26 |
fungi | so it doesn't look like one particular az is struggling | 13:27 |
*** mfer has joined #openstack-infra | 13:28 | |
*** nati_ueno has joined #openstack-infra | 13:29 | |
*** heyongli has quit IRC | 13:29 | |
*** zz_gondoi is now known as gondoi | 13:30 | |
*** zehicle_at_dell has quit IRC | 13:30 | |
*** bknudson has joined #openstack-infra | 13:32 | |
*** jgrimm has joined #openstack-infra | 13:33 | |
openstackgerrit | Cedric Brandily proposed a change to openstack-infra/git-review: Disable ssh/scp password authentication during tests https://review.openstack.org/97025 | 13:35 |
fungi | tteggel: gilliard: i stand corrected. all of the dozen or so i've looked through logs for actually built successfully, got used to run jobs and then were recycled to the delete queue. it's just that booting instances from snapshots and deleting taking quite a bit longer than our average job duration, so we end up with only a fraction of our quota available there | 13:35 |
fungi | tteggel: gilliard: so at this point i think the main thing hurting us there is boot and delete speed | 13:35 |
*** signed8bit has joined #openstack-infra | 13:37 | |
*** trinaths has joined #openstack-infra | 13:38 | |
fungi | tteggel: gilliard: it looks like booting from a snapshot takes 30-40 minutes, and deletes are taking about the same | 13:38 |
*** habib has joined #openstack-infra | 13:40 | |
*** trinaths has quit IRC | 13:40 | |
*** habib has quit IRC | 13:41 | |
*** jistr has quit IRC | 13:41 | |
*** habib has joined #openstack-infra | 13:41 | |
*** jistr has joined #openstack-infra | 13:42 | |
*** pblaho has joined #openstack-infra | 13:42 | |
Kiall | Q: Do the docs produced by the *-specs repos get published anywhere? | 13:43 |
*** otherwiseguy has joined #openstack-infra | 13:43 | |
anteaya | Kiall: they should, let me check the jobs to see where they are put | 13:44 |
*** oomichi has joined #openstack-infra | 13:44 | |
Kiall | thanks :) | 13:44 |
anteaya | Kiall: which spec repo? | 13:44 |
Kiall | Any of the really, just wondering where they end up | 13:45 |
Kiall | them* | 13:45 |
fungi | Kiall: the plan is to set up a specs.openstack.org website to publish them on, but i don't believe it exists yet. in the meantime, with specs being in restructuredtext, they should be fairly readable straight off of git.openstack.org or out of your local clone | 13:45 |
Kiall | fungi: Thanks :) | 13:45 |
*** oomichi has quit IRC | 13:47 | |
bookwar | Kiall: there is a html version attached to the each review by jenkins *-docs job(if it is enabled for the project), but there isn't any site for current merged result | 13:47 |
tteggel | fungi: wow - that's crazy. our boots from snapshot are taking ~2mins | 13:48 |
fungi | bookwar: Kiall: right, our docs-drafts jobs give you a preview of what it would look like rendered for publication | 13:48 |
tteggel | fungi: how big is your snapshot? | 13:49 |
Kiall | Cool - Thanks | 13:49 |
fungi | tteggel: checking | 13:49 |
fungi | tteggel: /dev/vda1 seems to be 30G | 13:51 |
Kiall | fungi: I think he's looking for the size as reported by `glance image-list` | 13:51 |
tteggel | fungi: Kiall yes please | 13:51 |
fungi | Kiall: oh, i can certainly dig that up as well... just a sec | 13:51 |
Kiall | (Which would determine the copy time from Glance -> Compute Node) | 13:51 |
*** doug-fish has joined #openstack-infra | 13:53 | |
*** IvanBerezovskiy has quit IRC | 13:53 | |
*** esker has joined #openstack-infra | 13:57 | |
*** ihrachyshka_ has joined #openstack-infra | 13:59 | |
fungi | tteggel: glance image-show reports size 7770734592 for a bare-precise image, 5492441088 for devstack-precise, 4692312064 for bare-centos6 | 14:00 |
tteggel | fungi: thanks | 14:00 |
*** trinaths has joined #openstack-infra | 14:01 | |
fungi | so i guess that's 7.2, 5.1 and 4.4 gib respectively | 14:01 |
tteggel | fungi: nothing out of the ordinary there | 14:01 |
fungi | though i wonder whether it's spending time waiting on the scheduler to find somewhere to stuff another node (keep in mind that we use the 30g flavor to get enough vcpu count, but boot it with the kernel limited to 8gib ram) | 14:02 |
*** ihrachyshka has quit IRC | 14:02 | |
*** jcoufal has quit IRC | 14:03 | |
tteggel | fungi: we're not showing any significant capacity issues that would cause this | 14:03 |
openstackgerrit | linggao proposed a change to openstack/requirements: Updated version requirement for pyghmi https://review.openstack.org/97508 | 14:03 |
fungi | tteggel: okay, that's good | 14:03 |
fungi | obviously i'm blind to any of that | 14:03 |
tteggel | sure | 14:03 |
*** homeless has joined #openstack-infra | 14:03 | |
fungi | i'll test booting from one of these snapshots directly with novaclient to see what happens, just as another data point | 14:04 |
tteggel | yes please | 14:04 |
*** wenlock has quit IRC | 14:04 | |
tteggel | we have a couple of jobs that regularly test cached and snapshotted boots and we're seeing <3mins consistently | 14:05 |
*** maxbit has joined #openstack-infra | 14:07 | |
*** _nadya_ has joined #openstack-infra | 14:08 | |
*** adalbas has quit IRC | 14:08 | |
fungi | excellent. another possibility is that it's a delay related to attaching to user-defined networks? | 14:09 |
*** prad has joined #openstack-infra | 14:09 | |
fungi | since we are using those to break up the broadcast domains and get around the /24 network size limitation | 14:09 |
*** adalbas has joined #openstack-infra | 14:10 | |
tteggel | that's possible. is there any way to break down time-to-active vs time-to-sshable? | 14:11 |
*** gondoi is now known as zz_gondoi | 14:11 | |
fungi | i think i can get a rough estimate based on our debug logs | 14:11 |
fungi | checking | 14:11 |
tteggel | rough estimate would be great | 14:11 |
*** zz_gondoi is now known as gondoi | 14:13 | |
*** eharney has joined #openstack-infra | 14:15 | |
*** Guest54700 has quit IRC | 14:15 | |
*** rdopiera has quit IRC | 14:16 | |
*** basha has joined #openstack-infra | 14:16 | |
*** flaper87|afk is now known as flaper87 | 14:17 | |
*** mwagner_lap has joined #openstack-infra | 14:21 | |
fungi | tteggel: as a single example first, this is what we see for a single node's lifecycle from a nodepool debug log perspective... http://paste.openstack.org/show/82601/ | 14:23 |
tteggel | ty | 14:23 |
*** IvanBerezovskiy has joined #openstack-infra | 14:23 | |
openstackgerrit | Michael Krotscheck proposed a change to openstack-infra/storyboard: Added sort parameters to API https://review.openstack.org/95959 | 14:24 |
fungi | tteggel: 13:01:10,061 is where the nova boot call is made, 13:08:28,062 is i believe when nova reports the node in a ready state, 13:40:19,541 is when nodepool is actually able to reach it (i'm double-checking nodepool source now to confirm that) | 14:24 |
fungi | tteggel: and then on the flip side, 13:43:02,213 is when the nova delete call is made and 14:19:38,842 is when it's finally gone | 14:25 |
tteggel | fungi: let me pass thisinfo on to someone who knows more about our neutron that I | 14:26 |
tteggel | *than | 14:26 |
fungi | tteggel: that would be awesome | 14:27 |
*** maxbit has quit IRC | 14:27 | |
*** wenlock has joined #openstack-infra | 14:27 | |
*** wenlock has joined #openstack-infra | 14:28 | |
tteggel | fungi: times are UTC? | 14:28 |
fungi | tteggel: yes | 14:29 |
*** andreaf has quit IRC | 14:29 | |
*** nati_ueno has quit IRC | 14:29 | |
fungi | tteggel: i picked one which just finished successfully deletingh | 14:29 |
tteggel | got it. thanks. | 14:29 |
fungi | so that i could get an accurate delete time delta too | 14:30 |
*** jlk_clone is now known as jlk | 14:30 | |
*** jlk has joined #openstack-infra | 14:30 | |
tteggel | fungi: i'm travelling for rest of the day will get back to you tomorrow. | 14:30 |
fungi | tteggel: okay, great | 14:30 |
fungi | tteggel: aha, my log filter needed to be case insensitive... http://paste.openstack.org/show/82603/ | 14:33 |
fungi | tteggel: 13:08:28,062 was actually when the nova boot call returned, 13:40:14,875 was when it was finally reported in the server list and it was basically immediately reachable. checking now to see what exactly it waits for between 13:08:28,062 and 13:40:14,875 | 14:34 |
tteggel | fungi: ack | 14:35 |
fungi | tteggel: it's waiting for nova to report a status of ACTIVE or ERROR apparently | 14:36 |
*** gokrokve has joined #openstack-infra | 14:36 | |
jeblair | sdague: (and jhesketh is gone) er, i'm pretty sure at the summit we decided that having one extra user for zuul to put the "policy non-voting" jobs in was the plan. | 14:36 |
*** hdd_ has quit IRC | 14:37 | |
jeblair | sdague: i'm _very strongly_ opposed to making throwaway users for zuul for integrating projects | 14:37 |
tteggel | this is doing almost exactly the same as our tooling. confused. digging deeper | 14:37 |
jeblair | sdague: i think the UI for that is terrible (with lots of extra comments being left on each patch), and creating users in gerrit is far too heavyweight | 14:38 |
jeblair | sdague: that's why, to make doubly sure, i laid out the plan in the mailing list. | 14:40 |
jeblair | sdague: if that's not acceptable to nova, then I think we need to go back to the drawing board. I would suggest that instead, we simply adopt a no non-voting jobs rule for nova and then ask nova core reviewers to please just pay attention. it's really not that hard. | 14:41 |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/config: Grant superuser to Nikita Konovalov in storyboard https://review.openstack.org/97521 | 14:41 |
sc68cal | hey we have some guy in openstack-meeting named lancet | 14:41 |
sc68cal | spamming basically porn text | 14:41 |
sc68cal | any way for me to kick? | 14:41 |
krotscheck | SergeyLukjanov: I tried that too, was denied. | 14:42 |
*** basha has quit IRC | 14:42 | |
jeblair | sc68cal: done | 14:42 |
sc68cal | jeblair: thanks....... | 14:42 |
*** maxbit has joined #openstack-infra | 14:43 | |
krotscheck | SergeyLukjanov: See https://review.openstack.org/#/c/78723/ | 14:44 |
krotscheck | or was that something different? | 14:44 |
krotscheck | Wait a sec, I may be talking out of my ass here. | 14:44 |
jeblair | krotscheck: what's the context? | 14:44 |
jeblair | SergeyLukjanov: ? | 14:45 |
krotscheck | jeblair: https://review.openstack.org/97521 | 14:45 |
krotscheck | I AM talking out of my ass. | 14:45 |
* SergeyLukjanov reading scrollback | 14:45 | |
krotscheck | Ignore me. | 14:45 |
jeblair | SergeyLukjanov: what's the context for adding NikitaKonovalov to the superuser list? | 14:45 |
SergeyLukjanov | jeblair, he's now sb-core, could be useful (we have other sb-cores in superusers) | 14:46 |
SergeyLukjanov | jeblair, probably not ;) | 14:46 |
jeblair | could be; i don't think there's much for a superuser to do at the moment though | 14:47 |
krotscheck | jeblair SergeyLukjanov: Well, superusers can create projects, which are subsequently deleted by load-projects. | 14:47 |
krotscheck | So being a superuser doesn’t actually let you do anything meaningful. | 14:47 |
SergeyLukjanov | krotscheck, https://review.openstack.org/#/c/78723/ is for root access to the VM, https://review.openstack.org/97521 is just some additional buttons on UI | 14:48 |
krotscheck | SergeyLukjanov: Yeah, I noticed that when I mentioned that I was talking out of my ass. | 14:48 |
SergeyLukjanov | :) | 14:48 |
*** UtahDave has joined #openstack-infra | 14:49 | |
krotscheck | Sorry, I should have been more clear about what sounds that was referring to. | 14:49 |
*** dims_ has quit IRC | 14:49 | |
SergeyLukjanov | jeblair, krotscheck, I've created this CR while was reading through the list of storyboard references in os-infra/config | 14:49 |
SergeyLukjanov | and see that we have infra-core + sb-core in this file ;) | 14:49 |
*** lcheng_ has joined #openstack-infra | 14:50 | |
openstackgerrit | A change was merged to openstack-infra/config: Fixed config of manila's jobs https://review.openstack.org/97022 | 14:50 |
*** mika has quit IRC | 14:53 | |
jeblair | fungi: do you think there's a nodepool inefficiency here? | 14:54 |
*** ramashri has joined #openstack-infra | 14:55 | |
jeblair | fungi: i wonder if we could increase the rate limit for hpcloud 1.1 | 14:55 |
fungi | jeblair: i'm not sure yet. maybe threads being blocked? | 14:55 |
*** ihrachyshka_ has quit IRC | 14:56 | |
fungi | jeblair: oh, actually, so tteggel and gilliard upped our rate quota when i pointed out that deleting disassociated ports in a loop was getting rejected after a few due to throttling | 14:56 |
fungi | that was at the end of last week | 14:56 |
fungi | so we could probably tell nodepool to use a shorter delay between calls now | 14:56 |
jeblair | fungi: any idea what the rate is? | 14:57 |
jeblair | 'nova rate-limits' is empty | 14:57 |
fungi | no idea, just said they put us in a higher "rate class" i think (no specifics). might be getting enforced by a proxy in front of the api endpoint rather than a nova feature or something? | 14:58 |
jeblair | yay | 14:59 |
sc68cal | Is there a way to sanitize the log to expunge the lines from lancet ? | 14:59 |
fungi | i hear that's a popular solution anyway | 14:59 |
sc68cal | for neutron_ipv6 irc meeting | 14:59 |
*** reed has joined #openstack-infra | 15:00 | |
fungi | jeblair: is that ^ something we have a policy on? otherwise, i'm happy to just back up and filter the meeting logs | 15:00 |
krotscheck | Hey everyone: I was just asked to curb my language in channel. I’m sorry for using curse words, it’s not appropriate in a professional setting. | 15:00 |
jeblair | fungi: it hasn't come up, but i reckon we could. maybe you should leave it like "lancet: [deleted]" or something? | 15:02 |
*** jcoufal has joined #openstack-infra | 15:02 | |
anteaya | lancet: [spam] might be more informative | 15:02 |
*** jcoufal has quit IRC | 15:03 | |
*** lcheng_ has quit IRC | 15:03 | |
*** jcoufal has joined #openstack-infra | 15:03 | |
fungi | sure, can do shortly | 15:04 |
jeblair | anteaya: ++ | 15:04 |
jeblair | trying to balance it against the idea that the log is a 'permanent record' | 15:05 |
openstackgerrit | German Eichberger proposed a change to openstack-infra/gear: adds code to not block when using eventlet https://review.openstack.org/97533 | 15:06 |
*** _nadya_ has quit IRC | 15:07 | |
anteaya | right | 15:07 |
sc68cal | jeblair: agree - although I don't know what benefit we get from having porn in our permanent record :-\ | 15:08 |
anteaya | I don't like altering permanent records either | 15:08 |
jeblair | sc68cal: i wasn't suggesting that we do; however, if we _alter_ the record, i think we need to make it clear | 15:08 |
anteaya | but saving links to other sites sets a bad precedent for would-be spammers | 15:08 |
*** zzelle has joined #openstack-infra | 15:09 | |
sc68cal | strangely there were no URLs | 15:09 |
anteaya | ah sorry, I never checked, that was my take away from what you had said | 15:10 |
anteaya | either way, doesn't matter | 15:10 |
anteaya | expunge the content | 15:10 |
sdague | jeblair: ok, so that isn't what I thought we agreed to at breakfast | 15:10 |
sdague | so probably worth circling back around. | 15:11 |
jeblair | sdague: yeah, i thought that's why we came up with the 3 different classes of jobs, which is what i laid out in my email | 15:12 |
sdague | ok, so I think the only question is whether class 'c' allows multiples in it. Because it's definitely desired that the Ironic jobs are 1 vote, that's just Ironic jobs. | 15:14 |
sdague | so it's on par with Xen jobs, for instance | 15:14 |
*** dkliban is now known as dkliban_afk | 15:17 | |
*** afazekas has quit IRC | 15:20 | |
mgagne | jeblair: about https://review.openstack.org/#/c/90455/10 | 15:20 |
mgagne | jeblair: what do you think about the new requirement on pbr? I'm thinking about distribution people packaging python-jenkins which now have to package pbr too | 15:20 |
jeblair | sdague: yeah, so what i got was that it's too hard to filter out "non-voting because they are pointless" and "non-voting because of policy reasons" from the results list, and so we should group them in such a way that you can see that "important non-voting jobs" gave a -1 and then you look at which failed and determine if it's a blocker. | 15:20 |
jeblair | sdague: if the actual problem is that "no one can be bothered to look at the results period and the nova developers are incapable of interpreting a result other than +1/-1" then honestly there's a bigger problem and we should address that | 15:21 |
sdague | jeblair: I guess. It seems like we have a useful UX with 3rd party CI systems reporting for sets of configurations. This is a nova driver, and having it look the same way as Xen driver voting is valuable. | 15:23 |
jeblair | sdague: i think it's a terrible ux -- adding all those comments makes reading the results impractical. and honestly, this is not at the level of a third party test -- this is the actual first party. | 15:23 |
sdague | but it's not urgent right now, and it seems like the zuul infrastructure could support either option. So something that can be sorted later. | 15:24 |
jeblair | sdague: alternative solutions we could consider are a) adopt a no-non-voting jobs policy for nova, except for important ones like ironic; b) make ironic voting | 15:24 |
sdague | jeblair: I think ux is in the eye of the beholder here. Being able to filter on those users is actually handy | 15:24 |
*** rdopiera has joined #openstack-infra | 15:25 | |
*** jistr has quit IRC | 15:25 | |
*** rgerganov has joined #openstack-infra | 15:26 | |
*** signed8bit has quit IRC | 15:26 | |
jeblair | sdague: yeah, i just think that ironic isn't actually like those other users; i think the circumstances around an integrating project are different, and what works for one isn't appropriate for the other | 15:26 |
*** signed8bit has joined #openstack-infra | 15:27 | |
fungi | sc68cal: i've redacted the spam from both text and html copies of your meeting logs. i didn't find any other logs containing lines from that nick | 15:27 |
*** _nadya_ has joined #openstack-infra | 15:27 | |
*** fifieldt has joined #openstack-infra | 15:27 | |
*** skolekonov has joined #openstack-infra | 15:27 | |
*** _nadya_ has quit IRC | 15:28 | |
sc68cal | fungi: thanks | 15:28 |
sdague | jeblair: could be. But I think different people use the interface differently here, and we should allow for that. Having a driver config voting under a dedicated userid is actually very handy and lots of people like it. | 15:29 |
sdague | especially as ironic isn't just an intergrated project, but it's a nova driver | 15:30 |
openstackgerrit | Ben Nemec proposed a change to openstack-infra/config: Add dib-utils project https://review.openstack.org/90281 | 15:30 |
*** esker has quit IRC | 15:30 | |
jeblair | sdague: sure, but as an integrated project, it should be tested by the openstack infrastructure and be part of the integrated gate | 15:31 |
*** esker has joined #openstack-infra | 15:31 | |
*** radez is now known as radez_g0n3 | 15:31 | |
*** gondoi is now known as zz_gondoi | 15:32 | |
jeblair | sdague: our ci system is complicated enough as it is, i'm very perturbed by the idea that we would be okay with exploding the complexity; every new pipeline and new gerrit user is another thing that users have to understand and interact with | 15:33 |
openstackgerrit | Radomir Dopieralski proposed a change to openstack/requirements: Add xstatic and xstatic-jquery for Horizon https://review.openstack.org/94337 | 15:34 |
*** gyee has joined #openstack-infra | 15:34 | |
jeblair | sdague: i'm already not excited about having to explain how "non-binding openstack ci" is different than "openstack ci", and the idea of saying "oh, yeah, ironic ci is just openstack ci but it's pretending to be a different system because of reasons but it's really the same one and might join it someday" is something i dread. | 15:34 |
*** gokrokve has quit IRC | 15:37 | |
*** ominakov has quit IRC | 15:39 | |
*** zz_gondoi is now known as gondoi | 15:42 | |
*** morganfainberg_Z is now known as morganfainberg | 15:43 | |
jeblair | fungi: should we try setting the rate to 0.5 or 0.1 and seeing what happens? | 15:44 |
fungi | jeblair: yeah, was just whipping up a patch. 0.1 it is | 15:45 |
jeblair | fungi: cool | 15:45 |
*** gondoi is now known as zz_gondoi | 15:45 | |
*** andreaf has joined #openstack-infra | 15:46 | |
*** signed8bit has quit IRC | 15:46 | |
*** lcheng_ has joined #openstack-infra | 15:46 | |
*** signed8bit has joined #openstack-infra | 15:47 | |
*** nati_ueno has joined #openstack-infra | 15:47 | |
*** zz_gondoi is now known as gondoi | 15:48 | |
*** IvanBerezovskiy has left #openstack-infra | 15:49 | |
openstackgerrit | Jeremy Stanley proposed a change to openstack-infra/config: Reduce nodepool's wait between calls in hpcloud https://review.openstack.org/97545 | 15:50 |
fungi | jeblair: that ^ ? | 15:50 |
fungi | oh, also, i manually deleted the old hpcloud 1.0 images out of nodepool's mysql database this morning. since we'd already merged the change to remove the corresponding providers, nodepool was unable to delete them itself (and was complaining in the log about it) | 15:51 |
*** rlandy has quit IRC | 15:51 | |
*** vhoward has left #openstack-infra | 15:52 | |
*** jlibosva has quit IRC | 15:52 | |
*** nati_uen_ has joined #openstack-infra | 15:52 | |
rdopiera | sdague: I just wanted to mention that xstatic just released stable version 1.0.0 :) | 15:52 |
*** nati_ue__ has joined #openstack-infra | 15:53 | |
*** fifieldt has quit IRC | 15:53 | |
sdague | rdopiera: great ! | 15:53 |
*** nati_ue__ has quit IRC | 15:53 | |
*** esker has quit IRC | 15:54 | |
jeblair | fungi: aprvd but how about we apply that manually? | 15:54 |
*** nati_ue__ has joined #openstack-infra | 15:54 | |
*** esker has joined #openstack-infra | 15:54 | |
*** ihrachyshka has joined #openstack-infra | 15:55 | |
*** nati_ueno has quit IRC | 15:55 | |
sdague | fungi: can we force merge this - https://review.openstack.org/#/c/97251/ - not having the devstack log is definitely a challenge | 15:57 |
*** nati_uen_ has quit IRC | 15:57 | |
jeblair | sdague: we usually only force-merge things that help fix the reason that they can't merge normally | 15:58 |
*** ihrachyshka has quit IRC | 15:59 | |
*** ihrachyshka has joined #openstack-infra | 15:59 | |
*** derekh_ has quit IRC | 16:00 | |
*** homeless_ has joined #openstack-infra | 16:00 | |
*** amotoki has joined #openstack-infra | 16:00 | |
*** e0ne_ has quit IRC | 16:00 | |
zaro | morning | 16:01 |
fungi | jeblair: will do | 16:01 |
*** homeless has quit IRC | 16:01 | |
sdague | jeblair: sure. The issue is that with the current gate backlog, we're doing a lot of round trips on that. And it means we are missing a kind of critical log for debugging some of the race fails. | 16:01 |
fungi | sdague: it could be enqueued and promoted instead | 16:01 |
sdague | fungi: sure | 16:02 |
*** esker has quit IRC | 16:02 | |
sdague | that would be fine as well | 16:02 |
*** esker has joined #openstack-infra | 16:02 | |
*** julim has quit IRC | 16:02 | |
*** flaper87 is now known as flaper87|afk | 16:03 | |
*** e0ne has joined #openstack-infra | 16:03 | |
fungi | will do | 16:03 |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: puppetize installation of gerrit third party plugins https://review.openstack.org/91193 | 16:03 |
*** esker has quit IRC | 16:04 | |
*** ociuhandu has joined #openstack-infra | 16:04 | |
*** afazekas has joined #openstack-infra | 16:04 | |
apevec | fungi, nova-stable-icehouse.tar.gz is still May 30 - could you please kick that job manually? | 16:05 |
*** e0ne has quit IRC | 16:05 | |
apevec | also neutron and ceilometer | 16:06 |
*** pelix has quit IRC | 16:08 | |
fungi | apevec: i can trigger them, but they're built in a pipeline which doesn't get elevated priority and has a lengthy (nearly 5 hour) backlog due to worker starvation | 16:09 |
*** pelix has joined #openstack-infra | 16:09 | |
fungi | apevec: chances are there are already tarball jobs for at least some of those in the ~20 post pipeline changes waiting for available workers | 16:10 |
apevec | uh, it's still that bad | 16:10 |
apevec | are workers getting union organized to work less? :) | 16:11 |
apevec | is there any way to look into what jobs are in that queue? | 16:12 |
jeblair | apevec: http://status.openstack.org/zuul/ | 16:12 |
*** virmitio has joined #openstack-infra | 16:12 | |
*** sandywalsh has quit IRC | 16:12 | |
apevec | ah, that page is so full, I kept focusing only on check and gate queues and missed everything else | 16:13 |
*** pelix has quit IRC | 16:13 | |
anteaya | fungi: this log could use some expungement: http://eavesdrop.openstack.org/irclogs/%23openstack-meeting/%23openstack-meeting.2014-06-03.log | 16:13 |
*** pelix has joined #openstack-infra | 16:14 | |
*** sandywalsh has joined #openstack-infra | 16:15 | |
*** yfried has quit IRC | 16:15 | |
*** dims_ has joined #openstack-infra | 16:16 | |
*** yamahata has quit IRC | 16:16 | |
apevec | fungi, jeblair - only this nova master change is in post queue (and failing) https://jenkins05.openstack.org/job/nova-branch-tarball/157/ | 16:17 |
*** dangers_away is now known as dangers | 16:19 | |
jeblair | apevec: we should look into why the stable change either didn't run or failed | 16:19 |
apevec | jeblair, yes please, I need that working for 2014.1.1 release on Thursday | 16:20 |
*** dims_ has quit IRC | 16:21 | |
jeblair | apevec: what's the git sha of the nova stable branch? | 16:21 |
*** marcoemorais has joined #openstack-infra | 16:22 | |
*** rgerganov has quit IRC | 16:23 | |
apevec | jeblair, last merge is http://git.openstack.org/cgit/openstack/nova/commit/?h=stable/icehouse&id=7431cb92729663ae9460df3bc654384fd6b56788 | 16:24 |
*** markmcclain has joined #openstack-infra | 16:24 | |
apevec | so 7431cb92729663ae9460df3bc654384fd6b56788 | 16:24 |
*** jgallard has quit IRC | 16:24 | |
jeblair | apevec: http://logs.openstack.org/74/7431cb92729663ae9460df3bc654384fd6b56788/ | 16:24 |
jeblair | apevec: http://logs.openstack.org/74/7431cb92729663ae9460df3bc654384fd6b56788/post/nova-branch-tarball/62acbb9/console.html | 16:25 |
apevec | oh it failed | 16:25 |
jeblair | apevec: i wonder if the recent wheel change broke the job | 16:25 |
apevec | yeah, no *.tar.gz | 16:25 |
*** freyes has quit IRC | 16:25 | |
jeblair | mordred: ^ | 16:25 |
anteaya | he might be not answering to mordred_phone | 16:26 |
anteaya | as well | 16:26 |
fungi | jeblair: it did, we merged fixes yesterday morning, but need new image updates for bare-precise to complete in hpcloud region b with the fixed scripts on them | 16:26 |
jeblair | anteaya: i'm not going to ping him twice -- his connectivity issues are his own. i suspect he watches 'mordred' on both anyway. | 16:27 |
*** ihrachyshka has quit IRC | 16:27 | |
fungi | jeblair: i'll get loops going to regenerate any bare-precise images which failed to build there | 16:27 |
jeblair | fungi: are we having build problems in hpcloud 1.1? | 16:27 |
anteaya | k | 16:27 |
fungi | jeblair: possibly. they failed yesterday but there was so much else going on with nova failures at the time i chalked it up to that | 16:28 |
fungi | looking now to see if any succeeded today | 16:28 |
jeblair | makes sense | 16:28 |
fungi | jeblair: looks like they all succeeded today, less than an hour ago though, so chances are the job ran in hpcloud earlier than that | 16:30 |
jeblair | apevec, fungi: so re-triggering might succeed | 16:30 |
fungi | i was also waiting to retrigger the taskflow release jobs (which is how clarkb discovered the error over the weekend) | 16:30 |
*** jooools has quit IRC | 16:32 | |
fungi | apevec: i'll get to retriggering those shortly | 16:37 |
apevec | thanks! | 16:37 |
*** freyes has joined #openstack-infra | 16:38 | |
fungi | jeblair: the test nodes graph certainly seems to have shifted after i adjusted rates on hpcloud... maybe for the worse--reviewing logs now | 16:38 |
jeblair | fungi: 2014-06-03 16:40:14,895 DEBUG nodepool.ProviderManager: Manager hpcloud-b4 running task <nodepool.provider_manager.ListExtensionsTask object at 0x7fe64f8a1390> | 16:40 |
jeblair | running that over and over doesn't look healthy | 16:40 |
fungi | aw, crap. since i used patch as root to apply my change to the config, ownership got set to root and nodepoold stopped being able to read the config | 16:41 |
fungi | fixed now | 16:41 |
*** basha has joined #openstack-infra | 16:41 | |
jeblair | looks better | 16:41 |
jeblair | it's now creating _a lot_ of servers | 16:42 |
fungi | i guess this will test rate limits on nova boot calls in hpcloud ;) | 16:42 |
*** julim has joined #openstack-infra | 16:45 | |
*** harlowja_away is now known as harlowja | 16:45 | |
*** thedodd has joined #openstack-infra | 16:45 | |
*** nati_ue__ has quit IRC | 16:46 | |
*** lttrl has joined #openstack-infra | 16:46 | |
harlowja | clarkb i'm gonna do a small couple changes for taskflow and tag 0.3.1, so i think this means u don't need to rerun the job (and when i tag it this will workout this time) | 16:47 |
jeblair | fungi: ^ | 16:48 |
fungi | harlowja: thanks--i'll drop it form the list of retriggers | 16:48 |
*** hashar has quit IRC | 16:48 | |
fungi | s/form/from/ | 16:48 |
harlowja | sounds good | 16:48 |
fungi | harlowja: and yes, at this point all the slaves which would run your tarball job have clarkb's fixes (as of about an hour ago finally) so should work fine | 16:49 |
harlowja | woot | 16:49 |
*** gokrokve has joined #openstack-infra | 16:50 | |
*** skolekonov has quit IRC | 16:50 | |
*** freyes has quit IRC | 16:51 | |
*** pblaho has quit IRC | 16:53 | |
jeblair | fungi: i don't see any rate limit errors so far | 16:54 |
*** apevec has quit IRC | 16:55 | |
*** freyes has joined #openstack-infra | 16:55 | |
fungi | nor i | 16:56 |
fungi | just a bunch of missing image errors from tripleo rh1, which i've just begun to ask them about | 16:56 |
jeblair | yup | 16:57 |
*** habib has quit IRC | 16:58 | |
*** srenatus has quit IRC | 16:58 | |
*** radez_g0n3 is now known as radez | 16:59 | |
*** srenatus has joined #openstack-infra | 16:59 | |
*** marcoemorais has quit IRC | 16:59 | |
*** arnaud has joined #openstack-infra | 16:59 | |
*** marcoemorais has joined #openstack-infra | 16:59 | |
*** markwash has joined #openstack-infra | 16:59 | |
*** amcrn has joined #openstack-infra | 17:00 | |
*** marcoemorais1 has joined #openstack-infra | 17:00 | |
*** rdopiera has quit IRC | 17:00 | |
*** esker has joined #openstack-infra | 17:00 | |
jpich | Would someone be able to force an "#endmeeting" command into #openstack-meeting-3? We lost the chair earlier | 17:01 |
anteaya | jpich: anyone should be able to run #endmeeting after a certain period of time, I do believe | 17:01 |
anteaya | will meetbot not hear you if you try? | 17:01 |
*** basha has quit IRC | 17:01 | |
clarkb | yup 60 minutes after startmeeting | 17:02 |
jpich | anteaya: I think there's another meeting supposed to start straightaway though | 17:02 |
*** fbo is now known as fbo_away | 17:02 | |
jpich | anteaya: meetbot ignores me which is probably fair :) | 17:02 |
*** trinaths has left #openstack-infra | 17:02 | |
anteaya | I don't know if anyone has special meetbot powers though other than the chair or the timeout | 17:02 |
anteaya | jpich: :( | 17:02 |
*** sarob has joined #openstack-infra | 17:02 | |
jpich | Oh, 60 minutes after the *start*! Great, thank you :) | 17:03 |
anteaya | np | 17:03 |
anteaya | happy #endmeeting | 17:03 |
*** jpich has quit IRC | 17:04 | |
*** marcoemorais has quit IRC | 17:04 | |
clarkb | fungi jeblair is the nodepool rate for region b correct? we seem slow there | 17:05 |
*** homeless has joined #openstack-infra | 17:05 | |
*** lcheng_ has quit IRC | 17:06 | |
fungi | clarkb: i just set it to 0.1 (after disabling puppet agent) as an experiment. basically https://review.openstack.org/97545 | 17:06 |
clarkb | oh I see there was a patch applied? | 17:06 |
jeblair | clarkb: there's a possibility that there isn't much to do | 17:07 |
fungi | waiting to see whether it helps | 17:07 |
clarkb | jeblair: ya | 17:07 |
*** primeministerp has joined #openstack-infra | 17:07 | |
jeblair | clarkb: especially since each provider should only run listservers task every 5 seconds | 17:08 |
clarkb | we should probably sort out iad too | 17:08 |
*** vhoward has joined #openstack-infra | 17:08 | |
*** homeless_ has quit IRC | 17:09 | |
*** mika has joined #openstack-infra | 17:10 | |
*** mika has joined #openstack-infra | 17:10 | |
*** palar has joined #openstack-infra | 17:10 | |
*** ociuhandu has quit IRC | 17:12 | |
jeblair | i spot checked a building server; it's ACTIVE but nodepool hasn't noticed that yet | 17:13 |
SpamapS | hm, I am finding that coverage doesn't work on projects named things like 'os-collect-config' .. the module is 'os_collect_config' ... coverage doesn't seem to care about 'module = os_collect_config' in setup.cfg ... but if I change the name= ... then the package name changes on pypi... | 17:14 |
*** Ryan_Lane has joined #openstack-infra | 17:14 | |
fungi | jeblair: found a possible nodepool bug which accounts for the tripleo errors... http://paste.openstack.org/show/82646/ (note that image id is marked as "ready" in nodepool image-list) | 17:14 |
SpamapS | Anybody know a fix for that? are we the only people silly enough to use dashes in a name w/ pbr managed projects? | 17:14 |
jeblair | fungi: it may have changed from ERROR to ACTIVE | 17:15 |
fungi | oh... | 17:15 |
jeblair | fungi: at least, that's what i would initially conclude from that | 17:15 |
SpamapS | ERROR -> ACTIVE ... thats.. amazing. | 17:15 |
SpamapS | self healing! | 17:15 |
*** arnaud has quit IRC | 17:15 | |
fungi | well, it currently doesn't exist in glance and the logs on their end show it was deleted around that time | 17:16 |
jeblair | fungi: oh, i may have misunderstood then | 17:16 |
jeblair | fungi: so you're saying nodepool marked an error image as ready | 17:16 |
fungi | possibly. i'm saying the last comment about the image in its debug log is that it was in an error state | 17:17 |
*** dims_ has joined #openstack-infra | 17:17 | |
jeblair | fungi: yes, that sounds buggy and i think i see it | 17:17 |
fungi | and they show it was deleted at 2014-06-03T14:51:15 (a couple minutes prior to that) | 17:17 |
*** markwash has quit IRC | 17:17 | |
fungi | due to a swift failure on their backend | 17:18 |
* notmyname perks up | 17:18 | |
clarkb | SpamapS: there is a different option to set the coverage module iirc | 17:19 |
clarkb | SpamapS: if you look in pbr you should be able to find it. | 17:19 |
*** markwash has joined #openstack-infra | 17:19 | |
fungi | notmyname: the tripleo cd admins are digging into a glance upload failure which trace logged as "TRACE glance.api.v1.upload_utils Got error from Swift: put_object('glance', '9820141c-bdd0-4b46-b0c1-e70b9de51dbd', ...) failure and no ability to reset contents for reupload." | 17:20 |
fungi | presumably on something close to tip of master glance and swift | 17:20 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Check the returned image status https://review.openstack.org/97564 | 17:20 |
SpamapS | coverage-package-name | 17:20 |
clarkb | jeblair: is the active but nodepool not noticing yet thing something we should treat as a nodepool bug? | 17:21 |
jeblair | fungi: https://review.openstack.org/97564 | 17:21 |
*** changbl has joined #openstack-infra | 17:21 | |
*** markmcclain has quit IRC | 17:21 | |
jeblair | clarkb: not sure | 17:21 |
fungi | oho | 17:21 |
jeblair | clarkb: yeah, i think so, and i think i may know one of the causes | 17:22 |
jeblair | one min | 17:22 |
clarkb | great I will wait for what you do rather than digging through nodepool logs/code | 17:22 |
SpamapS | clarkb: ugh, but that breaks tests. :-P | 17:23 |
*** dprince has quit IRC | 17:24 | |
clarkb | SpamapS: coverage is known to do that | 17:24 |
clarkb | especially if you are sensitive to timing | 17:24 |
*** arnaud has joined #openstack-infra | 17:25 | |
*** sweston has joined #openstack-infra | 17:26 | |
SpamapS | It's failing on versioning in pbr | 17:27 |
SpamapS | http://paste.ubuntu.com/7581717/ | 17:28 |
mordred_phone | SpamapS: you're failing on versioning in pbr | 17:28 |
*** ihrachyshka has joined #openstack-infra | 17:28 | |
mgagne | Is https://github.com/ruby-openstack an official OpenStack project? | 17:28 |
mordred_phone | no | 17:28 |
anteaya | mgagne: what would make you think it might be? | 17:29 |
mgagne | anteaya: the logo? | 17:29 |
anteaya | mgagne: the use of -openstack | 17:29 |
SpamapS | I don't think the words "Ruby" and "Official" are ever allowed together. | 17:29 |
*** dizquierdo has quit IRC | 17:29 | |
anteaya | mgagne: okay | 17:29 |
*** masayukig has quit IRC | 17:29 | |
mgagne | SpamapS: what about this page? http://developer.openstack.org/ | 17:29 |
anteaya | who does logo takedowns on github? | 17:29 |
SpamapS | oh weird | 17:30 |
fungi | mgagne: perhaps dtroyer may know whether that's being folded into the new client/sdk program? | 17:30 |
mgagne | SpamapS: hmmm it used to say "Official SDK" | 17:30 |
SpamapS | the problem was I didn't have "usedevelop" set | 17:30 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Prevent listserver tasks from piling up https://review.openstack.org/97570 | 17:32 |
SpamapS | hm no | 17:32 |
SpamapS | setting coverage-package-name doesn't work | 17:32 |
jeblair | clarkb, fungi: ^ i suspect that's a contributing cause to nodepool being slow to notice servers coming online | 17:32 |
JayF | SpamapS: https://github.com/openstack/ironic-python-agent has - and works, if it's useful to have a positive example | 17:32 |
JayF | SpamapS: but we've never put that on pypi (and probably never will, because it doesn't make sense to pip install a rest api for imaging your computer :D) | 17:33 |
SpamapS | Not sure I agree with that last point. :) | 17:33 |
SpamapS | It doesn't make sense _to you_. :) | 17:33 |
reed | I don't understand: I thought that since openstackstatus hangs on #openstack-community the channel was logged... but I can't see the logs on eavesdrop. What do I need to do to have the channel logged? | 17:34 |
anteaya | meetbot | 17:35 |
anteaya | meetbot logs the channel | 17:35 |
fungi | reed: http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/manifests/eavesdrop.pp | 17:35 |
anteaya | openstackstatus tells you when the infra system is having issues | 17:35 |
SpamapS | JayF: tox -ecover in that repo produces an error | 17:35 |
SpamapS | error: invalid command 'ironic_python_agent' | 17:35 |
fungi | reed: adding it to the meetbot::site channels array will get it logging (and the openstack bot will be present in channel at that point) | 17:35 |
JayF | SpamapS: ah, was your problem with coverage? I don't know if we even have that setup for ipa | 17:36 |
* reed notices that every question can be answered in a puppet module | 17:36 | |
*** masayukig has joined #openstack-infra | 17:36 | |
*** ociuhandu has joined #openstack-infra | 17:36 | |
*** nati_ueno has joined #openstack-infra | 17:36 | |
SpamapS | JayF: yes that is the problem I have. | 17:36 |
SpamapS | JayF: and i-p-a has a different problem. :) | 17:36 |
mordred_phone | SpamapS: I'll look next time I'm on my laptop | 17:37 |
SpamapS | and if I fix that problem, it has the same problem I have | 17:37 |
fungi | reed: except for the question "which puppet module should i be looking at?" | 17:37 |
SpamapS | Coverage.py warning: Module ironic-python-agent was never imported. | 17:37 |
SpamapS | Coverage.py warning: No data was collected. | 17:37 |
reed | fungi, ah, recursions :) | 17:37 |
mordred_phone | SpamapS: do you have .coveragerc or whatever the file is? | 17:37 |
mordred_phone | SpamapS: nova has an example one | 17:37 |
SpamapS | mordred_phone: no | 17:38 |
SpamapS | looking | 17:38 |
SpamapS | hm, seems promising | 17:38 |
fungi | reed: which instead is (somewhat opaquely) answered in documentation at http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/manifests/eavesdrop.pp | 17:38 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Log task manager queue length https://review.openstack.org/97574 | 17:38 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Log task durations https://review.openstack.org/97575 | 17:38 |
*** sarob has quit IRC | 17:38 | |
fungi | reed: er, i meant http://ci.openstack.org/irc.html#at-a-glance | 17:38 |
clarkb | jeblair: I approved the image status check but you may want to submit it | 17:38 |
jeblair | clarkb, fungi: would you mind reviewing the branch with tip https://review.openstack.org/97575 i think we might want to go ahead and manually install that on np and restart | 17:39 |
SpamapS | mordred_phone: that still does not help unfortunately | 17:39 |
clarkb | sure | 17:39 |
clarkb | doing that now | 17:39 |
*** SumitNaiksatam has joined #openstack-infra | 17:40 | |
mordred_phone | SpamapS: OK. I'll look in a bit | 17:40 |
*** mmaglana has joined #openstack-infra | 17:41 | |
clarkb | jeblair: lgmt | 17:42 |
fungi | jeblair: yep, the new changes on that series also lgtm | 17:43 |
jeblair | okay, i've manually installed it; i'll just do a simple nodepool restart now | 17:43 |
fungi | jeblair: do we want to go ahead and enqueue those into the gate rather than wait for check pipeline resources? | 17:43 |
fungi | oh, or that | 17:43 |
clarkb | k | 17:43 |
*** markwash_ has joined #openstack-infra | 17:44 | |
jeblair | that's going to dump a lot of building nodes into delete, but with any luck, the new builds will appear faster | 17:44 |
jeblair | 2014-06-03 17:45:41,439 DEBUG nodepool.ProviderManager: Manager hpcloud-b5 ran task <nodepool.provider_manager.ListServersTask object at 0x7f13b8a2cfd0> in 20.648375988s | 17:45 |
jeblair | clarkb, fungi: ^ i think that helps validate my hypothesis | 17:45 |
jeblair | (that's long enough for every waiting server to decide it needed to enqueue a listservers task 4 times over | 17:46 |
*** markwash has quit IRC | 17:46 | |
*** markwash_ is now known as markwash | 17:46 | |
fungi | oh wow | 17:48 |
fungi | so yeah, it does seem likely to be request dogpiling | 17:48 |
jeblair | so if there were 300 servers building, that's 6000 seconds of waiting for that api call | 17:48 |
clarkb | nice | 17:49 |
*** otherwiseguy has quit IRC | 17:49 | |
clarkb | but the lock should fix that now right? | 17:49 |
fungi | and the contributing factor in hpcloud 1.1 is that nova list takes at least 4x as long as it did in 1.0? | 17:49 |
jeblair | yup | 17:49 |
SpamapS | thanks neutron! | 17:50 |
jeblair | fungi: i'm assuming it's taking significantly longer, though i don't have times for that. rax takes 0.3-4 seconds | 17:50 |
*** dkliban_afk is now known as dkliban | 17:50 | |
*** Guest22618 has quit IRC | 17:51 | |
*** cody-somerville has quit IRC | 17:51 | |
fungi | or, well rather, if it took 25% as long as it does, we wouldn't have list tasks accumulating | 17:51 |
jeblair | wow, creating a server can take about that long too, though it seems it's more typically around 6-7 seconds | 17:53 |
SpamapS | mordred_phone: more data... it seems that the pbr additions to the 'testr' command aren't showing up.. so I can't pass --coverage-package-name .. which is what I need to do | 17:55 |
clarkb | SpamapS: what project is this for? | 17:56 |
mordred_phone | oh. well, you should be using setup.py test ... not testr | 17:56 |
SpamapS | clarkb: os-collect-config | 17:56 |
SpamapS | test? | 17:56 |
mordred_phone | test | 17:56 |
SpamapS | new stuf | 17:56 |
*** zzelle_ has joined #openstack-infra | 17:56 | |
clarkb | mordred_phone: a lot of stuff uses testr | 17:56 |
SpamapS | like.. everything | 17:57 |
SpamapS | ;) | 17:57 |
SpamapS | ok that does seem to be the problem, but given that the pbr thing is called _testr_ .. that is... really godamn confusing | 17:57 |
*** zzelle has quit IRC | 17:57 | |
mordred_phone | SpamapS: sorry | 17:57 |
jeblair | fungi, clarkb: i'm amused by our tweaking the 10 calls/sec rate limit in nodepool when the actual api calls themselves take 10 seconds | 17:57 |
clarkb | jeblair: :) | 17:58 |
fungi | ha | 17:58 |
clarkb | mordred_phone: ^ that might be good feedback to give back | 17:58 |
mordred_phone | jeblair: :) | 17:58 |
*** gyee_ has joined #openstack-infra | 17:58 | |
mordred_phone | clarkb: tteggel and gilliard are here | 17:58 |
fungi | well, here but not here. tteggel said he was travelling but would be back tomorrow i think | 17:59 |
clarkb | tteggel: gilliard ^ nova list takes ~20 seconds | 17:59 |
SpamapS | mordred_phone: fore refrence, https://review.openstack.org/97579 that fixed it | 18:01 |
boris-42 | clarkb what amount of vms? | 18:01 |
SpamapS | lovely typing prowezz | 18:01 |
clarkb | boris-42: ~300 | 18:01 |
boris-42 | clarkb it's okay=) | 18:01 |
SpamapS | not bad.. 93% unit test coverage | 18:01 |
boris-42 | clarkb it works quite slow | 18:01 |
mordred_phone | clarkb: I see ttegel tomorrow | 18:02 |
openstackgerrit | Joe Gordon proposed a change to openstack-dev/hacking: Update localization checks to understand separate catalogs https://review.openstack.org/97580 | 18:02 |
clarkb | boris-42: we are seeing rackspace be much quicker so not sure I agree | 18:02 |
*** nati_ueno has quit IRC | 18:02 | |
*** nati_uen_ has joined #openstack-infra | 18:02 | |
boris-42 | clarkb hm backspace probably fixed issue with that stuff | 18:02 |
boris-42 | clarkb out of box openstack works quite slow | 18:03 |
boris-42 | clarkb we have even benchmark for that case | 18:03 |
boris-42 | clarkb in rally* | 18:03 |
jeblair | clarkb: first data points are in on the new build times for 1.1; they look much better | 18:05 |
boris-42 | clarkb and it usually works for 20 seconds in care of ~800vms | 18:05 |
*** julim has quit IRC | 18:05 | |
clarkb | jeblair: great | 18:05 |
clarkb | jeblair: the zuul event and result queues are fairly large, but I think that may just be a side effect of all the changes in it | 18:06 |
*** flaper87|afk is now known as flaper87 | 18:06 | |
clarkb | it did the same thing last night and seemed to be able to keep up for the most part | 18:06 |
*** cody-somerville has joined #openstack-infra | 18:07 | |
*** julim has joined #openstack-infra | 18:07 | |
jeblair | they seem to be in the 150-200 second range | 18:07 |
mordred_phone | clarkb: remember yesterday when you thought today would not be about nodepool? | 18:07 |
clarkb | mordred_phone: yes don't ever listen to me | 18:08 |
clarkb | mordred_phone: asselin_ so it just occurred to me that argparse isn't in python 2.6 | 18:08 |
clarkb | which means 96539 may not work on centos | 18:09 |
mordred_phone | oh | 18:09 |
clarkb | is that correct? | 18:09 |
clarkb | for a script that simple it is probably fine to just read sys.argv[1] | 18:09 |
mordred_phone | uhm. you have to install the argparse module | 18:09 |
mordred_phone | ++ | 18:09 |
clarkb | or is it sys.argv[0]? I can never rememer if python does the wrong thing and removes the script name | 18:10 |
asselin_ | I can take a look | 18:10 |
dstufft | it doesn't remove the script name | 18:10 |
mordred_phone | dstufft: woot | 18:10 |
asselin_ | seems it can be installed: https://pypi.python.org/pypi/argparse | 18:11 |
asselin_ | argparse should work on Python >= 2.3 | 18:11 |
clarkb | asselin_: right but you have to install it | 18:11 |
clarkb | and in this image build env we don't really want to install anything | 18:12 |
asselin_ | ok I can rewrite it without argparse | 18:12 |
*** UtahDave has quit IRC | 18:13 | |
*** basha has joined #openstack-infra | 18:13 | |
mordred_phone | asselin_: thanks | 18:13 |
mordred_phone | clarkb: nice catch | 18:13 |
*** tkelsey has quit IRC | 18:15 | |
jeblair | clarkb, mordred_phone: the hpcloud-b error rate is increasing | 18:16 |
*** praneshp has joined #openstack-infra | 18:17 | |
mordred_phone | jeblair: I was literally typing that same thing | 18:17 |
jeblair | mordred_phone: oh, i think we're hitting hpcloud quotas | 18:17 |
jeblair | phschwartz: we're seeing a lot of rax-iad instances going into ERROR state | 18:18 |
mordred_phone | oh! floating ips again? | 18:18 |
*** denis_makogon_ has joined #openstack-infra | 18:19 | |
jeblair | mordred_phone: no, instance quotas | 18:19 |
mordred_phone | oh. fun. | 18:19 |
*** ildikov has quit IRC | 18:20 | |
clarkb | jeblair: quota should be for ~610 nodes | 18:21 |
*** Sukhdev has joined #openstack-infra | 18:22 | |
clarkb | jeblair: are we hitting some other parameter that makes us fail before we get there? iirc nodepool is only cofigured for 500 nodes right now | 18:22 |
jeblair | clarkb: wow, i think i know what's going on | 18:22 |
jeblair | gimme a min to check | 18:22 |
mordred | clouds are great | 18:22 |
*** Ajaeger has joined #openstack-infra | 18:23 | |
*** gilliard_m has joined #openstack-infra | 18:25 | |
*** denis_makogon_ is now known as denis_makogon | 18:25 | |
mordred | jeblair, anteaya (reading deeper scrollback - yes, I do watch mordred in both places) - and yes, I think the recently merged fixes to wheels should fix that | 18:25 |
clarkb | zuul queue lengths are impressive | 18:26 |
clarkb | I am not sure it is keeping up anymore this is much worse than last night | 18:26 |
anteaya | should we send out an email to the ml? | 18:26 |
Kiall | The job queue seems to be trending downwards? | 18:27 |
clarkb | Kiall: it is, which is good, but zuul doesn't seem to be able to process the results that quickly | 18:27 |
anteaya | perhaps asking people to rachet down patch submissions for a few hours? | 18:27 |
*** olaph has joined #openstack-infra | 18:27 | |
clarkb | anteaya: no, would be better to understand the problem a bit more | 18:27 |
anteaya | okay | 18:27 |
mordred | sdague: also, I share jeblair's concern about the new "ironic ci is ci but not ci" | 18:27 |
SpamapS | eh? | 18:28 |
*** _nadya_ has joined #openstack-infra | 18:28 | |
sdague | mordred: ok, well I'm just reflecting what was asked for during the ironic driver merge process by the nova team in that summit session | 18:28 |
Kiall | clarkb: ahh - 1095 results, that's no good :) | 18:28 |
sdague | it makes sense from that perspective | 18:28 |
mordred | sdague: I know. I think I'm concerned that it's overly baroque | 18:28 |
*** basha has quit IRC | 18:29 | |
bodepd | mordred: couple of things: 1. to CLA or not to CLA (already resolved) | 18:29 |
anteaya | yeah, I have never seen it that high before | 18:29 |
bodepd | mordred: other was was rather the project status=active statement is needed (I've heard both recommendations) | 18:29 |
clarkb | Kiall: in theory tests will level off and zuul will catch up on that | 18:29 |
clarkb | Kiall: but my theory has been proven wrong so far :) | 18:29 |
bodepd | also, what is the tag workflow? I assume that we will need tags for the split out modules? | 18:29 |
mordred | sdague: and we've never needed to create a fake third-party rig for an incubated project that should be able to avail itself of the project resources before | 18:29 |
anteaya | bodepd: I have it as an agenda item on today's -infra meeting | 18:30 |
anteaya | bodepd: if we get to it, and if my power doesn't go out first | 18:30 |
sdague | mordred: sure. We've also never had an out of tree compute driver trying to get in tree. | 18:30 |
mordred | sdague: that we think we need it now means I think that we're inventing new non-official states to describe the ironic project because $meh | 18:30 |
sdague | it's mostly about that driver landing | 18:30 |
sdague | I think it's a temporary thing for that | 18:30 |
*** SumitNaiksatam has quit IRC | 18:31 | |
mordred | we used non-voting and experimental jobs to land neutron | 18:31 |
SpamapS | I still think that bar is just an inch too high. | 18:31 |
mordred | I do too | 18:31 |
SpamapS | lower it to "we'll put it in tree but it is out if it destabilizes CI" .. not.. "be out of tree and prove yourself and then we'll accept you even though we never did that for any other drivers" | 18:32 |
mordred | the fact that we're having to invent new infra to test a merge of an incubated project is the thing | 18:32 |
bodepd | anteaya: let me know if you guys need anything from me. | 18:32 |
*** _nadya_ has quit IRC | 18:32 | |
anteaya | bodepd: you are welcome to attend the meeting if you wish | 18:32 |
jeblair | yeah, if we care about it that much, why don't we just make it voting? | 18:32 |
anteaya | starts in 28 minutes in -meeting, last item on the agenda | 18:33 |
sdague | mordred: ok, sure. So I'm mostly trying to explain what I know. It seems that there are some very different povs in different camps. I'm clearly not the decision maker in either on this particular decision. So we should probably try to figure out who all those folks are and get them in one virtual room to hammer this out | 18:33 |
anteaya | mostly I just want to hear thoughts from people because if they disagree with how I am reviewing I would prefer if they tell me, not -1 patches on the same item I gave feedback on earlier | 18:33 |
mordred | basically, extra complexity in systems usually winds up being a sniff test for me that something else is wrong | 18:33 |
mordred | sdague: ++ | 18:33 |
sdague | because apparently lots of people agreed to summit to different things :) | 18:34 |
SpamapS | if only we had a holodeck | 18:34 |
anteaya | I'd like a holodeck | 18:34 |
anteaya | I want to be on a boat in my holodeck | 18:34 |
harlowja | ^ joins in on that, ha | 18:34 |
anteaya | you can have your own boat | 18:35 |
harlowja | :) | 18:35 |
anteaya | we can have a regatta | 18:35 |
* mordred supports the regatta idea | 18:35 | |
anteaya | winner decides how ironic gets graduated? | 18:35 |
jeblair | clarkb: nodepool has leaked instances; i'm not sure why or how | 18:36 |
clarkb | huh | 18:36 |
clarkb | that would explain it | 18:36 |
jeblair | clarkb: https://etherpad.openstack.org/p/vfCgWGS616 | 18:36 |
clarkb | I also think we have reached peak zuul | 18:37 |
jeblair | that's still an active instance from nova pov | 18:37 |
mordred | clarkb: peak zuul sounds terrifying | 18:37 |
clarkb | mordred: yes quite scary | 18:37 |
mordred | ok. moving back to phone. must board plane | 18:37 |
anteaya | it is pretty terrifying to watch | 18:37 |
jeblair | i think we might want to reduce the number of status updates jenkins sends. | 18:38 |
anteaya | mordred: safe flight, whereever you are going this time | 18:38 |
clarkb | jeblair: in the gearman plugin? | 18:38 |
jeblair | clarkb: it's a thought. i don't know if it's the cause. | 18:38 |
*** nati_uen_ has quit IRC | 18:39 | |
*** nati_ueno has joined #openstack-infra | 18:39 | |
jeblair | but also it's possible that the very inefficient pipeline algorithm doesn't deal with 500 changes in queue | 18:39 |
*** klindgren_ has joined #openstack-infra | 18:39 | |
jeblair | anyway, nodepool first | 18:39 |
clarkb | jeblair: are you going to delete the aliens? | 18:40 |
jeblair | clarkb: i wondered if the server list cache change had something to do with it | 18:40 |
clarkb | oh hrm | 18:40 |
fungi | if we end up entertaining such large queues, we may benefit from additional job prioritizing solutions so that changes waiting on a job retry got knocked out sooner (reducing the queue size more aggressively) | 18:41 |
clarkb | fungi: if you look at the status right now the issue is no changes are being knocked out | 18:41 |
jeblair | clarkb: but i can't quite figure out how, if the data are old, it should have old cache data which shows the server... | 18:41 |
jeblair | oh! | 18:41 |
harlowja | should there be a message sent out to people to tell them to stop submitting code reviews (just an idea), stops it from getting worse? | 18:41 |
clarkb | fungi: we could ~halve zuuls load right now if it got to the point where it can process those things | 18:41 |
fungi | right, huge queues | 18:41 |
jeblair | clarkb, fungi: i think i have a hypothesis | 18:41 |
clarkb | harlowja: I feel like no one listens | 18:42 |
clarkb | and really if the problem is fundamental we need a proper solution | 18:42 |
anteaya | clarkb: you would be surprised | 18:42 |
jeblair | meh, we could just restart zuul and problem solved. :) | 18:42 |
anteaya | folks are aware in channels and standing by | 18:42 |
harlowja | k | 18:42 |
clarkb | jeblair: :P | 18:42 |
fungi | right, we know how to clear these queues at the expense of test results, after all, if that was what we really wanted to do | 18:43 |
clarkb | theory after looking at graphs. if you see last night/this mornings emptying of the queue then immediate rocket back to having lots of work to do | 18:43 |
clarkb | I bet the result queue backed up while I was sleeping and when all jobs were done running it ifnally got to chew through that and queue up the next round of work | 18:44 |
clarkb | so this may be cyclic | 18:44 |
fungi | i think that sudden spike may be the daily periodic jobs | 18:44 |
sdague | maybe, the queue was deep really early though | 18:44 |
clarkb | no the periodic jobs were a different spike | 18:44 |
clarkb | periodic jobs are the spike prior to the drive to zero | 18:45 |
clarkb | (I was awake for them) | 18:45 |
sdague | my push of code at 6am EST took 4 hrs to get nodes | 18:45 |
jeblair | clarkb, fungi: i wrote up my hypothesis in https://etherpad.openstack.org/p/vfCgWGS616 | 18:45 |
fungi | ahh, yep, the spike of interest is ~0900 utc | 18:45 |
*** krotscheck has quit IRC | 18:45 | |
*** ominakov has joined #openstack-infra | 18:45 | |
jeblair | clarkb, fungi: basically, i think it's the caching change + the restart | 18:45 |
*** ominakov has quit IRC | 18:45 | |
*** krotscheck has joined #openstack-infra | 18:46 | |
jeblair | clarkb, fungi: so the right way to clean that up? nova delete the aliens and then clean up unattached floating ips? | 18:46 |
clarkb | jeblair: that sounds right to me | 18:47 |
jeblair | (since going the other way sounds hard) | 18:47 |
clarkb | jeblair: actually you should be able to delete floating ip first | 18:47 |
jeblair | clarkb: oh, how? | 18:47 |
clarkb | probably a little hard | 18:47 |
jeblair | i mean, how do we identify them? | 18:47 |
clarkb | if you nova show on an instance you get the ip right? | 18:47 |
fungi | and delete lingering neutron ports too probably | 18:47 |
clarkb | then you neutron delete that ip | 18:47 |
fungi | though hopefully nova takes care of that part | 18:47 |
jeblair | fungi: i don't think nodepool creates ports? | 18:48 |
clarkb | fungi: pretty sure nova deals with the port | 18:48 |
jeblair | clarkb: yes | 18:48 |
jeblair | clarkb: you get the ip when you nova show | 18:48 |
fungi | yeah, good point. it's just floating ips we end up handling separately | 18:48 |
jeblair | clarkb: for each alien, nova show |grep ip > file; for each ip in file delete ip; for each alien delete server? | 18:48 |
clarkb | jeblair: that sounds right | 18:48 |
bodepd | I can't make that time. I assume the meeting is all text in irc? | 18:48 |
anteaya | bodepd: yes | 18:49 |
clarkb | and should leave ips being attached alone | 18:49 |
fungi | bodepd: yep, and archived to the web | 18:49 |
jeblair | clarkb, fungi: anyone want to volunteer to start that while i try to fix the nodepool bug? | 18:49 |
clarkb | I can take a stab at it | 18:49 |
fungi | jeblair: i can get churning on the alien deletes and the associated floating-ip deletes | 18:49 |
fungi | or clarkb | 18:49 |
clarkb | fungi: go for it | 18:49 |
jeblair | clarkb: you win! | 18:49 |
clarkb | ha | 18:50 |
clarkb | :P | 18:50 |
fungi | we can split it by provider i assume | 18:50 |
clarkb | fungi: its just the one provider | 18:50 |
clarkb | assuming the cache thing is actually the cause | 18:50 |
fungi | i'll know here in a sec | 18:50 |
jeblair | well, b1-b5, but still probably the hard thing is making the procedure | 18:50 |
fungi | nova delete won't care which az it is though if i don't specify one, right? | 18:51 |
jeblair | oh, fun gotcha: | 18:51 |
clarkb | it shouldn't | 18:51 |
jeblair | clarkb, fungi: alien-list will show all the nodes from the other "fake" providers as aliens | 18:51 |
fungi | will neutron care which network is associated with a floating ip? | 18:51 |
jeblair | clarkb, fungi: so you'll need to do "nodepool alien-list hpcloud-b1|grep hpcloud-b1" | 18:51 |
clarkb | fungi: it shouldn't | 18:52 |
clarkb | jeblair: oh fun | 18:52 |
mordred_phone | nice | 18:52 |
fungi | ahh, yep. too awesome | 18:52 |
jeblair | clarkb, fungi: actually: nodepool alien-list hpcloud-b1|grep hpcloud-b1- | 18:52 |
jeblair | (note that final '-') | 18:52 |
fungi | yeah, to match only on the instance name | 18:52 |
jeblair | yup | 18:52 |
clarkb | my alien list is running now will have to rerun with ^ | 18:53 |
clarkb | gah its stuck on tripleo /me uses the ^C | 18:53 |
*** nati_ueno has quit IRC | 18:53 | |
fungi | yeah, provider-specific alien-list goes waaay faster | 18:54 |
*** otherwiseguy has joined #openstack-infra | 18:54 | |
*** nati_uen_ has joined #openstack-infra | 18:55 | |
*** srenatus has quit IRC | 18:56 | |
*** srenatus has joined #openstack-infra | 18:56 | |
SergeyLukjanov | oh, near the meeting time | 18:57 |
fungi | okay, 184 aliens between the 5 refion b networks | 18:57 |
fungi | er, region | 18:57 |
clarkb | nova show doesn't seem to show me the ip | 18:57 |
clarkb | maybe I got a node with no floating ip /me tries more | 18:57 |
jeblair | oh i'm sorry, i saw an ip, i didn't notice it wasn't the floating one | 18:58 |
mordred_phone | why not just alien delete followed by delete of now orphaned ips? | 18:59 |
*** otherwiseguy has quit IRC | 18:59 | |
mordred_phone | pre deleting the ips sounds like it's hard | 18:59 |
clarkb | jeblair: looks like you get them listed like this 10.x.x.x, 15.y.y.y | 18:59 |
jeblair | mordred_phone: i think we were hoping that if it wasn't too complicated, doing it in the right order would prevent some spurious ip quota errors during the cleanup | 18:59 |
clarkb | so it will show up if it has one | 18:59 |
mordred_phone | jeblair: nod | 18:59 |
jeblair | mordred_phone: but yeah, if it's too complicated, should probably just do that | 19:00 |
*** dprince has joined #openstack-infra | 19:00 | |
jeblair | i think it's meeting time | 19:00 |
*** pelix has quit IRC | 19:00 | |
*** bauzas has quit IRC | 19:01 | |
*** otherwiseguy has joined #openstack-infra | 19:01 | |
*** ildikov has joined #openstack-infra | 19:01 | |
*** jp_at_hp1 has quit IRC | 19:01 | |
*** jistr has joined #openstack-infra | 19:02 | |
*** derekh_ has joined #openstack-infra | 19:02 | |
*** praneshp has quit IRC | 19:02 | |
*** miarmak has quit IRC | 19:02 | |
pleia2 | woo, back just in time :) | 19:02 |
*** alexpilotti_ has joined #openstack-infra | 19:03 | |
*** alexpilotti has quit IRC | 19:03 | |
*** alexpilotti_ is now known as alexpilotti | 19:03 | |
clarkb | fungi: mordred_phone I think we can do a floating ip list and from there check if the uuid for the server is in the list of aleins | 19:03 |
clarkb | that is simpler | 19:03 |
*** e0ne has joined #openstack-infra | 19:03 | |
fungi | clarkb: well, floating-ip-list tells you the fixed ip or "-" so are we concerned about races there? | 19:03 |
clarkb | oh right | 19:04 |
clarkb | gah | 19:04 |
*** praneshp has joined #openstack-infra | 19:04 | |
*** freyes has quit IRC | 19:05 | |
*** otherwiseguy has quit IRC | 19:05 | |
*** AndChat|158321 has joined #openstack-infra | 19:07 | |
*** gilliard_m has quit IRC | 19:07 | |
fungi | otherwise i'm happy to just get a couple floating-ip lists at the end and delete any which stay on the list after a minute | 19:07 |
fungi | clarkb: so any opposition to just deleting the alien instances, then deleting the non-associated fixed-ips? | 19:08 |
clarkb | fungi: nope. I think the other way is fairly complicated and will require proper code and not a big bash for loop | 19:09 |
*** ihrachyshka has quit IRC | 19:10 | |
*** ihrachyshka has joined #openstack-infra | 19:10 | |
SpamapS | soren: can we get uvirtbot in #openstack-meeting-alt ? Thanks. | 19:11 |
*** alexpilotti has quit IRC | 19:11 | |
*** markwash_ has joined #openstack-infra | 19:12 | |
fungi | clarkb: starting to iterate in that case | 19:12 |
clarkb | ok | 19:12 |
Kiall | FYI Slow/Failing instance create on HP Cloud is being treated as incident now.. Resolution should come soon .. https://community.hpcloud.com/status/incident/2618 | 19:13 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Support suppressing bugs in bot https://review.openstack.org/97354 | 19:14 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Add suppressed fingerprint for pep8 failures in gate https://review.openstack.org/95350 | 19:14 |
*** markwash has quit IRC | 19:14 | |
*** gyee_ has quit IRC | 19:14 | |
dhellmann | clarkb: did things settle down enough for you to be able to re-test those alpha releases of taskflow? | 19:15 |
clarkb | dhellmann: no | 19:15 |
dhellmann | clarkb: ok, no problem | 19:15 |
clarkb | today has been crazy too :) don't look at the zuul status page | 19:15 |
dhellmann | clarkb: heh, I've been reviewing specs, so I haven't even looked at code today | 19:16 |
openstackgerrit | Joe Gordon proposed a change to openstack-dev/hacking: Update localization checks to understand separate catalogs https://review.openstack.org/97580 | 19:16 |
ttx | jcoufal: re; meeting times, just make the change on the meetings wiki page if you haven't already | 19:16 |
*** bauzas has joined #openstack-infra | 19:17 | |
*** markwash_ has quit IRC | 19:18 | |
fungi | Kiall: well, at this point it seems like it might actually have been a compound issue where the underlying cause is actually nova api calls taking 10-20 seconds to return (and so we were running a backlog on calls which waited in line for one another to complete) | 19:21 |
*** masayukig has quit IRC | 19:22 | |
clarkb | fungi: how go deletes? anything else I can do to help there? | 19:23 |
fungi | instance deletes seem to have completed. working on fip deletes now | 19:23 |
clarkb | looking at zuul gearman graphs we are pretty close to getting to the bottom of the pile. Probably worth letting it get there to see what happens to the queues | 19:24 |
*** adalbas has quit IRC | 19:25 | |
Kiall | fungi: I've relayed that on... | 19:25 |
*** e0ne has quit IRC | 19:25 | |
Alex_Gaynor | clarkb: I assume it's fine that the zuul queues are pretty backed up (500 events, 2250 results) | 19:25 |
clarkb | Alex_Gaynor: well not great. basically enough zuul jobs are running that the results queue is backed up preventing event queue from being processed | 19:26 |
*** otherwiseguy has joined #openstack-infra | 19:26 | |
clarkb | Alex_Gaynor: once we hit the bottom of the pile it should be able to chew through those | 19:26 |
fungi | Alex_Gaynor: "fine" is not the word i'd use, but "known" and "being worked on" | 19:26 |
Alex_Gaynor | I guess "self resolving" is really what I meant | 19:26 |
*** torgomatic has joined #openstack-infra | 19:28 | |
*** resker has joined #openstack-infra | 19:29 | |
*** masayukig has joined #openstack-infra | 19:29 | |
arunkant | What needs to be done to merge this (https://review.openstack.org/#/c/95842/) as couple of keystone core reviewers are done with review. | 19:30 |
*** esker has quit IRC | 19:30 | |
*** otherwiseguy has quit IRC | 19:30 | |
*** Alexei_987 has joined #openstack-infra | 19:31 | |
Ajaeger | arunkant: A keystone core needs to approve it as well. But right now our job queue is enourmous, so it will take a few hours to get this merged. | 19:31 |
*** otherwiseguy has joined #openstack-infra | 19:32 | |
clarkb | Ajaeger: its a requirements change. I think a requirements core needs to approve it. Not sure where keystone core came in. Were they just checking that the requirement is legit for their needs? | 19:32 |
*** esker has joined #openstack-infra | 19:32 | |
Ajaeger | clarkb: oops, yeah - you're right - the requirements cores need to do it... | 19:32 |
*** resker has quit IRC | 19:33 | |
Ajaeger | arunkant: one of the requirements cores needs to approve it - see https://review.openstack.org/#/admin/groups/131,members for a list. | 19:33 |
jcoufal | ttx: updated https://wiki.openstack.org/wiki/Meetings/UX | 19:34 |
Ajaeger | clarkb: yeah, seems keystone came in since it's a dependency of a keystone change | 19:34 |
ttx | jcoufal: did you update the main page as well ? https://wiki.openstack.org/wiki/Meetings | 19:34 |
*** doude has quit IRC | 19:36 | |
*** otherwiseguy has quit IRC | 19:36 | |
fungi | oh too awesome. so after starting the loop to delete floating ips i got a couple of... | 19:37 |
fungi | ERROR: Invalid OpenStack Nova credentials. | 19:37 |
jcoufal | ttx: my mistake, working on it | 19:37 |
fungi | followed by several of... | 19:38 |
fungi | ERROR: Caught MongoException that may indicate temporary connectivity issue (HTTP 500) | 19:38 |
fungi | (mongo, no! never kill a customer!) | 19:38 |
jcoufal | ttx: fixed | 19:38 |
*** adalbas has joined #openstack-infra | 19:38 | |
ttx | jcoufal: ok, will pick it up, probably tomorrow though | 19:39 |
*** e0ne has joined #openstack-infra | 19:39 | |
*** Alexei_987 has quit IRC | 19:39 | |
*** Alexei_987 has joined #openstack-infra | 19:40 | |
*** bcrochet has quit IRC | 19:41 | |
jcoufal | ttx: np, thanks | 19:41 |
*** bcrochet has joined #openstack-infra | 19:41 | |
*** mkoderer has quit IRC | 19:42 | |
*** hashar has joined #openstack-infra | 19:43 | |
fungi | these nova floating-ip-delete calls are taking quite a while | 19:43 |
*** otherwiseguy has joined #openstack-infra | 19:44 | |
openstackgerrit | Joshua Harlow proposed a change to openstack/requirements: Add doc8 a documentation style checking package https://review.openstack.org/94061 | 19:44 |
fungi | i should break my loop and time a few to see how long | 19:44 |
arunkant | Ajaeger: I do see sean dague (requirement core reviewers) has reviewed it as well. This is a change needed for a keystone that's why keystone core reviewed it. So looks like its all set. Is that correct understanding? | 19:44 |
*** bcrochet has quit IRC | 19:45 | |
Ajaeger | arunkant: Some of the cores needs to give a final "Approval" as well. This has not been done yet. | 19:45 |
fungi | arunkant: a requirements core reviewer has to review it and be confident enough in it to approve it | 19:45 |
openstackgerrit | Nikhil Manchanda proposed a change to openstack-infra/config: Fix experimental trove-image-build gate job https://review.openstack.org/97608 | 19:47 |
arunkant | I was wondering what is holding up as saw of couple of approvals..okay..will wait for my turn for requirement core reviewers..thanks Ajaeger: and fungi: for your reply. | 19:48 |
*** otherwiseguy has quit IRC | 19:48 | |
*** bcrochet has joined #openstack-infra | 19:48 | |
SlickNik | ^^ jeblair / SergeyLukjanov: I had to make a couple of fixes to the trove image build job, so I also fixed the the /opt/stack dir path issue which was causing some confusion. Thanks! | 19:49 |
*** pblaho has joined #openstack-infra | 19:49 | |
*** SlickNik has left #openstack-infra | 19:49 | |
*** SlickNik has joined #openstack-infra | 19:49 | |
SergeyLukjanov | SlickN1k, added to backlog | 19:49 |
SlickNik | Roger that. Whenever you get to it. :) | 19:50 |
SlickNik | Thanks! | 19:50 |
*** sballe has joined #openstack-infra | 19:53 | |
fungi | nova floating-ip-delete calls are taking around 3.5-5.5 seconds to return | 19:55 |
anteaya | yay, my talk got accepted for linuxcon, thanks fungi for helping me with that | 19:56 |
ArxCruz | anteaya: hey :) | 19:56 |
anteaya | ArxCruz: hey | 19:56 |
ArxCruz | anteaya: kurt is on vacation, he's ask me to join the infra 3rd meetings | 19:56 |
reed | alright, long email to -infra sent | 19:56 |
anteaya | ArxCruz: great thanks | 19:56 |
reed | blog post published, too | 19:56 |
ArxCruz | which time / day these meetings are being scheduled ? | 19:56 |
fungi | anteaya: yw! and wtg! | 19:56 |
anteaya | fungi: :D | 19:56 |
anteaya | ArxCruz: mondays at 1800 utc | 19:57 |
ArxCruz | cool, next monday I will be there then | 19:57 |
*** e0ne has quit IRC | 19:57 | |
*** otherwiseguy has joined #openstack-infra | 19:58 | |
anteaya | ArxCruz: great thanks | 19:59 |
*** AndChat|158321 has quit IRC | 19:59 | |
*** gilliard_m has joined #openstack-infra | 20:00 | |
*** flaper87 is now known as flaper87|afk | 20:00 | |
fungi | floating ip deletes finally finished | 20:01 |
clarkb | zuul results queue is falling slowly now | 20:01 |
clarkb | it hit the bottom of the gearman pile | 20:01 |
fungi | yeah, the waiting jobs finally bottomed out | 20:02 |
*** markwash has joined #openstack-infra | 20:02 | |
zaro | fungi: so i'll try to pick up where i left off on bug 1083101 | 20:02 |
uvirtbot | Launchpad bug 1083101 in openstack-ci "Set up private gerrit for security reviews" [High,In progress] https://launchpad.net/bugs/1083101 | 20:02 |
*** bcrochet has quit IRC | 20:02 | |
fungi | zaro: sounds good. i think we're in a better place to be able to get that up and running now that we have the gerrit upgrade behind us | 20:02 |
*** markwash has quit IRC | 20:02 | |
*** otherwiseguy has quit IRC | 20:02 | |
fungi | you did a lot to get it better puppeted than our previous gerrit version (where we definitely still had a lot of holes in the puppetry) | 20:03 |
*** markwash has joined #openstack-infra | 20:03 | |
clarkb | and the gerrit security stuff should help with initial bootstrapping | 20:03 |
*** bcrochet has joined #openstack-infra | 20:05 | |
fungi | "Queue lengths: 606 events, 3060 results." | 20:05 |
*** dkliban is now known as dkliban_brb | 20:05 | |
fungi | that's gonna be a while | 20:05 |
clarkb | I am waiting for 58214 to pop off | 20:05 |
clarkb | once that happens I iwll be happy | 20:05 |
*** marcoemorais has joined #openstack-infra | 20:05 | |
openstackgerrit | Ramy Asselin proposed a change to openstack-infra/config: Allow choice of GIT protocol used. https://review.openstack.org/96539 | 20:06 |
*** marcoemorais has quit IRC | 20:06 | |
*** marcoemorais2 has joined #openstack-infra | 20:06 | |
*** otherwiseguy has joined #openstack-infra | 20:06 | |
*** sarob has joined #openstack-infra | 20:07 | |
clarkb | asselin_: +2 thanks | 20:07 |
asselin_ | clarkb, thanks! | 20:07 |
asselin_ | that'll save me a rebase | 20:07 |
asselin_ | for the other patch | 20:07 |
*** marcoemorais1 has quit IRC | 20:08 | |
*** derekh_ has quit IRC | 20:08 | |
*** lcostantino has quit IRC | 20:09 | |
*** esker has quit IRC | 20:09 | |
*** palar_ has joined #openstack-infra | 20:10 | |
*** esker has joined #openstack-infra | 20:10 | |
*** weshay has quit IRC | 20:10 | |
*** lcostantino has joined #openstack-infra | 20:10 | |
*** _nadya_ has joined #openstack-infra | 20:11 | |
*** otherwiseguy has quit IRC | 20:11 | |
clarkb | fungi: its under 3000! | 20:11 |
clarkb | I should go find lunch | 20:11 |
*** palar has quit IRC | 20:13 | |
*** bcrochet has quit IRC | 20:14 | |
*** palar_ has quit IRC | 20:14 | |
fungi | clarkb: at least it's not over 9000 | 20:14 |
*** esker has quit IRC | 20:14 | |
*** jcoufal has quit IRC | 20:15 | |
clarkb | thats impossible! | 20:15 |
* fungi reprimands himself for bad anime reference | 20:15 | |
clarkb | the latest movie is getting an english dub | 20:15 |
*** alexpilotti has joined #openstack-infra | 20:15 | |
fungi | i need to pick up the original series box set. been meaning to get it for a while | 20:16 |
clarkb | I tried watching it again semi recently and remembered why it annoyed me before. Each episode has about 2 minutes of content and 20 minutes of recap | 20:16 |
clarkb | someone should remaster them with all of the recap removed | 20:16 |
*** pblaho has quit IRC | 20:17 | |
fungi | yeah, cut to feature-film-length installments | 20:17 |
*** prakashkashyap has joined #openstack-infra | 20:18 | |
* pleia2 reads this conversation assuming they're talking about sailor moon and is delighted | 20:18 | |
pleia2 | (because it's funnier that way) | 20:18 |
*** sandywalsh has quit IRC | 20:18 | |
clarkb | pleia2: I didn't watch much sailor moon | 20:18 |
fungi | pleia2: i already have that box set, so no | 20:19 |
*** bcrochet has joined #openstack-infra | 20:19 | |
clarkb | it was on at a weird time in the afternoon iirc. Like 15 minutes before I got home it started | 20:19 |
pleia2 | I still want to be sailor pluto when I grow up | 20:19 |
*** prakashkashyap has left #openstack-infra | 20:19 | |
* fungi has just decided not to grow up | 20:19 | |
*** mriedem has quit IRC | 20:19 | |
fungi | much easier | 20:19 |
pleia2 | :) | 20:19 |
mtreinish | clarkb: they already did remaster it without the recap | 20:19 |
*** jistr has quit IRC | 20:20 | |
*** mriedem has joined #openstack-infra | 20:21 | |
*** sandywalsh has joined #openstack-infra | 20:21 | |
*** lcheng_ has joined #openstack-infra | 20:21 | |
*** bcrochet has quit IRC | 20:21 | |
*** Sukhdev has quit IRC | 20:22 | |
*** melwitt has joined #openstack-infra | 20:22 | |
*** bcrochet has joined #openstack-infra | 20:23 | |
fungi | the 2009 dvd release, looks like | 20:23 |
*** weshay has joined #openstack-infra | 20:25 | |
openstackgerrit | Jarret Raim proposed a change to openstack-infra/config: Adding barbican-specs to infra configs https://review.openstack.org/97616 | 20:26 |
mtreinish | fungi: yeah it started in 2009, it was done in hd too. The second season which finishes off the last arc just started a couple months ago. | 20:26 |
*** _nadya_ has quit IRC | 20:27 | |
fungi | cool | 20:27 |
*** yjiang5_away is now known as yjiang5 | 20:27 | |
*** dkliban_brb is now known as dkliban | 20:27 | |
*** _nadya_ has joined #openstack-infra | 20:27 | |
*** nati_uen_ has quit IRC | 20:28 | |
*** otherwiseguy has joined #openstack-infra | 20:32 | |
*** mriedem has quit IRC | 20:33 | |
*** mriedem has joined #openstack-infra | 20:34 | |
*** miqui has quit IRC | 20:34 | |
*** _nadya_ has quit IRC | 20:35 | |
*** sballe has quit IRC | 20:35 | |
*** arnaud has quit IRC | 20:35 | |
*** miqui has joined #openstack-infra | 20:36 | |
*** otherwiseguy has quit IRC | 20:36 | |
*** julim has quit IRC | 20:37 | |
*** miqui has quit IRC | 20:38 | |
*** dizquierdo has joined #openstack-infra | 20:39 | |
*** dangers is now known as dangers_away | 20:40 | |
*** arnaud has joined #openstack-infra | 20:41 | |
*** ramashri has quit IRC | 20:44 | |
Alex_Gaynor | Hmm. So I wonder, is it possible/likely that the web API for Zuul is generating enough traffic that it's slowing down its operations? | 20:45 |
*** ramashri has joined #openstack-infra | 20:46 | |
*** nati_ueno has joined #openstack-infra | 20:47 | |
*** marcoemorais has joined #openstack-infra | 20:48 | |
*** marcoemorais2 has quit IRC | 20:49 | |
*** marcoemorais1 has joined #openstack-infra | 20:49 | |
*** marcoemorais has quit IRC | 20:49 | |
*** otherwiseguy has joined #openstack-infra | 20:49 | |
*** Ajaeger has quit IRC | 20:52 | |
*** aysyd has quit IRC | 20:52 | |
fungi | this is an odd failure to see in the gate... https://jenkins05.openstack.org/job/gate-nova-pep8/3309/console | 20:53 |
fungi | this one as well... https://jenkins07.openstack.org/job/gate-horizon-python27/762/console | 20:53 |
*** otherwiseguy has quit IRC | 20:53 | |
pleia2 | multihomed ftw (comcast shut off my building for fun this afternoon) | 20:53 |
fungi | i suspect both of those the check pipeline freshness checks were insufficient | 20:53 |
*** maxbit has quit IRC | 20:55 | |
*** sweston has quit IRC | 20:55 | |
*** ArxCruz has quit IRC | 20:57 | |
*** marcoemorais1 has quit IRC | 20:58 | |
*** marcoemorais has joined #openstack-infra | 20:58 | |
*** jooools has joined #openstack-infra | 20:59 | |
*** nati_uen_ has joined #openstack-infra | 21:00 | |
*** marcoemorais1 has joined #openstack-infra | 21:00 | |
*** aconrad has joined #openstack-infra | 21:00 | |
*** nati_ueno has quit IRC | 21:00 | |
*** reed has quit IRC | 21:00 | |
*** dhellman_ has joined #openstack-infra | 21:01 | |
*** marcoemorais has quit IRC | 21:02 | |
openstackgerrit | Lance Bragstad proposed a change to openstack/requirements: Add bash8 to global-requirements https://review.openstack.org/97625 | 21:02 |
*** marcoemorais has joined #openstack-infra | 21:02 | |
sdague | lbragstad: do we really need that? | 21:02 |
*** marcoemorais has quit IRC | 21:02 | |
sdague | are there any projects that currently sync requirements that would use bash8? | 21:03 |
harlowja | whats bash ;) | 21:03 |
*** marcoemorais1 has quit IRC | 21:03 | |
*** marcoemorais has joined #openstack-infra | 21:03 | |
mtreinish | sdague: well what about devstack if it uses tox for the bash8 job? | 21:03 |
mtreinish | it'll need a setup.py and you'd want bash8 in test-requires then | 21:03 |
lbragstad | sdague: I have a commit up in keystone for checking some scripts we have, and it got brought up that it wasn't added to global-reqs | 21:03 |
sdague | mtreinish: I don't think we'll do gr sync on it | 21:03 |
sdague | lbragstad: ok | 21:04 |
fungi | why is nodepool list back to not displaying the new az column? | 21:04 |
lbragstad | sdague: I can abandon if that's not the place for it | 21:04 |
sdague | lbragstad: no, the keystone thing makes it valid | 21:04 |
sdague | I didn't realize you guys were doing that | 21:04 |
jeblair | fungi: arg, i must have left that patch out of my manual install, sorry | 21:05 |
*** pdmars has quit IRC | 21:05 | |
*** marcoemorais has quit IRC | 21:05 | |
fungi | jeblair: no worries, just making sure it wasn't something more dire | 21:05 |
lbragstad | sdague: we have a couple bash scripts that are used for setting up resource (more example scripts). | 21:05 |
*** _nadya_ has joined #openstack-infra | 21:05 | |
*** marcoemorais has joined #openstack-infra | 21:05 | |
lbragstad | resources* | 21:05 |
sdague | lbragstad: it should be in the lower section though | 21:05 |
jeblair | fungi: it may be more dire, we may not be using azs | 21:05 |
mtreinish | sdague: why not? I thought you'd want to keep the versions in sync with everything else. Things like setuptools, pbr, etc... | 21:05 |
fungi | oh | 21:06 |
lbragstad | sdague: ahh ok, down here? https://github.com/openstack/requirements/blob/master/global-requirements.txt#L130 | 21:06 |
sdague | lbragstad: yeh | 21:06 |
jeblair | fungi: i'll finish my restart fix, then rebase, and hopefully it sholud be safe to restart with it | 21:06 |
lbragstad | sdague: ok | 21:06 |
*** marcoemorais has quit IRC | 21:06 | |
sdague | mtreinish: so, realistically, it only needs tox | 21:06 |
fungi | jeblair: sounds great | 21:06 |
sdague | I expect it will not include setuptools and pbr | 21:06 |
sdague | because it's not ever going to be a python package | 21:07 |
*** marcoemorais has joined #openstack-infra | 21:07 | |
openstackgerrit | Lance Bragstad proposed a change to openstack/requirements: Add bash8 to global-requirements https://review.openstack.org/97625 | 21:07 |
*** crc32 has joined #openstack-infra | 21:07 | |
mtreinish | but I thought you need setup.py to use tox. Are you just going to fake it? | 21:07 |
fungi | i don't think tox requires setup.py does it? you can list arbitrary (non-python) commands in your tox.ini | 21:08 |
*** gilliard_m has quit IRC | 21:08 | |
fungi | unless there's some non-obvious hidden dependency for some reason | 21:08 |
mtreinish | fungi: I thought clarkb told me there was, but I could just be making things up | 21:09 |
*** saper_ has joined #openstack-infra | 21:09 | |
*** saper has quit IRC | 21:10 | |
*** mika has quit IRC | 21:11 | |
*** dhellmann has quit IRC | 21:11 | |
*** mika has joined #openstack-infra | 21:12 | |
*** otherwiseguy has joined #openstack-infra | 21:12 | |
*** dims__ has joined #openstack-infra | 21:13 | |
*** dhellmann has joined #openstack-infra | 21:13 | |
*** hashar has quit IRC | 21:13 | |
*** dizquierdo has quit IRC | 21:13 | |
anteaya | pleia2: so are you hooked back up again, or did you have to wander the streets for wifi? | 21:14 |
pleia2 | anteaya: I have two internet connections (multihomed), so even when one is down I have a connection, it automatically fails over | 21:15 |
anteaya | ah | 21:15 |
anteaya | you're so smart | 21:15 |
*** dims_ has quit IRC | 21:15 | |
pleia2 | haha, just practical with how reliable comcast can be somtimes | 21:16 |
pleia2 | (un)reliable | 21:16 |
*** otherwiseguy has quit IRC | 21:16 | |
anteaya | yeah, sorry you have to use comcast | 21:16 |
anteaya | reed so it is a shame your email topics didn't get some air time at today's infra meeting | 21:17 |
clarkb | fungi: mtreinish it is necessary in some cases because it isntalls an sdist or does pip install -e | 21:17 |
anteaya | reed I had no idea what you were thinking otherwise I would have suggested it | 21:17 |
clarkb | I think there is a way to override that now though. so you can noop | 21:17 |
anteaya | reed and jeblair is away next week so that kind of will affect timing on this conversation as well | 21:17 |
fungi | clarkb: makes sense | 21:17 |
*** otherwiseguy has joined #openstack-infra | 21:19 | |
*** hogepodge has quit IRC | 21:20 | |
*** dkliban is now known as dkliban_afk | 21:21 | |
*** bcrochet is now known as bcrochet|g0ne | 21:22 | |
*** adalbas has quit IRC | 21:22 | |
sdague | clarkb: right, sdist and pip install -e not really something we want for devstack :) | 21:23 |
*** otherwiseguy has quit IRC | 21:23 | |
*** ihrachyshka has quit IRC | 21:24 | |
*** dhellman_ has quit IRC | 21:26 | |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Check the returned image status https://review.openstack.org/97564 | 21:29 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Log task manager queue length https://review.openstack.org/97574 | 21:29 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Log task durations https://review.openstack.org/97575 | 21:29 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Prevent listserver tasks from piling up https://review.openstack.org/97570 | 21:29 |
jeblair | clarkb, fungi: want to re-review that? it's correctly based on master now, and i incorporated the fix in https://review.openstack.org/97570 (since it hadn't merged yet) | 21:30 |
fungi | at least zuul is still a while from caring if we restart nodepool and blow away every single node too | 21:30 |
jeblair | meanwhile, the graphs look very very strange; has anyone looked into that yet, or should i do that while you review? | 21:31 |
jeblair | fungi, clarkb: ^ | 21:31 |
denis_makogon | hello, guys, could someone explain why Zuul jobs are taking so long ? | 21:31 |
fungi | jeblair: which graphs? | 21:31 |
*** smarcet has quit IRC | 21:31 | |
fungi | jeblair: we're just not running anything because zuul's been taking so long processing events and results | 21:31 |
*** marcoemorais has quit IRC | 21:32 | |
fungi | the entire node pool has settled to an idle state | 21:32 |
jeblair | fungi: yeah, those :) | 21:32 |
*** marcoemorais has joined #openstack-infra | 21:32 | |
jeblair | fungi: so, basically, we worked through the entire backlog? | 21:32 |
jeblair | and now we're working through the results of that? | 21:32 |
fungi | apparently | 21:32 |
fungi | and have been for... hours | 21:33 |
*** marcoemorais has quit IRC | 21:33 | |
*** marcoemorais has joined #openstack-infra | 21:33 | |
fungi | denis_makogon: a combination of external factors (setuptools failure, changes on our cloud providers) have caused a fairly sizeable backlog | 21:34 |
fungi | s/failure/broken releases/ | 21:34 |
*** marcoemorais has quit IRC | 21:34 | |
*** weshay has quit IRC | 21:34 | |
*** marcoemorais has joined #openstack-infra | 21:35 | |
*** mfer has quit IRC | 21:35 | |
denis_makogon | fungi, thanks Jeremy, hope everything will be ok soon :) i was really shoked when saw my patch took like 8h 33m to verify | 21:35 |
fungi | jeblair: those lgtm | 21:35 |
clarkb | jeblair: fungi yes that is why the graphs are funny | 21:37 |
fungi | since 97570 seems to have needed some heavier-handed merge conflict resolution, i'll leave that one for clarkb to re-review | 21:37 |
clarkb | I am looking now | 21:37 |
jeblair | fungi: it did? | 21:37 |
*** marcoemorais1 has joined #openstack-infra | 21:38 | |
fungi | oh, actually i think i spent too much time looking at the patch differential on that file and not enough looking at the revision | 21:39 |
*** marcoemorais has quit IRC | 21:39 | |
fungi | yeah, looks identical to me. i guess gerrit was merely spooked in context lines | 21:39 |
jeblair | fungi: ah, yeah, remember ps1 was based on something older than master; so you're seeing my rebase in the diff | 21:39 |
fungi | right, that file ended up with a lot of availability zone related churn | 21:40 |
jeblair | fungi: the main diff is the cleanupservers function -- that's the fix for the restart issue | 21:40 |
*** marun is now known as marun_afk | 21:40 | |
*** marcoemorais has joined #openstack-infra | 21:42 | |
*** dims__ has quit IRC | 21:42 | |
*** arnaud has quit IRC | 21:42 | |
*** marcoemorais has quit IRC | 21:42 | |
clarkb | reviewed | 21:42 |
clarkb | ya you are basically checking cache validity | 21:42 |
*** jooools has quit IRC | 21:43 | |
fungi | also remember that puppet is still disabled on nodepool.o.o with https://review.openstack.org/97545 manually applied too | 21:44 |
*** marcoemorais has joined #openstack-infra | 21:44 | |
fungi | though as jeblair points out shaving a fraction of a second off 10+ is relatively of pointless | 21:44 |
*** marcoemorais has quit IRC | 21:44 | |
*** marcoemorais has joined #openstack-infra | 21:45 | |
fungi | anyway, i'm being pulled away to dinner, but will return in a bit | 21:45 |
clarkb | jeblair: re zuul event processing | 21:45 |
clarkb | is this indiciation that event processing is worse than o(n) | 21:45 |
*** marcoemorais1 has quit IRC | 21:45 | |
clarkb | it seems to slow quite a bit at that queue size | 21:45 |
clarkb | whereas processing for ~300 events last night wasn't too bad | 21:46 |
bodepd | sorry I missed the meeting where my concern was addressed. I'm reviewing logsnow | 21:46 |
bodepd | anteaya: thanks for adding it to the agenda. It's great to be heard :) | 21:46 |
jeblair | clarkb: i think it's probably mostly pipeline size | 21:47 |
*** virmitio has quit IRC | 21:47 | |
clarkb | jeblair: the check pipeline was roughly the same size last night | 21:47 |
anteaya | bodepd: well you were just doing as asked, which I appreciate | 21:48 |
clarkb | (but you know the zuul internals betyone so I trust your assessment :) ) | 21:48 |
anteaya | bodepd: sorry we weren't clearer when reviewing your patch series | 21:48 |
bodepd | anteaya: no worries, I also feel that I might have fallen a bit short as a contributor. I only started reading the awful documentation on gerrit-acls after the second -1 | 21:49 |
*** otherwiseguy has joined #openstack-infra | 21:49 | |
anteaya | bodepd: do you have any suggestions about how I can improve the aweful docs on gerrit-acls? | 21:50 |
bodepd | anteaya: I was trying to just grab an approved template and copy it as opposed to trying to understand my requirements and craft an acl | 21:50 |
anteaya | bodepd: /me nods | 21:50 |
Alex_Gaynor | jeblair: Is there any chance the web thread is stealing CPU time from teh "doing stuff" thread, since the body size is so huge? | 21:50 |
bodepd | anteaya: sorry if the maintaners are in the room :) | 21:50 |
anteaya | bodepd: *shrug* | 21:50 |
bodepd | anteaya: the main issue was that it didn't have enough examples | 21:50 |
anteaya | bodepd: great, which doc where you following? | 21:51 |
clarkb | Alex_Gaynor: I don't think so. they run in different threads without locks iirc | 21:51 |
anteaya | bodepd: I'll offer more/better examples | 21:51 |
bodepd | http://gerrit.googlecode.com/svn/documentation/2.1.2/access-control.html | 21:51 |
jeblair | clarkb: well, the only thing that relates to result queue size is a Queue.Queue.get(), which is not very efficient, but i think the pipeline processing itself is taking a couple of seconds | 21:51 |
clarkb | Alex_Gaynor: and python doesn't hold the GIL when writing to sockets iirc | 21:51 |
*** _nadya_ has quit IRC | 21:51 | |
anteaya | bodepd: oh, yeah I have no way of altering that doc | 21:51 |
bodepd | anteaya: I also didn't see any reference to [project] | 21:51 |
jeblair | clarkb: whereas the Queue.get() is taking 0.001 seconds (both sides of that are logged) | 21:51 |
Alex_Gaynor | clarkb: that's correct, I was actually thinking abotu the "building the JSON data strcuture" time | 21:51 |
anteaya | did you find it via a search? | 21:51 |
clarkb | Alex_Gaynor: hrm that could be | 21:52 |
bodepd | anteaya: is there a better internal doc I should have referenced? | 21:52 |
anteaya | bodepd: we have docs for infra but I don't know if people find them | 21:52 |
* Alex_Gaynor looked into runnign zuul under pypy, but one of the deps is a blocker | 21:52 | |
anteaya | bodepd: see that is the thing if you can't find it, any improvements I make don't solve the issue | 21:52 |
dstufft | Alex_Gaynor wants to convert everyone to the pypy relgion | 21:52 |
*** masayukig has quit IRC | 21:52 | |
dstufft | religon | 21:52 |
clarkb | Alex_Gaynor: whcih dependency? | 21:52 |
dstufft | w/e | 21:52 |
dstufft | spelling is hard | 21:52 |
anteaya | folks sometimes come into channel ans ask, but mostly they cargo cult and then we edit | 21:53 |
Alex_Gaynor | clarkb: smmap, some transitive requirement of git | 21:53 |
jeblair | Alex_Gaynor, clarkb: yeah, creating the json and processing the queue are both heavy python CPU tasks, so we're at a bit of a disadvantage there | 21:53 |
anteaya | if folks can't find the docs we have, that makes it hard | 21:53 |
Alex_Gaynor | jeblair: what do you think abotu putting it behind a caching proxy like haproxy? | 21:53 |
jeblair | Alex_Gaynor: that's a bummer, i think zuul would benefit from pypy | 21:53 |
*** mriedem has quit IRC | 21:53 | |
bodepd | anteaya: I found stackforge docs, still looking... | 21:53 |
anteaya | bodepd: http://ci.openstack.org/stackforge.html#create-a-new-stackforge-project-with-puppet | 21:53 |
*** miqui has joined #openstack-infra | 21:54 | |
jeblair | Alex_Gaynor: i think it's a great idea; i even left comments on hashar's proxy header changes suggesting that we make sure we support it | 21:54 |
*** markwash has quit IRC | 21:54 | |
anteaya | bodepd: what did you search when you found the google documentation? | 21:54 |
anteaya | bodepd: do you remember? | 21:54 |
bodepd | gerrit acl | 21:54 |
anteaya | kk | 21:54 |
bodepd | if I do a search for gerrit acl openstack infra, I get the right doc | 21:54 |
*** nati_uen_ has quit IRC | 21:54 | |
clarkb | jeblair: we could rewrite zuul in go | 21:55 |
anteaya | https://duckduckgo.com/?q=openstack+gerrit+acl&t=canonical | 21:55 |
anteaya | do you? | 21:55 |
mordred_phone | clarkb: or c++ | 21:55 |
jeblair | clarkb: heh, we could probably also do that with subprocessing module and an internal cache | 21:55 |
jeblair | i mean multiprocessing | 21:55 |
bodepd | honestly, it's just a case of not being entirely sure to look (which is easy to solve with experience), but I usually just come in and poke infra once every few months | 21:55 |
anteaya | bodepd: fair enough | 21:55 |
anteaya | I need to sleep on this | 21:55 |
anteaya | and then offer some form of patch | 21:56 |
anteaya | that will at the very least make me feel like I did something constructive | 21:56 |
jeblair | Alex_Gaynor, clarkb: however, i think we may only have a handful of status watchers at the moment | 21:56 |
jeblair | we're doing something like 1 request per second there | 21:56 |
Alex_Gaynor | that's not terribly many | 21:57 |
openstackgerrit | Aaron Greengrass proposed a change to openstack-infra/config: Extend graphite storage schema configuration https://review.openstack.org/97635 | 21:57 |
*** masayukig has joined #openstack-infra | 21:57 | |
*** mrda-away is now known as mrda | 21:58 | |
clarkb | jeblair: how much of the pipeline processing is in calls to gerrit? | 21:58 |
clarkb | jeblair: is that any of it? | 21:58 |
jeblair | clarkb: i think that's all out of band | 21:59 |
*** signed8bit has quit IRC | 22:00 | |
clarkb | jeblair: there are a ton of describe jobs according to the debug log | 22:01 |
jeblair | clarkb: i think a lot of time may be being spent in submitting job description jobs for jenkins to gearman | 22:01 |
clarkb | ya | 22:01 |
jeblair | i think that is within the pipeline processing, so that could be what's taking so long | 22:01 |
jeblair | they are being dispatched quickly though, so i don't think it's the case that the gearman queue is slow | 22:02 |
clarkb | we can probably stop doing that? seems like that info is less important now | 22:02 |
*** marcoemorais has quit IRC | 22:02 | |
*** mwagner_lap has quit IRC | 22:02 | |
*** marcoemorais has joined #openstack-infra | 22:02 | |
jeblair | we need to make the gearman server log level run-time adjustable | 22:02 |
*** nati_ueno has joined #openstack-infra | 22:02 | |
*** marcoemorais has quit IRC | 22:03 | |
*** sarob has quit IRC | 22:04 | |
*** marcoemorais has joined #openstack-infra | 22:04 | |
anteaya | the rain has stopped and I am going to take a walk | 22:04 |
*** sarob has joined #openstack-infra | 22:06 | |
*** denis_makogon has quit IRC | 22:06 | |
openstackgerrit | Joe Gordon proposed a change to openstack-dev/hacking: Update localization checks to understand separate catalogs https://review.openstack.org/97580 | 22:08 |
*** sarob has quit IRC | 22:10 | |
clarkb | jeblair: reading the logs we do appear to spend a good chunk of the time in the updateBuildDescription loops | 22:11 |
clarkb | 2014-06-03 22:10:02,549 to 2014-06-03 22:10:09,789 for one particular result event | 22:11 |
*** dims has joined #openstack-infra | 22:12 | |
*** alexpilotti has quit IRC | 22:12 | |
jeblair | clarkb: yup; i'm trying to determine if it's submitting the job or formatting the text that's slow | 22:13 |
clarkb | jeblair: ok | 22:13 |
clarkb | let me know if you want me to dig at anything | 22:13 |
clarkb | I do notice that the job submissions aren't always slow | 22:13 |
clarkb | which may indicate that something else is stealing cpu | 22:13 |
*** thedodd has quit IRC | 22:14 | |
*** doug-fish has left #openstack-infra | 22:15 | |
JayF | zuul.openstack.org doesn't appear to be loading, and a new patchset isn't getting a jeknins job schedules it appears. Not sure if this is related to the ongoing conversations here or a new issue :/ | 22:16 |
JayF | https://review.openstack.org/#/c/97631/1 is the patch that hasn't had a jenkins comment on it about jobs running (which I'm used to seeing near-instantaneously)( | 22:16 |
mattoliverau | Morning | 22:17 |
*** _nadya_ has joined #openstack-infra | 22:17 | |
*** unicell1 has quit IRC | 22:18 | |
clarkb | JayF: yes it is related | 22:18 |
*** gondoi is now known as zz_gondoi | 22:18 | |
clarkb | jeblair: I wonder if we should have a profiling switch built into zuul that we can turn on and when turned off have it spit to a file | 22:18 |
mordred_phone | Alex_Gaynor, clarkb re pypy ... the merge workers need git, bit the central zuul server doesn't, right? so could perhaps run zuuld in pypy | 22:19 |
openstackgerrit | Morgan Fainberg proposed a change to openstack-infra/config: Make apache-services tempest check expirimental https://review.openstack.org/97638 | 22:19 |
clarkb | mordred_phone: good point that is a possibility | 22:19 |
jeblair | clarkb: yeah, i think the gearman submission is taking ~ 0.01 seconds, so it's the other part of the describe loop that's occasionally taking ~1 second | 22:19 |
clarkb | jeblair: yappi claims to be thread happy | 22:19 |
openstackgerrit | Morgan Fainberg proposed a change to openstack-infra/config: Make apache-services tempest check experimental https://review.openstack.org/97638 | 22:19 |
mordred_phone | ++ to toggleable profiling | 22:19 |
jeblair | (at least, according to tcpdump) | 22:20 |
clarkb | I may give zuul profiling a go | 22:20 |
*** SumitNaiksatam has joined #openstack-infra | 22:20 | |
jeblair | (there's a chance that the gear client is taking that long) | 22:20 |
jeblair | (but i doubt it) | 22:20 |
*** bknudson has quit IRC | 22:21 | |
*** ccit has quit IRC | 22:23 | |
*** nati_ueno has quit IRC | 22:24 | |
*** aconrad has quit IRC | 22:24 | |
*** bauzas has quit IRC | 22:25 | |
*** sarob has joined #openstack-infra | 22:29 | |
*** nati_ueno has joined #openstack-infra | 22:32 | |
clarkb | I think I havea quick patch that may work. going to do some testing to see if it does what I think it will do | 22:34 |
mattoliverau | jeblair, clarkb: sounds like your having fun with zuul and slowness again. If there is anything I can do to help, even reading through tcpdump out etc, then let me know, my brain is all yours :) | 22:36 |
clarkb | mattoliverau: thanks. I think jeblair is trying to narrow it down with the available tools now and I am working on maybe adding more tools to the mix for future debugging | 22:36 |
jeblair | clarkb, mattoliverau: i think supporting a profiler is good, though i also think that being able to change the log levels would be very helpful too -- i'm fairly certain i could narrow down exactly what was happening if i had the gear debug logs | 22:37 |
jeblair | clarkb: so yesterday, a run of set_description calls took about 0.5 seconds for all of them | 22:39 |
*** aconrad has joined #openstack-infra | 22:39 | |
jeblair | (eg, all of the calls related to a single change) | 22:39 |
jeblair | whereas now each is taking about 1 second, for 6-10 seconds for the whole set | 22:40 |
*** jgrimm has quit IRC | 22:44 | |
*** alexandra_ is now known as asettle | 22:45 | |
*** andreykurilin has joined #openstack-infra | 22:46 | |
*** marcoemorais has quit IRC | 22:46 | |
jeblair | clarkb: maybe we could test Alex_Gaynor's hypothesis by stopping apache? | 22:46 |
*** marcoemorais has joined #openstack-infra | 22:46 | |
jeblair | clarkb: since our zuul doesn't run a merger, status.json is the only thing it's serving | 22:47 |
*** nati_uen_ has joined #openstack-infra | 22:47 | |
jeblair | Alex_Gaynor: i think you win :) | 22:48 |
mattoliverau | Its a shame tcpwrappers dont work in python, otherwise you could use them to increase the logging verbosity without restarting the application, kind of like a profiling switch. | 22:48 |
jeblair | Alex_Gaynor, clarkb: the set_description jobs are back to < 1s for the whole set | 22:48 |
mordred_phone | oh wow | 22:48 |
*** moted has joined #openstack-infra | 22:49 | |
clarkb | jeblair: nice | 22:49 |
clarkb | so maybe we leave it like that for a little bit? | 22:49 |
*** moted has quit IRC | 22:49 | |
mattoliverau | Was that turning apache off or restarting it? I wonder what the apache logs show. | 22:50 |
clarkb | I think my silly yappi patch is working | 22:50 |
jeblair | clarkb: i did some ad-hoc statistical profiling (i did 2 thread dumps) and noticed that both times we were waiting for a gearman packet in one thread, and we had another thread gzipping the output | 22:50 |
bodepd | for the alpha check in zuul layout, what is the precedence between -,_,[a-z] ? | 22:50 |
jeblair | clarkb: tbh, i don't quite understand why we're asking python to gzip things instead of letting apache do it... | 22:50 |
*** nati_ueno has quit IRC | 22:50 | |
mordred_phone | does zuul recalculate the json on each request? or does it memoize it somewhere? | 22:50 |
bodepd | it looks like it is: -,[a-z],_ | 22:50 |
mordred_phone | jeblair: ++ | 22:50 |
clarkb | jeblair: didn't that come in a patch? | 22:50 |
clarkb | and I agree | 22:50 |
jeblair | mordred_phone: recalc; i think we should cache it internally, not gzip it, and let a fronting webserver cache it more or gzip it if it wants | 22:51 |
mordred_phone | +100 | 22:51 |
jeblair | btw, about a minute ago we hit the bottom of the results queue and are now working through the event queue | 22:51 |
openstackgerrit | Dan Bode proposed a change to openstack-infra/config: Add puppet-openstack project puppet-openstacklib https://review.openstack.org/97357 | 22:51 |
jeblair | we're reporting a change about every second now | 22:52 |
Alex_Gaynor | oh boy, we also got a bunch of new builds | 22:54 |
jeblair | "trigger_event_queue": {"length": 73}, "result_event_queue": {"length": 3} | 22:55 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Add toggleable yappi profiling to zuul https://review.openstack.org/97641 | 22:55 |
clarkb | jeblair: ^ that is my ugly hack | 22:55 |
harlowja | goooo zuuul, u can do it! | 22:55 |
clarkb | running a local zuul server that doesn't do much indicates that it works though | 22:55 |
jeblair | status.json has shed 1.1MB | 22:55 |
jeblair | "trigger_event_queue": {"length": 123}, "result_event_queue": {"length": 7} | 22:56 |
clarkb | jeblair: so did you just restart apache? | 22:56 |
jeblair | (the increase is probably zuul's own comments) | 22:56 |
jeblair | clarkb: no, i _stopped_ apache | 22:57 |
clarkb | jeblair: I still get a status though | 22:57 |
jeblair | clarkb: you're probably just looking at old data | 22:57 |
jeblair | oh, no, i'm wrong | 22:57 |
jeblair | did puppet restart apache? | 22:58 |
clarkb | jeblair: http://paste.openstack.org/show/82689/ is what the profiling data generated by 97641 looks like | 22:58 |
clarkb | jeblair: it may have | 22:58 |
*** jp_at_hp has joined #openstack-infra | 22:58 | |
clarkb | https://code.google.com/p/yappi/wiki/usageyappi_v082 explains the output | 22:58 |
*** praneshp has quit IRC | 22:59 | |
jeblair | clarkb: cool | 22:59 |
jeblair | status.json is down to 4.1M (from 6.5M) | 22:59 |
clarkb | note I picked yappi because it deals with threads for me | 22:59 |
morganfainberg | jeblair, would you apply CC headers if passed through a webserver doing the cache/zip? | 22:59 |
*** msabramo has left #openstack-infra | 23:00 | |
jeblair | morganfainberg: cc headers? | 23:00 |
morganfainberg | jeblair, cache-control | 23:00 |
clarkb | jeblair: we can have it use a wall clock if you prefer instead of cpu clock | 23:00 |
*** eharney has quit IRC | 23:01 | |
*** bcrochet|g0ne is now known as bcrochet | 23:01 | |
morganfainberg | jeblair, or you mean as like mod_wsgi [or unicorns!] serving up the data and letting the webserver be smart as it smarts can be | 23:01 |
morganfainberg | or mod_whatever_would_run_zuul | 23:01 |
*** esker has joined #openstack-infra | 23:02 | |
Alex_Gaynor | jeblair: FWIW since apache went off zull stopped working for me because (as far as I can tell) Acces-Control-Allow-Origin stopped being included | 23:02 |
*** achuprin has quit IRC | 23:02 | |
openstackgerrit | A change was merged to openstack-infra/devstack-gate: Copy devstacklog.txt https://review.openstack.org/97251 | 23:02 |
Alex_Gaynor | since status is making requests to zuul. | 23:03 |
*** jhesketh_ has joined #openstack-infra | 23:03 | |
*** achuprin has joined #openstack-infra | 23:04 | |
*** jhesketh_ is now known as jhesketh | 23:04 | |
jhesketh | Morning | 23:04 |
jeblair | morganfainberg: i lean toward letting the proxy be smart, especially it should be doing the compression, if needed, but i also think it can adopt cache policies; that said, i think zuul could probably do some really basic caching internally (like maybe 1s) | 23:04 |
morganfainberg | jeblair, ++ | 23:04 |
*** andreykurilin has quit IRC | 23:04 | |
jeblair | Alex_Gaynor: the status.json is proxied through apache, so it's not being returned at all right now | 23:05 |
*** andreykurilin has joined #openstack-infra | 23:05 | |
Alex_Gaynor | jeblair: AH, that's what you meant by testing my hypothesis :-) | 23:05 |
jeblair | yup | 23:05 |
*** changbl has quit IRC | 23:05 | |
jeblair | status.json is now 1.5M | 23:06 |
jeblair | "trigger_event_queue": {"length": 154}, "result_event_queue": {"length": 45} | 23:06 |
bodepd | yay. builds are way more responsove today | 23:06 |
mordred | jeblair: woot | 23:07 |
jeblair | "trigger_event_queue": {"length": 18}, "result_event_queue": {"length": 144} | 23:07 |
* mordred in my hotel room, so now that you've got everything fixed, I can be of use | 23:07 | |
jeblair | well, i think we've identified a substantial change that we should make soon, before the next time we have a big status page | 23:07 |
jeblair | i won't be able to hack on that before i leave :( | 23:08 |
jeblair | the status page is now increasing in size again because of all the queued changes | 23:08 |
clarkb | jeblair: if you can summarize what you want in one go I am happy to take a stab at it | 23:08 |
clarkb | jeblair: actually let me see if I can regurgitate | 23:08 |
clarkb | jeblair: stop gzipping status in zuul. cache status in zuul (for ~1s). Configure apache to cache and gzip status.json in front of zuul | 23:09 |
*** andreykurilin has quit IRC | 23:09 | |
*** esker has quit IRC | 23:10 | |
clarkb | jeblair: mordred re profiling do you think it should clear stats on stop after it prints what it collected? or should we be cummulative? | 23:10 |
jeblair | clarkb: i think clear? | 23:11 |
clarkb | also maybe this should be a zuul rpc command instead? the signal thing was a quick hack since we already had that data | 23:11 |
mordred | clear | 23:11 |
clarkb | er s/data/functionality for thread stacks/ | 23:11 |
jeblair | clarkb: oh... | 23:11 |
jeblair | though i think the current queue design doesn't favor that, we may want to either use signals or refactor it | 23:12 |
clarkb | oh right it may be lost | 23:12 |
clarkb | ya lets stick with the signal for now | 23:12 |
mordred | I thnk signals are fine | 23:12 |
jeblair | (if we had needed to do an rpc command, we would have had to wait for all the events to flush out first, which obviously was the problem) | 23:12 |
mordred | clarkb: I can also potentially help workon that - I do have an 8 hour plane flight tomorrow | 23:12 |
jeblair | clarkb: i agree with your summary and opened a bug: https://bugs.launchpad.net/zuul/+bug/1326170 | 23:14 |
uvirtbot | Launchpad bug 1326170 in zuul "Status page eats all the CPU when large" [Undecided,New] | 23:14 |
clarkb | mordred: I think the profiler bits are basically done | 23:14 |
clarkb | testing the clearing of stats now | 23:14 |
mordred | clarkb: yah. I more meant I could help with the status thing | 23:14 |
clarkb | oh sure | 23:15 |
jeblair | puppet has restarted apache again, so check out the status page while you can :) | 23:15 |
*** zzelle_ has quit IRC | 23:15 | |
jeblair | we're at check:256 gate:103 | 23:15 |
clarkb | mordred: there are two distinct pieces. do you want to hack on the zuul wsgi piece or the apache proxy? | 23:15 |
mordred | clarkb: I could do either one - whatever's helpful | 23:16 |
*** pfalleno1 has joined #openstack-infra | 23:16 | |
jeblair | clarkb, mordred: that's an ordered list in the bug, btw | 23:16 |
* mordred goes to look | 23:16 | |
*** jeremyb has joined #openstack-infra | 23:16 | |
*** primemin1sterp has joined #openstack-infra | 23:16 | |
*** JayF_ has joined #openstack-infra | 23:16 | |
*** JayF has quit IRC | 23:17 | |
*** zhiyan_ is now known as zhiyan | 23:17 | |
*** JayF_ is now known as JayF | 23:18 | |
clarkb | https://review.openstack.org/#/c/97641/2 has stats clearing now | 23:18 |
clarkb | so I think that is ready for review | 23:18 |
jeblair | Alex_Gaynor: thanks for figuring it out. i owe you a pickle. :) | 23:18 |
*** zhiyan is now known as zhiyan_ | 23:18 | |
Alex_Gaynor | jeblair: :D | 23:18 |
*** swes has joined #openstack-infra | 23:18 | |
*** mmaglana has quit IRC | 23:19 | |
jeblair | clarkb: should we make that optional? | 23:19 |
jeblair | clarkb: eg, a try_import | 23:19 |
*** EmilienM_ has joined #openstack-infra | 23:20 | |
*** jbryce_ has joined #openstack-infra | 23:20 | |
*** spiffxp- has joined #openstack-infra | 23:20 | |
*** DinaBelova2 has joined #openstack-infra | 23:20 | |
*** alaski_ has joined #openstack-infra | 23:20 | |
*** radez` has joined #openstack-infra | 23:20 | |
*** salv-orlando has quit IRC | 23:20 | |
*** alaski has quit IRC | 23:20 | |
*** spiffxp has quit IRC | 23:20 | |
*** jbryce has quit IRC | 23:20 | |
*** radez has quit IRC | 23:20 | |
*** jbryce_ is now known as jbryce | 23:20 | |
*** primeministerp has quit IRC | 23:20 | |
*** pfallenop has quit IRC | 23:20 | |
*** andreaf has quit IRC | 23:20 | |
*** jhesketh has quit IRC | 23:20 | |
openstackgerrit | A change was merged to openstack-dev/cookiecutter: Added a doc build target to tox.ini https://review.openstack.org/92019 | 23:20 |
openstackgerrit | Dan Bode proposed a change to openstack-infra/config: Add puppet-openstack project puppet-openstacklib https://review.openstack.org/97357 | 23:20 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Add toggleable yappi profiling to zuul https://review.openstack.org/97641 | 23:20 |
*** jeremyb_ has quit IRC | 23:20 | |
*** ruhe has quit IRC | 23:20 | |
*** DinaBelova has quit IRC | 23:20 | |
*** EmilienM has quit IRC | 23:20 | |
*** DinaBelova2 is now known as DinaBelova | 23:20 | |
*** EmilienM_ is now known as EmilienM | 23:20 | |
*** rossella_s has quit IRC | 23:20 | |
clarkb | jeblair: up to you. the lib is mit licensed and installable from pypi so I don't think it is a huge problem to have as a requirement | 23:20 |
clarkb | jeblair: as compared to say psycopg2 | 23:20 |
*** ruhe2 has joined #openstack-infra | 23:20 | |
*** rossella has joined #openstack-infra | 23:20 | |
*** andreaf has joined #openstack-infra | 23:20 | |
*** rossella is now known as rossella_s | 23:20 | |
*** ruhe2 is now known as ruhe | 23:20 | |
jeblair | clarkb: it is slightly embarassing to say that a profiler is a production requirement of the program. | 23:21 |
*** jhesketh_ has joined #openstack-infra | 23:21 | |
clarkb | I guess :) | 23:21 |
clarkb | I can make it optional | 23:21 |
*** jhesketh_ is now known as jhesketh | 23:21 | |
openstackgerrit | Devananda van der Veen proposed a change to stackforge/gertty: add alembic to requirements.txt https://review.openstack.org/97646 | 23:24 |
openstackgerrit | German Eichberger proposed a change to openstack-infra/gear: adds code to not block when using eventlet (fixed pep8) https://review.openstack.org/97533 | 23:24 |
clarkb | jeblair: it looks like statsd is option but listed in the requirements.txt. Should I leave yappi in requirements.txt? | 23:25 |
clarkb | *statsd is optional | 23:25 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Add in new check-non-binding pipeline https://review.openstack.org/97411 | 23:26 |
jeblair | clarkb: i don't think so | 23:27 |
jeblair | clarkb: statsd is recommended for a prod deployment, the profiler not so much i think | 23:28 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Add toggleable yappi profiling to zuul https://review.openstack.org/97641 | 23:28 |
clarkb | jeblair: how is that | 23:28 |
clarkb | mordred: jeblair I am working on the zuul apache proxying now | 23:30 |
clarkb | I think the wsgi app change is actually pretty simple so am putting that off | 23:30 |
openstackgerrit | Dan Bode proposed a change to openstack-infra/config: Add puppet-openstack project puppet-openstack_extras https://review.openstack.org/97647 | 23:30 |
jesusaurus | Reference: | 23:32 |
*** timrc is now known as timrc-afk | 23:32 | |
* clarkb learns him a mod deflate | 23:32 | |
jesusaurus | so thats what was in my paste buffer... | 23:33 |
mordred | clarkb: I've got a possible thing - I didn't realize you were looking at it | 23:34 |
mordred | one sec... | 23:34 |
clarkb | mordred: oh even better | 23:34 |
clarkb | mordred: mostly I am learning a thing. AddOutputFilterByType looks like what we want | 23:34 |
sdague | clarkb: we use it in os-loganalyze if I can in any way be helpful | 23:35 |
jhesketh | Let me know if I can help btw guys | 23:35 |
clarkb | jhesketh: maybe you want to tackle the zuul change to stop doing gzip there? I think we can just rip out the Vary Accept-Encoding and gzip | 23:36 |
clarkb | pretty sure that is all that is required as apache should deal with the Vary Accept-Encoding and gzip for us | 23:36 |
*** wenlock has quit IRC | 23:36 | |
clarkb | or I can do it since I have ready the code ... | 23:36 |
clarkb | then you can sanity check me | 23:36 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Use Apache to compress status.json https://review.openstack.org/97648 | 23:37 |
mordred | clarkb: ^^ how does that look? | 23:37 |
jhesketh | clarkb: either or, if it's a simple change you could probably do it quicker since you're at the code | 23:37 |
jhesketh | otherwise I can poke | 23:37 |
clarkb | jhesketh: ok let me do it | 23:37 |
jeblair | okay, i have to go make sure that i have everything i need to spend a week in the wilderness, which merits some dedicated attention. | 23:38 |
mordred | ++ | 23:38 |
clarkb | jeblair: have fun and yes make sure you are ready :) | 23:38 |
jeblair | thanks, and see you later! | 23:38 |
*** aconrad has quit IRC | 23:38 | |
jhesketh | Seeya jeblair :-) | 23:38 |
*** bknudson has joined #openstack-infra | 23:39 | |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Stopping gzipping zuul status json in zuul https://review.openstack.org/97649 | 23:40 |
clarkb | jhesketh: ^ | 23:40 |
StevenK | Hmmm. Where are post job logs kept? | 23:40 |
clarkb | jeblair: it looks like apache is still running though | 23:40 |
clarkb | StevenK: logs.openstack.org/firsttwocharofsha1/sha1 | 23:40 |
*** marcoemorais has quit IRC | 23:40 | |
clarkb | let me make the commit message better | 23:40 |
zaro | jhesketh: i tried building turbo hipster. didn't work for me. i assume that it's no good on mac? | 23:40 |
*** marcoemorais has joined #openstack-infra | 23:40 | |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Stop gzipping zuul status json in zuul https://review.openstack.org/97649 | 23:41 |
jhesketh | clarkb: Looks good to me | 23:41 |
jhesketh | zaro: when did you try? There was a bug with setuptools and gear that wasn't building a few days ago | 23:41 |
*** marcoemorais has quit IRC | 23:41 | |
mordred | jhesketh: what about mine? | 23:41 |
zaro | jeblair: today. | 23:41 |
clarkb | mordred: your change lgtm, just a bit weird to clump the deflate with the Alaises | 23:41 |
*** marcoemorais has joined #openstack-infra | 23:41 | |
clarkb | mordred: is that something we care enough about to change? | 23:42 |
openstackgerrit | A change was merged to openstack-infra/zuul: Use the full release string when showing version https://review.openstack.org/87948 | 23:42 |
*** timrc-afk is now known as timrc | 23:42 | |
clarkb | and of course I failed to put bug number in commit message | 23:42 |
clarkb | I can do this honest | 23:42 |
*** andreaf has quit IRC | 23:42 | |
StevenK | clarkb: Are you sure? http://logs.openstack.org/15/1504ae4 is a 404 | 23:42 |
clarkb | StevenK: full sha1 | 23:42 |
StevenK | Oh | 23:42 |
mordred | clarkb: any reason to not +A yours? | 23:42 |
clarkb | mordred: want to add it to your commit message too and move the deflate line | 23:43 |
clarkb | mordred: only for bookkeeping let me update with bug | 23:43 |
mordred | clarkb: kk. lemme do that too | 23:43 |
jhesketh | mordred: Also looks good to me | 23:43 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Stop gzipping zuul status json in zuul https://review.openstack.org/97649 | 23:43 |
jhesketh | zaro: What error(s) did you get? I'm happy to spend some time helping you get it working :-) | 23:44 |
jhesketh | (ie fixing turbo-hipster if needed) | 23:44 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Use Apache to compress status.json https://review.openstack.org/97648 | 23:45 |
clarkb | mordred: will your change deflate data already deflated by zuul? because that will determine the order we apply these changes | 23:45 |
clarkb | if mod_deflate is smart and leaves it alone we can do yours first then restart zuul more casually | 23:46 |
clarkb | (though maybe we should just ram it in right now regardless) | 23:46 |
sdague | clarkb: I believe it's smart | 23:46 |
zaro | jhesketh: i'll probably need to rerun on my mac to see the problem again. have you ever run it on mac before? | 23:47 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Use Apache to compress status.json https://review.openstack.org/97648 | 23:47 |
* mordred fixed sdague's deflate suggestion | 23:47 | |
openstackgerrit | A change was merged to openstack-infra/config: Update elastic-recheck link on index.html https://review.openstack.org/97134 | 23:47 |
openstackgerrit | A change was merged to openstack-infra/config: Reduce nodepool's wait between calls in hpcloud https://review.openstack.org/97545 | 23:47 |
mordred | clarkb: I say ram it | 23:47 |
mordred | clarkb: apache is off anyway | 23:47 |
jhesketh | zaro: nope | 23:47 |
clarkb | mordred: no apache is on | 23:48 |
*** oomichi has joined #openstack-infra | 23:48 | |
clarkb | puppet keeps enabling it | 23:48 |
clarkb | mordred: once the zuul change merges we can capture zuul pipeline state, stop zuul, upgrade zuul, start zuul, replay the pipeline state | 23:50 |
clarkb | but ya I think having your change go in first isfine | 23:50 |
zaro | jhesketh: "AttributeError: 'module' object has no attribute 'poll'" it's the select.poll, i believe i was having the same issue with python gear. | 23:50 |
clarkb | +2 from me | 23:50 |
*** otherwiseguy has quit IRC | 23:51 | |
*** timrc is now known as timrc-afk | 23:51 | |
zaro | jhesketh: http://paste.openstack.org/show/82698/ | 23:51 |
*** hs634 has joined #openstack-infra | 23:52 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Cache status.json for 5 seconds in Apache https://review.openstack.org/97650 | 23:53 |
mordred | clarkb, sdague, jhesketh && | 23:53 |
*** dprince has quit IRC | 23:53 | |
mordred | gah | 23:53 |
mordred | ^^ | 23:53 |
clarkb | mordred: I am working on the 1s cache in zuul now too | 23:53 |
jhesketh | zaro: hmm, interesting.. I'd say that maybe poll isn't supported on mac's https://docs.python.org/2/library/select.html#select.poll | 23:54 |
jhesketh | therefore turbo-hipster probably won't work on macs :-( | 23:54 |
*** saper_ is now known as saper | 23:55 | |
Alex_Gaynor | select.poll isn't supported on many (all?) versions of OS X due to some bug | 23:56 |
sdague | mordred: cool, though apache mem_cache config is something I've not used before | 23:56 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Cache status.json for 5 seconds in Apache https://review.openstack.org/97650 | 23:57 |
mordred | sdague: we could also do disk cache - but in this case since it's only a 2.5M thing, I mean, it shouldn't need the disk | 23:58 |
*** gokrokve has quit IRC | 23:58 | |
clarkb | ++ | 23:58 |
mordred | sdague, jhesketh: next question - does it cache the gzpped content? or does it gzip the cache'd content? | 23:58 |
*** timrc-afk is now known as timrc | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!