ianw | I've declared bankruptcy on scrollback today, so not sure I'm much help :) | 00:08 |
clarkb | ianw: uh tl;dr is funny networking in rackspace to the git mirror. Lots of connections being killed under haproxy both on the front and backends. So I filed a ticket after much debugging | 00:09 |
clarkb | ianw: networking-vpp clogged up the tubes again and has been asked to wait until dhellmann is done with release things before continuing | 00:09 |
ianw | clarkb: ahh, so that was the issue. jhesketh and I were looking yesterday afternoon, that weird situation where nothing seems to be wrong but something is :) | 00:10 |
clarkb | ianw: osic-cloud8 has a weird floating ip setup for our mirror so to start we are just going to use the mirror in cloud1 (by creating dns records for cloud8 that point at cloud1) | 00:10 |
clarkb | ianw: ya its a fun one whatever it is | 00:10 |
clarkb | nodepool is now running the shade fixes for dual stack networking | 00:10 |
jhesketh | clarkb: good to know, thanks for digging :-) | 00:11 |
clarkb | and we have enabled infracloud and internap mtl01 | 00:11 |
clarkb | I think at least some of these things got status logged | 00:11 |
clarkb | I am going to restart the nodepool builder now so that it can learn how to talk to osic-cloud8 and upload images there | 00:12 |
ianw | clarkb: cool, thanks for the update :) | 00:14 |
bkero | That didn't get fixed in the release? *sigh* | 00:28 |
*** Julien-zte has joined #openstack-infra | 00:29 | |
ianw | bkero: not sure if it's the same issue? this one went in https://git.kernel.org/cgit/utils/util-linux/util-linux.git/commit/disk-utils/sfdisk.c?id=14f644f386a1708483ed446e983c0976e3976a9d | 00:30 |
ianw | i'm really not sure how we're the only people who noticed | 00:30 |
*** baoli has quit IRC | 00:30 | |
bkero | There's probably a kludge in libguestfs that works around this | 00:31 |
*** jamesdenton has joined #openstack-infra | 00:31 | |
clarkb | ianw: it seems like we have a fairly consistent set of failure modes now. init + cloud-init/glean fail to network, growroot fails to growroot, random new firewall software ignores the rules I tell it, new package manager is cranky | 00:31 |
clarkb | rinse and repeat :) | 00:31 |
bkero | Looks to be an adjacent issue. Mine was that the ioctl for sfdisk was failing because loopback device | 00:31 |
openstackgerrit | Merged openstack-infra/system-config: Log gear at debug level on nodepoold https://review.openstack.org/362455 | 00:32 |
ianw | clarkb: yeah. on my todo list is to check that growroot worked in our setup scripts. the problem was that RAX worked, as there was just enough space. and when i was running experimental with just one node, i didn't get out to other providers | 00:32 |
ianw | but once things started getting heterogeneous ... | 00:33 |
*** signed8bit_Zzz is now known as signed8bit | 00:33 | |
ianw | but, pabelanger we should still see why out-of-disk lead ansible to hang around for it's full 3 hour timeout | 00:34 |
*** Hal1 has joined #openstack-infra | 00:34 | |
clarkb | ianw: if I am going to guess its doing a write that blocks despite not enough disk and just never returns? | 00:35 |
*** kzaitsev_mb has joined #openstack-infra | 00:35 | |
clarkb | hrm though that should ENOSPC | 00:35 |
*** baoli has joined #openstack-infra | 00:35 | |
*** thorst has joined #openstack-infra | 00:35 | |
*** zhurong has joined #openstack-infra | 00:36 | |
*** pvaneck has quit IRC | 00:36 | |
*** tphummel has joined #openstack-infra | 00:37 | |
ianw | clarkb: it was something slightly weirder in that console.html showed everything finishing up, but there was an ansible copy process that got stuck. i'll have to dig out the logs, i should make a note for posterity | 00:37 |
*** kzaitsev_mb has quit IRC | 00:40 | |
ianw | ahh, looking at the logs, it's zuul_runner that's behaving odd in this case. that makes more sense, being our custom bit | 00:42 |
*** thcipriani is now known as thcipriani|afk | 00:43 | |
*** signed8bit has quit IRC | 00:44 | |
pabelanger | ianw: ya, I haven't had a chance to dig into fedora24 yet | 00:44 |
pabelanger | clarkb: cool, osic-cloud8 images already uploaded. We can work on launching servers tomorrow | 00:46 |
ianw | pabelanger: so i remember later -> http://paste.openstack.org/show/565462/ . maybe something in the bg causing ssh exit to hang? | 00:46 |
pabelanger | ianw: is it possible git clone is failing? | 00:47 |
pabelanger | ianw: because we had the same issue today in tripleo-test-cloud-rh1 | 00:47 |
pabelanger | ianw: not failing, hanging | 00:47 |
ianw | pabelanger: possibly ... http://logs.openstack.org/12/363212/1/check/gate-tempest-dsvm-platform-fedora24-nv/fc07025/logs/ the workspace-setup-new is 0 bytes. i'm guessing that means whatever output never got flushed to it, rather than it never ran | 00:48 |
ianw | so what's acutally going on ... shrug? | 00:48 |
*** mtanino has quit IRC | 00:49 | |
*** ociuhandu has joined #openstack-infra | 00:49 | |
pabelanger | ianw: also, I don't think it is an SSH issue, because ansible async will poll the server every 10 seconds, and usually ansible will bark is SSH connection fails | 00:49 |
pabelanger | ianw: ya, this looks like what I seen in tripleo-test-cloud-rh1 today, if you get into that node, I suspect you'll see hung git clone process | 00:50 |
pabelanger | from devstack-gate | 00:50 |
*** gyee has quit IRC | 00:51 | |
*** vinaypotluri has quit IRC | 00:52 | |
*** Goneri has joined #openstack-infra | 00:54 | |
*** Sukhdev has quit IRC | 00:55 | |
*** sarob has quit IRC | 00:55 | |
ianw | mtreinish: if around, have some questions on https://review.openstack.org/#/c/234447/ | 01:01 |
*** spzala has quit IRC | 01:02 | |
ianw | particularly what devstack.subunit it's trying to pick up | 01:02 |
*** kzaitsev_mb has joined #openstack-infra | 01:06 | |
*** ociuhandu has quit IRC | 01:08 | |
*** priteau has joined #openstack-infra | 01:11 | |
*** Sukhdev has joined #openstack-infra | 01:12 | |
*** Sukhdev has quit IRC | 01:13 | |
*** priteau has quit IRC | 01:16 | |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config: Revert "Revert "Disable rax-iad due to launch failure rate"" https://review.openstack.org/364012 | 01:16 |
pabelanger | ianw: ^if you don't mind a +A, I forgot to disable that earlier today. Were still having issues in rax-iad | 01:17 |
pabelanger | I was able to reproduce the issue manually, so we can keep the region offline until we know the fix | 01:18 |
*** rossella_s has quit IRC | 01:18 | |
*** chlong has joined #openstack-infra | 01:18 | |
*** shashank_hegde has quit IRC | 01:18 | |
ianw | pabelanger: LGTM | 01:18 |
*** rossella_s has joined #openstack-infra | 01:19 | |
*** tqtran has quit IRC | 01:19 | |
*** aeng has quit IRC | 01:20 | |
pabelanger | ianw: thanks | 01:20 |
ianw | who is responsible for stackviz? | 01:20 |
pabelanger | ianw: I think timothyb89 | 01:26 |
*** yanyanhu has joined #openstack-infra | 01:27 | |
ianw | pabelanger / timothyb89: cool ... well i'm not sure what it thinks it's doing during grenade runs, but i'm pretty sure it's not doing it | 01:27 |
*** baoli_ has joined #openstack-infra | 01:28 | |
*** baoli has quit IRC | 01:28 | |
*** baoli has joined #openstack-infra | 01:30 | |
*** esp has quit IRC | 01:32 | |
*** baoli_ has quit IRC | 01:33 | |
*** salv-orlando has joined #openstack-infra | 01:34 | |
*** Benj_ has joined #openstack-infra | 01:34 | |
openstackgerrit | Merged openstack-infra/project-config: Revert "Revert "Disable rax-iad due to launch failure rate"" https://review.openstack.org/364012 | 01:36 |
*** changzhi has joined #openstack-infra | 01:37 | |
*** spzala has joined #openstack-infra | 01:38 | |
*** Benj_ has quit IRC | 01:41 | |
*** hockeynut has quit IRC | 01:43 | |
*** spzala has quit IRC | 01:43 | |
*** salv-orlando has quit IRC | 01:44 | |
*** spzala has joined #openstack-infra | 01:45 | |
*** sarob has joined #openstack-infra | 01:49 | |
*** esp has joined #openstack-infra | 01:58 | |
*** zshuo has joined #openstack-infra | 02:02 | |
*** Apoorva has quit IRC | 02:03 | |
*** changzhi has joined #openstack-infra | 02:04 | |
*** spzala has quit IRC | 02:05 | |
*** apetrich has quit IRC | 02:05 | |
*** esp has quit IRC | 02:05 | |
*** sdake has quit IRC | 02:06 | |
*** apetrich has joined #openstack-infra | 02:07 | |
*** esberglu has joined #openstack-infra | 02:09 | |
*** gildub has quit IRC | 02:10 | |
*** cody-somerville has joined #openstack-infra | 02:13 | |
*** yamamoto_ has joined #openstack-infra | 02:14 | |
*** hichihara has joined #openstack-infra | 02:14 | |
timothyb89 | ianw: what's the issue? | 02:17 |
ianw | timothyb89: have a look at 2016-08-31 07:32:32.699 in http://logs.openstack.org/26/363326/3/check/gate-grenade-dsvm-neutron-ubuntu-trusty/05856df/logs/devstack-gate-cleanup-host.txt | 02:17 |
ianw | timothyb: i'm working on a refactor of bits of this anyway, it's incredibly hard to understand IMO | 02:18 |
*** tqtran has joined #openstack-infra | 02:18 | |
*** thorst has quit IRC | 02:18 | |
*** thorst has joined #openstack-infra | 02:19 | |
timothyb89 | ianw: ah, hmm. I thought I had the paths for grenade set correctly but I guess not :) | 02:20 |
*** nstolyarenko has joined #openstack-infra | 02:21 | |
timothyb89 | ianw: I'll make sure to take a look at the devstack-gate bits when I'm back in the office tomorrow | 02:21 |
*** tqtran has quit IRC | 02:22 | |
*** fguillot has quit IRC | 02:23 | |
*** Goneri has quit IRC | 02:35 | |
*** vhosakot has joined #openstack-infra | 02:37 | |
*** mriedem has quit IRC | 02:39 | |
*** jamielennox|away is now known as jamielennox | 02:41 | |
*** salv-orlando has joined #openstack-infra | 02:43 | |
*** aeng has joined #openstack-infra | 02:44 | |
*** armax has quit IRC | 02:45 | |
*** zz_dimtruck is now known as dimtruck | 02:47 | |
*** vhosakot has quit IRC | 02:48 | |
*** Sukhdev has joined #openstack-infra | 02:48 | |
*** gouthamr has quit IRC | 02:49 | |
*** amotoki has joined #openstack-infra | 02:51 | |
*** salv-orlando has joined #openstack-infra | 02:52 | |
*** salv-orlando has quit IRC | 02:57 | |
*** dimtruck is now known as zz_dimtruck | 03:02 | |
*** amotoki has quit IRC | 03:08 | |
*** armax has quit IRC | 03:08 | |
*** cody-somerville has joined #openstack-infra | 03:11 | |
*** docaedo has joined #openstack-infra | 03:11 | |
*** vinaypotluri has joined #openstack-infra | 03:14 | |
*** cody-somerville has quit IRC | 03:16 | |
*** yamahata has quit IRC | 03:17 | |
*** amotoki has joined #openstack-infra | 03:21 | |
*** tphummel has quit IRC | 03:23 | |
*** zz_dimtruck is now known as dimtruck | 03:23 | |
*** thorst has joined #openstack-infra | 03:26 | |
*** amotoki has quit IRC | 03:30 | |
amrith | did something just hiccup in zuul? | 03:32 |
amrith | I had a recheck running on a review https://review.openstack.org/#/c/363654/ | 03:32 |
amrith | and it seems to have vanished without a trace | 03:32 |
*** thorst has quit IRC | 03:33 | |
*** shashank_hegde has joined #openstack-infra | 03:33 | |
*** chem|off has quit IRC | 03:34 | |
clarkb | its there... the recheck was after you asked? | 03:34 |
clarkb | I am confused | 03:34 |
clarkb | the reverify 3.5 hours ago reported. then you just rechecked and its queued | 03:35 |
*** amotoki has joined #openstack-infra | 03:35 | |
clarkb | from ehat I can see its all working as expected | 03:35 |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: Fix devstack subunit output https://review.openstack.org/364045 | 03:35 |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: [WIP] Refactor devstack log copying https://review.openstack.org/364046 | 03:35 |
*** vikrant has joined #openstack-infra | 03:36 | |
ianw | timothyb89: ^ i'm thinking something like this. rolling stackviz into the devstack processing part of 364046 would probably remove the confusion | 03:36 |
timothyb89 | ianw: oh, cool, that would be much better | 03:38 |
amrith | clarkb I see it now | 03:39 |
*** sarob has joined #openstack-infra | 03:39 | |
amrith | after I refreshed my screen a couple of times | 03:39 |
amrith | I just requested a recheck | 03:39 |
amrith | something weird | 03:39 |
amrith | for sure | 03:39 |
amrith | petr requested the reverify at 8:06 | 03:39 |
amrith | it failed at 9:50 | 03:39 |
clarkb | and that ran and reported fine | 03:39 |
amrith | but at 11:30 I saw nothing | 03:40 |
amrith | just the previous gate | 03:40 |
clarkb | then you did a recheck and it worked fine | 03:40 |
clarkb | you didntcomment untill:33 | 03:40 |
amrith | and on zuul nothing | 03:40 |
amrith | correct | 03:40 |
amrith | I was refreshing a couple of times | 03:40 |
clarkb | yes and it was in zuul at that point | 03:40 |
amrith | on my browser | 03:40 |
clarkb | so worked fine | 03:40 |
clarkb | you were saying ut wasnt queued before you commented | 03:40 |
amrith | from about 10:30 to 11:30, I saw nothing in review.openstack.org | 03:40 |
clarkb | I think maybe you just didnt get the comment to poat in gerrit as quickly as you thought | 03:41 |
amrith | which is when I posted the question here on IRC :) | 03:41 |
amrith | well it is running again | 03:41 |
clarkb | what woukd there have been to see? | 03:41 |
amrith | that it failed at 9:50pm | 03:41 |
amrith | all I was seeing was the previous success | 03:42 |
amrith | so, check passed and it went to gate. and failed | 03:42 |
amrith | petr reverified | 03:42 |
amrith | then it failed in check | 03:42 |
amrith | that failure was at 9:50pm | 03:42 |
clarkb | yes and from what I see thats all rwcorded properly | 03:42 |
clarkb | then you rechecked and it restarted jobs like asked | 03:43 |
amrith | yes, except from about 10:30 to 11:30 I've been refreshing my screen, and checking zuul | 03:43 |
amrith | and saw nothing :) | 03:43 |
amrith | My guess is that after I posted the recheck | 03:43 |
amrith | something happened | 03:43 |
clarkb | I wasnt able to check that but when I did it worked.any chance you have a proxy being overzealous with caching? | 03:43 |
amrith | not that I know of, I'm sitting at home | 03:44 |
amrith | comcast doesn't typically cache this stuff, I don't think | 03:44 |
openstackgerrit | Changcheng Intel proposed openstack-infra/jenkins-job-builder: [Don't Merge]update base_email_ext to adapt Email-ext plugin https://review.openstack.org/355139 | 03:44 |
amrith | I don't use a proxy at home for sure | 03:44 |
*** Srinu has joined #openstack-infra | 03:46 | |
amrith | hmmm | 03:46 |
clarkb | maybe you had toggle ci toggled? | 03:46 |
clarkb | thats actually lrobably the most likely cause | 03:46 |
amrith | toggle CI shouldn't impact the middle of the screen | 03:46 |
amrith | which shows the results | 03:47 |
amrith | it only shows the stuff in the history | 03:47 |
clarkb | that I dont kniw. its a mess of hacky js to parse the gerrit | 03:47 |
clarkb | I tend to rely in the actual comments | 03:47 |
amrith | below the CR +2's, Verified and Workflow is the jenkins check and gate output | 03:47 |
amrith | and that didn't refresh | 03:47 |
amrith | I didn't even look at the history | 03:47 |
clarkb | also if you were looking at an old patchset that also affects the table | 03:47 |
*** akshai_ has quit IRC | 03:48 | |
clarkb | it wont update like you expect | 03:48 |
clarkb | iirc | 03:48 |
amrith | this section: http://picpaste.com/Capture-HJj23Ccv.JPG | 03:48 |
amrith | circled in the image | 03:49 |
amrith | oh, an f5 won't update that section? | 03:49 |
clarkb | if you are on an old patchset it gets weird I think | 03:49 |
amrith | that I did not know. maybe I should just navigate away from the review and back | 03:49 |
amrith | that could be | 03:49 |
amrith | I could, maybe have been on previous patch set | 03:50 |
clarkb | if you refresh on thr latest patchset it shoukd be fine with a hardrefresh at least | 03:50 |
amrith | don't recall | 03:50 |
amrith | good to know | 03:50 |
amrith | in future just navigate away and come back | 03:50 |
clarkb | I dont know enough about the js details to know if a soft refresh is enough | 03:50 |
amrith | it was 10:30, nothing much has worked today | 03:50 |
amrith | someone said something about mercury going retrograde and causing all the computrons to spin in the wrong direciton | 03:50 |
Srinu | hi | 03:51 |
amrith | our stable branches (both) just died inexplicably in the past couple of days; just realized it | 03:51 |
Srinu | anyone face this issue. http://paste.openstack.org/show/565606/ | 03:51 |
Srinu | please help me. | 03:51 |
amrith | but, on the plus side, the helical inclined plane worked and pulled the cork out of the bottle just fine | 03:51 |
amrith | hi Srinu | 03:51 |
Srinu | amrith: hi. did you saw my question | 03:52 |
clarkb | Srinu: that log points you at thr other log files for specificsyou will need to look at them to determine what is happening | 03:52 |
clarkb | do you have a link to the job logs? | 03:52 |
amrith | yes Srinu .. as clarkb says, the error messages are pointing you to the right log file | 03:52 |
amrith | what's the review #? | 03:52 |
*** salv-orlando has joined #openstack-infra | 03:53 | |
amrith | did he go away? | 03:53 |
Srinu | clarkb,amrith: while the running the block storage tests cases it is killed and then copying logs | 03:54 |
amrith | was this in the gate/CI | 03:54 |
amrith | or on your local machine? | 03:54 |
amrith | looks like CI/gate to me | 03:54 |
Srinu | amrith: ci | 03:55 |
amrith | ok, what's the review # | 03:55 |
amrith | review.openstack.org/#/c/..... | 03:55 |
amrith | or as clarkb said, the link to the file where you got the stuff that you put in paste | 03:56 |
*** salv-orlando has quit IRC | 03:56 | |
amrith | clarkb, what tz are you in? | 03:56 |
clarkb | pst | 03:56 |
amrith | ah | 03:56 |
amrith | wondered if you were a night owl too; it is getting close to being tomorrow now. | 03:57 |
Srinu | amrith,clarkb: please check this. http://paste.openstack.org/show/565607/ | 03:57 |
* jlvillal thinks clarkb should be off work by now :) | 03:57 | |
amrith | Srinu, that won't help | 03:57 |
amrith | what's the review # | 03:57 |
amrith | or a link to the logs | 03:57 |
amrith | just a URL would do ... | 03:57 |
amrith | or is this a private CI? | 03:57 |
amrith | which we can't get to? | 03:58 |
clarkb | amrith screaming toddlers keeping me awake | 03:58 |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: Update bashate to 0.5.0 https://review.openstack.org/236815 | 03:58 |
Srinu | amrith: it is not a patch. this error is coming in jenkins ci(private). | 03:58 |
EmilienM | hi, I'm waiting for this review to release tripleo newton-3: https://review.openstack.org/#/c/363897/ | 03:58 |
tristanC | ttx: can you please confirm the proposed schedule for upcoming election: http://docs-draft.openstack.org/53/335253/3/check/gate-election-docs-ubuntu-xenial/b8bd2a9//doc/build/html/ (rendered from https://review.openstack.org/#/c/335253/2/events.yaml) ? | 03:58 |
amrith | clarkb, it is quiet here, almost midnight | 03:58 |
EmilienM | if you're project-config core, please look at this patch when you can | 03:58 |
amrith | sorry Srinu can't tell from that error; something took too long, someone got angry and killed it. there are lots of books and movies with the same story. | 03:59 |
clarkb | Srinu: I think that means you timed out | 03:59 |
amrith | some test had a 600s timeout | 03:59 |
clarkb | Srinu: try increasing the timeout or make it run faster | 03:59 |
amrith | and your test didn't run in time ... | 03:59 |
amrith | game over | 03:59 |
Srinu | amrith.clarkb: thank you | 03:59 |
*** rlandy has quit IRC | 03:59 | |
*** amotoki has quit IRC | 04:01 | |
*** Srinu has quit IRC | 04:02 | |
*** cody-somerville has joined #openstack-infra | 04:05 | |
*** spzala has joined #openstack-infra | 04:05 | |
*** jamielennox is now known as jamielennox|away | 04:05 | |
openstackgerrit | Merged openstack-infra/project-config: tripleo-ui: add missing jobs for release management https://review.openstack.org/363897 | 04:07 |
amrith | hmm, clarkb is there a simple way to look at the history of a particular job in the CI? project=openstack/trove-integration, job=gate-trove-functional-dsvm-mysql-mitaka. I looked in openstack-health (status.openstack.org/openstack-health) but it seems to only have data through 8/19. | 04:08 |
*** Sukhdev has quit IRC | 04:09 | |
*** mcarden has quit IRC | 04:09 | |
clarkb | amrith: the three places are graphite.openstack.org the health dashboard and elasticsearch | 04:09 |
*** changzhi has quit IRC | 04:09 | |
*** alexey_weyl has joined #openstack-infra | 04:09 | |
amrith | ok, thx. let me look at the other two | 04:09 |
*** alexey_weyl has left #openstack-infra | 04:09 | |
*** vikrant has quit IRC | 04:09 | |
*** alexey_weyl has joined #openstack-infra | 04:09 | |
alexey_weyl | Hi Guys, | 04:09 |
alexey_weyl | Please approve this change | 04:10 |
alexey_weyl | https://review.openstack.org/#/c/363905/ | 04:10 |
*** vikrant has joined #openstack-infra | 04:10 | |
*** spzala has quit IRC | 04:10 | |
*** yamahata has joined #openstack-infra | 04:12 | |
amrith | wow! this graphite is cool stuff | 04:13 |
*** mcarden has joined #openstack-infra | 04:13 | |
clarkb | amrith: there is also grafana.openstack.org that is a different frontend to the same data | 04:14 |
clarkb | we have a grafyaml config somehwere that you can write out cof igs for dashboard in | 04:14 |
amrith | wow, awesome | 04:15 |
amrith | didn't know I could do this ... | 04:15 |
amrith | does one have to sign in with lp credentials or some other? | 04:16 |
openstackgerrit | Merged openstack-infra/project-config: move tripleo scenario jobs to check pipeline, non-voting https://review.openstack.org/363629 | 04:16 |
*** amotoki has joined #openstack-infra | 04:16 | |
clarkb | no its all public data | 04:16 |
clarkb | read only | 04:16 |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: Fix devstack subunit output https://review.openstack.org/364045 | 04:16 |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: [WIP] Refactor devstack log copying https://review.openstack.org/364046 | 04:16 |
amrith | ok | 04:17 |
alexey_weyl | hello, | 04:18 |
alexey_weyl | Can you please check this change: | 04:18 |
alexey_weyl | https://review.openstack.org/#/c/363905/ | 04:18 |
amrith | g'night clarkb .. I just pushed a change to revert to the last point where stable passed, will see what tomorrow (crap: today) brings. | 04:19 |
*** alexey_weyl has quit IRC | 04:20 | |
*** cody-somerville has quit IRC | 04:20 | |
*** dimtruck is now known as zz_dimtruck | 04:22 | |
*** yamamoto_ has quit IRC | 04:23 | |
*** shashank_hegde has quit IRC | 04:26 | |
openstackgerrit | Merged openstack-infra/project-config: Vitrage tempests https://review.openstack.org/363905 | 04:27 |
openstackgerrit | Ian Wienand proposed openstack-infra/project-config: Run bashate test over devstack-gate too https://review.openstack.org/236819 | 04:28 |
*** thorst has joined #openstack-infra | 04:30 | |
*** links has joined #openstack-infra | 04:32 | |
*** asselin__ has joined #openstack-infra | 04:34 | |
*** krtaylor has quit IRC | 04:34 | |
*** rajinir has quit IRC | 04:35 | |
*** jraim has quit IRC | 04:36 | |
*** maximov has quit IRC | 04:36 | |
*** jraim has joined #openstack-infra | 04:36 | |
*** maximov has joined #openstack-infra | 04:37 | |
openstackgerrit | Merged openstack/diskimage-builder: Explain difference between two envvars https://review.openstack.org/345935 | 04:38 |
*** yamamoto_ has joined #openstack-infra | 04:39 | |
*** zz_ja has quit IRC | 04:39 | |
*** shashank_hegde has joined #openstack-infra | 04:39 | |
*** zz_dimtruck is now known as dimtruck | 04:42 | |
*** yamahata has quit IRC | 04:42 | |
*** dtroyer has joined #openstack-infra | 04:44 | |
*** zz_ja has joined #openstack-infra | 04:45 | |
*** esp has quit IRC | 04:45 | |
*** pt_15 has quit IRC | 04:46 | |
*** jbryce has quit IRC | 04:46 | |
*** cargonza has quit IRC | 05:04 | |
*** kun_huang has quit IRC | 05:04 | |
*** jamielennox|away is now known as jamielennox | 05:04 | |
*** ediardo has joined #openstack-infra | 05:04 | |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: [WIP] Refactor devstack log copying https://review.openstack.org/364046 | 05:17 |
*** senk has quit IRC | 05:19 | |
*** nwkarsten has quit IRC | 05:21 | |
*** nwkarsten has joined #openstack-infra | 05:22 | |
*** roxanaghe has quit IRC | 05:26 | |
openstackgerrit | Andreas Jaeger proposed openstack-infra/project-config: Update api-jobs https://review.openstack.org/364076 | 05:27 |
*** shashank_hegde has quit IRC | 05:30 | |
*** yamamoto_ has joined #openstack-infra | 05:34 | |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder: Add IMAGE_ELEMENT_YAML https://review.openstack.org/335265 | 05:36 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder: Making element overriding explicit https://review.openstack.org/334785 | 05:36 |
openstackgerrit | Ian Wienand proposed openstack/diskimage-builder: Convert pkg-map and svc-map copies to explicit variables https://review.openstack.org/335308 | 05:36 |
*** roxanaghe has quit IRC | 05:38 | |
*** asselin_ has joined #openstack-infra | 05:38 | |
*** asselin__ has quit IRC | 05:40 | |
AJaeger | ianw, jhesketh, could you review https://review.openstack.org/364076 as well, please? I need some debugging logs for a change... | 05:42 |
*** ihrachys has joined #openstack-infra | 05:42 | |
jhesketh | AJaeger: looking | 05:42 |
AJaeger | thanks | 05:42 |
*** thorst has quit IRC | 05:42 | |
*** ilyashakhat has joined #openstack-infra | 05:42 | |
*** zz_dimtruck is now known as dimtruck | 05:42 | |
*** cody-somerville has quit IRC | 05:45 | |
*** Sukhdev has joined #openstack-infra | 06:01 | |
*** igormarnat has joined #openstack-infra | 06:16 | |
*** kzaitsev_mb has joined #openstack-infra | 06:25 | |
rcarrillocruz | weeee | 06:31 |
rcarrillocruz | infracloud ran jobs during the night: | 06:32 |
rcarrillocruz | http://logs.openstack.org/63/363863/2/check/gate-ironic-pep8-ubuntu-xenial/e3700fb/console.html | 06:32 |
rcarrillocruz | http://logs.openstack.org/periodic-stable/periodic-octavia-docs-mitaka/51235bc/console.html | 06:32 |
rcarrillocruz | MORNING! | 06:32 |
jhesketh | odyssey4me: looking | 06:33 |
*** ihrachys has quit IRC | 06:33 | |
*** kzaitsev_ws has quit IRC | 06:41 | |
openstackgerrit | Merged openstack-infra/project-config: Add OSA keystone uwsgi functional tests https://review.openstack.org/363640 | 06:54 |
AJaeger | rcarrillocruz: woot! Great to see the progress on infracloud! | 07:13 |
rcarrillocruz | \o/ | 07:13 |
mptacekx | Hi, I am contributing to intel-nfv-ci, currently we are facing an issue with unstable connectivity to OVH file server (ftp.cluster011.ovh.net,, there are some drop-outs fr e.g. 15mins nearly every second hr producing unstable results. Is it a known issue ? Thanks | 07:14 |
AJaeger | rcarrillocruz: will you be around to babysit the cloud? Then let's ask others to +2A 364101 | 07:14 |
*** sshnaidm|afk is now known as sshnaidm | 07:14 | |
*** javeriak has quit IRC | 07:14 | |
*** javeriak has joined #openstack-infra | 07:15 | |
AJaeger | rcarrillocruz: btw. http://grafana.openstack.org/dashboard/db/nodepool-infra-cloud shows vanilla and west. Is west really correct as name? | 07:17 |
openstackgerrit | Peter Zhurba proposed openstack-infra/project-config: Add repo for openstack/puppet-glare. https://review.openstack.org/362950 | 07:17 |
*** jamielennox is now known as jamielennox|away | 07:17 | |
*** yaume has joined #openstack-infra | 07:18 | |
*** shashank_hegde has quit IRC | 07:19 | |
*** bexelbie has quit IRC | 07:20 | |
*** jpich has joined #openstack-infra | 07:25 | |
*** nwkarsten has quit IRC | 07:29 | |
rcarrillocruz | east = chocolate | 07:29 |
*** abregman_ has joined #openstack-infra | 07:32 | |
*** sdake has joined #openstack-infra | 07:44 | |
*** _nadya_ has quit IRC | 08:08 | |
*** matrohon has joined #openstack-infra | 08:12 | |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: [WIP] Refactor devstack log copying https://review.openstack.org/364046 | 08:16 |
skraynev | sorry for the interruption again. could you please add me to the groups https://review.openstack.org/#/admin/groups/1547,members https://review.openstack.org/#/admin/groups/1546,members https://review.openstack.org/#/admin/groups/1548,members according patch https://review.openstack.org/#/c/357745/ | 08:16 |
*** dizquierdo has quit IRC | 08:48 | |
*** dtardivel has joined #openstack-infra | 08:57 | |
*** mhickey has joined #openstack-infra | 09:03 | |
openstackgerrit | Alexander Evseev proposed openstack-infra/puppet-zuul: Replace upstream module by Mirantis' one https://review.openstack.org/364194 | 09:42 |
rcarrillocruz | thx ianw | 10:15 |
openstackgerrit | Giulio Fidente proposed openstack-infra/tripleo-ci: [NO MERGE] Test Ceph RadosGW as replacement for Swift https://review.openstack.org/364227 | 10:48 |
openstack | bug 1619232 in Ironic "Heartbeat()'s race condition: InvalidState: Can not transition from state 'deploying' on event 'resume' (no defined transition)" [High,New] https://launchpad.net/bugs/1619232 - Assigned to Lucas Alvares Gomes (lucasagomes) | 10:50 |
openstackgerrit | Csaba Henk proposed openstack-infra/project-config: remove manila's glusterfs xenial jobs https://review.openstack.org/359167 | 11:09 |
rcarrillocruz | i am | 11:25 |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci: POC: WIP: oooq undercloud install https://review.openstack.org/358919 | 11:26 |
sdague | rcarrillocruz: something is really weird with multinode | 11:28 |
*** Hal2 has joined #openstack-infra | 11:37 | |
sdague | or at least the last thing that logs | 11:42 |
rcarrillocruz | ++ | 11:45 |
Zara | uh oh, another limit | 11:51 |
rcarrillocruz | http://paste.openstack.org/show/565679/ | 11:53 |
*** pradk has joined #openstack-infra | 12:09 | |
sdague | http://paste.openstack.org/show/565685/ | 12:11 |
amrith | ok, one second | 12:17 |
sdague | ok, please do that now :) | 12:28 |
pabelanger | rcarrillocruz: mordred: looks like one of our compute nodes in infracloud-vanilla is having some ConnectTimeout issues: http://paste.openstack.org/show/565687/ | 12:34 |
*** nwkarsten has joined #openstack-infra | 12:50 | |
*** Goneri has joined #openstack-infra | 12:54 | |
amrith | I've seen this before | 12:59 |
amrith | in a bootstrapping manner | 13:04 |
sdague | http://logs.openstack.org/66/364266/2/gate/gate-grenade-dsvm-neutron-multinode/63498bd/console.html | 13:06 |
sdague | and the last thing grenade is doing is downloading a pip | 13:08 |
rcarrillocruz | is not going to take much cpu | 13:18 |
sdague | mordred: ok, cool | 13:23 |
pkoniszewski | AJaeger: so what's the best way to work on such gate? propose new job to experimental queue? | 13:38 |
*** sshnaidm|mtg is now known as sshnaidm | 13:44 | |
*** mriedem has joined #openstack-infra | 13:46 | |
*** amotoki has quit IRC | 13:59 | |
sdague | mordred: patch finally merged | 14:33 |
mordred | sdague: woot | 14:36 |
mordred | pabelanger: looking | 14:36 |
clarkb | mordred: pabelanger the last time we had ssh issues like that control persist was fine we just broke ipv6. Maybe we are affecting the test instamce with some job side effect? | 14:41 |
mordred | clarkb: nod | 14:41 |
mordred | pabelanger: I'm seeing nodes be created there - let's consider that one an outlier for the moment | 14:42 |
mordred | also - we _definitely_ need to add floating ip batching | 14:42 |
mordred | we're slamming the hell out of the neutron api there | 14:42 |
pabelanger | clarkb: Oh, didn't know that. Have a review handy that shows the fix? | 14:45 |
jpich | Is there a story behind the Dell StorageCenter CI leaving a -1 on every patch in the Gerrit sandbox? Getting a 3rd party build failure notification before you even get the "Welcome, new contributor!" message is a bit rough :-) - https://review.openstack.org/#/q/project:openstack-dev/sandbox+status:open | 14:45 |
clarkb | pabelanger: this was the accept ra thing in devstack | 14:45 |
pabelanger | clarkb: okay | 14:45 |
pabelanger | clarkb: I don't know the full back story on the failures, just that sdague wanted to increase the SSH timeout value | 14:46 |
zigo | pabelanger: The packaging repo needs to hold a copy of the code from upstream for multiple reasons. Do I need to explain? | 14:47 |
clarkb | pabelanger: right I was just suggesting the last time we had problems with that fiddling with timeout values would have no effect. And that we shouldnt ignore the possibility of another break neyworking scenario in the jobs | 14:47 |
pabelanger | clarkb: agreed, good to know | 14:48 |
pabelanger | zigo: you might want to sync up with clarkb and fungi. I didn't really give input on that topic. | 14:48 |
zigo | ok | 14:49 |
zigo | clarkb: fungi: Do you have time to discuss this? | 14:49 |
pabelanger | however, I am interested in the discussions | 14:49 |
clarkb | zigo I am mostly here now. The outcome seemed to be we could just ensure $repo is present to the right version when needed | 14:50 |
clarkb | rather than force pushing into the deb repo | 14:50 |
zigo | clarkb: I don't agree with this outcome, that's the point! :) | 14:51 |
clarkb | ok | 14:51 |
zigo | clarkb: First off, the repo for upstream get EOL very early. | 14:51 |
zigo | I don't want this to happen in the packaging trees. | 14:51 |
zigo | We need to cover the life of Debian Stable for Newton for example. | 14:51 |
clarkb | we dont delete repos | 14:52 |
zigo | But branches are EOLed. | 14:52 |
clarkb | the tags dont | 14:52 |
clarkb | nor should any sha1 go away | 14:52 |
zigo | Anyway, the git-buildpackage workflow *IS* to import upstream tag/branch/tarballs into the packaging branch. | 14:52 |
mordred | yes, I concur with that | 14:53 |
zigo | There are ways to make it not do that, but then it's going to be over complicated for no reasons. | 14:53 |
zigo | For example, managing quilt patches will be horrible. | 14:53 |
sdague | clarkb: this is not breaking ipv6 | 14:53 |
*** abregman|mtg has joined #openstack-infra | 14:54 | |
sdague | http://logs.openstack.org/66/364266/2/gate/gate-grenade-dsvm-neutron-multinode/63498bd/console.html | 14:54 |
fungi | zigo: back now. i understand your _manual_ workflow involves merging a copy of nova commits into the pkg-nova repo, and your automated jobs can do precisely the same thing. that doesn't mean that the pkg-nova repo in _gerrit_ needs to carry a complete copy of nova commits | 14:54 |
sdague | clarkb: the last bit in the subnode log is downloading pip packages | 14:54 |
mordred | fungi: I disagree a bit | 14:54 |
mordred | fungi: because the gate isn't the only place where the packaging repo might be used to create packages | 14:54 |
mordred | so it should not assume the encompasing environment that is our gate | 14:55 |
sdague | in the middle of that download, zuul ansible shoots it in the head | 14:55 |
clarkb | if this is a reqyirement you should be a branch on the nova repo | 14:55 |
zigo | Right. Before an upload to Debian, I'll just git clone and rebuild ... | 14:55 |
fungi | mordred: sure. and the manual process (add a remote for the nova repo, pull from it into an upstream branch, merge) is still a valid workflow in that case, right? | 14:55 |
mordred | fungi: that's a severe degredation in funcationality over normal git-buldpackage workflow | 14:55 |
clarkb | sdague: I didnt say it broke ipv6 just consider the nodes can break themselves | 14:55 |
sdague | my theory is that 10s may not be enough time if the node is flat out pulling packages | 14:55 |
sdague | clarkb: sure | 14:56 |
mordred | fungi: like, gbp exists and is used heavily - I don't think we should break it | 14:56 |
mordred | HOWEVER | 14:56 |
mordred | zigo: I'd like to make a suggestion that might be a compromise | 14:56 |
sdague | however, where this is, I don't think that's what happened | 14:56 |
fungi | i'm just concerned that now not only do we have hundreds of separate repos for all packaged software, we have additional copies of all commits for all packaged software | 14:56 |
sdague | clarkb: if you have another thought from those logs, please go and look | 14:56 |
mordred | fungi: right. I don't think we need hundreds of commits in the upstream branch | 14:56 |
mordred | "zigo | Anyway, the git-buildpackage workflow *IS* to import upstream tag/branch/tarballs into the packaging branch." | 14:57 |
mordred | tag/branch is one of the options | 14:57 |
mordred | the other option is upstream tarballs, which get imported in a single commit and then tagged | 14:57 |
mordred | that is in line with how gerrit works | 14:57 |
mordred | and sohuld not require tons of infra-side automation ... unfortunately, it would be a non-trivial rework of the existing packaging repos I fear | 14:58 |
zigo | mordred: Even if you use tarballs, it imports the code in the packaging branch. | 14:58 |
clarkb | there is somethi g a little funny about hating vendored code | 14:58 |
mordred | zigo: it does | 14:58 |
clarkb | then requiring it | 14:58 |
mordred | zigo: but it's not 100 commits - it's one commit | 14:58 |
mordred | zigo: whic hmeans you could submit that commit to gerrit using git review like normal | 14:58 |
mordred | then land it | 14:58 |
fungi | also we've come full circle from "you should avoid putting the debian package metadata in your upstream source tree" to "you should put a copy of the upstream source tree in the repo where you have your debian packaging metadata" | 14:58 |
mordred | fungi: it's not the same thing at all | 14:58 |
zigo | mordred: The point is *also* to be able to use *any* commit upstream (and not just a branch or a tag) and package that. | 14:59 |
mordred | putting debian package metadata in upstream is problematic because of sequencing issues | 14:59 |
zigo | Could you explain what the problem is? Do we lack resources? | 14:59 |
mordred | if you want to release version 1.2 of an upstream software, what state should the packaging be in in upstream tag 1.2 | 14:59 |
zigo | Is it just too big? | 14:59 |
fungi | i understand the reasons, i'm asking that we separate _process_ (steps to build a package) from revision control (data we store for building a package) | 14:59 |
mordred | it necessarily cannot be the packaging that knows how to handle 1.2, because 1.2 doesn't exist until 1.2 exists | 14:59 |
zigo | fungi: If we get an ACL to merge tags, it is the same process, it goes under the Gerrit code review. | 15:00 |
zigo | I tried, the only issue is the missing ACL. | 15:00 |
zigo | (at least, it looks like it) | 15:00 |
fungi | all the data we need to build a debian package of nova exists in a combination of the nova repo and the pkg-deb repo. the steps to build the package can include combining those. the combined result doesn't have to get pushed back into the revision control history though, does it? | 15:00 |
zigo | fungi: If we don't, then it becomes a way harder to do *many* things. | 15:01 |
zigo | Like adding a quilt patch, or building before uploading to Debian. | 15:01 |
fungi | i don't understand. what prevents you from adding a git remote for the nova repo, pulling from it into an unstream branch and then doing a git merge? | 15:02 |
mordred | I feel like we're trying ot fight the current established best practice of debian packaging (use gbp with upstream sources included in the packaging repo) in the context of openstack infra, which seems like a very strange thing for us to be in the business of doing | 15:02 |
fungi | it doesn't seem like we're fighting it at all | 15:02 |
clarkb | mordred: do you at least agreeforce pushing noca into a different repo is a bad idea? | 15:03 |
fungi | i'm saying it's a step which can be performed locally, the same as it can be performed in our ci. why does the result (a duplicative, generated result) need to be pushed back into revision control history? | 15:03 |
mordred | clarkb: I believe we sohuld be importing tarballs using gbp pristine-tarball | 15:03 |
mordred | fungi: but it's a manual step which is not how gbp works | 15:03 |
fungi | you don'y manually invoke gbp? | 15:03 |
mordred | if it was a normal operational mode of gbp, then sure | 15:03 |
mordred | yes - but you run git-buildpackage and it does all the thigs | 15:04 |
fungi | it seems like any manual process involves _at_least_ one manual step | 15:04 |
mordred | yes. the one step of running git-buildpackage | 15:04 |
pabelanger | zigo: mordred: What is the objection to pulling in the upstream git repo or tarball at build time? When I was rolling packages with git-buildpackages, I would use uscan to fetch them before git-buildpackage. Obviously skipping using the upstream-branch step | 15:04 |
fungi | before running gbp, you probably _also_ have a manual step of cloning the repo you're going to run it in, yeah? | 15:05 |
mordred | if for openstack packages it's clone the repo, now do this other stuff to get another repo added to this repo, then run git-buildpackage, then we've subverted the power of the tool | 15:05 |
mordred | the steps are "clone the packaging repo, cd to the packaging repo, run git-buildpackage" ... what I'm saying is that if we inject more manual steps into that, then we're doing weird things | 15:05 |
zigo | fungi: pabelanger: 1/ It makes it very hard to manage quilt patch 2/ it's not the usual git-buildpackage workflow, so it will confuse contributors 3/ It will be a lot of pain points for no reasons maintaining scripts which we otherwise wouldn't need to write | 15:05 |
zigo | fungi: pabelanger: What's the reason for *NOT* doing it? | 15:06 |
pabelanger | mordred: zigo: I would use overlay, IIFC | 15:06 |
pabelanger | zigo: eg: https://github.com/pabelanger/jenkins-job-builder-deb/blob/master/debian/gbp.conf | 15:06 |
mordred | pabelanger: right. with overlay you have to maintain local patches in quilt | 15:06 |
clarkb | zigo: the reason for not pushing all of nova into nova deb is nova code has a home it does not need two homes. its goimg to double the size of our test image cahes and so forth | 15:06 |
pabelanger | yes | 15:06 |
fungi | i guess my point is that i've seen plenty of different gbp workflows. some maintainers import tarballs, some pull from upstream git repos, some use quilt, some use single-patch, some push copies of upstream code back into public revision history and some don't... | 15:06 |
mordred | that is very differnt than using git | 15:06 |
mordred | fungi: right. | 15:07 |
clarkb | you are aslo asking to bypass every control on herrit | 15:07 |
pabelanger | mordred: zigo: yes, that is what I would do, quilt | 15:07 |
zigo | pabelanger: Last time we discussed using this in the Python module team, absolutely everyone but a single person liked using overlay. | 15:07 |
mordred | right | 15:07 |
mordred | so ... | 15:07 |
mordred | I do not think we should completely rework the 175 packaging repos that have a multi-year packaging history already | 15:07 |
pabelanger | zigo: hehe, ya, I prefer overlay too | 15:07 |
mordred | to use quilt instead of git | 15:07 |
fungi | i'm requesting flexibility in choosing a workflow that meshes well with the situation where upstream and packaging repositories are already in the same system and avoid keeping duplicate copies | 15:07 |
fungi | mordred: well, they use quilt _and_ git afaik | 15:08 |
pabelanger | mordred: Ya, now I understand the issue. | 15:08 |
*** kien-ha has joined #openstack-infra | 15:08 | |
fungi | quilt packages carried in teh debian/patches dir, committed into the packaging branch | 15:08 |
mordred | fungi: right. but there is no need to introduce quilt into an existing ecosystem that does not already use quilt | 15:08 |
pabelanger | FWIW: I didn't mind quilt, but I am not the one doing the packaging | 15:08 |
mordred | that is one way these _could_ have been done | 15:08 |
mordred | but it's not the way they were | 15:08 |
fungi | well, i wasn't specifically suggesting to use quilt. i was giving it as an example (and i thought zigo already carried debian/patches in his packages but i'll pull a source package and double-check that) | 15:09 |
zigo | I do use quilt, I'm not sure what fungi suggests here... | 15:09 |
mordred | zigo: wait - you're already using quilt? | 15:10 |
mordred | wtf | 15:10 |
mordred | why are we having this argument then? | 15:10 |
clarkb | anyways my biggest concern is we dont bypass gerrit by allowing force pushes and we dont doubt the size of our test images by literally copy.pasting all our repos | 15:10 |
zigo | mordred: I don't understand ! :) | 15:10 |
clarkb | s/doubt/double/ | 15:10 |
*** zz_dimtruck is now known as dimtruck | 15:11 | |
mordred | zigo: importing upstream sources into the packaging repo is a thing you do as an alternative to using quilt to manage local changes | 15:11 |
zigo | mordred: No. | 15:11 |
mordred | zigo: if you're using quilt, then just pointing at release tarballs should be fine | 15:11 |
zigo | I use quilt patches to make sure I keep Debian specific patches separated. | 15:11 |
zigo | I don't see any relationship between using quilt and the current gbp workflow. | 15:12 |
pabelanger | Right, that is how I understood it too. | 15:12 |
mordred | right. in a git workflow you can have each of those debian specific patches in the debian packaging branch and gbp will create the patch overlay and packaging build - you don't need the quilt, you have git | 15:12 |
pabelanger | clarkb: Right, https://github.com/openstack/deb-python-os-client-config/tree/debian/newton | 15:12 |
mordred | however, that's getting into the weeds | 15:12 |
zigo | Nop! :) | 15:12 |
zigo | That's not what it is about. | 15:13 |
pabelanger | is basically pytnon-os-client-config repo, with debian folder added | 15:13 |
zigo | At least, that's not how I use quilt. | 15:13 |
pabelanger | clarkb: so, a lot of duplicated code | 15:13 |
zigo | mordred: What you're talking about is more like if we were using git-dpm or gbp pq. | 15:13 |
zigo | pabelanger is right. | 15:13 |
pabelanger | I always thought we we're just talking about the debian folder for the packaging repos | 15:13 |
fungi | the question being floated is, for example, whether we should import a complete copy of nova into the pkg-nova repo | 15:14 |
zigo | We're talking about having upstream code/tag within the debian packaging branch, yes. | 15:14 |
pabelanger | like: https://github.com/openstack-infra/zuul-packaging/tree/debian/sid | 15:14 |
zigo | Which is what 99% of package maintainers do. | 15:14 |
mordred | yah. if we're keeping patches in quilt, I see no point in having the upstream sources in the repo. I _do_ argue in favor of upstream sources in the repo so that quilt can be avoided | 15:14 |
clarkb | pabelanger: no this is literally copying all history an potentially making more than one copy one for each branch (not sure how its organized) | 15:15 |
fungi | and my position, which i've yet to see clear evidence indicating it's not viable, is that pulling nova source into the pkg-nova repo could happen at job runtime rather than being a manual process the results of which are pushed back into the repo in gerrit | 15:15 |
zigo | Then how do you actually manage to run dpkg-source --commit ? | 15:15 |
pabelanger | clarkb: looking at https://github.com/openstack/deb-python-os-client-config and https://github.com/openstack/os-client-config it is slightly confusing which one is upstream, since both master branches are the same | 15:16 |
zigo | fungi: I'm not saying it's not possible to do it, I'm saying it's a *WAY* more complicated to do what you're describing. | 15:17 |
zigo | fungi: First, you got to design the code to do the pull / push from here and there. | 15:17 |
fungi | zigo: we actually do a lot of tat in other jobs anyway | 15:17 |
zigo | fungi: Then, when designing a quilt patch, you need to do many things so that you can finally type dpkg-source --commit | 15:17 |
pabelanger | I'm not sure it is way more complicated, but it does delay the packaging process | 15:17 |
zigo | fungi: Then, we'd have to do more work to upload to debian as well, instead of just git clone, build, upload. | 15:18 |
zigo | I've given my argumentation about why I prefer the current workflow, I still don't get why you prefer not using it. | 15:18 |
fungi | i'm suggesting that when developing locally you can pull from the nova repo while writing your quilt patch, but only push the quilt patch back into the pkg-nova repo rather than merging a copy of all of nova into the pkg-nova repo | 15:18 |
zigo | fungi: Yes, that's more manual operations ! | 15:19 |
zigo | :) | 15:19 |
zigo | To me, it just looks like you're saying "whaaaaat? We'll have 2 copies?". Is there any other point of argumentation besides that? | 15:19 |
fungi | and is avoiding that one manual operation worth keeping an entire extra copy of all upstream commits to all packaged projects in gerrit? | 15:19 |
zigo | Also, what would be the workflow if I want to package commit 28af82dec ? | 15:20 |
mlavalle | hi is there anybody here who can help me with a logstash question? | 15:20 |
zigo | The version should be something like 1:2.3.4+2016.09.01.git.28af82dec-5+13~bpo8+1 | 15:21 |
zigo | Do you suggest that we add some logic to extract the git sha256, and pull the relevant things in the packaging branch, in the build script? | 15:21 |
zigo | This looks very hackish to me. | 15:22 |
clarkb | zigo: I listed my two reasons why I don't like it | 15:22 |
pabelanger | zigo: right, we'd have to do that | 15:22 |
zigo | clarkb: I was on the phone at the same time, I'm sorry, I probably missed them. | 15:22 |
zigo | (now I'm all focussed...) | 15:22 |
pabelanger | zigo: and we do today, some jobs will use zuul-cloner to fetch specific commits. This could be used in your build process with the Depends-On field | 15:23 |
clarkb | zigo: I don't want multiple copies of every repo floating around as it puts pressure on already large test images which we can't reliably update. I also don't want a git process that bypasses every control in Gerrit | 15:24 |
pabelanger | rcarrillocruz: over an hour and no launch failures in infracloud | 15:25 |
rcarrillocruz | woot! | 15:25 |
zigo | clarkb: There's no such thing as "bypasses every control in gerrit", if I understand correctly. | 15:25 |
pabelanger | rcarrillocruz: our time to ready is still a little high, but we could still be distributing images | 15:26 |
clarkb | zigo: what you want is to be able to force push arbitrary commits into your repos | 15:26 |
zigo | clarkb: I believe a git merge -X <tag-name> would just end up in a normal CR. | 15:26 |
clarkb | zigo: that is bypassing the controls we have put in place | 15:26 |
clarkb | oh you are wanting to push a merge commit? ok thats better. The previous discussion said you all needed to push the tags straight in | 15:26 |
zigo | clarkb: That isn't what it seemed when I tried to "git review" a merge commit. | 15:26 |
clarkb | but its still creates the other problem of doubling (or worse) the size of all the repos in gerrit | 15:27 |
clarkb | mordred: zigo if we do that I will suggest we not cache any of the packaging repos on our test images | 15:27 |
zigo | clarkb: Basically, I want to be able to do: git merge -X theirs <TAG_NAME> .... some packaging change .... git commit -a --amend && git review | 15:27 |
clarkb | because it just doesn't scale that way | 15:27 |
zigo | clarkb: That, you guys decide, that wouldn't be me. | 15:28 |
zigo | If you think it's better to remove the cache, that's an -infra team decision. | 15:28 |
clarkb | zigo: not that its better just that I am only interested in carrying one copy of the data in the space constrainted environment | 15:28 |
zigo | I've been doing git clone on each build for a *very* long time on my jenkins, so that's ok to me. | 15:28 |
zigo | clarkb: We're talking about around 1GB of data here. | 15:29 |
zigo | Do we really need to save THAT much? | 15:29 |
clarkb | zigo: its not just 1GB | 15:30 |
clarkb | its 1GB in nova then 1GB in pkg nova | 15:30 |
clarkb | then in six months is 1.5GB and 1.5GB | 15:30 |
zigo | Nop, that's the total for all of the Git in all of my packaging. | 15:30 |
clarkb | (illustrative only not real numbers) | 15:30 |
clarkb | zigo: yes trying to illustrate the growth pattern here though | 15:30 |
zigo | Nova is huge, other packages are mostly very small. | 15:30 |
clarkb | and as mroe and more projects happen and you package more and more repos... | 15:30 |
clarkb | the git cache on our images is huge right now and accounts for most of the disk use of those | 15:31 |
pabelanger | I think the issue is, we are proposing another method of doing it, which is different then the workflow you are accustomed too. And the potential amount of work needed to switch workflows | 15:31 |
zigo | For example, a non-bare repo of deb-python-debtcollector is not even 1MB with all the git history. | 15:31 |
zigo | pabelanger: That's one big issue too. And I'm supposed to have all of OpenStack Newton ready this week too ... | 15:31 |
pabelanger | We could use openstack-infra/zuul-packaging as a POC, because I always envisioned using git-buildpackage with overlay to produce its package builds. | 15:32 |
*** annegentle has quit IRC | 15:32 | |
zigo | Could we agree to discuss this *later*, ie when I'm not in the rush of a release? | 15:32 |
pabelanger | zigo: yes, this would impact that deadline. As we need to workout the process | 15:32 |
clarkb | we turned off the du'ing due to the xenail build issue so I can't easily just look at logs for a disk size number | 15:32 |
clarkb | but over 7GB I think | 15:33 |
zigo | I still hope to release Newton with infra, but if we decide to switch to another workflow, I give up for this release. | 15:33 |
zigo | I don't want to risk having a bad OpenStack release for Debian Stretch (frozen at the end of the year). | 15:34 |
zigo | (and Mitaka is not an option as Horizon/Mitaka isn't Django 1.10 compatible) | 15:34 |
rcarrillocruz | pabelanger, mordred : https://review.openstack.org/#/c/364397/ | 15:36 |
rcarrillocruz | i still need to put a test for it, but you get the idea | 15:37 |
rcarrillocruz | clarkb: we were also talking about ^ these days | 15:37 |
zigo | pabelanger: I'm saying that I wont have the necessary time to change everything (workflow, build scripts, etc) given the time constraints of the Newton release. | 15:37 |
clarkb | pabelanger: zigo openstack/ is 6.7 GB | 15:37 |
clarkb | so if packaging deb is 1GB you represent more than 1/7th of our entire disk use | 15:37 |
zigo | clarkb: This includes stuff I don't need. | 15:37 |
clarkb | zigo: yes I am comparing you to the whole | 15:38 |
zigo | ok | 15:38 |
clarkb | and that will only grow as the rest of openstack/ grows because you are a copy of a significant chunk of it | 15:38 |
pabelanger | zigo: right, you want to use same build script for both repos. I was confused you wanted to drive package builds from openstack into debian some how. | 15:38 |
* zigo tries now to have actual real numbers | 15:38 | |
clarkb | I am totally happy to cache those repos and host them. I just don't think they should include entire copies of the other repos we host | 15:39 |
pabelanger | clarkb: Ya, that was my thought on using overlay option for git-buildpackage | 15:39 |
pabelanger | otherwise, why not just include debian folder into nova? | 15:40 |
pabelanger | rcarrillocruz: seems right, can we add test for devstack? | 15:41 |
pabelanger | rcarrillocruz: Oh, I see you are working on that | 15:41 |
zigo | clarkb: You do realize that this also includes 3rd party python modules? | 15:41 |
rcarrillocruz | yep, follow up patch | 15:41 |
zigo | (so that the reasoning doesn't apply for them...) | 15:41 |
clarkb | we also created some of the repos multiple times in gerrit | 15:41 |
clarkb | so my ask is just that we bit a bit more careful and not spam gerrit with a bunch of unnecessary copies of things. AIUI this is possible | 15:42 |
zigo | Currently, in alioth.debian.org all the deb-* are 1.5 GB bare git repos. | 15:42 |
zigo | clarkb: Would it be possible to install the cache only in the Jessie image? | 15:43 |
clarkb | zigo: ya there is potential tuning we can do around that. However we have in the past tried to treat them all the same so that you don't have to consider such differences when moving among distros. Focus should be on the distro differences | 15:44 |
clarkb | I think we shoul stick to that goal as much as possible. | 15:44 |
zigo | Makes sense. | 15:45 |
pleia2 | good morning | 15:45 |
pabelanger | clarkb: rcarrillocruz: fungi: brings osic-cloud8 online^ | 15:49 |
pabelanger | so far both internap-mtl01 and infracloud-vanilla look good | 15:49 |
rcarrillocruz | +2 | 15:50 |
rcarrillocruz | has the multinode issue being solved | 15:50 |
rcarrillocruz | sdake: | 15:50 |
rcarrillocruz | ^ | 15:50 |
rcarrillocruz | erm | 15:50 |
rcarrillocruz | sdague: ^ | 15:50 |
rcarrillocruz | sorry mr dake :-) | 15:50 |
pabelanger | yes, sdague should have included you | 15:50 |
sdague | rcarrillocruz: yes | 15:50 |
rcarrillocruz | so, we good to bump now | 15:50 |
rcarrillocruz | ? | 15:50 |
rcarrillocruz | i have a change to bump infracloud to 50 | 15:51 |
rcarrillocruz | and i increased quota this morning on the zuul project | 15:51 |
sdague | so, probably, I'd just ramp slowly and make sure to watch zuul like a hawk to look for new fails | 15:51 |
sdague | what would be super cool is if nodes could be brought into a test run holding pen where some changes are just duped onto them | 15:52 |
sdague | so we could get burn in without blowing up real work | 15:52 |
pabelanger | rcarrillocruz: So, I want to see what is needed to start getting mcollective going on infracloud. Or some other method for statsd | 15:52 |
pabelanger | sdague: Ya, I've often thought of that too | 15:53 |
rcarrillocruz | sdague: so now it's on 10 nodes, my change is about bumping to 50. I'm cool bumping to less than that | 15:53 |
pabelanger | would be a good tool to help debug clouds without affecting production | 15:53 |
rcarrillocruz | indeed | 15:53 |
sdague | because this only test live thing is just rough if people don't stay on top of job fails for the hours after | 15:54 |
sdague | I seem to be meat nagios for that a bunch | 15:54 |
rcarrillocruz | pabelanger: i'm confused, statsd is metrics but my understanding is mcollective is puppet orchestration ? | 15:54 |
pabelanger | rcarrillocruz: haha, collectd | 15:55 |
pabelanger | my mistake | 15:55 |
jeblair_ | please don't run collectd on just *some* of our systems | 15:55 |
jeblair_ | if you want to do that, replace cacti completely, and everywhere, and make sure that you get data that is correct like cacti | 15:55 |
pabelanger | I thought that was the tooling for choice from the talks at our last midcycle? | 15:55 |
jeblair_ | we don't need two systems graphing memory in parallel | 15:56 |
rcarrillocruz | pabelanger , jeblair_ : that seems like a good topic for either mid cycle or summit | 15:56 |
rcarrillocruz | i agree we should stick to one | 15:56 |
pabelanger | Ya, if we can use the same across everything, that makes things easier | 15:57 |
jeblair_ | rcarrillocruz: i don't think anyone is opposed to it. i'm just saying it's a lot of hard work, and you have to understand what every graph is measuring to be sure to get the data right. our cacti install had *a lot* of work put into it to make sure it's correct. | 15:58 |
jeblair_ | rcarrillocruz: i mean, maybe we should talk about it so i can say this to everyone | 15:58 |
rcarrillocruz | indeed | 15:58 |
*** gyee has joined #openstack-infra | 15:58 | |
ihrachys | depends-on fails for me lately. somehow zuul forgets to pick up a patch for merge after its dependency lands its repo | 15:58 |
jeblair_ | because this is the second time in the past few days i've given this little speech :) | 15:58 |
rcarrillocruz | heh | 15:58 |
ihrachys | has anyone noticed? | 15:58 |
sdague | ihrachys: which patches? | 15:58 |
pabelanger | I'm sure its been me twice asking about statsd too | 15:58 |
jeblair_ | pabelanger: in the mean time, we do run cacti everywhere, including infra cloud | 15:58 |
ihrachys | sdague: just nudged https://review.openstack.org/#/c/325208/ | 15:58 |
rcarrillocruz | fwiw, i added all compute hosts to cacti | 15:59 |
ihrachys | sdague: there were others, but I can't immediately come up with links because I lost context. | 15:59 |
jeblair_ | rcarrillocruz: cool, i'll update the trees later | 15:59 |
*** javeriak_ has joined #openstack-infra | 15:59 | |
sdague | ihrachys: python-neutronclient isn't in a shared pipeline with neutron | 16:00 |
ihrachys | sdague: another one was https://review.openstack.org/#/c/362772/ | 16:00 |
sdague | so I think that's expected | 16:00 |
sdague | they have to share a pipeline for them to auto process together | 16:00 |
sdague | ihrachys: yep, that would be the same thing | 16:00 |
sdague | same reason it happens for project-config changes | 16:00 |
dhellmann | infra folks, it would be good to have some quick attention on https://review.openstack.org/364417 to cut down the false failures we're seeing in release jobs today | 16:00 |
ihrachys | sdague: oh I see. not that it's the best user experience because honestly I don't know by heart pipeline setups :) | 16:00 |
pabelanger | jeblair_: Yes. Maybe what I should be asking is how to get data out of cacti to be rendered with grafana. | 16:01 |
jeblair_ | ihrachys: if the two projects are not tested together from source, then the depends-on has a slightly different meaning. it means "this patch can't be enqueued until the other patch lands". | 16:02 |
ihrachys | jeblair_: yeah, but even when the first lands, the second does not get anywhere near merge queue | 16:02 |
jeblair_ | ihrachys: (in this case, the depends-on can't be used to incorporate a pending change from another repo) | 16:02 |
ihrachys | jeblair_: so I need to W+0/W+1 it again to get the job done | 16:02 |
ihrachys | jeblair_: you mean by client patch has not really fetched server side? | 16:03 |
jeblair_ | ihrachys: yes, that's true. you could always leave the depends-on header off in that case. | 16:03 |
jeblair_ | ihrachys: correct | 16:03 |
sdague | jeblair_: would it be hard to look for those an enqueue them on patch merge? | 16:04 |
ihrachys | jeblair_: gotcha. it sucks and is confusing, but I guess I just had too many hopes in the magic :) | 16:04 |
jeblair_ | ihrachys: well, if it *did* then the pipeline config is wrong | 16:04 |
sdague | I've definitely seen it trip people up | 16:04 |
jeblair_ | ihrachys: if it actually *did* pull in changes, then maybe it should be in the same shared queue | 16:04 |
ihrachys | jeblair_: I guess they were passing fine independently, that's why I haven't spotted anything on check queue | 16:04 |
jeblair_ | sdague: i think that would be possible | 16:05 |
sdague | jeblair_: I think that's the thing that people are expecting, then get confused when it doesn't happen. | 16:05 |
ihrachys | jeblair_: ok, for example neutronclient gates on neutron code via its functional job. wouldn't it mean that they should have had the same pipeline? | 16:05 |
*** kien-ha has quit IRC | 16:06 | |
sdague | ihrachys: only if they share a job name | 16:06 |
*** jlk` is now known as jlk | 16:06 | |
jeblair_ | i guess in this case, neutron doesn't gate on neutronclient | 16:06 |
sdague | jeblair_: correct | 16:06 |
sdague | which is intentional | 16:06 |
zaro | fungi, clarkb: for tomorrow ^ | 16:08 |
jesusaur | infra-root: when you get a chance, I'd like your opinions on https://review.openstack.org/363969 and the output at http://logs.openstack.org/69/363969/1/check/gate-project-config-layout/2386513/console.html#_2016-08-31_23_11_54_465294 | 16:08 |
*** mwhahaha has quit IRC | 16:10 | |
*** shashank_hegde has joined #openstack-infra | 16:11 | |
*** ijw has joined #openstack-infra | 16:11 | |
mgagne | pabelanger, clarkb: ready for internap-mtl01 | 16:11 |
sdague | any other project-config core folks want to do a quick review here - https://review.openstack.org/#/c/363937 ? | 16:11 |
*** yamamoto has joined #openstack-infra | 16:11 | |
sdake | rcarrillocruz no need to call me mr :) | 16:12 |
rcarrillocruz | ;-) | 16:12 |
*** jordanP has quit IRC | 16:12 | |
fungi | sdague: in theory we could do something like that with a nodepool-dev/zuul-dev environment. set them up with a limited project list and a check pipeline and only add quota for new providers to nodepool-dev initially | 16:14 |
fungi | (the new provider incubation corral) | 16:15 |
sdague | fungi: that would be kind of nice | 16:15 |
fungi | it would be a nontrivial thing to create and maintain, but it's probably worth entertaining | 16:15 |
sdague | because the onboarding of new providers is great for capacity, but tends to make fail spikes | 16:15 |
*** yamamoto has quit IRC | 16:15 | |
sdague | so is rough during crunch time | 16:16 |
clarkb | I mean | 16:17 |
rcarrillocruz | fungi: so, i assume we'll get chocolate on newton , when we discuss at the mid-cycle no? | 16:17 |
rcarrillocruz | when is newton due? | 16:17 |
clarkb | have ahve fail spikes in long lived clouds all the time too | 16:17 |
clarkb | I think its more "cluod" than "new cloud" | 16:17 |
clarkb | (like the osic ipv6 thing wasn't a new cloud issue, just cloud issue) | 16:18 |
pabelanger | there is a lot of overhead for nodepool-dev too, image uploads for example | 16:18 |
fungi | rcarrillocruz: worth bookmarking https://releases.openstack.org/newton/schedule.html | 16:18 |
clarkb | that said I think its a reasonable thing to do I just don't want the expectation to be we will never have a cloud issue again | 16:18 |
rcarrillocruz | thanks | 16:18 |
*** kien-ha has joined #openstack-infra | 16:19 | |
sdague | clarkb: sure | 16:19 |
mordred | clarkb: to be fair though ... the ipv6 was a cloud config change that we knew about and was _like_ adding a new cloud | 16:19 |
mordred | or could have been seen that way - although I will admit I did not think of it that way at the time | 16:19 |
*** hashar is now known as hasharAway | 16:19 | |
sdague | but we had to drop an internap region, and infra cloud in the last 2 weeks | 16:19 |
clarkb | mordred: I didn't (and I still don't really getting a new IP address is not liek getting a new cloud imo) | 16:19 |
fungi | rcarrillocruz: so the infra sprint is the week before final release candidates for newton | 16:20 |
sdague | all of which could be better servered with a holding pen | 16:20 |
mordred | clarkb: totally | 16:20 |
sdague | so those could live debug through issues | 16:20 |
sdague | and once got to a good pass rate, get added to the good pool | 16:20 |
clarkb | sdague: yup I agree. I have always tried to run tests myself on the instances before bringing them into the fold not sure if we managed to do that this time. And having a tool to do that would avoid ah uman needing to think about it. So definitely a reasonable thing to do | 16:20 |
fungi | sdague: clarkb: mordred: though i think having a proving ground shadow ci for new providers in some ways simplifies testing them out when we have to determine appropriate flavors and whatnot | 16:21 |
fungi | er, what clarkb just said basically | 16:21 |
sdague | fungi: yep | 16:21 |
*** lucasagomes is now known as lucas-dinner | 16:21 | |
*** senk has quit IRC | 16:21 | |
sdague | I am well aware we will always have fails :) | 16:22 |
fungi | we hadn't seriously considered it in the past because adding a new provider happened once every year or two | 16:22 |
fungi | now it seems to be something closer to a monthly occurrence | 16:22 |
rcarrillocruz | heh, yeah | 16:22 |
fungi | what an awesome problem to have, btw | 16:23 |
pabelanger | rcarrillocruz: fungi: Maybe Newton RC1? | 16:23 |
*** ramishra has joined #openstack-infra | 16:23 | |
fungi | pabelanger: i'm not opposed to running infra-cloud on bleeding-edge prerelease code as long as it doesn't get in the way of being able to use it. having early feedback to the community on new releases is great, but i think our #1 goal should be making sure we're able to run _something_ and use it consistently | 16:24 |
rcarrillocruz | wfm, although i think that bringing up all servers, iron out issues, get back to DC with tickets etc, it could very well get past one-two weeks, just in time for the newton release | 16:24 |
fungi | if sticking with mitaka helps increase our chances of keeping it up and running i'm much more in favor of that | 16:25 |
pabelanger | rcarrillocruz: What is stopping us from launching infra-chocolate now? | 16:25 |
fungi | and then consider upgrading to newton once newton is releasedc | 16:25 |
*** sarob has joined #openstack-infra | 16:25 | |
fungi | pabelanger: mostly that there's still work to finish on vanilla, aiui | 16:26 |
rcarrillocruz | i need to double check the inventory of things | 16:26 |
rcarrillocruz | what took a long time was to find out what those servers were | 16:26 |
rcarrillocruz | cos in our inventory we had ilO ips | 16:26 |
pabelanger | rcarrillocruz: fungi: okay, assign me a task! I'm eager to help | 16:26 |
pabelanger | :) | 16:26 |
rcarrillocruz | no serial numbers | 16:26 |
rcarrillocruz | no nothing | 16:26 |
* mtreinish wants ice cream now | 16:26 | |
rcarrillocruz | but the DC folks they refer them by serial number | 16:27 |
rcarrillocruz | so i had to go one by one | 16:27 |
rcarrillocruz | guessing by macs | 16:27 |
rcarrillocruz | what was what | 16:27 |
rcarrillocruz | registering the racks the machine were put | 16:27 |
rcarrillocruz | etc | 16:27 |
rcarrillocruz | essetnially, cross-checking two different sources of truth | 16:27 |
rcarrillocruz | but sure, we can start doing provisioning on chocolate | 16:27 |
rcarrillocruz | as a amtter of ract, i plan to use those machines to do the live demo i promised on the meeting a couple weeks ago | 16:28 |
rcarrillocruz | how to enroll with bifrost | 16:28 |
rcarrillocruz | deploy | 16:28 |
rcarrillocruz | delete | 16:28 |
rcarrillocruz | redeploy | 16:28 |
rcarrillocruz | etc | 16:28 |
*** martinkopec has quit IRC | 16:28 | |
fungi | yeah, and doing a demo with software you already have deployed previously increases the chances that it's a viable demonstration rather than getting bogged down in whatever isn't quite right for newton yet | 16:29 |
pabelanger | okay, is there a list of infracloud-vanilla that needs finishing up? | 16:29 |
rcarrillocruz | pabelanger: none | 16:29 |
rcarrillocruz | all that can be deployed, are deployed | 16:29 |
rcarrillocruz | we have 3 machines with issues | 16:29 |
pabelanger | rcarrillocruz: everything is online? | 16:29 |
rcarrillocruz | that i have tickets for | 16:29 |
rcarrillocruz | everything that is ok, is online | 16:29 |
pabelanger | nice | 16:29 |
rcarrillocruz | check oepnstack-dev mailing list, i did a summary of the machines | 16:29 |
fungi | and i guess the new ansible wheel is churning correctly for them, they're in cacti now, et cetera? | 16:29 |
rcarrillocruz | sec | 16:29 |
rcarrillocruz | i'll link | 16:30 |
rcarrillocruz | http://lists.openstack.org/pipermail/openstack-dev/2016-September/102707.html | 16:30 |
jeblair_ | fungi: is there a wiki-dev01.openstack.org server without a forward dns record? | 16:30 |
*** sarob has quit IRC | 16:30 | |
rcarrillocruz | fungi: they are all in cacti | 16:30 |
fungi | jeblair_: it's the result of a launch-node.py --keep i'm trying to work through vcsrepo errors for | 16:31 |
fungi | jeblair: i'll be deleting it shortly and relaunching | 16:31 |
jeblair | fungi: ok. pabelanger is not the only one to get NDRs -- i just got one for him for that server :) | 16:31 |
fungi | jeblair: cute | 16:31 |
rcarrillocruz | i was hoping to get ALL fixed today, the lab said 'all is good', but out of 4 with issues just one i brought it back to life today | 16:31 |
rcarrillocruz | so we're down to 3 now with issues | 16:32 |
pabelanger | ya, compute016.vanilla.ic.openstack.org looks to be down right now | 16:32 |
jeblair | pabelanger, rcarrillocruz: i'm happy to help click through the spamhaus records for all the infra-cloud ips, or if we want to set up a smarthost and just do one or two, that should work too. | 16:32 |
rcarrillocruz | pabelanger: yeah, that one has a HD broken | 16:32 |
fungi | rcarrillocruz: pabelanger: do we have any good numbers on relative job runtimes and nondeterministic failures (if any) in our nodepool project in infra-cloud yet? | 16:32 |
rcarrillocruz | fungi: i think it may be too soon, just one day of real workload | 16:33 |
rcarrillocruz | i really want to see dsvm runs when there are more noisy neighbours in the computes | 16:33 |
*** Thelo_ has quit IRC | 16:33 | |
jeblair | fungi: the 2 graphs at the bottom say 'no datapoints' http://grafana.openstack.org/dashboard/db/nodepool-infra-cloud | 16:33 |
rcarrillocruz | cos in my initial test, a nova tempest full run took the same as osic, but you know, the VM has the entire compute for itself | 16:33 |
fungi | looking in grafana myself now, yes | 16:33 |
fungi | looks like it's all building and deleting? | 16:34 |
fungi | oh, i guess we have some in use | 16:34 |
pabelanger | yes | 16:34 |
rcarrillocruz | jeblair: in regards to the spamhaus thing, yeah, i think having a smarthost would be good | 16:34 |
*** sambetts is now known as sambetts|afk | 16:35 | |
pabelanger | fungi: jeblair: the other failure was related to DNS, we were just using google DNS, so we added unbound to infracloud this morning | 16:36 |
jeblair | pabelanger: i'm suspicious of 364622 | 16:37 |
jeblair | pabelanger: does it happen elsewhere, or just omfra? | 16:37 |
jeblair | pabelanger: and what's *really* going on? i mean, it opened a connection, but then it was closed? | 16:38 |
pabelanger | jeblair: Ya, looks like randomly in rax, ovh, internap and bluebox too | 16:38 |
openstackgerrit | Emilien Macchi proposed openstack-infra/project-config: Revert "tripleo-ui: add missing jobs for release management" https://review.openstack.org/364452 | 16:38 |
fungi | what's the failure rate from that? frequent enough we could attempt to recreate it with openssh? | 16:38 |
fungi | and if it's a problem for paramiko, is it also going to happen to ansible? | 16:39 |
pabelanger | http://paste.openstack.org/show/565742/ EOFError from today | 16:39 |
fungi | does ansible (or maybe zuul-launcher) already have a similar workaround? | 16:39 |
mordred | fungi: ansible uses openssh, not paramiko | 16:40 |
*** Apoorva has joined #openstack-infra | 16:40 | |
*** yamahata has quit IRC | 16:40 | |
fungi | mordred: sure, which is why i asked whether there's any hope of us finding the underlying cause by testing with openssh | 16:40 |
mordred | fungi: good point | 16:40 |
jeblair | right, but i think what's being gotten at here is that some of the choices nodepool makes are about preventing bad hosts from making it to zuul | 16:40 |
*** cardeois has quit IRC | 16:40 | |
*** daemontool has quit IRC | 16:40 | |
jeblair | so is this really a situation where we *want* to help more things get through | 16:41 |
fungi | agreed, right now _assuming_ this eoferror indicates a problem node, then retry-spamming it into service could be detrimental to whatever job runs on it down the line | 16:41 |
jeblair | this is why i asked what's really going on | 16:42 |
jeblair | cause the other errors in there all have explanations for why we should ignore them (user not created yet, ssh not started yet, etc) | 16:42 |
pabelanger | right, I cannot answer that. I was hoping retry would better expose the actually issue | 16:42 |
fungi | yep, figuring out what is causing the eoferror paramiko is raising might help us figure out whether it's safe to press nodes exhibiting this behavior into service | 16:42 |
fungi | rather than papering over the failure with retries | 16:43 |
pabelanger | http://paste.openstack.org/show/565743/ is the traceback of the failure | 16:43 |
rcarrillocruz | k folks, gotta run to catch my son | 16:43 |
rcarrillocruz | later | 16:43 |
fungi | is there a way to get paramiko to provide more detail on the failure mode? | 16:43 |
fungi | aha, thanks | 16:43 |
rcarrillocruz | https://review.openstack.org/#/c/364397/ mordred , pabelanger , it passed tests | 16:43 |
rcarrillocruz | later | 16:43 |
pabelanger | fungi: I believe we could enabled debug logs for that | 16:44 |
*** akshai has quit IRC | 16:45 | |
fungi | logging.getLogger('paramiko.transport').setLevel(logging.DEBUG) | 16:45 |
fungi | apparently | 16:45 |
*** zul has joined #openstack-infra | 16:45 | |
*** yamamoto has joined #openstack-infra | 16:46 | |
jeblair | fungi: yeah, though looking at the code, i'm not sure we'd learn much from that | 16:47 |
*** jamesdenton has quit IRC | 16:47 | |
fungi | https://github.com/paramiko/paramiko/issues/520 maybe? | 16:47 |
*** ilyashakhat_mobi has joined #openstack-infra | 16:48 | |
*** asettle has quit IRC | 16:49 | |
pabelanger | jeblair: was there talk of moving away from paramiko in nodepool? | 16:49 |
*** timello has quit IRC | 16:49 | |
jeblair | pabelanger: i don't recall? | 16:49 |
pabelanger | okay | 16:49 |
*** asettle has joined #openstack-infra | 16:49 | |
*** ilyashakhat_mobi has quit IRC | 16:49 | |
*** sarob has joined #openstack-infra | 16:50 | |
*** sputnik13_ has joined #openstack-infra | 16:50 | |
*** mhickey has quit IRC | 16:50 | |
Shrews | pabelanger: mordred: rcarrillocruz: fyi, ansible testing WG meeting happening in 9 min. i keep forgetting | 16:51 |
pabelanger | Shrews: Thanks | 16:51 |
*** amotoki has quit IRC | 16:51 | |
jeblair | pabelanger: i think if you want to chase this down, that's fine -- i would recommend you create a new log message for it though so you can track it | 16:52 |
fungi | worth noting, we're on paramiko 1.17.2 for nodepool.o.o | 16:53 |
pabelanger | jeblair: okay, I'll update the patch shortly | 16:53 |
jeblair | pabelanger: and log the exception, so we know where it's coming from. that way if we see multiple eoferrors from the same host, we'll know if they're all from the same spot | 16:53 |
pabelanger | jeblair: will do | 16:53 |
*** asettle has quit IRC | 16:54 | |
fungi | ugh. their repo has tags for 1.17.2 and v1.17.2 | 16:54 |
*** sarob has quit IRC | 16:54 | |
fungi | context: https://github.com/paramiko/paramiko/blob/1.17.2/paramiko/transport.py#L492 | 16:55 |
fungi | (that file looks to be the same under the v1.17.2 tag as well) | 16:56 |
*** derekh has quit IRC | 16:56 | |
*** drifterza has quit IRC | 16:56 | |
*** markvoelker has quit IRC | 16:58 | |
*** cardeois has joined #openstack-infra | 16:59 | |
*** esikache1 has quit IRC | 16:59 | |
pabelanger | does ssh-server restart after keys are generated? | 16:59 |
*** shashank_hegde has quit IRC | 16:59 | |
fungi | pabelanger: so what's interesting about the failures... none seem to be in osic | 16:59 |
fungi | even though osic is now by far the bulk of our volume | 17:00 |
clarkb | pabelanger: no it doesn't start at all until keys are generated | 17:00 |
*** tesseract- has quit IRC | 17:01 | |
fungi | no rax-iad either, but i could chalk that up to lack of a statistically significant sample | 17:01 |
fungi | however, if this were consistent across providers, i would expect to see lots in osic too | 17:01 |
*** yamahata has joined #openstack-infra | 17:01 | |
pabelanger | rax-iad is currently disabled however | 17:01 |
fungi | hah, that explains that one then | 17:01 |
pabelanger | Ya, but no osic in any logs on disk | 17:01 |
pabelanger | would should include ipv4 | 17:02 |
pabelanger | oh, maybe not | 17:02 |
fungi | right, i think we're only connecting from nodepool to osic nodes via ipv6 now | 17:03 |
pabelanger | 2016-08-22 is last log, I think we had ipv6 by then | 17:03 |
fungi | so the question is whether this is a v4-only issue, or an issue than manifests dirrerently (and raises a different error condition) under v6 | 17:03 |
fungi | or an issue that doesn't affect osic for some other reasons unrelated to ipv4 vs ipv6 | 17:04 |
*** abregman has joined #openstack-infra | 17:04 | |
*** abregman|mtg has quit IRC | 17:06 | |
*** mwhahaha has joined #openstack-infra | 17:07 | |
*** tqtran has joined #openstack-infra | 17:07 | |
*** kdas_ is now known as kushal | 17:08 | |
*** kushal has quit IRC | 17:08 | |
*** kushal has joined #openstack-infra | 17:08 | |
*** nstolyarenko has joined #openstack-infra | 17:08 | |
*** tqtran has quit IRC | 17:12 | |
*** ilyashakhat_mobi has joined #openstack-infra | 17:12 | |
*** tonytan4ever has quit IRC | 17:16 | |
*** HeOS has quit IRC | 17:17 | |
pabelanger | mgagne: I think we are trying to determine how the cloud is preforming right now. Average test runs, if anything is failing, etc | 17:18 |
mgagne | alright, fine with me | 17:18 |
*** rossella_s has quit IRC | 17:18 | |
mgagne | is there any dashboard/link I can read? | 17:18 |
clarkb | pabelanger: basically with socket activation systemd is going to listen on port22 and accept connections for ssh before sshd is ready. Then when sshd is ready it will hand over control of the socket | 17:19 |
pabelanger | clarkb: let me extra the node type | 17:19 |
*** rossella_s has joined #openstack-infra | 17:19 | |
clarkb | so wondering if maybe tcp handshake happens then we have a long pause long enough to make ssh unhappy | 17:19 |
pabelanger | mgagne: mostly looking in logstash.o.o and nodepool logs atm | 17:20 |
*** ilyashakhat_mobi has quit IRC | 17:21 | |
*** nstolyarenko has quit IRC | 17:22 | |
clarkb | hrm glance wants to remove -2 perms from their core group? | 17:22 |
fungi | where did we get to with making nodepool logs public? i think we said we'd be comfortable with it once we finished the migration to shade? | 17:23 |
fungi | and that's done now, afaik | 17:23 |
pabelanger | clarkb: seems ubuntu-xenial and debian-jessie: http://paste.openstack.org/show/565751/ | 17:24 |
pabelanger | so you are on to something | 17:24 |
clarkb | I think we are still maybe waiting for swift bits to use ksa? I don't recall if that was required for the password sanitizing | 17:24 |
clarkb | though now all the swift stuff happens in the nodepool builder we could just not serve those logs | 17:24 |
clarkb | fungi: ^ | 17:24 |
fungi | ahh, right | 17:24 |
fungi | especially easy if we move the builder daemon to a separate server | 17:25 |
pabelanger | clarkb: ya, so that goes back to my question about ssh server being restart. It sounds like what you are describing with systemd could cause issues with the socket | 17:27 |
pabelanger | if we connect early enough | 17:27 |
*** ddieterly is now known as ddieterly[away] | 17:27 | |
pabelanger | So, once https://review.openstack.org/#/c/364322/ is cleaned up, we should see a failure or 2, and then eventually ssh connection | 17:28 |
nikhil | heya.. there're a few glance patches in gate that seem stuck verifying? https://review.openstack.org/363838 https://review.openstack.org/363870 https://review.openstack.org/354332 | 17:28 |
nikhil | based on the status on zuul | 17:29 |
clarkb | In theory it should be fine because systemd just holds the fd but maybe there is a timeout or something more aggressive in paramiko | 17:29 |
openstackgerrit | James E. Blair proposed openstack-infra/puppet-nodepool: Enable mod_proxy when proxying status commands https://review.openstack.org/364478 | 17:29 |
nikhil | all of them are really really important for us to tag newton-3 today | 17:29 |
pabelanger | another option, could be to delay our ssh connections per cloud, with some sort of configuration option. | 17:29 |
jeblair | clarkb, pabelanger, fungi: ^ i just did that manually on the nodepool server | 17:29 |
nikhil | any help/pointer would be super useful! | 17:29 |
clarkb | nikhil: have you pulled up teh console logs/ | 17:29 |
jeblair | clarkb, pabelanger, fungi, mgagne: http://nodepool.openstack.org/image-list works now | 17:29 |
jeblair | as does http://nodepool.openstack.org/dib-image-list | 17:29 |
clarkb | nikhil: says its still running tempest | 17:30 |
*** ramishra has quit IRC | 17:30 | |
pabelanger | jeblair: excellent | 17:30 |
AJaeger | fnikil 363838 is still running... | 17:30 |
*** dteselkin has quit IRC | 17:30 | |
jeblair | sdague: you may find http://nodepool.openstack.org/image-list useful | 17:30 |
AJaeger | nikhil: sorry for typo ^ | 17:30 |
nikhil | AJaeger: clarkb : that's been running for along time | 17:30 |
nikhil | and the other two seem done | 17:30 |
AJaeger | nikhil: gate-tempest-dsvm-neutron-full-ubuntu-xenial is running - did you see that? | 17:30 |
*** ramishra has joined #openstack-infra | 17:30 | |
fungi | 2016-09-01 17:30:47.614538 | {1} tempest.scenario.test_network_advanced_server_ops.TestNetworkAdvancedServerOps.test_server_connectivity_stop_start [165.255744s] ... ok | 17:30 |
nikhil | AJaeger: yeah, waiting for it for last 1-1.5 hrs | 17:31 |
*** kzaitsev_ws has quit IRC | 17:31 | |
*** kzaitsev_ws has joined #openstack-infra | 17:31 | |
*** tsufiev has quit IRC | 17:31 | |
*** katyafervent_awa has quit IRC | 17:31 | |
*** penguinolog has quit IRC | 17:32 | |
nikhil | what's the best path forward? (recheck won't work) so, wait fora bit more and bug folks then :) | 17:32 |
*** javeriak_ has quit IRC | 17:32 | |
*** e0ne has joined #openstack-infra | 17:32 | |
*** akshai has quit IRC | 17:33 | |
clarkb | nikhil: you'll have to debug why the job is slow | 17:33 |
clarkb | it is still doing stuff though seems like | 17:33 |
nikhil | clarkb: how do I get on this telnet link telnet:// ? | 17:33 |
fungi | looks like devstack setup took from 15:17:41 to 15:56:38 | 17:33 |
clarkb | nikhil: use telnet or nc to that ip address and port | 17:33 |
*** katyafervent_awa has joined #openstack-infra | 17:33 | |
nikhil | (that's what I get when I hover on that gate link) | 17:33 |
*** tphummel has joined #openstack-infra | 17:34 | |
clarkb | nikhil: if you are more adventurous there are ways to have your browser do things automatically but I haven't bothered | 17:34 |
*** igormarnat has joined #openstack-infra | 17:34 | |
*** cardeois_ has joined #openstack-infra | 17:34 | |
fungi | looks like it's just slow. for example the gap from when test_server_connectivity_rebuild reported and test_server_connectivity_resize reported was on the order of 3 minutes | 17:35 |
AJaeger | nikhil: telnet 19885 (or use nc with same arguments) | 17:35 |
nikhil | clarkb: gotcha, will use telnet | 17:35 |
*** ijw_ has joined #openstack-infra | 17:35 | |
nikhil | AJaeger: ty , just hopped on :) | 17:35 |
*** kaisers_ has joined #openstack-infra | 17:35 | |
*** javeriak has joined #openstack-infra | 17:36 | |
fungi | but it's entirely possible this job will continue until it reaches the job timeout and gets killed | 17:36 |
*** dteselkin has joined #openstack-infra | 17:36 | |
AJaeger | sdague: for https://review.openstack.org/#/c/363937 we should ask the Neutron folks, shouldn't we? armax, dougwig , please review | 17:36 |
* armax looks | 17:36 | |
nikhil | fungi: yeah, it's a change in config & tests are not expected to fail | 17:37 |
clarkb | I think those tests are the ones that tend to run at the end | 17:37 |
clarkb | but I haven't actually looked at the sorting recently | 17:37 |
AJaeger | Zara: did you merge anything yet on python-storyboardclient? Any dummy commit to get content published? | 17:37 |
openstackgerrit | Zara proposed openstack-infra/python-storyboardclient: Add due_dates https://review.openstack.org/345995 | 17:37 |
*** cardeois has quit IRC | 17:38 | |
openstackgerrit | Ben Nemec proposed openstack-infra/tripleo-ci: Add ipv6 nic-configs https://review.openstack.org/364479 | 17:38 |
AJaeger | do we want to merge the internap-mtl01 to 150 increase now? https://review.openstack.org/363984 ? | 17:38 |
Zara | Ajaeger: it's not merged yet but there's a patch over here: https://review.openstack.org/#/c/362878/ for docs | 17:38 |
fungi | the same job that ran on the two glance changes behind the slow one only took around an hour to complete, so there's probably something terribly wrong with the node this job is running on | 17:38 |
jeblair | clarkb: thx | 17:39 |
fungi | system load on that node is around 4 | 17:39 |
*** javeriak_ has joined #openstack-infra | 17:40 | |
*** dkehn has quit IRC | 17:40 | |
*** dkehn_ has quit IRC | 17:40 | |
nikhil | fungi: saw it succeed in 339 secs! | 17:40 |
*** kaisers_ has quit IRC | 17:40 | |
fungi | not an extreme amount of memory pressure | 17:41 |
pabelanger | AJaeger: still confirming if the cloud is ready for more nodes. Should know more in the next little bit | 17:41 |
*** javeriak has quit IRC | 17:41 | |
AJaeger | pabelanger: Mathieu Gagné commented "We're ready" on the review | 17:42 |
fungi | clarkb: around 5200 on all 8 processors listed in cpuinfo | 17:42 |
clarkb | pabelanger: is the current concern the ssh eof error? | 17:42 |
mat128 | mgagne: ^ | 17:42 |
*** dkehn has joined #openstack-infra | 17:42 | |
*** rbrndt has quit IRC | 17:42 | |
clarkb | fungi: thats more than my local machine! | 17:42 |
mgagne | AJaeger: pabelanger said " I think we are trying to determine how the cloud is preforming right now. Average test runs, if anything is failing, etc" | 17:42 |
pabelanger | clarkb: for internap-mtl01? Just wanted to confirm job times are in line with what we expect, I haven't actually checked that yet. | 17:43 |
fungi | clarkb: i think the eoferror seems to be consistent across providers (sans osic) so probably not a concern for ramping up | 17:43 |
clarkb | pabelanger: gotcha | 17:43 |
clarkb | pabelanger: oh I remember things re the EOF error | 17:43 |
clarkb | pabelanger: launch node ran into that too when running the restart command | 17:43 |
clarkb | pabelanger: but in that case its killing the services really fast. So maybe there is a restart of the service happenign that I didn't expect that is closing very earlyconnections | 17:44 |
AJaeger | project-config cores, could you review https://review.openstack.org/#/c/364417 for the release team, please? | 17:44 |
clarkb | pabelanger: you can actually reproduce that pretty easily by sshing into a host with systemd and running `reboot` | 17:44 |
*** nstolyarenko has joined #openstack-infra | 17:44 | |
pabelanger | clarkb: sdague: AJaeger: Looking at grafana for internap-mtl01, tempest tests are inline with nyj01. So, if everbody is on board, we can up the capacity I think | 17:44 |
mordred | nikhil: o hai | 17:44 |
nikhil | fungi: clarkb AJaeger : I noticed they merged after we'd a chat here. you guys have magic vision to make things work just by looking at'em!! | 17:45 |
nikhil | mordred: \o | 17:45 |
*** jtomasek is now known as jtomasek|afk | 17:45 | |
AJaeger | nikhil: we're all part of the magic team ;) | 17:46 |
mordred | nikhil: rcarrillocruz and I were talking about image import yesterday ... and I just wanted to confirm | 17:46 |
clarkb | pabelanger: so yes, I am suspecting some systemd behavior we may be tickling there | 17:46 |
mordred | nikhil: import_from_url is not a thing in v2? or it's a thing but only with tasks? | 17:46 |
fungi | clarkb: was there a systemd vs non-systemd split on the ssh failures then? | 17:46 |
mrhillsman | mordred fungi pabelanger clarkb for osic cloud8 the best solution is the dns suggestion from yesterday | 17:47 |
fungi | clarkb: in that case, yes, sounds highly likely | 17:47 |
pabelanger | clarkb: I think we are restart SSH for some reason | 17:47 |
pabelanger | getting a logs now | 17:47 |
mrhillsman | attaching public address to the VM directly looks like it is going to require quite a bit of work | 17:48 |
clarkb | pabelanger: though in theory restarting sshd doesn't kill existing connections... at least it didn't with upstart maybe this is new and exciting bugs | 17:48 |
pabelanger | clarkb: http://paste.openstack.org/show/565759/ that is from an random osic ubuntu-xenail server | 17:48 |
mrhillsman | i have to discuss with network folks since cloud8 is setup differently than cloud1 | 17:48 |
fungi | mrhillsman: it's an okay short-term solution, but long term it means that if we lose cloud1 for some reason then cloud8 will also effectively be dead to us | 17:48 |
mrhillsman | understood | 17:48 |
mrhillsman | makes total sense | 17:49 |
nikhil | mordred: import from url won't be a generic one. things will be rather predefined http://specs.openstack.org/openstack/glance-specs/specs/mitaka/approved/image-import/image-import-refactor.html#api-changes | 17:49 |
pabelanger | clarkb: I think it maybe glean that kicks off the stop / start under systemd | 17:50 |
clarkb | pabelanger: aha it does a stop start after the reload | 17:50 |
*** cardeois_ is now known as cardeois | 17:50 | |
clarkb | pabelanger: ya I Think that would explain ti then as I have seen the same behavior with launch node on `reboot` | 17:50 |
nikhil | mordred: as per that value discovery call, and then the info about the what to provide is here http://specs.openstack.org/openstack/glance-specs/specs/mitaka/approved/image-import/image-import-refactor.html#format-discovery | 17:50 |
nikhil | mordred: which says, give me the container name if swift local or give me a stream of data is glance-direct | 17:50 |
pabelanger | So, maybe we should see how to make glean run before networking is started | 17:50 |
clarkb | pabelanger: we may be able to just edit glean to say before sshd | 17:51 |
mordred | nikhil: awesome. thanks. super helpful | 17:51 |
nikhil | mordred: so, once import refactor merges, that will be the case :) | 17:51 |
pabelanger | clarkb: right | 17:51 |
*** ijw_ has quit IRC | 17:51 | |
pabelanger | clarkb: let me get a new server up and play with it | 17:51 |
clarkb | pabelanger: kk | 17:51 |
armax | looking | 17:52 |
*** dkehn_ has joined #openstack-infra | 17:54 | |
*** esikache1 has joined #openstack-infra | 17:55 | |
jeblair | sdague, pabelanger: re https://review.openstack.org/364309 i share clarkb and mordred's suspicion that there might be something more subtle at play. ansible uses a persistent ssh connection, so the time/effort required to open a new channel should be greatly diminished. 10 seconds seems more than ample. it's also the case that we saw this sort of thing with jenkins too, which had a different approach to connectivity. having said ... | 17:55 |
*** shashank_hegde has joined #openstack-infra | 17:55 | |
jeblair | (also, it's worth noting that zuul automatically restarts jobs that hit that problem) | 17:55 |
openstackgerrit | Merged openstack-infra/devstack-gate: remove old tests https://review.openstack.org/360761 | 17:56 |
pabelanger | I still want to make ansible async ignore ssh failures and keep trying until the timeout is reached. Or some ignore ssh failure limit | 17:57 |
dhellmann | hey, folks, I think I have some logic wrong in the script that tries to propose upper constraint changes when we release libraries. | 17:57 |
jeblair | pabelanger: yeah, i think that would be a nice improvement | 17:57 |
dhellmann | in this log, it should be trying to check out the stable/mitaka branch to propose the commit there, but it doesn't find origin/stable/mitaka: http://logs.openstack.org/7a/7ad1cd3a0da7a95fb7c14cf9eeb9ae683c247efa/release-post/tag-releases/82a4a38/console.html#_2016-09-01_15_55_56_931716 | 17:57 |
*** harlowja_ has joined #openstack-infra | 17:58 | |
*** niska has joined #openstack-infra | 17:58 | |
openstackgerrit | Merged openstack-infra/storyboard: Don't allow users to subscribe to private worklists they can't see https://review.openstack.org/363776 | 17:59 |
*** harlowja has quit IRC | 18:00 | |
mordred | pabelanger: problem with 60s poll | 18:00 |
clarkb | dhellmann: I think that may be a subtle git behavior where it can't identify a unique thing called that because there may be a file or other item with the same name? | 18:00 |
*** ihrachys has quit IRC | 18:00 | |
mordred | pabelanger: is that then there is a 60s lag between a job finishing and ansible knowing that | 18:00 |
dhellmann | clarkb : maybe? I'm pretty sure there is no file called origin/stable/mitaka though? | 18:01 |
mordred | pabelanger, jeblair: in the ansible zuul work I want to do next week - which involves forward porting 2.5 to 3 - I also want to investigate our own action plugin | 18:01 |
clarkb | dhellmann: right above where you linked you can see where it says stable/mitaka -> origin/stable/mitaka and its up to date | 18:01 |
mordred | and that action plugin should be able to be MUCH smarter about how async happens | 18:01 |
mordred | because we know our intent | 18:01 |
* clarkb tries to reproduce locally | 18:01 | |
dhellmann | clarkb : I've run into similar issues where the local branch name is not what I expect because of something about how we clone repos that I don't understand. | 18:01 |
mordred | whereas the async module from ansible has to be more generic, which means it can't respond | 18:01 |
dhellmann | clarkb : maybe I should change the script to just look for stable/mitaka and not origin/stable/mitaka? it used to be that the shorter name wouldn't exist locally, though | 18:02 |
Shrews | mordred: that's exciting | 18:02 |
jeblair | pabelanger: well, some zuul_runner things are very fast; i'd hate to have to wait 60 seconds for an 'echo' statement | 18:02 |
clarkb | dhellmann: it works locally :/ | 18:02 |
dhellmann | clarkb : yeah | 18:02 |
pabelanger | mordred: exciting | 18:02 |
pabelanger | jeblair: ya, down side | 18:02 |
sdague | jeblair: zuul did not restart this job | 18:03 |
pabelanger | if only we could have linear increasing polling | 18:03 |
jeblair | sdague: i did not see an error report for it | 18:03 |
dhellmann | fungi : ok. I'm looking for the full remote name and not finding it as origin/stable/mitaka. would it have a different name for some reason? | 18:03 |
mordred | oh. crap. I was going to fix a bug for jeblair today | 18:04 |
*** _nadya_ has quit IRC | 18:04 | |
sdague | jeblair: we force repromoted it because it was the critical patch to fix multinode | 18:04 |
mordred | jeblair: do you remember what the bug was that I was going to track down today? | 18:04 |
fungi | dhellmann: you also won't have an origin/stable/mitaka until after a remote update | 18:04 |
fungi | pretty sure | 18:04 |
fungi | testing now | 18:04 |
sdague | it had failed that job, reset the gate, taken everything off of it | 18:04 |
sdague | but we needed that patch | 18:04 |
dhellmann | fungi : this is from the release.sh script doing its own call to git clone to check out the requirements repository | 18:04 |
dhellmann | fungi : and then it does "git fetch -v --tags" and the output from that includes a bunch of branch names, including stable/mitaka and origin/stable/mitaka | 18:05 |
jeblair | sdague: http://logs.openstack.org/66/364266/2/gate/gate-grenade-dsvm-neutron-multinode/63498bd/console.html this one? | 18:05 |
dhellmann | unfortunately that's all in a temporary directory so it's no longer there to examine | 18:05 |
fungi | dhellmann: yeah, just confirmed, if i `git clone ...` the keystone repo, and then cd into it, `git branch -v` only lists "master" | 18:05 |
sdague | jeblair: yes | 18:05 |
*** claudiub has quit IRC | 18:05 | |
clarkb | fungi: ya but the job does git fetch -v --tags first which seems to populate the things | 18:06 |
openstackgerrit | Gabriele Cerami proposed openstack-infra/tripleo-ci: Add IPv6 network configuration for ipv6 job types https://review.openstack.org/363674 | 18:06 |
clarkb | fungi: at least when I do the same for requriments taht git show works locally | 18:06 |
dhellmann | fungi , clarkb : I can add a "git remote update" if you think that would help, but I thought the fetch was more or less doing that? | 18:06 |
fungi | clarkb: well, except git branch -v still only lists my local master after i do that | 18:06 |
clarkb | fungi: see git branch -a | 18:07 |
fungi | i wanted git branch -a | 18:07 |
Zara | AJaeger: ahaha, thanks. I'll fix it. :) | 18:07 |
fungi | yep | 18:07 |
fungi | so even immediately after a git clone, git branch -a actually has all the remote branches for the remote i cloned from | 18:07 |
jeblair | pabelanger, mordred, sdague: hrm. ansible did exit with exit code 1, not 3. | 18:08 |
*** salv-orl_ has joined #openstack-infra | 18:08 | |
dhellmann | clarkb , fungi : http://paste.openstack.org/show/565765/ | 18:08 |
AJaeger | Zara: and sorry for wrong guideance on ``code`` | 18:08 |
pabelanger | jeblair: yes, because failed=1 | 18:08 |
*** sshnaidm is now known as sshnaidm|afk | 18:08 | |
clarkb | dhellmann: I think I know what the issue is | 18:08 |
jeblair | pabelanger: why did that end up as a failure? | 18:09 |
clarkb | dhellmann: its a git repo in a git repo | 18:09 |
*** Na3iL has quit IRC | 18:09 | |
clarkb | dhellmann: can you confirm that is how the job is setting the tree up? It seems that way from the log | 18:09 |
jeblair | pabelanger: any chance it's because of the block/rescue thing? | 18:09 |
dhellmann | clarkb : oh! could be | 18:09 |
dhellmann | it does do that, yes | 18:09 |
fungi | ahh, right, git has some funky behaviors around git inside git | 18:10 |
dhellmann | that could well be | 18:10 |
dhellmann | it should be using our fancy tmp dir stuff, let me see why that isn't | 18:10 |
dhellmann | clarkb : it does seem to be using a tmpdir: http://logs.openstack.org/7a/7ad1cd3a0da7a95fb7c14cf9eeb9ae683c247efa/release-post/tag-releases/82a4a38/console.html#_2016-09-01_15_55_54_084809 | 18:10 |
dhellmann | oh, except that temporary directory is inside the workspace | 18:11 |
pabelanger | jeblair: I don't fully understand why. When I last tried to debug this, I considered our usage of the failed task the issue, but it might also be possible that something is async task is doing it | 18:11 |
dhellmann | which is a git repo | 18:11 |
*** maishsk has joined #openstack-infra | 18:11 | |
*** salv-orlando has quit IRC | 18:11 | |
dhellmann | ok, let me see if I can fix that | 18:11 |
pabelanger | jeblair: but when we get unreachable=1, we still called the fail task, which leads me to think using fail is not the problem | 18:11 |
* clarkb does a quick test | 18:12 | |
AJaeger | python experts, could you lend me a hand, please? See https://review.openstack.org/#/c/362436 - and my comment from 5:30 this morning. Why is pbr freeze showing openstackdocstheme==1.5.0 instead of something like openstackdocstheme==1.5.1.dev2 ? | 18:12 |
jeblair | pabelanger: i'm looking for the cases where we actually get exit=3 | 18:12 |
*** waht has joined #openstack-infra | 18:12 | |
fungi | dhellmann: separate (but related) note... we did eventually get /usr/zuul-env/bin/zuul-cloner onto the signing node so that script can likely be simplified again when you're ready to hack on that | 18:13 |
jeblair | pabelanger: and it looks like they are ones where we get an ssh error in zuul_runner, but then we *also* get an ssh error in the zuul_log in the rescue block.... | 18:13 |
dhellmann | fungi : ok, cool, I'll put that on the ocata list | 18:13 |
clarkb | hrm at least my local git knows how to handle that (I am running really new git though) | 18:13 |
AJaeger | fungi, dhellmann : But we don't have a local cache, correct? | 18:13 |
fungi | AJaeger: correct | 18:13 |
jeblair | pabelanger: that makes me more suspicious that the block/rescue thing is converting connection errors from inside the block into failures, *unless* there is also a connection failure in the rescue. | 18:14 |
jeblair | mordred, Shrews: ^ | 18:14 |
openstackgerrit | Merged openstack-infra/project-config: Add Ironic UI gerritbot to #openstack-ironic https://review.openstack.org/364347 | 18:15 |
fungi | AJaeger: it would help if we had tox logs collected from that job | 18:15 |
pabelanger | https://github.com/ansible/ansible/blob/bd68c324cebce599ff07d6fd90c36a224581e065/lib/ansible/plugins/connection/ssh.py#L603 | 18:15 |
pabelanger | seems to imply a problem with ssh client, 255 | 18:15 |
AJaeger | fungi, we have - see the recheck result | 18:16 |
*** ijw has joined #openstack-infra | 18:16 | |
fungi | AJaeger: i don't immediately see any indication of you using edit-constraints in that job | 18:16 |
clarkb | AJaeger: I remember reviewing that change I Think :) | 18:16 |
fungi | AJaeger: so you're probably running into http://git.openstack.org/cgit/openstack/requirements/tree/upper-constraints.txt#n196 forcing you to the constrained version? | 18:16 |
jeblair | pabelanger, Shrews: i'd like to back-burner the ssh failures for a moment and focus on the exit code. | 18:16 |
*** senk has joined #openstack-infra | 18:16 | |
AJaeger | fungi, http://logs.openstack.org/36/362436/4/check/gate-openstackdocstheme-api-ref/22ac35d/ | 18:16 |
pabelanger | ack | 18:16 |
*** awayne has quit IRC | 18:17 | |
AJaeger | fungi, but I'm not using constraints in that tox.ini | 18:17 |
*** tonytan4ever has quit IRC | 18:17 | |
AJaeger | fungi, I agree, if that repo would use constraints, then we would need edit-constraints... | 18:18 |
openstackgerrit | Doug Hellmann proposed openstack-infra/project-config: put release temporary directories under $TMPDIR https://review.openstack.org/364489 | 18:18 |
fungi | AJaeger: yeah, it's not constraints... i see it | 18:18 |
fungi | AJaeger: http://logs.openstack.org/36/362436/4/check/gate-openstackdocstheme-api-ref/22ac35d/tox/api-ref-1.log.txt | 18:18 |
mrhillsman | fungi mordred clarkb pabelanger - spoke with network folks and it is possible but will take some time | 18:18 |
AJaeger | fungi, clarkb , I run tox -e api-ref locally and got "openstackdocstheme==1.5.1.dev2 # git sha 670fbd8" in the freeze... | 18:18 |
dhellmann | clarkb : I need to test ^^ locally but I don't have a lib that's going to trigger a stable branch requirements update | 18:18 |
fungi | AJaeger: Collecting openstackdocstheme>=1.4.0 (from os-api-ref>=1.0.0->-r /home/jenkins/workspace/gate-openstackdocstheme-api-ref/test-requirements.txt (line 12)) | 18:18 |
mrhillsman | roll with short-term and i will respond when long-term has been implemented | 18:19 |
clarkb | dhellmann: you should be able to just invent one and remove it from your repo when done | 18:19 |
fungi | dhellmann: i think you uploaded right when openstackgerrit was restarting for a config update | 18:19 |
*** senk has quit IRC | 18:19 | |
clarkb | dhellmann: maybe use a temp repo so you don't risk mixing it up with the real world | 18:19 |
fungi | dhellmann: oh, you mean 364489 | 18:19 |
fungi | i didn't scroll back far enough, sorry | 18:20 |
*** pvaneck has joined #openstack-infra | 18:20 | |
*** maishsk has quit IRC | 18:20 | |
AJaeger | fungi, so os-api-ref forces the downgrade? ;( | 18:20 |
mordred | mrhillsman: awesome! thnaks! | 18:20 |
AJaeger | fungi, how can I avoid that? | 18:20 |
dhellmann | clarkb , fungi : I tested by re-releasing muranoclient and it did not work. I'll keep tweaking locally | 18:21 |
jeblair | mordred: it was shade caching images and flavors | 18:21 |
fungi | AJaeger: when test-requirements gets installed by tox, os-api-ref>=1.0.0 gets installed depending on openstackdocstheme>=1.4.0 which triggers a download from pypi | 18:21 |
fungi | AJaeger: looks like pip install -U may be at fault? | 18:21 |
fungi | i honestly can't remember what our position is now on whether install_command should be pip install with -U or without | 18:22 |
* AJaeger removes -U and will test that | 18:23 | |
openstackgerrit | Emilien Macchi proposed openstack-infra/tripleo-ci: pingtest: run 'openstack stack failures list' when failure https://review.openstack.org/363918 | 18:23 |
AJaeger | thanks, fungi | 18:23 |
dhellmann | clarkb, fungi : I found it. I was being bone-headed. | 18:23 |
*** piet has joined #openstack-infra | 18:24 | |
AJaeger | I pushed https://review.openstack.org/364492 and will check it tomorrow. | 18:25 |
clarkb | the erason we had/have -U is pre constraints you needed it to get the requirements updated if things otherwise fit into the reqs ranges | 18:25 |
clarkb | with constraints you will install the constraints version regardless every single time | 18:25 |
clarkb | The small gap without constraints and without -U is setuptools I think. But updating it during the main install run doesn't actually use the new version of setuptools so thats mostly a noop | 18:26 |
AJaeger | ah. Perhaps I should use constraints instead ;) | 18:26 |
fungi | well, also we had it to deal with the bad old days where lots of projects were using system site-packages in their tox virtualenvs and we had nodes with a bunch of crufty old distro-packaged python libs installed on them | 18:26 |
AJaeger | thanks, clarkb | 18:26 |
*** abregman|mtg is now known as abregman | 18:28 | |
* AJaeger congratulates Zara on absolving the RST master class ;) | 18:28 | |
mordred | jeblair: thank you! | 18:28 |
dhellmann | fungi : since the issue is in my cloning function, I'm going to switch it to use zuul-cloner | 18:29 |
AJaeger | fungi, it did not help, see http://logs.openstack.org/92/364492/1/check/gate-openstackdocstheme-api-ref/72a739d/console.html | 18:29 |
AJaeger | http://logs.openstack.org/92/364492/1/check/gate-openstackdocstheme-api-ref/72a739d/tox/api-ref-2.log.txt - is there a version screwup? | 18:30 |
*** mtanino has joined #openstack-infra | 18:31 | |
*** harlowja_ has quit IRC | 18:31 | |
*** harlowja has joined #openstack-infra | 18:32 | |
*** mtanino__ has quit IRC | 18:33 | |
AJaeger | yep ;/ | 18:33 |
openstackgerrit | Doug Hellmann proposed openstack-infra/project-config: fix branch handling in clone_repo https://review.openstack.org/364497 | 18:34 |
*** nstolyarenko has quit IRC | 18:34 | |
*** ddieterly[away] is now known as ddieterly | 18:34 | |
dhellmann | fungi, clarkb : ok, that should do it ^^ | 18:34 |
*** _nadya_ has quit IRC | 18:34 | |
*** ijw has quit IRC | 18:35 | |
Zara | AJaeger: =D thanks! | 18:35 |
*** Thelo_ has quit IRC | 18:35 | |
* AJaeger tries now constraints... | 18:35 | |
AJaeger | fungi,done already - https://review.openstack.org/364499 | 18:36 |
*** akshai has joined #openstack-infra | 18:37 | |
*** nstolyarenko has joined #openstack-infra | 18:37 | |
fungi | AJaeger: i wonder if you just need tox.skipdist=True like at http://git.openstack.org/cgit/openstack-dev/cookiecutter/tree/%7b%7bcookiecutter.repo_name%7d%7d/tox.ini#n4 | 18:38 |
fungi | iirc there are some odd interactions between skipdist and usedevelop | 18:38 |
fungi | mordred or dhellmann probably remember more clearly | 18:39 |
*** shardy is now known as shardy_afk | 18:40 | |
dhellmann | I think we usually set both of those, but I think just because building the dist is a waste of time if you're not going to install from it | 18:40 |
mordred | I do not - but I do know that the general intent is to set both | 18:41 |
mordred | yah | 18:41 |
AJaeger | ok, I'll try - thanks. | 18:41 |
clarkb | and then peopl started using symlinks | 18:42 |
fungi | something definitely seems to be causing tox to force it to the (cached?) wheel rather than the git checkout | 18:42 |
*** salv-orl_ has quit IRC | 18:42 | |
*** nstolyarenko has quit IRC | 18:43 | |
fungi | mordred: the very end of the tox log here is especially confusing... http://logs.openstack.org/92/364492/1/check/gate-openstackdocstheme-api-ref/72a739d/tox/api-ref-2.log.txt | 18:43 |
mordred | fungi: WOW | 18:44 |
jeblair | mordred, pabelanger, Shrews: i have confirmed with synthetic testing that it's neither the block/rescue, nor zuul_runner that's causing the ssh errors to be failures (exit code 1) rather than unreachable errors (exit 3). it seems to be a behavior of the async module -- if it can't connect at the start, it's 'unreachable'. if it can't connect for one of it's poll checks, it's a failure. | 18:44 |
*** kzaitsev_ws has joined #openstack-infra | 18:45 | |
fungi | mordred: if you back up to api-ref-1.log.txt you'll see it previously pulled in a wheel of 1.5.0 because there's a circular (test) dependency of openstackdocstheme on itself (via os-api-ref) | 18:45 |
fungi | so there will be a 1.5.0 wheel in the cache at that point | 18:46 |
mordred | fungi: but why would that matter :( | 18:46 |
*** Thelo_ has joined #openstack-infra | 18:46 | |
fungi | i'm just stretching for odd corner cases that might be exposing a bug we don't normally see | 18:46 |
mordred | fungi: oh - totally - sorry, it was a rhetorical head-against-desk question | 18:47 |
fungi | yeah, no idea whether that has anything to do with the problem, but obviously pip shouldn't be resolving setup_requires for 1.5.1.dev2 and then end by claiming to have installed 1.5.0 | 18:48 |
*** annegent_ has joined #openstack-infra | 18:49 | |
AJaeger | still wrong ;( http://logs.openstack.org/92/364492/2/check/gate-openstackdocstheme-pep8-ubuntu-xenial/66f399e/ | 18:49 |
fungi | er, install_requires i guess | 18:49 |
*** ijw has joined #openstack-infra | 18:49 | |
*** hasharAway is now known as hashar | 18:49 | |
fungi | AJaeger: was that after switching to constraints, or adding skipdist=true? | 18:50 |
mordred | OH | 18:50 |
mordred | constraints | 18:50 |
AJaeger | skipdist or skip*s*dist? | 18:50 |
* AJaeger has two changes ;) | 18:50 | |
fungi | mordred: no constraints at play in the log i linked for you | 18:50 |
*** david-lyle has joined #openstack-infra | 18:50 | |
mordred | oh | 18:50 |
mordred | darn | 18:50 |
openstackgerrit | Monty Taylor proposed openstack-infra/shade: Batch calls to list_floating_ips https://review.openstack.org/364508 | 18:51 |
jeblair | pabelanger, sdague: i believe we might be able to see a little more of what ansible is doing with the ssh connections and perhaps ascertain the likelihood of a connection timeout increase being effective if we run with verbose logs for a bit. i will enable that on one of the launchers and see if we can catch an error before we run out of disk. | 18:51 |
AJaeger | fungi, skipsdist is it, isn't it? | 18:51 |
AJaeger | that's what cookiecutter uses | 18:51 |
fungi | mordred: you can see at the top of the log it only ran `pip install -e .` | 18:51 |
mordred | fungi: yah. that's excessively weird | 18:51 |
AJaeger | I'm talking about https://review.openstack.org/364492 | 18:51 |
mordred | fungi: I kind of want to invoke our friendly pip human | 18:52 |
*** annegentle has quit IRC | 18:52 | |
AJaeger | no worries, fungi | 18:52 |
jeblair | pabelanger: actually, i'm going to do it on all the launchers to try to catch this faster | 18:53 |
*** waht has quit IRC | 18:54 | |
*** ddieterly has quit IRC | 18:55 | |
*** abregman has quit IRC | 18:56 | |
*** mriedem has quit IRC | 18:56 | |
pabelanger | jeblair: okay | 18:56 |
*** akshai has quit IRC | 18:57 | |
AJaeger | using constraints: http://logs.openstack.org/99/364499/3/check/gate-openstackdocstheme-pep8-ubuntu-xenial/367d855/ | 18:57 |
AJaeger | "openstackdocstheme==1.5.1.dev3 # git sha f3782e1" | 18:58 |
AJaeger | that looks finally ok... | 18:58 |
AJaeger | will do | 18:58 |
*** kien-ha has quit IRC | 18:59 | |
fungi | git describe says i've got a commit newer than 1.5.0 but pip list is saying "openstackdocstheme (1.5.0, /home/fungi/work/openstack/openstack/openstackdocstheme/.tox/venv/lib/python2.7/site-packages)" | 18:59 |
AJaeger | fungi, http://paste.openstack.org/show/565858/ | 18:59 |
fungi | AJaeger: was that with constraints or no? | 18:59 |
fungi | weird. that's not at all what i'm getting | 19:00 |
AJaeger | clean tree | 19:00 |
*** yaume_ has quit IRC | 19:00 | |
AJaeger | stranger and stranger ;( | 19:00 |
* AJaeger double checks the tree | 19:00 | |
AJaeger | fungi, you have an old tree - we released 1.5.1 | 19:01 |
*** rbrndt has joined #openstack-infra | 19:01 | |
AJaeger | wrong, other project. 1.5.0 is last tag. | 19:02 |
fungi | AJaeger: strangely, remote update isn't picking it up for me | 19:02 |
AJaeger | Sorry, mixed up releases | 19:02 |
AJaeger | you have pip 8.1.2, I use 7.1.2 | 19:02 |
*** kushal has quit IRC | 19:03 | |
fungi | AJaeger: yeah, i'm using tox 2.3.1 and virtualenv 15.0.3 | 19:04 |
pabelanger | clarkb: removing After=network.target from glean, openssh-server just reload, not stop / start | 19:04 |
pabelanger | clarkb: I am going to build an image and see if things still work | 19:04 |
clarkb | pabelanger: I think it can still happen after if you don't have an explicit before | 19:04 |
clarkb | pabelanger: I think it might be better to have an explicit Before sshd | 19:04 |
*** ddieterly has joined #openstack-infra | 19:05 | |
clarkb | pabelanger: buit good to know we can manipulate glean's unit file to have it not stop start | 19:05 |
pabelanger | clarkb: sure, we can do that too | 19:05 |
pabelanger | clarkb: also removing After=network.target fixes the dependency cycle that systemd complains about | 19:05 |
fungi | AJaeger: right, i'm starting to wonder if this is a regression in virtualenv/pip/tox somewhere | 19:05 |
clarkb | pabelanger: huh, wasn't that what we had to add to the urandom fixer unit file to make ti work? | 19:06 |
pabelanger | clarkb: FWIW: we do have Before=network-pre.target, which should protect before ssh starts | 19:06 |
fungi | AJaeger: yeah, i think there's a bug... i'm going to try to bisect a few tools | 19:06 |
pabelanger | clarkb: yes, and now I thinking more about it, it is also wrong. So we need to patch both | 19:06 |
AJaeger | fungi, thanks a lot! | 19:07 |
fungi | AJaeger: downgrading to virtualenv 14.0.2 gets me the (presumably proper) behavior you're seeing locally | 19:08 |
clarkb | pabelanger: aha gotcha | 19:08 |
fungi | AJaeger: and 15.0.0 has the broken behavior | 19:08 |
AJaeger | So, wouldn't this cause quite some havoc in the gate? We could test wrong things here ;( | 19:09 |
openstackgerrit | Paul Belanger proposed openstack-infra/glean: Remove After=network.target dependency cycle https://review.openstack.org/364516 | 19:09 |
pabelanger | clarkb: ^ | 19:09 |
pabelanger | like I said, going to do a quick build and test the image | 19:10 |
clarkb | pabelanger: cool I will try to test a local build of that too | 19:10 |
clarkb | pabelanger: do you know if there is an easy way to make simple-init in dib pull that version of glean? | 19:10 |
pabelanger | clarkb: not sure, haven't tried | 19:10 |
AJaeger | fungi, will you file a bug? Should we block that virtualenv version? | 19:10 |
mordred | clarkb: uhm ... I think so | 19:11 |
AJaeger | Sorry, I have to leave in a few minutes and call it a day ;( | 19:11 |
clarkb | hrm except my lcoal virsh thing appares to have been broken by an update \o/ | 19:11 |
mordred | greghaynes: ^^ how do we build with dib using simple-init from local source dir? | 19:11 |
*** ilyashakhat_mobi has joined #openstack-infra | 19:12 | |
greghaynes | export DIB_REPOLOCATION_glean=/path/to/glean | 19:12 |
mordred | clarkb: see - greghaynes continues to be magical pony | 19:12 |
openstackgerrit | Paul Belanger proposed openstack-infra/project-config: Remove After=network.target from initialize-urandom service https://review.openstack.org/364517 | 19:12 |
pabelanger | clarkb: and urandom fix^ | 19:13 |
mordred | pabelanger: nice catch, btw | 19:13 |
clarkb | pabelanger: cool I am going to attempt to get my local virt setup working again so I can test an image with both of those things in it | 19:13 |
greghaynes | mordred: clarkb one caveat is I think that will grab master of that repo, so you might also want to set DIB_REPOREF_glean=some_ref | 19:13 |
*** harlowja_ has joined #openstack-infra | 19:13 | |
pabelanger | mordred: Ya, finally getting to optimize our nodepool launches. | 19:14 |
*** sarob has joined #openstack-infra | 19:14 | |
*** sarob has quit IRC | 19:15 | |
* AJaeger waves good bye | 19:16 | |
*** tonytan4ever has quit IRC | 19:16 | |
*** harlowja has quit IRC | 19:17 | |
*** eggshell has joined #openstack-infra | 19:18 | |
fungi | have a good evening AJaeger | 19:18 |
openstackgerrit | greghaynes proposed openstack-infra/irc-meetings: Add diskimage-builder meeting agenda https://review.openstack.org/364519 | 19:19 |
*** sarob has joined #openstack-infra | 19:21 | |
*** waht has joined #openstack-infra | 19:21 | |
rcarrillocruz | Shrews: sigh, my wife had split shift today , could not attend ansible testing meeting | 19:22 |
rcarrillocruz | i'll check chat logs | 19:22 |
*** _sarob has joined #openstack-infra | 19:22 | |
sdague | jeblair: cool | 19:22 |
sdague | jeblair: it would also be good if the ansible logs got into elastic search | 19:23 |
sdague | to help look for patterns | 19:23 |
*** salv-orlando has joined #openstack-infra | 19:23 | |
nikhil | rcarrillocruz: aye | 19:25 |
*** spzala has quit IRC | 19:25 | |
*** spzala has joined #openstack-infra | 19:26 | |
fungi | urgh, something in one of the virtualenv 14.0.x/pip 6.0.x versions horked up my wheel cache | 19:29 |
fungi | this makes bisection decidedly more complicated | 19:29 |
fungi | er, pip 8.0.x i mean | 19:30 |
*** ijw has quit IRC | 19:31 | |
*** ijw has joined #openstack-infra | 19:32 | |
*** annegent_ has quit IRC | 19:32 | |
fungi | the release history for 14.0.6 includes "Upgrade setuptools to 20.0" and "Upgrade wheel to 0.29.0" | 19:34 |
*** _nadya_ has joined #openstack-infra | 19:35 | |
fungi | so we can consider this as probably either a regression between setuptools 19.6.2 and 20.0 or wheel 0.26.0 and 0.29.0 | 19:36 |
*** tphummel has quit IRC | 19:37 | |
*** sdague has quit IRC | 19:37 | |
*** _nadya_ has quit IRC | 19:40 | |
*** ddieterly is now known as ddieterly[away] | 19:41 | |
*** vhosakot has quit IRC | 19:41 | |
fungi | scarily, the official documentation for setuptools only has up through 25.1.3 in their included changelog | 19:43 |
*** nstolyarenko has joined #openstack-infra | 19:43 | |
fungi | oh, good, the CHANGES.rst in their git repo is up to date at least | 19:44 |
*** tongli has quit IRC | 19:44 | |
mat128 | I have the answer to that question | 19:45 |
mat128 | fungi: ^ | 19:45 |
mat128 | and if you have a newer version in your wheel cache, it's going to be used | 19:45 |
mat128 | leading to confusion and virtualenv not acting as it's supposed | 19:45 |
mat128 | I had filed a bug report, trying to find it | 19:46 |
*** asettle has quit IRC | 19:46 | |
zigo | clarkb: fungi: pabelanger: Can we wrap up the discussion we just had a few hours ago? | 19:47 |
zigo | If I understand correctly, the major concern is that each image will eat up to 1.5 GB of cache data, which will globally slow down infra. Is this the only problem, or is there anything else? | 19:47 |
zigo | Also, since things are the way they are right now, and that Newton release is approaching (technically for me, it's already released as b3, and I'm already late), can we delay switching to overlay mode for after Newton? | 19:47 |
zigo | Last, can we decide that we will use the overlay mode *only* for when upstream is OpenStack? | 19:47 |
mat128 | fungi: https://github.com/pypa/virtualenv/blob/bdef7328d47f18ecf9d1df23e33ec5a039f41048/virtualenv.py#L934 | 19:47 |
*** asettle has joined #openstack-infra | 19:47 | |
mat128 | if that line was changed to pip==VERSION | 19:47 |
mat128 | it'd work correctly | 19:47 |
fungi | mat128: so unfortunately this means the problem is somewhere between pip 8.0.2 and 8.1.2 or setuptools 19.6.2 and 26.1.1 or wheel 0.26.0 and 0.29.0 | 19:48 |
mat128 | in the meantime, flushing your wheel cache seems like the only way out | 19:48 |
*** asettle has quit IRC | 19:48 | |
*** mriedem has joined #openstack-infra | 19:49 | |
pabelanger | clarkb: okay, I think I have a minimal DIB working with glean and our urandom element, just confirming now | 19:49 |
fungi | zigo: i'm a little worried that we didn't anticipate the nature of the repos you were importing, and that they already include a fair amount of git history from the corresponding upstream project repos. cleaning that up is likely to be complicated | 19:50 |
clarkb | pabelanger: ok I just got my virsh issue sorted out | 19:50 |
fungi | e.g. http://git.openstack.org/cgit/openstack/deb-nova/tree/?h=debian%2Fnewton looks like it's not just the debian directory but rather an entire nova source code tree | 19:50 |
zigo | fungi: We already imported everything. | 19:50 |
clarkb | pabelanger: but gonna try to get the buiod started then go eat lunch | 19:51 |
zigo | fungi: Also, I'm really not sure how I will do for keeping the packaging history and changing the hosted format ... :/ | 19:51 |
zigo | fungi: I really need answers for my above concerns ASAP. Time is running... :/ | 19:52 |
fungi | zigo: yeah, i'm inclined at this point to just go along with what you want because the bulk of the damage is already done and i don't have a good answer for how to go back and fix it | 19:52 |
zigo | fungi: So, would you agree that we delay the transition for after the Newton release? | 19:53 |
fungi | which would probably involve git filter-branch to trim out everything except the debian subtree and forcve-pushing teh result back over the existing repos | 19:53 |
zigo | (if we decide to do so...) | 19:53 |
fungi | zigo: yeah, in the near term merging a few more upstream tags isn't going to make the existing situation considerably worse | 19:53 |
fungi | zigo: we have it documented at http://docs.openstack.org/infra/manual/drivers.html#merge-commits | 19:55 |
*** abregman has joined #openstack-infra | 19:57 | |
*** asettle has joined #openstack-infra | 19:59 | |
*** asettle has quit IRC | 19:59 | |
rcarrillocruz | pabelanger: oh, we have cirros or some other image in our mirrors? | 20:00 |
rcarrillocruz | or which file you refer to | 20:00 |
clarkb | pabelanger: ok my build is started | 20:01 |
*** _nadya_ has joined #openstack-infra | 20:01 | |
clarkb | will have to see how that goes | 20:01 |
pabelanger | rcarrillocruz: ~/cache/files is what I was referring too | 20:01 |
pabelanger | clarkb: rebuilding, I messed up the glean variable | 20:01 |
clarkb | gonna grab lunch while its going | 20:01 |
openstackgerrit | Sagi Shnaidman proposed openstack-infra/tripleo-ci: Use centos CDN repository with periodic jobs https://review.openstack.org/364534 | 20:02 |
pabelanger | clarkb: Ya, I did use it, I had a typo in my path | 20:02 |
*** piet has quit IRC | 20:02 | |
clarkb | ah | 20:02 |
openstackgerrit | Guillaume Espanel proposed openstack-infra/project-config: Create puppet-cloudkitty repository https://review.openstack.org/364535 | 20:03 |
fungi | confirmed the pip cache is definitely playing a part in this behavior. testing with various versions i got it into a state where older virtualenv releases that had been working for me started exhibiting the problem behavior until i blew away ~/.cache/pip | 20:04 |
*** ijw has joined #openstack-infra | 20:04 | |
mat128 | fungi: I tried very hard to find the bug report I remember submitting, but can't find anything | 20:04 |
*** ddieterly[away] is now known as ddieterly | 20:04 | |
mat128 | fungi: must have had bad dreams about virtualenv and pip.. can't find any trace of my experiment either :( | 20:05 |
*** annegentle has joined #openstack-infra | 20:05 | |
fungi | mat128: i'm less concerned with that to be honest. the current and more troubling issue is that starting with virtualenv 14.0.6 we seem to be installing cached wheels of downloaded releases when we ask it to pip install the current checked out source tree | 20:05 |
mat128 | fungi: are we using explicit versions? | 20:06 |
fungi | i think it's something after wheel 0.26.0 but i'm still narrowing it down | 20:06 |
*** derekh has joined #openstack-infra | 20:06 | |
mat128 | fungi: can you reproduce it easily? | 20:06 |
mat128 | fungi: pip install package==1.0.0 | 20:06 |
mat128 | or constraints | 20:06 |
fungi | mat128: absolutely, i'm whittling down versions of things in the toolchain to narrow down whether it's virtualenv, pip, wheel or setuptools at fault | 20:06 |
zigo | (minus the english mistakes in the commit header... :( ) | 20:06 |
mat128 | fungi: so let me try to understand the issue: you have software==1.1 in your wheel cache, you issue pip install software==1.0 and 1.1 gets installed? | 20:07 |
fungi | mat128: the short version is demonstrated in http://logs.openstack.org/92/364492/1/check/gate-openstackdocstheme-api-ref/72a739d/tox/api-ref-1.log.txt and http://logs.openstack.org/92/364492/1/check/gate-openstackdocstheme-api-ref/72a739d/tox/api-ref-2.log.txt | 20:08 |
*** _nadya_ has quit IRC | 20:08 | |
mat128 | ouch | 20:08 |
fungi | mat128: it does a pip install -r test-requirements.txt where one of the packages listed there is a circular dependency back on the project being tested and so pulls down the packaged release, then does a pip install -e . and ends up i think installing the cached wheel instead of the current source tree | 20:09 |
*** ijw has quit IRC | 20:09 | |
mat128 | ? | 20:09 |
fungi | not sure yet, i'm less concerned with workarounds and more with bisecting to a specific version in the toolchain that introduces this regression | 20:10 |
fungi | it's just a little fiddly since some of teh parts are vendored so there's some chicken-and-egg problems pinning them independently | 20:11 |
*** tonytan4ever has joined #openstack-infra | 20:11 | |
*** coolsvap has quit IRC | 20:12 | |
*** flepied has quit IRC | 20:12 | |
fungi | comparing virtualenv 14.0.5 and 14.0.6 behavior (or 14.0.5 and latest release for that matter) | 20:12 |
*** gyee has joined #openstack-infra | 20:13 | |
mat128 | openstackdocstheme (1.5.0, /Users/mmitchell/projects/openstackdocstheme/.tox/venv/lib/python2.7/site-packages) | 20:13 |
rcarrillocruz | pabelanger: oh nice, did not know we had cirros at cache/files, thanks! | 20:13 |
*** kzaitsev_mb has quit IRC | 20:14 | |
openstackgerrit | James E. Blair proposed openstack-infra/infra-specs: Zuulv3: drop variable interpolation and add nodesets https://review.openstack.org/361463 | 20:14 |
mat128 | fungi: pip 8.1.2, virtualenv 15.0.2 | 20:15 |
fungi | i think i've mostly ruled it out as a regression in wheel. seems more like setuptools but i'm not sure where in its history this crops up yet | 20:15 |
mat128 | oh, virtualenv 14 | 20:15 |
*** maishsk has quit IRC | 20:15 | |
*** fguillot has quit IRC | 20:16 | |
fungi | jeblair: oh, cool! | 20:16 |
*** tonytan_brb has joined #openstack-infra | 20:16 | |
ianw | has the idea of zuul-cloner being a separate thing to zuul been covered before? | 20:16 |
*** maishsk has joined #openstack-infra | 20:16 | |
fungi | ianw: we've talked about breaking out the cli tools, but the arguments for doing so have been fairly shallow | 20:16 |
*** tonytan4ever has quit IRC | 20:17 | |
*** esikache1 has quit IRC | 20:18 | |
*** spzala has quit IRC | 20:19 | |
*** markusry has quit IRC | 20:20 | |
*** flepied has joined #openstack-infra | 20:21 | |
*** kzaitsev_mb has joined #openstack-infra | 20:21 | |
fungi | i wonder if SETUPTOOLS_SYS_PATH_TECHNIQUE=rewrite changes this | 20:21 |
mat128 | fungi: confirmed here too | 20:22 |
mat128 | http://paste.openstack.org/show/565887/ | 20:22 |
mat128 | fungi: yes, that fixes it | 20:23 |
fungi | ugh | 20:23 |
fungi | "This project hopes that that few if any environments find it necessary to retain the old behavior, and intends to drop support for it altogether in a future release. Please report any relevant concerns in the ticket for this change." | 20:23 |
fungi | i guess we have one :/ | 20:24 |
mat128 | installing a project from source doesnt seem like a rare case | 20:24 |
fungi | indeed | 20:24 |
fungi | this is probably a pip -e case | 20:24 |
mat128 | fungi: look at the paths | 20:25 |
openstackgerrit | Monty Taylor proposed openstack-infra/shade: Batch calls to list_floating_ips https://review.openstack.org/364508 | 20:25 |
mat128 | http://paste.openstack.org/show/565887/ | 20:25 |
mat128 | fungi: one is from site-packages (as you said, pulled via dep resolution) | 20:25 |
fungi | yep | 20:25 |
*** dprince has quit IRC | 20:26 | |
clarkb | pabelanger: my local.instance never dhcped | 20:26 |
fungi | okay, so usedevelop=false does indeed cause the problem to go away | 20:26 |
fungi | which means setuptools 25.0.0 basically broke editable installs | 20:27 |
*** pvaneck has quit IRC | 20:27 | |
clarkb | pabelanger: so rebuilding to set apassword to logij on console | 20:27 |
*** nmagnezi_ has joined #openstack-infra | 20:27 | |
clarkb | pabelanger: I think thr ordering for glean might be more subtle | 20:27 |
fungi | mat128: i'll update the bug mentioned in the changelog and see where that gets us | 20:27 |
mat128 | fungi: yeah, probably our best bet | 20:28 |
*** ddieterly is now known as ddieterly[away] | 20:28 | |
*** eggshell has quit IRC | 20:28 | |
pabelanger | clarkb: yup | 20:28 |
*** nmagnezi has quit IRC | 20:28 | |
fungi | https://github.com/pypa/setuptools/issues/674 for the record | 20:28 |
fungi | mat128: i'll come up with a reproducer that doesn't rely on tox, now that i know what the problem is | 20:29 |
*** e0ne has quit IRC | 20:30 | |
*** gordc has quit IRC | 20:30 | |
*** itisha has quit IRC | 20:30 | |
*** ijw has quit IRC | 20:31 | |
lifeless | fungi: yay quagmire | 20:32 |
lifeless | fungi: lets all switch to rust :) | 20:32 |
mordred | lifeless: if only | 20:32 |
prometheanfire | someone mind looking at this for the release process? https://review.openstack.org/#/c/364497 | 20:32 |
openstackgerrit | David Shrewsbury proposed openstack-infra/nodepool: Replace watch thread with periodic thread https://review.openstack.org/363217 | 20:34 |
*** gouthamr has quit IRC | 20:34 | |
fungi | lifeless: indeed. who needs editable installs anyway? ;) | 20:37 |
*** annegentle has quit IRC | 20:39 | |
*** ijw has joined #openstack-infra | 20:41 | |
mrhillsman | any thoughts on the time it will take to get some workload on cloud8? | 20:41 |
ianw | lifeless: if around ... http://docs.openstack.org/developer/pbr/#environment-markers <- is it possible to query how you were installed with a various environments? or is the idea you would just do an import and catch exceptions? | 20:43 |
openstackgerrit | Thomas Goirand proposed openstack-infra/project-config: Add merge commit ACL for packaging-deb https://review.openstack.org/364537 | 20:44 |
*** kgiusti has left #openstack-infra | 20:44 | |
*** ijw has quit IRC | 20:46 | |
*** annegentle has joined #openstack-infra | 20:47 | |
*** piet has joined #openstack-infra | 20:49 | |
openstackgerrit | Monty Taylor proposed openstack-infra/shade: Fix up image and flavor by name in create_server https://review.openstack.org/364548 | 20:49 |
*** maishsk has quit IRC | 20:50 | |
mordred | armax, dtroyer: neutronclient released. new os-client-config released - the gate does not seem to have completely broken ... so I think we can consider that good! | 20:51 |
ianw | timothyb89: around? I wouldn't mind rolling stackviz cleanup into my devstack-gate cleanup-refactor, just to get it all done. want to pick your brain on what you think is supposed to be happening under regular & grenade | 20:51 |
armax | mordred: someone in the neutron channel said something got belly up | 20:52 |
timothyb89 | ianw: sure, happy to help | 20:52 |
*** ijw has joined #openstack-infra | 20:52 | |
mordred | armax: oh no | 20:52 |
*** pvaneck has joined #openstack-infra | 20:52 | |
ianw | timothyb89: so see http://logs.openstack.org/46/364046/4/check/gate-grenade-dsvm-neutron-ubuntu-trusty/f527fe1/logs/ ... that has old & new and each has their tempest run | 20:53 |
ianw | timothyb89: do you expect that both would have stackviz output too? | 20:53 |
mordred | jeblair: so ... interesting edge case for your brainhole wrt zuul | 20:53 |
mordred | prometheanfire: ^^ (about to mention the neutron thing to jeblair) | 20:54 |
prometheanfire | mordred: we are handling it in -release | 20:54 |
timothyb89 | ianw: ideally if a *.subunit exists, stackviz should too | 20:54 |
mordred | prometheanfire: well, there is a thing here which is, I think, potentially worth considering on the zuul side for future development ... | 20:54 |
mordred | jeblair: release automation released a new version of python-neutronclient (yay!) and then submitted a patch to global requirements bumping the constraints for the release | 20:55 |
mordred | jeblair: the tests for that change ran before the release artifact had made it to the mirrors | 20:55 |
clarkb | pabelanger: ok on my local instance I don't see glean or ssh ever start | 20:56 |
mordred | jeblair: I'm mentioning it because it sounds similar to me to one of the things we've heard from distro folks - sometimes they'd like to run a job in response to an artifact being uploaded, not to a git commit | 20:56 |
clarkb | pabelanger: now to determine if it is pebcak during build | 20:56 |
mordred | jeblair: and this seems like a specific case of that for if/when we get around to musing about such a thing | 20:56 |
ianw | timothyb89: would i just "cat tempest.subunit" | stackviz ... ? | 20:58 |
timothyb89 | ianw: there are a few extra steps, but that would work | 20:59 |
timothyb89 | ianw: though specifically on http://logs.openstack.org/26/363326/3/check/gate-grenade-dsvm-neutron-ubuntu-trusty/05856df/logs/devstack-gate-cleanup-host.txt it looks like a permissions issue of some sort? | 21:00 |
*** Gibi is now known as gibi | 21:00 | |
*** nmagnezi_ has quit IRC | 21:01 | |
pabelanger | clarkb: can you see glean in systemctl? | 21:01 |
clarkb | pabelanger: no | 21:01 |
*** raildo has quit IRC | 21:01 | |
clarkb | pabelanger: but it is installed | 21:01 |
pabelanger | Hmm | 21:01 |
clarkb | the command glean is installed I mean | 21:01 |
pabelanger | just bringing my image online now | 21:01 |
pabelanger | looks like glean failed to start for me too | 21:02 |
pabelanger | same with initialize-urandom | 21:02 |
ianw | timothyb89: there is that ... i think it's confusion between when grenade uses "new" v the top-level | 21:03 |
*** jkilpatr has quit IRC | 21:03 | |
pabelanger | clarkb: you likely see Ordering cycle found, skipping Network | 21:03 |
*** ddieterly[away] is now known as ddieterly | 21:03 | |
jeblair | mordred: maybe the constraints update shouldn't come from the release tag job; maybe it should happen in response to the artifact upload job? | 21:03 |
ianw | timothyb89: can you have one "stackviz" dir and put two lots of data in? | 21:04 |
jeblair | mordred: (in other words, look at it as not being the reponsibility of the release pipeline to update constraints; look at it as the responsibility of the project to update constraints when it's released (regardless of *how* the release is triggered). | 21:04 |
*** berendt has joined #openstack-infra | 21:04 | |
clarkb | pabelanger: no its more like glean just isn't enabled at all for some reason | 21:04 |
clarkb | pabelanger: systemctl status glean says Loaded: not-found (Reason: No such file or directory) | 21:06 |
clarkb | pabelanger: but the unit file is in /usr/lib/systemd/system | 21:06 |
pabelanger | clarkb: it will be glean@eth0.service I think | 21:06 |
timothyb89 | ianw: something like, `stackviz-export -f old/testrepository.subunit -f new/testrepository.subunit $log_path/stackviz/data` would work I think | 21:07 |
mordred | jeblair: I believe that is what the release team is going to do | 21:07 |
*** trown is now known as trown|outtypewww | 21:07 | |
clarkb | pabelanger: there is glean@.service | 21:08 |
ianw | timothyb89: great, i'll try that. will it work if i do the exports separately, or do they have to be done from the one stackviz-export call? making two calls would be easier and fit into the existing fn | 21:08 |
*** yfried has quit IRC | 21:08 | |
pabelanger | clarkb: still figuring that out | 21:08 |
mordred | jeblair: I was more just bringing it up as a local example of a thing people who are not in openstack have expressed wanting to be able to do | 21:09 |
timothyb89 | ianw: right now it would need 1 call, but we can always just make 2 copies like we do now if that's easier | 21:10 |
*** rvasilets___ has joined #openstack-infra | 21:10 | |
clarkb | pabelanger: hrm I still have an After=network.target in that file so I may not have gotten glean/simple-init to install properly | 21:10 |
timothyb89 | ianw: actually, 2 separate may still be the best option so dstat is included ... the export CLI only accepts 1 dstat input right now | 21:11 |
ianw | timothyb89: ok, cool ... let me get something together and we can see how it looks | 21:12 |
timothyb89 | ianw: sounds good! | 21:12 |
*** kaisers_ has joined #openstack-infra | 21:14 | |
pabelanger | clarkb: ya, same. glean was installed by pip for me | 21:15 |
*** ilyashakhat_mobi has quit IRC | 21:15 | |
*** ldnunes has quit IRC | 21:16 | |
clarkb | pabelanger: we need to set DIB_INSTALLTYPE_simple_init=repo | 21:17 |
clarkb | pabelanger: so I rebuilding with that. But I don't know why its otherwise broken for me | 21:18 |
clarkb | pabelanger: did glean the service run for you? | 21:18 |
*** kaisers_ has quit IRC | 21:18 | |
fungi | mat128: lifeless: https://github.com/pypa/setuptools/issues/674 | 21:18 |
pabelanger | clarkb: didn't come up for me, I don't know why | 21:18 |
clarkb | pabelanger: same problem likely then :) | 21:18 |
fungi | dstufft: ^ that issue is probably of interest to you as well, as pip maintainer | 21:18 |
clarkb | pabelanger: I used devuser to create a user with password then logged in on console | 21:18 |
*** rossella_s has joined #openstack-infra | 21:19 | |
*** aeng has joined #openstack-infra | 21:20 | |
*** tphummel has joined #openstack-infra | 21:21 | |
fungi | oh, and now i find https://github.com/pypa/setuptools/issues/729 and https://github.com/pypa/setuptools/issues/447 basically already cover this | 21:21 |
openstackgerrit | Matthew Thode proposed openstack-infra/project-config: Pause before submitting the requirements review https://review.openstack.org/364559 | 21:23 |
clarkb | pabelanger: manually running `sudo systemctl start glean@ens3` worked | 21:24 |
* prometheanfire likes glean | 21:24 | |
pabelanger | clarkb: okay, lets hope this DIB works | 21:25 |
clarkb | now to figure out why it didn't fire on boot | 21:25 |
openstackgerrit | Ian Wienand proposed openstack-infra/devstack-gate: Refactor stackviz run https://review.openstack.org/364560 | 21:25 |
pabelanger | clarkb: did you rebuild? | 21:25 |
clarkb | pabelanger: also there isn't au nit file for each individual interface instead the glean@ unit file is a template that takes the interface dev name as an argument | 21:25 |
clarkb | pabelanger: still rebuilding. But also debugging on my old broken host | 21:25 |
*** Goneri has joined #openstack-infra | 21:26 | |
*** thorst has quit IRC | 21:26 | |
fungi | mat128: AJaeger: oh, not yet merged, i misread | 21:26 |
clarkb | pabelanger: oh there was an ordering dependency why did that not show up in dmesg? | 21:26 |
*** thorst has joined #openstack-infra | 21:27 | |
timothyb89 | ianw: that looks a lot better! small issue, though, the '--end' flag copied over is actually an old typo, that should be '--env' | 21:27 |
*** matt-borland has quit IRC | 21:27 | |
clarkb | pabelanger: so its possible the fix might actuall fix this | 21:27 |
pabelanger | clarkb: Ya, that is the original ordering issue | 21:27 |
pabelanger | yup | 21:27 |
pabelanger | for some reason, the magic make it work today | 21:28 |
timothyb89 | ianw: apparently that branch was never followed since it should have been spitting out errors for months... whoops | 21:28 |
*** jerryz has quit IRC | 21:28 | |
*** pt_15 has quit IRC | 21:29 | |
timothyb89 | ianw: that, and probably some poor testing on my part... I guess that explains why dstat hasn't been showing up, though | 21:30 |
*** abregman has quit IRC | 21:31 | |
*** thorst has quit IRC | 21:31 | |
*** claudiub has joined #openstack-infra | 21:31 | |
*** abregman has joined #openstack-infra | 21:31 | |
*** ddieterly has quit IRC | 21:31 | |
timothyb89 | ianw: for what it's worth, I'm hoping to remove the stackviz-export step entirely in the near future, hopefully just the single `sudo cp -r ...` will be required | 21:32 |
*** gouthamr has joined #openstack-infra | 21:32 | |
clarkb | pabelanger: ok glean worked but initialize-urandom failed due to no such file or directory. I don't know which file or directory yet | 21:33 |
*** abregman has quit IRC | 21:33 | |
pabelanger | clarkb: oh, maybe we are missing haveged | 21:36 |
clarkb | pabelanger: oh! | 21:36 |
pabelanger | I think that is a dependency | 21:36 |
clarkb | its not part of the initialize-urandom element | 21:36 |
* clarkb builds another image | 21:36 | |
pabelanger | ya, we can added it as a pkg-map | 21:37 |
*** _sarob has quit IRC | 21:37 | |
clarkb | I just added it to infra package needs really quick totest but ya would be better in initialize-urandom | 21:38 |
pabelanger | that works | 21:38 |
*** shardy_afk has quit IRC | 21:38 | |
pabelanger | rebuild started.... again | 21:38 |
clarkb | :) isn't it fun how fixing things like this ends up being | 21:39 |
fungi | mordred: dhellmann: so... revisiting this apparently intentional behavior change in setuptools path ordering will result in future as-of-yet-unidentified modifications in pip's behavior around editable installs, can you remind me why some projects use them for tox in the first place? | 21:39 |
pabelanger | I'm hoping at the mid-cycle we can talk about DIB elements for a bit | 21:39 |
clarkb | pabelanger: part of the problem here is the normal builds take forever and are massive due to all the caching so I try to avoid that and just build with ubuntu-minimal simple-init growroot devuser infra-package-needs (for ssh) and initialize-urandom | 21:40 |
clarkb | but then we find where we have undeclared deps between elements | 21:40 |
clarkb | but also undocumented flags in simple-init that need setting aren't fun either | 21:41 |
clarkb | I should push a patch to dib now to fix that | 21:41 |
*** vhosakot has quit IRC | 21:41 | |
mordred | fungi: because it shortens the iteration cycle | 21:42 |
pabelanger | okay, will be back shortly, need to walk down to store for something | 21:42 |
*** hashar has quit IRC | 21:42 | |
*** priteau has quit IRC | 21:42 | |
mordred | fungi: for things like nova, needing install to run after each edit before running tox can be a significant delay | 21:42 |
mordred | fungi: the run_tests.sh that nova used before we went tox did editable installs into a virtualenv rather quickly, and the devs were annoyed by how long tox took | 21:43 |
fungi | mordred: got it. and specifically they want to do it with tox | 21:44 |
*** ijw has quit IRC | 21:44 | |
fungi | so we can't just take usedevelop out. and the SETUPTOOLS_SYS_PATH_TECHNIQUE envvar is targeted for future removal from setuptools | 21:44 |
*** adriant has quit IRC | 21:45 | |
mordred | fungi: so - should I go read the bug in question? | 21:45 |
mordred | fungi: like, why have they decided to break -e ? | 21:45 |
fungi | mordred: https://github.com/pypa/setuptools/issues/674 explains the reason for the behavior change | 21:46 |
openstackgerrit | Clark Boylan proposed openstack/diskimage-builder: Document source glean installs in simple-init https://review.openstack.org/364568 | 21:46 |
mordred | fungi: cool | 21:46 |
clarkb | greghaynes: ianw ^ | 21:46 |
*** gouthamr has quit IRC | 21:47 | |
cinerama | clarkb: thanks! | 21:47 |
fungi | mordred: so anyway, i think we have to consider the possibility that usedevelop=true in tox.ini is simply unsafe (and certainly currently broken with latest pip/setuptools) | 21:47 |
mordred | fungi: is the sequence of "pip install -e . ; pip install -r test-requirements.txt" | 21:48 |
fungi | mordred: other way around | 21:48 |
*** Thelo_ has quit IRC | 21:48 | |
fungi | mordred: basically tox runs your install_command first, then installs the local tree | 21:48 |
fungi | so it's in some cases getting a packaged version of whatever we're testing pulled in, and then does the editable install of the local source tree after | 21:49 |
clarkb | pbr has this behavior | 21:49 |
clarkb | we had a workaround in it involving put . in the deps list or something | 21:50 |
clarkb | (since pbr bootstraps itself) | 21:50 |
*** yamahata has quit IRC | 21:50 | |
mordred | yah | 21:51 |
mordred | -r{toxinidir}/test-requirements.txt | 21:51 |
ianw | clarkb: lg, it might be helpful to give an example of using a upstream review and getting the ref from there. i've manually talked people through that a couple of times (should have taken the time to document it :) | 21:52 |
*** ddieterly has joined #openstack-infra | 21:53 | |
*** cardeois has quit IRC | 21:56 | |
clarkb | pabelanger: http://paste.openstack.org/show/565895/ looks ok to me it still reloads (and seems to start with a "I shut down ok" message) but I think thats all fine | 21:56 |
clarkb | ianw: oh thats a good idea ya I can make it more verbose | 21:56 |
*** Goneri has quit IRC | 21:57 | |
*** berendt has quit IRC | 21:58 | |
clarkb | pabelanger: also initialize-urandom and glean both ran before the ssh things started | 21:58 |
clarkb | (I also really don't like that its ssh not sshd such muscle memory) | 21:59 |
*** adriant has joined #openstack-infra | 21:59 | |
clarkb | pabelanger: you good with me approving the glean change, then we need a release before we can approve the urandom one | 22:00 |
clarkb | ianw: ^ is that better? | 22:03 |
*** ijw has joined #openstack-infra | 22:04 | |
*** fguillot has joined #openstack-infra | 22:05 | |
*** javeriak has joined #openstack-infra | 22:06 | |
*** rlandy is now known as rlandy|bbl | 22:07 | |
ianw | clarkb: cool. you can even do it directly from the review, but people can probably figure that out | 22:08 |
ianw | i mean upstream git. the hardest part is finding the pull-down in the top-right corner of gerrit ui | 22:08 |
*** javeriak_ has quit IRC | 22:09 | |
*** Julien-zte has joined #openstack-infra | 22:09 | |
clarkb | ianw: I hope my example also shows how to test local edits by explaining it that way | 22:09 |
rcarrillocruz | pabelanger: mind reviewing https://review.openstack.org/#/c/364397/ pls? it pulls from file now | 22:10 |
*** Swami has quit IRC | 22:11 | |
*** rlandy|bbl is now known as rlandy | 22:11 | |
*** rlandy is now known as rlandy|bbl | 22:12 | |
openstackgerrit | Merged openstack-infra/shade: Batch calls to list_floating_ips https://review.openstack.org/364508 | 22:12 |
*** vhosakot has joined #openstack-infra | 22:13 | |
*** tphummel has quit IRC | 22:14 | |
*** xyang1 has quit IRC | 22:14 | |
*** krotscheck has quit IRC | 22:15 | |
*** krotscheck has joined #openstack-infra | 22:15 | |
clarkb | mordred: what are your thoughts on making a glean release nowish? | 22:15 |
clarkb | mordred: we will need to coordinate that and the fix for initialize-urandom because if we don't then boot doesn't work | 22:16 |
clarkb | mordred: glean can't bring up interfaces properly if we run the current glean against the initialize-urandom fix | 22:16 |
clarkb | its good fun | 22:16 |
*** ddieterly is now known as ddieterly[away] | 22:17 | |
*** javeriak_ has joined #openstack-infra | 22:17 | |
*** javeriak has quit IRC | 22:17 | |
*** Thelo_ has joined #openstack-infra | 22:17 | |
*** tphummel has joined #openstack-infra | 22:19 | |
*** spzala has joined #openstack-infra | 22:20 | |
*** piet has quit IRC | 22:20 | |
*** ilyashakhat_mobi has quit IRC | 22:21 | |
mordred | clarkb: oh lovely | 22:21 |
mordred | clarkb: well, I am in support of releasing glean anytime you think is good | 22:21 |
clarkb | mordred: I want pabelanger to confirm his test build functioned then I think we can get both things in and glean released | 22:22 |
mordred | ++ | 22:22 |
clarkb | dtroyer: stevemar did that osc neutron https thing get merged and released yet? I can't find the bug anymore and I fail at googling | 22:22 |
ianw | i don't want to mess things up, but i have a bunch of glean stuff just sitting -> https://review.openstack.org/#/q/status:open+project:openstack-infra/glean+branch:master+topic:mock | 22:22 |
pabelanger | clarkb: back | 22:23 |
ianw | mostly test cleanups | 22:23 |
mordred | :( | 22:23 |
pabelanger | let me test quickly | 22:23 |
pabelanger | build is done | 22:23 |
jhesketh | Morning | 22:23 |
openstackgerrit | Ian Wienand proposed openstack-infra/glean: Add selinux context manager for writing files https://review.openstack.org/304357 | 22:24 |
mordred | clarkb: you have opinons on https://review.openstack.org/#/c/318464 ? | 22:24 |
clarkb | mordred: no I suffer mock where it exists :) | 22:24 |
ianw | mordred: ^ hmm, rebase *seemed* happy ... | 22:25 |
clarkb | I guess that would be the only thing I would want is to make sure we don't regress like mox | 22:25 |
*** adriant has quit IRC | 22:25 | |
mordred | yah. mock is the python3 happy one | 22:25 |
*** yolanda has quit IRC | 22:26 | |
*** ramishra has quit IRC | 22:27 | |
pabelanger | clarkb: blarg, my build actually failed. | 22:28 |
pabelanger | while I figure out why my dib failed | 22:28 |
*** ramishra has joined #openstack-infra | 22:29 | |
clarkb | pabelanger: ok, so shoudl I approve the glean chagne then and ask modrred for a release then we can approve the project-config change? | 22:30 |
pabelanger | clarkb: Ya, lets do that | 22:30 |
*** Thelo_ has joined #openstack-infra | 22:30 | |
clarkb | ok glean change approved | 22:31 |
*** Thelo_ has quit IRC | 22:32 | |
rcarrillocruz | thx | 22:32 |
dtroyer | I haven't tested it myself yet though | 22:32 |
clarkb | mordred: ^ you want to do the honors of a release? | 22:35 |
clarkb | or maybe get some of ianw's in first? | 22:36 |
clarkb | ianw: any of them make snse to try and get into a release if it happens nowish? | 22:36 |
*** ddieterly[away] is now known as ddieterly | 22:38 | |
ianw | i don't think it's super urgent. the selinux one was to help with, well, selinux. the others were test-cleanups that feel out of trying to test it | 22:39 |
ianw | i'm just catching up ... are you sure it's the After? as described in https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/ that's mostly a *shutdown* thing | 22:39 |
*** Thelo_ has joined #openstack-infra | 22:40 | |
ianw | "network.target has very little meaning during start-up ... It's primary purpose is for ordering things properly at shutdown:" | 22:40 |
*** Thelo_ has quit IRC | 22:41 | |
*** ddieterly is now known as ddieterly[away] | 22:42 | |
*** sarob has joined #openstack-infra | 22:43 | |
*** tonytan_brb has quit IRC | 22:43 | |
clarkb | ianw: I know that those two changes to remove the After result in no more ordering cycle | 22:45 |
pabelanger | right | 22:45 |
mordred | clarkb: let's go ahead and release ... there are a LOT of changes unreleased - I hestitate to land more given the debug cycle | 22:45 |
*** adriant has joined #openstack-infra | 22:45 | |
clarkb | ianw: there were two interrelated problems. The ordering cycle which just made things messy and ssh was being started then stopped then started | 22:46 |
clarkb | ianw: so if connections came in during the first start they would get killed when the service was stopped | 22:46 |
mordred | clarkb: 1.6.0 ? | 22:47 |
clarkb | mordred: uh the change I know of should be backaard compat and it just changes a bug so even a point would be fine | 22:47 |
clarkb | but not sure about all the other changes | 22:47 |
*** signed8bit is now known as signed8bit_Zzz | 22:48 | |
jeblair | pabelanger, mordred, clarkb, and i guess someone can tell sdague if they see him: here's what i've noticed about those ansible ssh connection failures: 1) they seem *vaguely* time correlated. they seem to come in batches, and the batches happen on all the zuul launchers. 2) i suspect they may be hitting a small number of jobs; related: they seem to disproportionately happen on multinode jobs. 3) the ssh failure happens 25 seconds ... | 22:48 |
jeblair | ... after the last poll. our polling interval is 5 seconds, and there is a 10 second ssh connection timeout. i can't account for the other 10 seconds. | 22:48 |
jeblair | there is no additional information from the async module about the ssh error, even with -vvv. (this is perhaps not surprising at this point) | 22:49 |
mordred | clarkb: that's the changelog entries since the last release | 22:49 |
jeblair | i'm turning off verbose mode on the launchers | 22:50 |
mordred | clarkb: I think bonding additions make it 1.6.0 | 22:50 |
clarkb | mordred: ah ya that would be a new feature | 22:50 |
mordred | ok. Im pushing the tag | 22:51 |
mordred | and done | 22:51 |
clarkb | mordred: https://review.openstack.org/#/c/364517/1 is the corresponding one now that that is done | 22:52 |
*** Swami has joined #openstack-infra | 22:52 | |
clarkb | and yes lets hope new images work :) | 22:52 |
mordred | jeblair: good point | 22:52 |
clarkb | good thing we won't get those until after dhellmann is done | 22:52 |
mordred | clarkb: that sohuld be safe to land now, yeah? | 22:52 |
mordred | since glean will exist next time an image is built | 22:53 |
mordred | k. done | 22:53 |
*** gouthamr_ has quit IRC | 22:53 | |
fungi | woah! https://gerrit-review.googlesource.com/85233 | 22:54 |
pleia2 | nice | 22:55 |
mordred | neat! | 22:55 |
*** ijw has quit IRC | 22:56 | |
*** javeriak_ has quit IRC | 22:57 | |
clarkb | mordred: pabelanger the good thing is if that breaks xenial/centos/jessie it won't break any jobs | 22:57 |
clarkb | since it will just prevent us from sshing in to unbooted hosts | 22:57 |
*** sarob has joined #openstack-infra | 22:58 | |
pabelanger | ya | 22:59 |
mordred | clarkb: if you have a sec ... https://review.openstack.org/#/c/362813 is part of me working through the suboptimal caching in nodepool right now | 22:59 |
*** annegentle has quit IRC | 23:01 | |
*** kaisers_ has joined #openstack-infra | 23:02 | |
*** sarob has quit IRC | 23:03 | |
*** ijw has joined #openstack-infra | 23:04 | |
*** dimtruck is now known as zz_dimtruck | 23:05 | |
*** zz_dimtruck is now known as dimtruck | 23:05 | |
pabelanger | clarkb: do we want to kick off an image-build tonight for ubuntu-xenial? | 23:06 |
*** rbrndt has quit IRC | 23:08 | |
clarkb | mordred: tonyb ^ comments on that welcome. I think it should be relatively safe to do as soon as dhellmann is done making milestone 3 things happen | 23:08 |
clarkb | pabelanger: we could. Then upload to osic real quick and see if it works | 23:09 |
clarkb | pabelanger: probably a decent idea. Just make sure that glean hits pypi first | 23:09 |
*** ijw has quit IRC | 23:09 | |
*** salv-orlando has quit IRC | 23:10 | |
dhellmann | clarkb : the plan is to finish tagging by 1400 UTC tomorrow when the release team meeting starts, then branch after that for the libs that don't have branches, then some time later in the day tomorrow there's gerrit downtime IIRC. I can let you know here when we're done with the branches | 23:10 |
*** markvoelker has joined #openstack-infra | 23:10 | |
clarkb | dhellmann: sounds good thanks | 23:11 |
clarkb | mordred: question about that caching change. We have two different cache settings in the one clouds.yaml | 23:11 |
clarkb | mordred: yes | 23:11 |
mordred | clarkb: yah - one is the per-resource cache setting - which is really the batch/poll setting | 23:11 |
clarkb | looks like one is for the inventory cache? | 23:11 |
mordred | clarkb: the other is "how long should an entire copy of the inventory be kept around" | 23:12 |
clarkb | and it won't conflate the two? | 23:12 |
mordred | nope. they're completely different things | 23:12 |
mordred | if there is a valid inventory cache, ansible won't execute any shade calls at all | 23:12 |
clarkb | ah gotcha | 23:12 |
clarkb | but while building an inventory it will use the other cache settings | 23:12 |
mordred | yah | 23:13 |
mordred | but we're not there yet | 23:13 |
jeblair | clarkb, pabelanger, mordred: this is the number of counts of ansible ssh failures per job over the past month: http://paste.openstack.org/show/565906/ | 23:13 |
jeblair | we'd probably need to normalize that by number of jobs run to really pick up a pattern | 23:14 |
mordred | jeblair: wow: 3939 gate-tempest-dsvm-neutron-full-ubuntu-xenial | 23:14 |
*** gouthamr_ has joined #openstack-infra | 23:14 | |
clarkb | mordred: thats the ipv6 issue | 23:14 |
*** gouthamr has quit IRC | 23:14 | |
jeblair | it is? | 23:14 |
clarkb | jeblair: ya it only affected jobs using neutron and it was every job using neutron basically | 23:14 |
jeblair | and that showed up as the problem i'm looking at? | 23:15 |
clarkb | jeblair: you should be able to tell reasonably well if it drops off in the last week or so | 23:15 |
*** shashank_hegde has quit IRC | 23:15 | |
*** xarses has quit IRC | 23:15 | |
jeblair | clarkb: without setting unreachable? | 23:15 |
ianw | timothyb89: so i guess running it for "old" and "new" doesn't quite work to show separate outputs -> http://logs.openstack.org/60/364560/3/check/gate-grenade-dsvm-neutron-ubuntu-trusty/adeae81/logs/stackviz/#/stdin | 23:15 |
clarkb | jeblair: I am not sure what that means | 23:16 |
pabelanger | I think ipv6 was different, because those jobs were requeue in zuul | 23:16 |
clarkb | jeblair: pabelanger I think if you compare the wkeely numbers for the last week and the 3 weeks before those jobs should fall off as being that bad | 23:17 |
*** rvasilets___ has quit IRC | 23:17 | |
jeblair | clarkb: i'm not convinced that my methodology is not already excluding those | 23:19 |
openstackgerrit | Doug Hellmann proposed openstack-infra/project-config: fix networking-hyperv release acls https://review.openstack.org/364591 | 23:19 |
jeblair | clarkb: but i don't want to spend any more days on this than i already am | 23:19 |
jeblair | clarkb: so i will just re-run it for the last week | 23:19 |
clarkb | ok I think we would probably have noticed with freeze if neutron was failing that hard. We certainly noticed it when ipv6 was broken | 23:19 |
jeblair | clarkb: most of these are being re-queued by zuul | 23:20 |
clarkb | yes that was the behavior we saw with ipv6. The job would run then timeout then ansible would rerun it. | 23:20 |
clarkb | the impact was in slowing down the throughput of the queues by a lot | 23:20 |
jeblair | clarkb: these aren't timeouts | 23:20 |
pabelanger | clarkb: mordred: BTW, I noticed this gem while chasing the ssh-server stop / start issue: http://paste.openstack.org/show/565907/ | 23:21 |
pabelanger | not sure now to fix that atm | 23:21 |
clarkb | jeblair: yes | 23:21 |
clarkb | jeblair: with ipv6 the existing connections had to timeout then we would get the connection error | 23:21 |
clarkb | we saw both things together because the hosts jsut became unroutable | 23:21 |
clarkb | so tcp thinks its still connected for X amount of time before it gives up and then connection fails | 23:22 |
jeblair | clarkb: okay, that's similar enough that we'd have to get into the weeds to figure out if i'm picking it up. so i'll still just do the last week. | 23:23 |
jeblair | so just starting with aug 26 | 23:25 |
mordred | pabelanger: wow | 23:25 |
mordred | pabelanger: uhm ... what is that? | 23:25 |
mordred | yah | 23:26 |
jeblair | clarkb, pabelanger, mordred: http://paste.openstack.org/show/565908/ | 23:26 |
clarkb | and glean is gonna try to configure all those interfaces | 23:26 |
mordred | yah | 23:26 |
mordred | and fail | 23:26 |
mordred | jeblair: oh that's much better | 23:26 |
clarkb | ya thats more like I would expect with multinode being more common for whatever reason | 23:27 |
*** annegentle has joined #openstack-infra | 23:27 | |
*** ilyashakhat_mobi has joined #openstack-infra | 23:29 | |
pabelanger | mordred: Ya, a neutron job. spotted it in passing | 23:29 |
jeblair | you can see them grouping i was talking about | 23:29 |
mordred | pabelanger: oh - wow - is glean getting triggered by udev/systemd every time neutron creates an interface??? | 23:30 |
clarkb | mordred: yes | 23:30 |
pabelanger | mordred: i think so | 23:30 |
clarkb | I think its fine if noisy | 23:31 |
mordred | https://review.openstack.org/#/c/364510/ clarkb, jeblair: last piece of inefficient nodepool ... it has passed http://logs.openstack.org/10/364510/3/check/gate-dsvm-nodepool-src-shade/67d715e/ already, which contains the shade fix that enables it (it failed a previous run of that, so the success shows the shade fixed fix it) | 23:31 |
jeblair | 13 jobs failed that way within a second of each other at 21:07 today | 23:31 |
mordred | jeblair: it's interesting to me that it clusters like that ... | 23:31 |
mordred | jeblair: but that also makes me think "cloud network hiccup" | 23:31 |
jeblair | mordred: yes, though it is across all of our launchers, and across clouds | 23:31 |
jeblair | so it's at least a datacenter-scale hiccup | 23:32 |
clarkb | one commonality seems to be the job/project? | 23:32 |
clarkb | like 12:54 today a bunch of multinodes fail | 23:32 |
jeblair | clarkb: yeah, let me regenerate this with project + change info as well | 23:32 |
pabelanger | clarkb: ubuntu-xenial DIB started | 23:33 |
*** chlong has quit IRC | 23:33 | |
*** ilyashakhat_mobi has quit IRC | 23:34 | |
ianw | how do we feel about devstack-gate making symlinks in the log directory | 23:34 |
*** gongysh has joined #openstack-infra | 23:35 | |
clarkb | ianw: is this related to the subunit thing? | 23:37 |
*** vhosakot has quit IRC | 23:37 | |
clarkb | ianw: its probably fine though we should be careful not to copy them with the logs | 23:37 |
clarkb | er rather we should follow the links when copying | 23:37 |
ianw | clarkb: yeah, splitting stackviz between "old" and "new" runs in grenade output, but not duplicating the ~3mb of js that's driving it | 23:38 |
clarkb | ianw: I just had to symlink /usr/libexec/qemu-bridge-helper to /usr/lib/qemu-bridge-helper because virsh would not honor the path I set in /etc/libvirt/qemu.conf | 23:39 |
clarkb | so if symlinks can solve this problem too that seems fine with me | 23:39 |
ianw | heh, apparently libncurses split itself in two between fedora 23 & 24, so the other day debugging the sfdisk stuff i symlinked half of the new ncurses back to a libncurses.so which had just enough symbols to make it work. symlinks can do anything :) | 23:41 |
mordred | hahahaha | 23:43 |
clarkb | pabelanger: yup | 23:43 |
pabelanger | okay | 23:43 |
pabelanger | well, giving up now | 23:43 |
pabelanger | since nodepool is building it | 23:43 |
*** xarses has joined #openstack-infra | 23:44 | |
mordred | clarkb, jeblair: woot. the change finished testing. if you look at http://logs.openstack.org/10/364510/3/check/gate-dsvm-nodepool/3db0c11/logs/screen-nodepool.txt.gz you can see the issue that the shade patch we landed fixed. then in the logs for the passing job all is happy | 23:46 |
*** zhurong has joined #openstack-infra | 23:48 | |
jeblair | i will be so happy when we have afs on the launchers. i'm doing all kinds of copying files around between them right now and it's silly | 23:49 |
*** thcipriani is now known as thcipriani|afk | 23:49 | |
mordred | jeblair: ++ | 23:50 |
jeblair | i guess i should poke at https://review.openstack.org/305477 | 23:50 |
*** markvoelker has quit IRC | 23:51 | |
jeblair | clarkb, mordred, pabelanger: http://paste.openstack.org/show/565911/ | 23:53 |
jeblair | it looks like the groupings are not simply "all the jobs for a change" | 23:54 |
jeblair | that seems pretty spread out too | 23:54 |
jeblair | are they all bluebox+osic? | 23:55 |
*** waht has quit IRC | 23:56 | |
jeblair | no, but mostly. | 23:56 |
jeblair | mostly osic makes sense. mostly bluebox does not. | 23:56 |
jeblair | i wonder if we're looking at a real-time map of internet routing issues :) | 23:57 |
mordred | jeblair: we tend to expose many issues | 23:58 |
mordred | jeblair: why not internet routing issues? | 23:58 |
*** zhurong has quit IRC | 23:59 | |
jeblair | the project list is pretty broad too. | 23:59 |
