*** zhiwei has joined #openstack-infra | 00:00 | |
mordred | clarkb: we could rewrite it in go | 00:01 |
---|---|---|
*** vishy has quit IRC | 00:02 | |
clarkb | mordred: yes! I actually just considerd doing it in C because you can link against crm114 directly | 00:02 |
clarkb | but after some local testing I think I may be overthinking this and closing stdin is sufficient to make crm114 go away | 00:03 |
*** mfink has joined #openstack-infra | 00:03 | |
mordred | kk | 00:03 |
*** annegent_ has joined #openstack-infra | 00:03 | |
*** r-daneel has quit IRC | 00:03 | |
*** zz_gondoi has quit IRC | 00:03 | |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Handle log processing subprocess cleanup better https://review.openstack.org/118924 | 00:03 |
*** r-daneel has joined #openstack-infra | 00:04 | |
clarkb | jeblair: ^ is that better? the only major different there other than the exception catching is that I don't bother to read stdout and stderr, I just close them | 00:04 |
*** nimrodsun_ has quit IRC | 00:04 | |
*** dkliban is now known as dkliban_afk | 00:06 | |
*** unicell has joined #openstack-infra | 00:06 | |
*** vishy has joined #openstack-infra | 00:06 | |
*** ZZelle_ has quit IRC | 00:06 | |
*** nimrodsun_ has joined #openstack-infra | 00:06 | |
*** paulrad has joined #openstack-infra | 00:06 | |
*** emagana has joined #openstack-infra | 00:07 | |
*** yamamoto_ has joined #openstack-infra | 00:07 | |
*** praneshp_ has joined #openstack-infra | 00:07 | |
*** flaper87 is now known as flaper87|afk | 00:08 | |
*** gondoi has joined #openstack-infra | 00:08 | |
*** signed8b_ has joined #openstack-infra | 00:08 | |
*** annegent_ has quit IRC | 00:08 | |
*** baoli_ has joined #openstack-infra | 00:08 | |
*** dmsimard_away is now known as dmsimard | 00:09 | |
*** imcsk8_ has joined #openstack-infra | 00:09 | |
clarkb | we merged slightly more code over the last 24 hours than the last few 24 hour periods | 00:09 |
*** rpodolyaka1 has joined #openstack-infra | 00:10 | |
*** dmsimard is now known as dmsimard_away | 00:10 | |
jeblair | clarkb: i literally don't know. :) i'm not trying to say i think any of these approaches are wrong, i'm just saying that they are not an incremental improvement; they completely change how the process is terminated. i'd need a day of reading and testing to understand the impacts. | 00:10 |
*** paulrad has quit IRC | 00:11 | |
*** nikhil__1 has joined #openstack-infra | 00:11 | |
*** therve has joined #openstack-infra | 00:11 | |
*** ekarlso- has joined #openstack-infra | 00:11 | |
*** slagle_ has joined #openstack-infra | 00:11 | |
*** devananda has joined #openstack-infra | 00:11 | |
openstackgerrit | Ian Wienand proposed a change to openstack-infra/config: Add bare-f20 nodes https://review.openstack.org/117397 | 00:12 |
openstackgerrit | Ian Wienand proposed a change to openstack-infra/config: Get postgresql puppet from upstream master https://review.openstack.org/117396 | 00:12 |
*** dmitryme has quit IRC | 00:12 | |
*** jcooley has quit IRC | 00:12 | |
*** cypriotme has quit IRC | 00:12 | |
*** baoli has quit IRC | 00:12 | |
*** emagana_ has quit IRC | 00:12 | |
*** signed8bit has quit IRC | 00:12 | |
*** imcsk8 has quit IRC | 00:12 | |
*** yamamoto has quit IRC | 00:12 | |
*** rpodolyaka has quit IRC | 00:12 | |
*** jesusaurus has quit IRC | 00:12 | |
*** ekarlso has quit IRC | 00:12 | |
*** Steap has quit IRC | 00:12 | |
*** therve` has quit IRC | 00:12 | |
*** nikhil___ has quit IRC | 00:12 | |
*** asadoughi has quit IRC | 00:12 | |
*** dhellmann_ has quit IRC | 00:12 | |
*** rfolco has quit IRC | 00:12 | |
*** kmartin has quit IRC | 00:12 | |
*** esmute has quit IRC | 00:12 | |
*** devanand1 has quit IRC | 00:12 | |
*** praneshp has quit IRC | 00:12 | |
*** bogdando has quit IRC | 00:12 | |
*** slagle has quit IRC | 00:12 | |
*** bogdando has joined #openstack-infra | 00:12 | |
*** esmute has joined #openstack-infra | 00:12 | |
*** praneshp_ is now known as praneshp | 00:12 | |
*** annegent_ has joined #openstack-infra | 00:12 | |
clarkb | jeblair: yup understood. I myself am just trying to wrap my head around it. I do note that the latest patchset is at teh very least probably a reasonale step given the testing I did locally. I wrote a small script that Popen'd crm, sleeps for 60 seconds (or long enough to confirm crm114 is running), writes to crm114 stdin, closes stdin, closes stdout, closes stderr, then waits and that wait returns | 00:12 |
*** jcooley has joined #openstack-infra | 00:12 | |
clarkb | so crm114 is dying gracefuly without an explicit kill | 00:12 |
*** dhellmann has joined #openstack-infra | 00:12 | |
jeblair | clarkb: ah, i see you tested that, sorry i was reading backwards :) | 00:12 |
*** jesusaurus has joined #openstack-infra | 00:12 | |
*** annegent_ has quit IRC | 00:12 | |
*** asadoughi has joined #openstack-infra | 00:13 | |
jeblair | clarkb: did crm114 have pending (unread) data on stdout when you closed stdin? | 00:13 |
*** dmitryme has joined #openstack-infra | 00:13 | |
clarkb | jeblair: oh good question, probably not | 00:13 |
clarkb | the message I wrote to it was relatively small | 00:13 |
clarkb | oh wait I am mixing up fds | 00:14 |
clarkb | I did not check but it should've have. I can update my script to check stdout | 00:14 |
fungi | logstash job queue has just gone through the roof | 00:14 |
fungi | presumably related | 00:14 |
clarkb | fungi: ya once I diagnosed the issue I restarted the two bad worker processes | 00:14 |
clarkb | they were acting like dev null previously so kept the queue low | 00:14 |
clarkb | now not so much :) | 00:15 |
fungi | oh, too fun | 00:15 |
fungi | bitbucket processing | 00:15 |
*** emagana has quit IRC | 00:16 | |
*** Steap has joined #openstack-infra | 00:16 | |
*** emagana has joined #openstack-infra | 00:16 | |
ianw | clarkb / fungi / anyone : be great if i could get the f20-bare changes moving (https://review.openstack.org/117396 <- puppet fix, https://review.openstack.org/117397 <- f20-bare node config) | 00:17 |
clarkb | ianw: crinkle says they will be releasing that module real soon now, can we consume it from the release rather than master? | 00:17 |
clarkb | crinkle: do you know when that is happening? | 00:17 |
ianw | clarkb: ok, for some values of "soon" :) | 00:18 |
*** packet has quit IRC | 00:18 | |
crinkle | clarkb: supposed to be this week, so hopefully tomorrow | 00:18 |
*** melwitt has quit IRC | 00:19 | |
*** melwitt has joined #openstack-infra | 00:19 | |
clarkb | jeblair: further testing shows there may not be anythin on stdout | 00:19 |
clarkb | jeblair: I may need to pik this up in the morning when my brain can focus on it | 00:20 |
ianw | crinkle / clarkb : ok, i'll hold off | 00:20 |
*** emagana has quit IRC | 00:21 | |
*** Ryan_Lane1 has joined #openstack-infra | 00:21 | |
*** dims has joined #openstack-infra | 00:21 | |
*** rfolco has joined #openstack-infra | 00:21 | |
*** SumitNaiksatam has joined #openstack-infra | 00:21 | |
clarkb | jeblair: newlines are important ! according t oselect there is data waiting there | 00:22 |
clarkb | jeblair: I can paste my test script | 00:22 |
*** Ryan_Lane has quit IRC | 00:22 | |
clarkb | jeblair: http://paste.openstack.org/show/105498/ | 00:22 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/config: Reduce min-ready https://review.openstack.org/118930 | 00:23 |
jeblair | clarkb, ianw: ^ | 00:23 |
*** otherwiseguy has quit IRC | 00:23 | |
*** sballe has quit IRC | 00:23 | |
jeblair | in case we decide we don't want 50 nodes sitting idle while we're under contention. | 00:23 |
*** gokrokve has joined #openstack-infra | 00:24 | |
*** dims_ has joined #openstack-infra | 00:24 | |
jeblair | ianw: any work you'd like to do to improve the situation would be welcome :) | 00:24 |
*** dims has quit IRC | 00:27 | |
*** zhiwei has quit IRC | 00:29 | |
ianw | jeblair: hmm, in your opinion is min-ready supposed to be more of a hint than a rule? documentation isn't clear on the semantics of it "The min-ready key is optional and defaults to 2." | 00:30 |
jeblair | ianw: it's so that if there is no load, we still have some nodes ready to run jobs immediately. | 00:32 |
jeblair | ianw: when there isn't enough capacity to satisfy demand, we should not use min-ready to reserve nodes that could otherwise be used | 00:32 |
jeblair | but that's what's happening now | 00:32 |
*** SumitNaiksatam has quit IRC | 00:33 | |
*** yamahata has joined #openstack-infra | 00:33 | |
jeblair | i'll try to get a change into geard soon that would let us do a fifo allocator. though that will tie nodepool to geard until we get it (or something like it) upstream in c-gearman | 00:33 |
*** signed8b_ has quit IRC | 00:37 | |
*** lcheng_ has joined #openstack-infra | 00:37 | |
*** yamahata has quit IRC | 00:39 | |
*** yamahata has joined #openstack-infra | 00:39 | |
ianw | jeblair: what if we moved https://github.com/openstack-infra/nodepool/blob/master/nodepool/nodepool.py#L1472 (capacity calcs for providers) above https://github.com/openstack-infra/nodepool/blob/master/nodepool/nodepool.py#L1456 (demand calculation) and calculated a flag if things were at capacity | 00:41 |
ianw | then in the deman calculation, ignore min-ready if the flag is set | 00:41 |
*** bdpayne has quit IRC | 00:42 | |
*** r-daneel has quit IRC | 00:43 | |
*** laca has quit IRC | 00:43 | |
*** gokrokve_ has joined #openstack-infra | 00:43 | |
*** shayneburgess has quit IRC | 00:43 | |
*** gokrokve has quit IRC | 00:43 | |
*** marcoemorais has quit IRC | 00:43 | |
*** esmute has quit IRC | 00:50 | |
*** esmute has joined #openstack-infra | 00:50 | |
*** gokrokve_ has quit IRC | 00:50 | |
*** SumitNaiksatam has joined #openstack-infra | 00:51 | |
*** bogdando has quit IRC | 00:52 | |
*** Ryan_Lane1 has quit IRC | 00:54 | |
*** homeless has quit IRC | 00:54 | |
*** gargola has quit IRC | 00:55 | |
*** asettle has joined #openstack-infra | 00:56 | |
*** zhiwei has joined #openstack-infra | 00:57 | |
*** tgohad has joined #openstack-infra | 01:00 | |
*** gokrokve has joined #openstack-infra | 01:00 | |
*** mmaglana has quit IRC | 01:02 | |
*** tgohad__ has joined #openstack-infra | 01:03 | |
*** tgohad has quit IRC | 01:03 | |
*** tsg has quit IRC | 01:03 | |
*** bogdando has joined #openstack-infra | 01:04 | |
*** koolhead17 has quit IRC | 01:05 | |
*** otter768 has joined #openstack-infra | 01:11 | |
*** mriedem has joined #openstack-infra | 01:12 | |
*** tgohad__ has quit IRC | 01:13 | |
*** paulrad has joined #openstack-infra | 01:14 | |
*** Ryan_Lane has joined #openstack-infra | 01:16 | |
*** Ryan_Lane has quit IRC | 01:17 | |
*** jyuso has joined #openstack-infra | 01:18 | |
*** paulrad has quit IRC | 01:19 | |
*** liusheng has joined #openstack-infra | 01:21 | |
*** ashaeron has quit IRC | 01:21 | |
openstackgerrit | Ian Wienand proposed a change to openstack-infra/nodepool: Ignore min-ready when at capacity https://review.openstack.org/118939 | 01:25 |
ianw | jeblair: ^ gotta think about how to test it ... | 01:26 |
openstackgerrit | Davanum Srinivas (dims) proposed a change to openstack-infra/config: Add docs jobs for some oslo projects https://review.openstack.org/118940 | 01:31 |
*** bnemec has quit IRC | 01:32 | |
*** tsg has joined #openstack-infra | 01:32 | |
*** bnemec has joined #openstack-infra | 01:32 | |
*** Hefeweizen has joined #openstack-infra | 01:34 | |
*** signed8bit has joined #openstack-infra | 01:36 | |
*** paulrad has joined #openstack-infra | 01:37 | |
*** tsg has quit IRC | 01:41 | |
*** tsg has joined #openstack-infra | 01:41 | |
*** signed8bit has quit IRC | 01:41 | |
*** signed8bit has joined #openstack-infra | 01:41 | |
*** paulrad has quit IRC | 01:41 | |
*** baohua has joined #openstack-infra | 01:42 | |
*** anvilmutant has joined #openstack-infra | 01:44 | |
*** anvilmutant has quit IRC | 01:45 | |
*** yaguang has joined #openstack-infra | 01:48 | |
jogo | we currently have a missing log file https://review.openstack.org/#/c/113658/ | 01:49 |
jogo | we create but don't colelct javelin.log | 01:50 |
*** yamamoto_ has quit IRC | 01:51 | |
*** HenryG has quit IRC | 01:51 | |
*** nosnos has joined #openstack-infra | 01:54 | |
*** yaguang has quit IRC | 01:54 | |
*** dmsimard_away is now known as dmsimard | 01:57 | |
*** yjiang5 has quit IRC | 02:01 | |
*** mriedem has quit IRC | 02:05 | |
openstackgerrit | K Jonathan Harker proposed a change to openstack-infra/config: Begin cleaning up bashate failures https://review.openstack.org/118944 | 02:06 |
*** dmsimard is now known as dmsimard_away | 02:09 | |
*** yaguang has joined #openstack-infra | 02:14 | |
*** zz_dimtruck is now known as dimtruck | 02:16 | |
*** asettle has quit IRC | 02:17 | |
*** yamamoto_ has joined #openstack-infra | 02:32 | |
*** lcheng_ has quit IRC | 02:35 | |
*** dougwig has quit IRC | 02:36 | |
*** hdd has quit IRC | 02:37 | |
*** dougwig_ has joined #openstack-infra | 02:39 | |
*** bhuvan has quit IRC | 02:39 | |
*** HenryG has joined #openstack-infra | 02:40 | |
*** ianw has quit IRC | 02:46 | |
*** ianw has joined #openstack-infra | 02:46 | |
*** praneshp has quit IRC | 02:50 | |
openstackgerrit | melanie witt proposed a change to openstack-infra/devstack-gate: collect paramiko logs from tempest runs https://review.openstack.org/118947 | 02:50 |
*** tsg has quit IRC | 02:58 | |
*** KanagarajM has joined #openstack-infra | 02:58 | |
*** david-lyle has joined #openstack-infra | 02:59 | |
*** yfried has joined #openstack-infra | 02:59 | |
*** pcrews has quit IRC | 03:02 | |
*** zz_naotok is now known as naotok | 03:04 | |
*** yfried has quit IRC | 03:07 | |
*** mmaglana has joined #openstack-infra | 03:10 | |
*** baoli_ has quit IRC | 03:15 | |
*** baoli has joined #openstack-infra | 03:20 | |
jogo | random question: how long does 'nova list' take for your accounts in rax and HP? | 03:20 |
*** stevemar has joined #openstack-infra | 03:27 | |
*** TravT has left #openstack-infra | 03:32 | |
*** signed8bit has quit IRC | 03:35 | |
*** amcrn_ has quit IRC | 03:36 | |
*** shardy_z has quit IRC | 03:40 | |
*** rushiagr_away is now known as rushiagr | 03:42 | |
*** asettle has joined #openstack-infra | 03:43 | |
*** david-lyle has quit IRC | 03:44 | |
*** tomoe_ has joined #openstack-infra | 03:45 | |
*** unicell1 has joined #openstack-infra | 03:51 | |
*** Mithrandir has quit IRC | 03:53 | |
*** Mithrandir has joined #openstack-infra | 03:53 | |
*** unicell has quit IRC | 03:53 | |
*** daya_k has joined #openstack-infra | 03:59 | |
*** otter768 has quit IRC | 04:01 | |
openstackgerrit | James E. Blair proposed a change to stackforge/gertty: Change _ to - in config YAML https://review.openstack.org/118954 | 04:04 |
openstackgerrit | James E. Blair proposed a change to stackforge/gertty: Change help key https://review.openstack.org/118955 | 04:04 |
openstackgerrit | James E. Blair proposed a change to stackforge/gertty: Clear error flag when changing screen https://review.openstack.org/118956 | 04:04 |
openstackgerrit | James E. Blair proposed a change to stackforge/gertty: Update README and install sample configs https://review.openstack.org/118957 | 04:04 |
*** rushiagr is now known as rushiagr_away | 04:07 | |
*** signed8bit has joined #openstack-infra | 04:08 | |
*** signed8bit has quit IRC | 04:09 | |
*** signed8bit has joined #openstack-infra | 04:10 | |
*** vhoward has joined #openstack-infra | 04:12 | |
*** _nadya_ has joined #openstack-infra | 04:12 | |
*** craigbr has quit IRC | 04:15 | |
*** _nadya_ has quit IRC | 04:17 | |
openstackgerrit | Andreas Jaeger proposed a change to openstack-infra/config: Remove docutils pin https://review.openstack.org/117172 | 04:17 |
jogo | related to https://bugs.launchpad.net/python-novaclient/+bug/1202179 | 04:17 |
uvirtbot | Launchpad bug 1202179 in python-novaclient "findall in novaclient/base.py is inefficient" [Undecided,In progress] | 04:17 |
*** Ryan_Lane has joined #openstack-infra | 04:18 | |
*** dims_ has quit IRC | 04:21 | |
*** hdd has joined #openstack-infra | 04:21 | |
*** dims has joined #openstack-infra | 04:21 | |
openstackgerrit | Jamie Lennox proposed a change to openstack-dev/cookiecutter: Use oslotest rather than copying helpers https://review.openstack.org/118961 | 04:23 |
*** dims has quit IRC | 04:26 | |
*** morganfainberg_Z is now known as morganfainberg | 04:29 | |
*** melwitt has quit IRC | 04:39 | |
*** bhuvan has joined #openstack-infra | 04:47 | |
*** _nadya_ has joined #openstack-infra | 04:47 | |
*** rushiagr_away is now known as rushiagr | 04:51 | |
*** asettle has quit IRC | 04:51 | |
openstackgerrit | Jamie Lennox proposed a change to openstack-dev/cookiecutter: Automatically initialize git when finished. https://review.openstack.org/118967 | 04:57 |
openstackgerrit | Jamie Lennox proposed a change to openstack-dev/cookiecutter: Automatically initialize git when finished. https://review.openstack.org/118967 | 04:59 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Allow a pipeline to specify alternative gerrit acc https://review.openstack.org/97391 | 04:59 |
*** hdd has quit IRC | 05:03 | |
*** nelsnelson has quit IRC | 05:04 | |
*** ppai has joined #openstack-infra | 05:05 | |
*** KanagarajM has quit IRC | 05:06 | |
*** _nadya_ has quit IRC | 05:10 | |
*** _nadya_ has joined #openstack-infra | 05:11 | |
*** TravT has joined #openstack-infra | 05:13 | |
TravT | so zuul has been terrorizing me recently. | 05:14 |
TravT | watching build 3937 right now change 111483,55 | 05:15 |
TravT | just hit an error in gate-tempest-dsvm-neutron-heat-slow | 05:15 |
TravT | it failed with: /opt/stack/new/devstack/extras.d/70-sahara.sh: line 7: /opt/stack/new/devstack/lib/sahara-dashboard: No such file or directory | 05:15 |
TravT | which has nothing to do with this patch (glance patch) | 05:15 |
TravT | is there any way to tell if this patch will rerun tests or do we have to resubmit again? | 05:16 |
clarkb | TravT: it actually failed on http://logs.openstack.org/83/111483/55/gate/gate-tempest-dsvm-neutron-heat-slow/5991c59/logs/devstacklog.txt.gz#_2014-09-04_05_00_35_790 which isn't necessarily any better. since all changes ahead of it failed as well 111483 should be evicted when the tests are done | 05:17 |
*** _nadya_ has quit IRC | 05:17 | |
TravT | clarkb: thanks... | 05:18 |
TravT | guess its back to the end of the line... | 05:20 |
clarkb | it looks like there may have been network trouble between hpcloud and the pypi mirror | 05:20 |
clarkb | there are a few jobs that failed in a similar way all in hpcloud | 05:20 |
*** loki184 has joined #openstack-infra | 05:21 | |
TravT | clarkb: is there any shortcut to get it back into gate? | 05:22 |
clarkb | if it is a gate fixing bug we can promote it | 05:22 |
*** cipcosma has joined #openstack-infra | 05:22 | |
TravT | or do i have to get the code review + a? | 05:22 |
*** baoli has quit IRC | 05:22 | |
clarkb | you don't need to get another approval, you can recheck it | 05:22 |
TravT | ok, so if I just do 'recheck no bug' it'll jump back into workflow? | 05:23 |
TravT | assuming nothing else fails | 05:23 |
clarkb | yes | 05:23 |
TravT | ok, i'll watch it and hope that is it. we've got a devstack patch waiting to be submitted for this patch to clear. | 05:24 |
openstackgerrit | Aaron Rosen proposed a change to openstack-dev/cookiecutter: Add additional gitignores .swo/.swn https://review.openstack.org/118923 | 05:27 |
*** unicell1 has quit IRC | 05:34 | |
openstackgerrit | Aaron Rosen proposed a change to openstack-dev/cookiecutter: Add additional gitignores .sw? https://review.openstack.org/118923 | 05:37 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Add support for negative requirements https://review.openstack.org/102726 | 05:38 |
*** lcheng_ has joined #openstack-infra | 05:39 | |
*** dimtruck is now known as zz_dimtruck | 05:40 | |
*** KanagarajM has joined #openstack-infra | 05:40 | |
*** ppai has quit IRC | 05:42 | |
*** harlowja_at_home has joined #openstack-infra | 05:43 | |
*** Longgeek has joined #openstack-infra | 05:44 | |
*** signed8bit has quit IRC | 05:44 | |
*** _nadya_ has joined #openstack-infra | 05:48 | |
*** harlowja is now known as harlowja_away | 05:50 | |
*** Ryan_Lane has quit IRC | 05:51 | |
*** afazekas has joined #openstack-infra | 05:51 | |
*** yaguang has quit IRC | 05:54 | |
*** ppai has joined #openstack-infra | 05:57 | |
*** BOKALDO has joined #openstack-infra | 06:03 | |
TravT | clarkb: that patch only failed due to the pypy network error. | 06:05 |
TravT | clarkb: i've rechecked no bug | 06:05 |
TravT | can you promote it back up? | 06:05 |
TravT | 111483,55 | 06:06 |
*** cipcosma has quit IRC | 06:06 | |
*** ppai has quit IRC | 06:08 | |
pleia2 | hooray, no i18n meeting for me tonight, had chat with Daisy now instead \o/ | 06:08 |
pleia2 | ah timezones | 06:09 |
* pleia2 rest & | 06:09 | |
dtroyer | TravT, clarkb: I just caught the grenade error re sahara-dashboard. If that is causing real failures, the fix is in https://review.openstack.org/#/c/118090/ if needed. | 06:11 |
*** emagana has joined #openstack-infra | 06:12 | |
*** Longgeek has quit IRC | 06:12 | |
*** emagana has quit IRC | 06:12 | |
*** flaper87|afk is now known as flaper87 | 06:12 | |
*** yfried has joined #openstack-infra | 06:12 | |
TravT | dtroyer: clarkb: i don't know, but all the other tests succeeded. The pypi error might be a problem too. | 06:14 |
TravT | dtroyer: clarkb: either way, that sahara dashboard fix should get promoted up. | 06:14 |
*** Longgeek has joined #openstack-infra | 06:18 | |
*** yfried_ has joined #openstack-infra | 06:19 | |
*** yfried has quit IRC | 06:20 | |
*** yamamoto_ has quit IRC | 06:21 | |
*** ppai has joined #openstack-infra | 06:21 | |
*** Longgeek_ has joined #openstack-infra | 06:21 | |
*** Longgeek_ has quit IRC | 06:22 | |
*** harlowja_at_home has quit IRC | 06:23 | |
*** Longgeek_ has joined #openstack-infra | 06:23 | |
openstackgerrit | Longgeek proposed a change to openstack-infra/puppet-yum: Add .gitreview Rakefile files and update puppet coding style https://review.openstack.org/102876 | 06:24 |
*** Longgeek has quit IRC | 06:24 | |
*** Longgeek has joined #openstack-infra | 06:24 | |
*** amotoki has joined #openstack-infra | 06:25 | |
*** Longgeek_ has quit IRC | 06:26 | |
*** stevemar has quit IRC | 06:31 | |
*** vhoward has left #openstack-infra | 06:35 | |
*** yaguang has joined #openstack-infra | 06:41 | |
*** KanagarajM has quit IRC | 06:48 | |
*** marun_ has joined #openstack-infra | 06:50 | |
*** mflobo has joined #openstack-infra | 06:50 | |
*** daya_k has quit IRC | 06:53 | |
*** skolekonov has joined #openstack-infra | 06:54 | |
*** arxcruz has joined #openstack-infra | 06:54 | |
*** yamamoto_ has joined #openstack-infra | 06:55 | |
*** _nadya_ has quit IRC | 06:57 | |
*** yolanda has joined #openstack-infra | 06:58 | |
*** sunrenjie6 has joined #openstack-infra | 07:01 | |
*** jbryce has quit IRC | 07:03 | |
*** jbryce_ has joined #openstack-infra | 07:03 | |
*** jbryce_ is now known as jbryce | 07:03 | |
*** flaviof_zzz has quit IRC | 07:03 | |
*** sarob has quit IRC | 07:03 | |
*** lbragstad has quit IRC | 07:03 | |
*** roaet_ has quit IRC | 07:03 | |
*** stevebaker has quit IRC | 07:04 | |
*** adam_g has quit IRC | 07:04 | |
*** wendar has quit IRC | 07:04 | |
*** wendar_ has joined #openstack-infra | 07:04 | |
*** roaet_ has joined #openstack-infra | 07:05 | |
*** lbragstad has joined #openstack-infra | 07:05 | |
*** sunrenjie6 has quit IRC | 07:05 | |
*** jpich has joined #openstack-infra | 07:05 | |
*** stevebaker has joined #openstack-infra | 07:06 | |
*** adam_g has joined #openstack-infra | 07:06 | |
*** adam_g has quit IRC | 07:06 | |
*** adam_g has joined #openstack-infra | 07:06 | |
*** sunrenjie6 has joined #openstack-infra | 07:07 | |
*** luqas has joined #openstack-infra | 07:07 | |
*** lcheng_ has quit IRC | 07:08 | |
*** jistr has joined #openstack-infra | 07:09 | |
*** andreykurilin has joined #openstack-infra | 07:10 | |
*** yamamoto_ has quit IRC | 07:10 | |
*** sunrenjie6 has quit IRC | 07:11 | |
*** jamespage_ has joined #openstack-infra | 07:11 | |
TravT | well, we hit another package install timeout on 111483 | 07:14 |
TravT | https://jenkins03.openstack.org/job/gate-glance-python27/764/console | 07:14 |
openstackgerrit | Yuan Zhou proposed a change to openstack/requirements: Bump pyeclib >= 0.9.5 for Swift Erasure Code project https://review.openstack.org/118986 | 07:14 |
*** sabeen has quit IRC | 07:14 | |
*** sabeen has joined #openstack-infra | 07:14 | |
*** jamespage_ has quit IRC | 07:14 | |
*** sunrenjie6 has joined #openstack-infra | 07:14 | |
*** _nadya_ has joined #openstack-infra | 07:16 | |
*** sunrenjie6 has quit IRC | 07:19 | |
*** KanagarajM has joined #openstack-infra | 07:20 | |
*** jgallard has joined #openstack-infra | 07:20 | |
*** Longgeek has quit IRC | 07:20 | |
*** Longgeek has joined #openstack-infra | 07:21 | |
*** pkoniszewski has joined #openstack-infra | 07:21 | |
*** jcoufal has joined #openstack-infra | 07:21 | |
*** KanagarajM has quit IRC | 07:23 | |
*** Longgeek_ has joined #openstack-infra | 07:26 | |
*** Longgeek_ has quit IRC | 07:27 | |
*** sarob has joined #openstack-infra | 07:27 | |
*** jlibosva has joined #openstack-infra | 07:27 | |
*** Longgeek_ has joined #openstack-infra | 07:27 | |
*** flaviof_zzz has joined #openstack-infra | 07:28 | |
*** Longgeek has quit IRC | 07:29 | |
*** doude has joined #openstack-infra | 07:29 | |
*** shardy has joined #openstack-infra | 07:30 | |
*** ildikov has joined #openstack-infra | 07:31 | |
*** bo_sh has joined #openstack-infra | 07:32 | |
*** dizquierdo has joined #openstack-infra | 07:32 | |
*** sunrenjie6 has joined #openstack-infra | 07:32 | |
*** pkoniszewski has quit IRC | 07:33 | |
*** Longgeek has joined #openstack-infra | 07:33 | |
*** pkoniszewski has joined #openstack-infra | 07:33 | |
bo_sh | hey guys, i want to use the log-pusher (client and worker). do i have to run them both? or one of them runs the other when Jenkins sends a ZMQ message? | 07:34 |
*** sunrenjie6 has quit IRC | 07:36 | |
*** Longgeek_ has quit IRC | 07:36 | |
*** Longgeek has quit IRC | 07:37 | |
*** ihrachyshka has joined #openstack-infra | 07:38 | |
*** Longgeek has joined #openstack-infra | 07:38 | |
openstackgerrit | yolanda.robla proposed a change to openstack-infra/config: Add tools to hack on infra with Docker https://review.openstack.org/105917 | 07:39 |
*** mspreitz has quit IRC | 07:40 | |
*** ashaeron has joined #openstack-infra | 07:42 | |
openstackgerrit | Joakim Löfgren proposed a change to openstack-infra/jenkins-job-builder: Add PMD publisher https://review.openstack.org/118312 | 07:42 |
*** ihrachyshka has quit IRC | 07:42 | |
*** e0ne has joined #openstack-infra | 07:46 | |
*** Guest42286 has joined #openstack-infra | 07:48 | |
*** mpaolino has joined #openstack-infra | 07:48 | |
*** mmaglana has quit IRC | 07:50 | |
*** Guest60909 is now known as Adri2000 | 07:50 | |
*** Adri2000 has joined #openstack-infra | 07:50 | |
*** HeOS has quit IRC | 07:51 | |
*** jp_at_hp has joined #openstack-infra | 07:53 | |
*** mrmartin has joined #openstack-infra | 07:54 | |
*** yamamoto_ has joined #openstack-infra | 07:54 | |
*** jgallard has quit IRC | 07:55 | |
*** jgallard has joined #openstack-infra | 07:55 | |
*** garyh has quit IRC | 07:58 | |
*** yamamoto_ has quit IRC | 07:58 | |
*** yamamoto_ has joined #openstack-infra | 07:59 | |
*** yamamoto_ has quit IRC | 08:01 | |
*** pkoniszewski has quit IRC | 08:05 | |
*** pkoniszewski has joined #openstack-infra | 08:06 | |
*** pblaho has joined #openstack-infra | 08:06 | |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/zuul: Refactor sources out of triggers https://review.openstack.org/118993 | 08:07 |
*** bhuvan has quit IRC | 08:11 | |
*** sabeen has quit IRC | 08:11 | |
ttx | SergeyLukjanov: around ? There seems to be some attrition of test nodes on certain jobs | 08:12 |
ttx | Like jobs are queued on the check pipe for 13 hours | 08:12 |
openstackgerrit | Joshua Hesketh proposed a change to openstack-infra/config: Use the latest jquery on zuul https://review.openstack.org/98029 | 08:14 |
*** derekh has joined #openstack-infra | 08:15 | |
lifeless | derekh: hey | 08:16 |
*** yfried_ has quit IRC | 08:16 | |
derekh | lifeless: hi ya | 08:16 |
lifeless | derekh: how goes it? also tchaypo has a second cloud up | 08:17 |
tchaypo | For loose values of up | 08:18 |
*** ZZelle has quit IRC | 08:18 | |
*** dtantsur|afk is now known as dtantsur | 08:19 | |
tchaypo | There is certainly a cloud. Whether it's useful or not, I don't know | 08:19 |
*** ZZelle has joined #openstack-infra | 08:19 | |
derekh | lifeless: got back to hp1 yesterday, increased a few timeouts and reran the tests, gonna look into results today, I've pretty much freed up my time for this again | 08:19 |
*** mmaglana has joined #openstack-infra | 08:20 | |
*** andreykurilin has quit IRC | 08:21 | |
tchaypo | derekh: Do you have access to hp2? | 08:22 |
*** mmaglana_ has joined #openstack-infra | 08:22 | |
derekh | tchaypo: I'm not sure, let me find out | 08:24 |
tchaypo | I'm at the local Python ug tonight so I can't work with you tonight | 08:24 |
*** mmaglana has quit IRC | 08:25 | |
tchaypo | If you can poke at it and email me I can follow up tomorrow, or I can hang around tomorrow night and talk to you | 08:26 |
lifeless | tchaypo: tomorrow lets try the different router confg | 08:27 |
tchaypo | Kk | 08:27 |
*** mmaglana_ has quit IRC | 08:27 | |
lifeless | tchaypo: there should be enough int he bug to figure it out | 08:27 |
lifeless | tchaypo: but if not ping me | 08:27 |
tchaypo | Which bug? | 08:28 |
derekh | tchaypo: yup, I got access to hp2 bastion | 08:28 |
*** garyh has joined #openstack-infra | 08:29 | |
*** daya_k has joined #openstack-infra | 08:30 | |
*** pblaho is now known as pblaho|afk | 08:32 | |
lifeless | tchaypo: the one on the l3 router not setting vlan tags | 08:35 |
*** pblaho|afk is now known as pblaho | 08:35 | |
*** MaxV has joined #openstack-infra | 08:37 | |
lyxus | Hi Folks, I saw an email going around about it, but I can't find it... What is the format needed to be in the right side of the reviewers | 08:37 |
tchaypo | I'm fairly sure some testenvs would be handy | 08:39 |
tchaypo | to get some i think i need to rebuild the ci-overcloud a bit smaller | 08:40 |
tchaypo | I think we probably want about 30 nodes for ci-overcloud and 60 for testenvs? | 08:40 |
ttx | SergeyLukjanov, fungi, jeblair, clarkb: there seems to be unusual attrition of test nodes that cause changes to linger on the check pipeline, with queued jobs | 08:41 |
lifeless | tchaypo: -> #tripleo | 08:41 |
*** bhuvan has joined #openstack-infra | 08:42 | |
ttx | see ever-growing "deleting" vs. "in-use" ratio | 08:42 |
*** mrmartin has quit IRC | 08:43 | |
*** yamamoto_ has joined #openstack-infra | 08:46 | |
*** bhuvan has quit IRC | 08:47 | |
*** arxcruz has quit IRC | 08:47 | |
*** amuller has joined #openstack-infra | 08:49 | |
*** yamamoto_ has quit IRC | 08:50 | |
pkoniszewski | ttx: Jenkins doesn't react to recheck comments, probably everything is getting stuck in 'deleting' state | 08:52 |
*** yamamoto_ has joined #openstack-infra | 08:54 | |
*** HeOS has joined #openstack-infra | 08:57 | |
*** yamamoto_ has quit IRC | 08:59 | |
openstackgerrit | Sylvain Bauza proposed a change to openstack-infra/config: Enable IRC channel logging for #openstack-fr https://review.openstack.org/118651 | 09:03 |
*** luqas has quit IRC | 09:03 | |
*** m1dev has quit IRC | 09:04 | |
*** gokrokve_ has joined #openstack-infra | 09:04 | |
*** Ng has quit IRC | 09:04 | |
*** m1dev has joined #openstack-infra | 09:04 | |
*** Ng_ has joined #openstack-infra | 09:05 | |
openstackgerrit | Joakim Löfgren proposed a change to openstack-infra/jenkins-job-builder: Add PMD publisher https://review.openstack.org/118312 | 09:06 |
*** acruz has joined #openstack-infra | 09:06 | |
*** che-arne has joined #openstack-infra | 09:07 | |
*** BOKALDO has quit IRC | 09:07 | |
*** gokrokve has quit IRC | 09:07 | |
*** gokrokve_ has quit IRC | 09:08 | |
*** Longgeek_ has joined #openstack-infra | 09:13 | |
*** MaxV has quit IRC | 09:14 | |
*** MaxV has joined #openstack-infra | 09:14 | |
*** Longgeek has quit IRC | 09:15 | |
*** Longgeek_ has quit IRC | 09:17 | |
*** bhuvan has joined #openstack-infra | 09:17 | |
*** Longgeek has joined #openstack-infra | 09:18 | |
openstackgerrit | Dmitry Teselkin proposed a change to openstack-infra/config: Make puppetdb_server name configurable https://review.openstack.org/119011 | 09:19 |
*** luqas has joined #openstack-infra | 09:22 | |
*** bhuvan has quit IRC | 09:22 | |
*** andreaf has joined #openstack-infra | 09:22 | |
*** mmaglana has joined #openstack-infra | 09:23 | |
*** cipcosma has joined #openstack-infra | 09:23 | |
*** ociuhandu has quit IRC | 09:26 | |
*** che-arne has quit IRC | 09:27 | |
*** mmaglana has quit IRC | 09:27 | |
*** oanufriev has joined #openstack-infra | 09:28 | |
*** habib has joined #openstack-infra | 09:29 | |
*** yamamoto_ has joined #openstack-infra | 09:30 | |
*** che-arne has joined #openstack-infra | 09:31 | |
*** unicell has joined #openstack-infra | 09:31 | |
ttx | Hmm, looks like we are going up again | 09:32 |
openstackgerrit | yolanda.robla proposed a change to openstack-infra/config: Typo: replace puppet.rsh by ref to puppet-master https://review.openstack.org/118113 | 09:34 |
*** yamamoto_ has quit IRC | 09:34 | |
SergeyLukjanov | ttx, I'm around now | 09:35 |
*** habib has quit IRC | 09:35 | |
*** habib has joined #openstack-infra | 09:35 | |
ttx | SergeyLukjanov: not sure there is anything to fix. the check pipeline just seems to be a bit clogged, with some changes waiting for test resources for 13 hours | 09:36 |
ttx | but then the changes I watched just got some nodes | 09:36 |
SergeyLukjanov | ttx, I'll check nodepool | 09:36 |
ttx | Trick is, for retries, the 13 hour adds up to the 24h in the gate | 09:37 |
ttx | SergeyLukjanov: practical exercise: Sahara change 112159 (which I'm waiting on to tag) is blocked in check queue right now, not even in the gate pipe | 09:38 |
SergeyLukjanov | ttx, yeah, that's sad | 09:38 |
SergeyLukjanov | ttx, yeah, I see, I'm looking for it too | 09:38 |
ttx | but then I was more confused as to why the "top" of the check pipe wasn't geting test resources either | 09:38 |
ttx | like 118332 | 09:39 |
SergeyLukjanov | ttx, there is a lot of "deleting" nodes | 09:39 |
ttx | yes, was wondering if it was cause or symptom | 09:39 |
ttx | and if anything was wrong, or just loaded | 09:40 |
SergeyLukjanov | ttx, 406/814 nodes are in delete state | 09:40 |
ttx | maybe we have a problem building them up? | 09:41 |
ttx | we've seen deletes go up for the last 8 hours | 09:41 |
SergeyLukjanov | ttx, yeah, probably it's a cause | 09:42 |
ttx | although the last hour it's stabilized | 09:42 |
*** Longgeek_ has joined #openstack-infra | 09:43 | |
*** dizquierdo has quit IRC | 09:43 | |
SergeyLukjanov | ttx, there're a lot of nodes in delete state for 5-10 hours | 09:44 |
SergeyLukjanov | ~242 nodes with delete state for more than 1 hour | 09:44 |
SergeyLukjanov | (talking about devstack nodes) | 09:45 |
*** gokrokve has joined #openstack-infra | 09:46 | |
*** rlandy has joined #openstack-infra | 09:46 | |
ttx | SergeyLukjanov: so they take more than one hour to delete ? | 09:46 |
*** Longgeek has quit IRC | 09:46 | |
SergeyLukjanov | ttx, yup | 09:47 |
ttx | hmm, that's probably not a good sign | 09:47 |
SergeyLukjanov | oh, I see node with 173.60 hours in delete state | 09:47 |
*** dtantsur is now known as dtantsur|brb | 09:48 | |
SergeyLukjanov | anteaya, thx for tip | 09:49 |
*** gokrokve has quit IRC | 09:52 | |
*** zhiwei has quit IRC | 09:52 | |
*** baohua has quit IRC | 09:53 | |
*** yamamoto_ has joined #openstack-infra | 09:54 | |
*** mrmartin has joined #openstack-infra | 09:54 | |
SergeyLukjanov | ttx, all nodes with delete state for more than hour are rax nodes! | 09:55 |
ttx | ah-ha. | 09:55 |
SergeyLukjanov | ttx, so, it's definitely the cause | 09:55 |
SergeyLukjanov | ttx, I've tried to manually remove one of 'em and catched timeout | 09:55 |
ttx | so we have all rax nodes getting gradually stuck in delete | 09:56 |
ttx | I just like when it happens at the worst moment | 09:56 |
ttx | who needs chaos monkey when reality is worse | 09:57 |
SergeyLukjanov | yup, it always happens in j3 time :) | 09:57 |
SergeyLukjanov | ttx, do we have a lot of changes we are waiting for j3? | 09:57 |
ttx | About 30 I would say | 09:58 |
SergeyLukjanov | ttx, heh, no way to promote | 09:58 |
*** yamamoto_ has quit IRC | 09:59 | |
*** daya_k has quit IRC | 10:03 | |
*** dmsimard_away is now known as dmsimard | 10:03 | |
*** Adri2000 has quit IRC | 10:05 | |
*** Adri2000 has joined #openstack-infra | 10:05 | |
*** mmitchell has quit IRC | 10:06 | |
*** mmitchell has joined #openstack-infra | 10:06 | |
openstackgerrit | Aidan McGinley proposed a change to openstack-infra/jenkins-job-builder: Adds support for the Config File Provider Plugin https://review.openstack.org/119021 | 10:06 |
*** Adri2000 is now known as Guest71503 | 10:06 | |
*** dmsimard is now known as dmsimard_away | 10:06 | |
*** Guest42286 has quit IRC | 10:06 | |
* ttx prays the gate gods on that last check, got a pile of 4 success if it passes | 10:11 | |
*** jgallard has quit IRC | 10:12 | |
*** MaxV has quit IRC | 10:13 | |
*** nosnos has quit IRC | 10:14 | |
*** nosnos has joined #openstack-infra | 10:15 | |
*** yamahata has quit IRC | 10:16 | |
*** nosnos_ has joined #openstack-infra | 10:18 | |
*** nosnos has quit IRC | 10:19 | |
openstackgerrit | A change was merged to openstack-infra/config: Remove docutils pin https://review.openstack.org/117172 | 10:21 |
*** pblaho is now known as pblaho|afk | 10:22 | |
*** mmaglana has joined #openstack-infra | 10:24 | |
*** rushiagr is now known as rushiagr_away | 10:28 | |
*** mmaglana has quit IRC | 10:29 | |
*** cdent has joined #openstack-infra | 10:30 | |
*** k4n0_ has quit IRC | 10:33 | |
*** k4n0 has joined #openstack-infra | 10:35 | |
*** _nadya_ has quit IRC | 10:35 | |
*** dmsimard_away is now known as dmsimard | 10:36 | |
lyxus | kevinbenton, Hey Kevin, do you use Jenkins to post back your result ? | 10:42 |
kevinbenton | lyxus: no, i have a separate hacky bash script that aggregates the results from both jobs and checks them for known setup failures and then votes | 10:43 |
lyxus | kevinbenton, I am doing the same thing but I can't figure out the correct format to be able to be on the right side of the reviewers | 10:44 |
lyxus | kevinbenton, I just show up at the bottom | 10:44 |
kevinbenton | lyxus: link to a review with your results? | 10:45 |
*** naotok is now known as zz_naotok | 10:45 | |
lyxus | kevinbenton, Example https://review.openstack.org/#/c/105389/ On the right side of the reviewer you can see a list of CI | 10:46 |
*** gokrokve has joined #openstack-infra | 10:46 | |
*** erlon has joined #openstack-infra | 10:46 | |
kevinbenton | lyxus: oh, i think you have the format right. just missing CI in the account name | 10:47 |
lyxus | kevinbenton, The name is "Nuage CI" | 10:47 |
kevinbenton | lyxus: oh sorry, looking at wrong one | 10:48 |
kevinbenton | lyxus: remove the html tags | 10:48 |
kevinbenton | lyxus: those are added by gerrit | 10:49 |
lyxus | kevinbenton, is there a format that you have to follow ? I don't see the CI on the right side | 10:50 |
*** penguinRaider has joined #openstack-infra | 10:50 | |
*** gokrokve has quit IRC | 10:51 | |
*** Guest71503 is now known as Adri2000_ | 10:51 | |
*** Adri2000_ has joined #openstack-infra | 10:51 | |
kevinbenton | lyxus: http://paste.openstack.org/show/105700/ | 10:53 |
lyxus | kevinbenton, let me try that | 10:53 |
kevinbenton | lyxus: i have to go now, but that should get you what you need | 10:53 |
lyxus | kevinbenton, thank you so much ! | 10:53 |
*** dizquierdo has joined #openstack-infra | 10:53 | |
*** yamamoto_ has joined #openstack-infra | 10:54 | |
*** jp_at_hp has quit IRC | 10:55 | |
*** yamamoto_ has quit IRC | 10:59 | |
*** Ng_ is now known as Ng | 11:01 | |
*** e0ne has quit IRC | 11:01 | |
*** jyuso has quit IRC | 11:02 | |
*** ociuhandu has joined #openstack-infra | 11:03 | |
*** dims has joined #openstack-infra | 11:04 | |
*** yaguang has quit IRC | 11:05 | |
openstackgerrit | A change was merged to openstack-infra/config: Replace all check/gate: noop with a template https://review.openstack.org/115500 | 11:08 |
SergeyLukjanov | ttx, there are only 5 nodes in delete state on rax now, yay! | 11:11 |
SergeyLukjanov | ttx, and for all nodes it's only 73/845 in delete state | 11:12 |
SergeyLukjanov | ttx, so, it should start working much better | 11:12 |
*** mpaolino has quit IRC | 11:12 | |
*** dtantsur|brb is now known as dtantsur | 11:14 | |
*** jp_at_hp has joined #openstack-infra | 11:15 | |
*** rushiagr_away is now known as rushiagr | 11:18 | |
*** jab has joined #openstack-infra | 11:18 | |
*** jab has joined #openstack-infra | 11:18 | |
*** dims has quit IRC | 11:19 | |
*** MaxV has joined #openstack-infra | 11:19 | |
*** dims has joined #openstack-infra | 11:19 | |
pkoniszewski | SergeyLukjanov: thanks a lot! | 11:19 |
*** pblaho|afk is now known as pblaho | 11:20 | |
*** acruz has quit IRC | 11:20 | |
*** mmaglana has joined #openstack-infra | 11:21 | |
*** dims_ has joined #openstack-infra | 11:21 | |
*** zhiwei has joined #openstack-infra | 11:21 | |
*** jab is now known as bradjones | 11:21 | |
*** ppai has quit IRC | 11:22 | |
*** dims has quit IRC | 11:24 | |
*** acruz has joined #openstack-infra | 11:25 | |
*** dims_ has quit IRC | 11:26 | |
*** habib has quit IRC | 11:26 | |
*** mmaglana has quit IRC | 11:26 | |
*** habib has joined #openstack-infra | 11:26 | |
*** habib has quit IRC | 11:27 | |
*** dims has joined #openstack-infra | 11:27 | |
*** habib has joined #openstack-infra | 11:27 | |
*** habib has quit IRC | 11:29 | |
ianw | sdague / dtroyer : trying to hunt down what changed to make bashate start looking at this file, when it wasn't before https://review.openstack.org/#/c/119037/ | 11:30 |
*** dims_ has joined #openstack-infra | 11:30 | |
*** pcm_ has joined #openstack-infra | 11:30 | |
*** eglynn__ is now known as eglynn-officeafk | 11:31 | |
*** dims has quit IRC | 11:31 | |
sdague | yeh, I'm honestly not entirely sure where our discovery is at the moment | 11:33 |
*** jamespage_ has joined #openstack-infra | 11:34 | |
ttx | SergeyLukjanov: it does work a lot better now, thanks | 11:36 |
* ttx tags in confidence | 11:36 | |
ianw | sdague: the odd thing is, nothing changed? | 11:36 |
openstackgerrit | Davanum Srinivas (dims) proposed a change to openstack-infra/config: Add missing oslo projects to gerritbot config https://review.openstack.org/117309 | 11:37 |
*** Guest42286 has joined #openstack-infra | 11:37 | |
lyxus | kevinbenton, Must be missing something, still not working ! | 11:38 |
*** ppai has joined #openstack-infra | 11:38 | |
kevinbenton | lyxus: you have newline chars after tests succeeded, right? | 11:39 |
*** mwagner_lap has quit IRC | 11:40 | |
kevinbenton | lyxus: it looks like they aren’t there | 11:40 |
lyxus | kevinbenton, do you force them by writing \n ? | 11:40 |
kevinbenton | lyxus: the line with the result needs to start with - | 11:40 |
kevinbenton | lyxus: no, just make sure it’s double quotes in a bash script | 11:40 |
*** baoli has joined #openstack-infra | 11:41 | |
lyxus | kevinbenton, http://paste.openstack.org/show/105732/ | 11:42 |
*** dims_ has quit IRC | 11:42 | |
kevinbenton | lyxus: echo “$MESSAGE” | 11:42 |
*** dims has joined #openstack-infra | 11:43 | |
lyxus | kevinbenton, let me see | 11:43 |
*** luqas has quit IRC | 11:43 | |
sdague | ianw: there was a bug at some point in file looping that meant we some times skipped files | 11:44 |
kevinbenton | lyxus: here is our vote line | 11:44 |
kevinbenton | http://paste.openstack.org/show/105733/ | 11:44 |
kevinbenton | lyxus: note the quotes around $MESSAGE | 11:44 |
kevinbenton | lyxus: i really have to go now, way past my bed time :-) | 11:45 |
ianw | sdague : passing http://logs.openstack.org/26/118226/1/check/gate-devstack-bashate/0324511/console.html -> run-bashate.sh ... ./run_tests.sh | 11:45 |
*** dims_ has joined #openstack-infra | 11:45 | |
*** MaxV has quit IRC | 11:45 | |
lyxus | kevinbenton, me too :( 4.45am here :( thanks ! | 11:45 |
ianw | sdague: failing http://logs.openstack.org/09/118609/3/check/gate-devstack-bashate/dd30a4d/console.html -> using tox | 11:45 |
*** rkukura_ has joined #openstack-infra | 11:46 | |
*** gokrokve has joined #openstack-infra | 11:46 | |
*** MaxV has joined #openstack-infra | 11:47 | |
*** dims has quit IRC | 11:47 | |
ttx | SergeyLukjanov: did you do anything special ? | 11:47 |
ianw | sdague : https://git.openstack.org/cgit/openstack-infra/config/commit/?id=8ff2dbe1c353b1b79331e69a35a27103e3ef847a | 11:48 |
ttx | SergeyLukjanov: If yes, could be useful to pass the baton to fungi when he will be around | 11:48 |
SergeyLukjanov | ttx, nope, I've just tried to remove some nodes manually | 11:48 |
ttx | SergeyLukjanov: I fear that the problem persists and the amount of stuck-in-deleted will slowly build up again | 11:49 |
SergeyLukjanov | ttx, yeah, I hope it'll work till the end of week at least | 11:49 |
*** loki184 has quit IRC | 11:50 | |
*** mrda1 has joined #openstack-infra | 11:50 | |
*** _d34dh0r53_ has joined #openstack-infra | 11:50 | |
*** YorikSar_ has joined #openstack-infra | 11:50 | |
*** jroll|dupe has joined #openstack-infra | 11:50 | |
*** dtroyer_zz has joined #openstack-infra | 11:50 | |
*** pabelanger_ has joined #openstack-infra | 11:50 | |
*** alaski_ has joined #openstack-infra | 11:50 | |
*** rkukura has quit IRC | 11:50 | |
*** YorikSar has quit IRC | 11:50 | |
*** greghaynes has quit IRC | 11:50 | |
*** jroll has quit IRC | 11:50 | |
*** d34dh0r53 has quit IRC | 11:51 | |
*** mrda has quit IRC | 11:51 | |
*** alaski has quit IRC | 11:51 | |
*** dtroyer has quit IRC | 11:51 | |
*** dtantsur has quit IRC | 11:51 | |
*** jamielennox has quit IRC | 11:51 | |
*** juice has quit IRC | 11:51 | |
*** pabelanger has quit IRC | 11:51 | |
*** dtantsur has joined #openstack-infra | 11:51 | |
*** greghayn1 has joined #openstack-infra | 11:51 | |
*** dtantsur has quit IRC | 11:51 | |
*** dtantsur has joined #openstack-infra | 11:51 | |
*** jroll|dupe is now known as jroll | 11:51 | |
*** jamielennox has joined #openstack-infra | 11:51 | |
*** MaxV has quit IRC | 11:51 | |
*** juice has joined #openstack-infra | 11:51 | |
*** rkukura_ is now known as rkukura | 11:51 | |
*** gokrokve has quit IRC | 11:51 | |
*** MaxV has joined #openstack-infra | 11:51 | |
*** jamespage_ has quit IRC | 11:53 | |
ZZelle | SergeyLukjanov, hi | 11:54 |
*** yamamoto has joined #openstack-infra | 11:54 | |
*** YorikSar_ is now known as YorikSar | 11:54 | |
openstackgerrit | Davanum Srinivas (dims) proposed a change to openstack-infra/config: Add docs jobs for some oslo projects https://review.openstack.org/118940 | 11:55 |
*** acruz has quit IRC | 11:57 | |
*** e0ne has joined #openstack-infra | 11:57 | |
ZZelle | SergeyLukjanov, if you have some time, could you have a look to git-review reviews? In particular, https://review.openstack.org/109851 https://review.openstack.org/114038 proposing interesting features | 11:59 |
*** yamamoto has quit IRC | 11:59 | |
*** mwagner_lap has joined #openstack-infra | 12:01 | |
*** e0ne has quit IRC | 12:02 | |
*** dims_ has quit IRC | 12:05 | |
*** dims has joined #openstack-infra | 12:06 | |
*** acruz has joined #openstack-infra | 12:08 | |
*** mbacchi has joined #openstack-infra | 12:08 | |
*** e0ne has joined #openstack-infra | 12:08 | |
*** isviridov is now known as isviridov_away | 12:08 | |
*** bookwar has quit IRC | 12:09 | |
*** mfink has quit IRC | 12:09 | |
*** bookwar has joined #openstack-infra | 12:10 | |
*** baoli has quit IRC | 12:11 | |
*** flaper87 is now known as echo | 12:11 | |
*** echo is now known as flaper87 | 12:11 | |
*** baoli has joined #openstack-infra | 12:12 | |
*** acruz has quit IRC | 12:13 | |
*** monester has quit IRC | 12:13 | |
*** arxcruz has joined #openstack-infra | 12:13 | |
*** baoli has quit IRC | 12:19 | |
*** mjturek has quit IRC | 12:20 | |
*** mpaolino has joined #openstack-infra | 12:20 | |
*** mmaglana has joined #openstack-infra | 12:21 | |
*** yamamoto has joined #openstack-infra | 12:22 | |
*** mmaglana has quit IRC | 12:26 | |
*** aysyd has joined #openstack-infra | 12:27 | |
*** weshay has joined #openstack-infra | 12:27 | |
*** adalbas has joined #openstack-infra | 12:28 | |
*** bo_sh has left #openstack-infra | 12:32 | |
*** Adri2000_ is now known as Adri2000 | 12:33 | |
openstackgerrit | Cedric Brandily proposed a change to openstack-infra/git-review: Align git-review and python -m git_review.cmd behaviors https://review.openstack.org/119050 | 12:36 |
*** dizquierdo has quit IRC | 12:37 | |
*** mfink has joined #openstack-infra | 12:38 | |
*** kgiusti has joined #openstack-infra | 12:38 | |
*** mfink_ has joined #openstack-infra | 12:38 | |
*** dosaboy_ is now known as dosaboy | 12:38 | |
*** doug-fish has joined #openstack-infra | 12:39 | |
*** flaper87 is now known as flaper87|afk | 12:40 | |
*** bookwar_ has joined #openstack-infra | 12:40 | |
fungi | ttx: SergeyLukjanov: what's going on? | 12:41 |
ttx | fungi: if you look at the nodepool graph, there was growing number of nodes in delete state on rax | 12:41 |
ttx | fungi: Sergey manually deleted them | 12:41 |
fungi | looks like we had some accumulation of nodes in a delete state while i slept, yeah | 12:41 |
ttx | that test node attrition resulted in a huge pileup in check pipe | 12:42 |
fungi | i'll look and see if we're getting any api errors from there | 12:42 |
*** bswartz has quit IRC | 12:42 | |
ttx | in turn resulting in recheck delays when targeted changes got reverified | 12:42 |
*** flaper87|afk is now known as flaper87 | 12:42 | |
ttx | I fear the root cause is not fixed and therefore the manual cleanup might be necessary again | 12:43 |
*** mfink has quit IRC | 12:43 | |
*** luqas has joined #openstack-infra | 12:44 | |
*** TravT has quit IRC | 12:46 | |
SergeyLukjanov | ZZelle, added to my review queue | 12:46 |
SergeyLukjanov | fungi, morning | 12:46 |
*** gokrokve has joined #openstack-infra | 12:46 | |
SergeyLukjanov | ttx, as ttx said I fear that we'll face the same issue again | 12:46 |
SergeyLukjanov | fungi, ^^ | 12:46 |
SergeyLukjanov | fungi, there were about two hundreds of nodes on rax DCs with 5-10 hours | 12:47 |
*** baoli has joined #openstack-infra | 12:47 | |
*** flaviof_zzz is now known as flaviof | 12:47 | |
fungi | yeah, i'm looking to see whether any obvious cause was being logged while it was all gummed up like that | 12:47 |
*** dprince has joined #openstack-infra | 12:48 | |
fungi | we were getting quite a few http 500 responses from hpcloud, but that's no help | 12:50 |
openstackgerrit | Dan Prince proposed a change to openstack-infra/config: Import os-net-config project under TripleO https://review.openstack.org/112331 | 12:51 |
*** gokrokve has quit IRC | 12:51 | |
fungi | oh... it's been trying to create devstack-f20-virt-preview nodes in rax regions and unable to run prepare_node_devstack_virt_preview.sh successfully. i wonder if there's some blocking going on which is interfering with the periodic deletion | 12:52 |
*** dprince has quit IRC | 12:53 | |
fungi | it would almost have to be a problem with the periodic deleter since the pattern on the graph indicates a slow leak rather than the rapid trail-off we'd see if we weren't deleting any | 12:53 |
ZZelle | SergeyLukjanov, thanks | 12:53 |
*** lcheng_ has joined #openstack-infra | 12:54 | |
*** lcheng_ has quit IRC | 12:54 | |
*** dprince has joined #openstack-infra | 12:54 | |
*** baohua has joined #openstack-infra | 12:54 | |
*** pblaho has quit IRC | 12:55 | |
fungi | i'm manually trying one of those image updates now to see if i can tell why it's failing to complete | 12:55 |
*** miqui has joined #openstack-infra | 12:56 | |
*** baohua has quit IRC | 12:57 | |
*** yaguang has joined #openstack-infra | 12:57 | |
*** yamahata has joined #openstack-infra | 12:58 | |
*** dkranz has quit IRC | 12:58 | |
openstackgerrit | A change was merged to openstack-infra/devstack-gate: Add oslo.log to devstack-vm-gate-wrap.sh https://review.openstack.org/116995 | 12:58 |
*** baohua has joined #openstack-infra | 12:59 | |
*** pblaho has joined #openstack-infra | 12:59 | |
*** tsg has joined #openstack-infra | 12:59 | |
openstackgerrit | Dmitry Teselkin proposed a change to openstack-infra/config: Pin pupppetdb-terminus package https://review.openstack.org/118660 | 13:00 |
*** yaguang has quit IRC | 13:01 | |
*** radez_g0n3 is now known as radez | 13:02 | |
*** yaguang has joined #openstack-infra | 13:02 | |
*** mjturek has joined #openstack-infra | 13:02 | |
openstackgerrit | Davanum Srinivas (dims) proposed a change to openstack-infra/config: Add docs jobs for some oslo projects https://review.openstack.org/118940 | 13:06 |
*** mjturek has quit IRC | 13:07 | |
*** marun_ has quit IRC | 13:07 | |
*** mjturek has joined #openstack-infra | 13:07 | |
*** jcoufal has quit IRC | 13:08 | |
*** tsg has quit IRC | 13:10 | |
*** ppai has quit IRC | 13:10 | |
*** eglynn-officeafk is now known as eglynn-office | 13:11 | |
*** jcoufal has joined #openstack-infra | 13:11 | |
*** k4n0 has quit IRC | 13:12 | |
*** baoli has quit IRC | 13:13 | |
*** dkranz has joined #openstack-infra | 13:14 | |
fungi | INFO nodepool.image.build.rax-iad.devstack-f20-virt-prev | 13:15 |
fungi | iew: /opt/nodepool-scripts/prepare_devstack_virt_preview.sh: line 25: ./prepare_devstack.sh: No such file or directory | 13:15 |
fungi | bummer | 13:15 |
*** mestery has quit IRC | 13:17 | |
*** bswartz has joined #openstack-infra | 13:18 | |
*** mriedem has joined #openstack-infra | 13:18 | |
*** yaguang has quit IRC | 13:19 | |
*** dizquierdo has joined #openstack-infra | 13:19 | |
*** yaguang has joined #openstack-infra | 13:19 | |
annegentle | thanks ttx for the help with the tc update blog post! | 13:19 |
*** baoli has joined #openstack-infra | 13:19 | |
openstackgerrit | A change was merged to openstack-infra/zuul: Clean up bad layout files in zuul tests https://review.openstack.org/116118 | 13:19 |
*** isviridov_away is now known as isviridov | 13:20 | |
ttx | annegentle: np, publish when you feel like it ! (today ideally) | 13:20 |
*** mestery has joined #openstack-infra | 13:21 | |
*** mmaglana has joined #openstack-infra | 13:21 | |
*** mestery has quit IRC | 13:21 | |
*** mestery has joined #openstack-infra | 13:22 | |
*** gargola has joined #openstack-infra | 13:22 | |
*** vhoward has joined #openstack-infra | 13:23 | |
openstackgerrit | Jeremy Stanley proposed a change to openstack-infra/config: Correct path in prepare_devstack_virt_preview.sh https://review.openstack.org/119059 | 13:23 |
fungi | ianw: ^ | 13:23 |
fungi | as my grandpa always said, "one bug leads to another" | 13:24 |
annegentle | ttx: yeah definitely want it to go today | 13:24 |
fungi | i'll nab a thread dump from nodepool now and see if i can tell whether the deleter is blocking on the image builder | 13:25 |
*** marun_ has joined #openstack-infra | 13:25 | |
*** mmaglana has quit IRC | 13:25 | |
*** julim has joined #openstack-infra | 13:27 | |
*** pcrews has joined #openstack-infra | 13:27 | |
*** baoli has quit IRC | 13:28 | |
*** dkranz has quit IRC | 13:28 | |
*** baoli has joined #openstack-infra | 13:28 | |
*** dkranz has joined #openstack-infra | 13:29 | |
*** dizquierdo has quit IRC | 13:32 | |
*** skolekonov has quit IRC | 13:34 | |
*** dkliban_afk is now known as dkliban | 13:35 | |
*** bknudson has joined #openstack-infra | 13:35 | |
*** amotoki_ has joined #openstack-infra | 13:35 | |
*** dkranz has quit IRC | 13:36 | |
*** zhiwei has quit IRC | 13:39 | |
mriedem | looks like the py33 jobs have some issues: https://bugs.launchpad.net/openstack-ci/+bug/1365512 | 13:39 |
uvirtbot | Launchpad bug 1365512 in oslotest "python33 jobs failing with "No distributions matching the version for oslotest>=1.1.0.0a1"" [Undecided,New] | 13:39 |
*** yaguang has quit IRC | 13:40 | |
*** rpodolyaka1 is now known as rpodolyaka_pto | 13:42 | |
*** craigbr has joined #openstack-infra | 13:42 | |
*** yaguang has joined #openstack-infra | 13:44 | |
*** tsg has joined #openstack-infra | 13:46 | |
*** gokrokve has joined #openstack-infra | 13:46 | |
*** homeless has joined #openstack-infra | 13:48 | |
*** alexpilotti has joined #openstack-infra | 13:48 | |
*** nosnos_ has quit IRC | 13:50 | |
*** gokrokve has quit IRC | 13:51 | |
*** nosnos has joined #openstack-infra | 13:51 | |
openstackgerrit | Matt Riedemann proposed a change to openstack-infra/elastic-recheck: Add query for oslotest/infra py33 bug 1365512 https://review.openstack.org/119063 | 13:51 |
uvirtbot | Launchpad bug 1365512 in oslotest "python33 jobs failing with "No distributions matching the version for oslotest>=1.1.0.0a1"" [Undecided,New] https://launchpad.net/bugs/1365512 | 13:51 |
*** portante has quit IRC | 13:52 | |
*** r-daneel has joined #openstack-infra | 13:53 | |
*** unicell has quit IRC | 13:53 | |
*** weshay has quit IRC | 13:54 | |
*** sballe has joined #openstack-infra | 13:55 | |
*** nosnos has quit IRC | 13:55 | |
*** cnesa10 has quit IRC | 13:55 | |
*** yaguang has quit IRC | 13:56 | |
*** yaguang has joined #openstack-infra | 13:56 | |
*** paulrad has joined #openstack-infra | 13:58 | |
openstackgerrit | Dmitry Teselkin proposed a change to openstack-infra/config: Dependencies to install python-cinderclient https://review.openstack.org/119066 | 13:59 |
fungi | mriedem: yeah, this is known. apparently oslotest does not build universal wheels because it has some py3k-specific test requirement differences (specifically using mox3 vs mox on different interpreter versions), and since we exclusively publish wheels for prereleases we only get the platform-specific wheel for the interpreter on which it was built | 13:59 |
*** dtantsur is now known as dtantsur|brb | 14:00 | |
fungi | mriedem: consensus seems to be for us to find ways to make libraries work in python 3.x and 3.x without using split requirements lists | 14:00 |
*** radez is now known as radez_g0n3 | 14:00 | |
*** dougwig_ is now known as dougwig | 14:00 | |
fungi | mriedem: in the oslotest case, there is a proposed change to just use mox3 all the time and get rid of the py3k-specific requirements list | 14:00 |
sdague | this is also fixable if we stop pre-releasing | 14:00 |
fungi | release early and often | 14:01 |
sdague | I feel that the pre-release hack has only caused grief | 14:01 |
mriedem | alpha packages have caused me some pain | 14:01 |
mriedem | but i'm only one person | 14:01 |
mriedem | +sdague apparently :) | 14:01 |
*** yamahata has quit IRC | 14:02 | |
*** yamahata has joined #openstack-infra | 14:02 | |
*** unicell has joined #openstack-infra | 14:03 | |
fungi | the current prerelease mechanisms spun mostly out of oslo wanting to find a way to sync releases with the integrated server release yet still be able to have unofficial versions which could be used in development branches of servers while preparing for the next integrated release | 14:03 |
*** nelsnelson has joined #openstack-infra | 14:04 | |
fungi | i too think it's worth revisiting whether this truly is a necessary constraint, or just something we thought would help but now have evidence to the contrary | 14:04 |
*** rkukura_ has joined #openstack-infra | 14:05 | |
fungi | it was also a plan which evolved at a time before we were running integration tests with currrent git checkouts, and were still stuck on consuming the release artifacts of our own libraries | 14:06 |
dims | boatload of reviews fungi :) | 14:06 |
fungi | dims: meh, need to make sure i can vote in *all* the elections, right? | 14:06 |
*** yamamoto_ has joined #openstack-infra | 14:07 | |
dims | lol | 14:07 |
*** [HeOS] has joined #openstack-infra | 14:07 | |
*** dane_leblanc has joined #openstack-infra | 14:08 | |
fungi | also, i can nominate myself for ptl of all projects which don't have anyone running for election, as punishment to projects who are disinterested in governance ;) | 14:08 |
fungi | (having me as a ptl should be sufficient threat for every project to cough up at least one ptl candidate) | 14:08 |
*** annegent_ has joined #openstack-infra | 14:09 | |
*** markmcclain has joined #openstack-infra | 14:10 | |
mtreinish | fungi: heh, that was jogo's plan during the last ptl election | 14:11 |
fungi | gotta carry the torch | 14:11 |
*** _nadya_ has joined #openstack-infra | 14:12 | |
jogo | fungi: I was going to do that again this cycle, if there is such a project we should run against eachother :) | 14:13 |
jogo | make it fun | 14:13 |
*** wenlock has joined #openstack-infra | 14:13 | |
openstackgerrit | Darragh Bailey proposed a change to openstack-infra/jenkins-job-builder: Move ordereddict to requirements https://review.openstack.org/119071 | 14:13 |
*** yaguang has quit IRC | 14:13 | |
*** pkoniszewski has quit IRC | 14:14 | |
*** cherriges has joined #openstack-infra | 14:14 | |
*** AaronGreen has joined #openstack-infra | 14:14 | |
*** mrodden_ has joined #openstack-infra | 14:14 | |
*** MaxV_ has joined #openstack-infra | 14:14 | |
*** mfer has joined #openstack-infra | 14:15 | |
fungi | jogo: http://i.imgur.com/zzbuCSC.jpg | 14:16 |
*** yamahata has quit IRC | 14:17 | |
*** pblaho has quit IRC | 14:17 | |
*** yamamoto has quit IRC | 14:17 | |
*** rkukura has quit IRC | 14:17 | |
*** Guest42286 has quit IRC | 14:17 | |
*** cipcosma has quit IRC | 14:17 | |
*** HeOS has quit IRC | 14:17 | |
*** ekarlso- has quit IRC | 14:17 | |
*** AaronGr has quit IRC | 14:17 | |
*** jpeeler has quit IRC | 14:17 | |
*** mrodden has quit IRC | 14:17 | |
*** odyi has quit IRC | 14:17 | |
*** harlowja_away has quit IRC | 14:17 | |
*** annegentle has quit IRC | 14:17 | |
*** bcrochet has quit IRC | 14:17 | |
*** grantbow has quit IRC | 14:17 | |
*** aviau has quit IRC | 14:17 | |
*** antonym has quit IRC | 14:17 | |
*** MaxV has quit IRC | 14:17 | |
*** rkukura_ is now known as rkukura | 14:17 | |
*** mrodden_ is now known as mrodden | 14:17 | |
*** cherriges is now known as odyi | 14:17 | |
*** odyi has quit IRC | 14:17 | |
*** odyi has joined #openstack-infra | 14:17 | |
fungi | jogo: that *would* be punishment | 14:17 |
*** cipcosma has joined #openstack-infra | 14:17 | |
jogo | fungi: hehe | 14:18 |
mtreinish | fungi: are you Jack Johnson or John Jackson? | 14:18 |
fungi | mtreinish: how can you tell? captions? | 14:18 |
mtreinish | heh, yeah they were captions I think | 14:18 |
mtreinish | google found me: http://futurama.wikia.com/wiki/Jack_Johnson and http://futurama.wikia.com/wiki/John_Jackson | 14:19 |
fungi | i mean, they were clones after all, so basically if you switched the captions then it wouldn't matter | 14:19 |
*** yamahata has joined #openstack-infra | 14:20 | |
fungi | (which was more or less what i meant by "how can you tell?) | 14:20 |
*** otherwiseguy has joined #openstack-infra | 14:21 | |
*** mmaglana has joined #openstack-infra | 14:21 | |
*** antonym has joined #openstack-infra | 14:22 | |
*** atiwari has joined #openstack-infra | 14:22 | |
mtreinish | heh, fair enough. I was just giving you a hard time :) | 14:22 |
*** jpeeler has joined #openstack-infra | 14:22 | |
*** ekarlso- has joined #openstack-infra | 14:22 | |
*** lttrl has joined #openstack-infra | 14:23 | |
*** portante has joined #openstack-infra | 14:23 | |
fungi | but basically, yes, we could be a lax project's john jackson and jack johnson, painfully encouraging them to create their own robo-nixon | 14:24 |
mtreinish | I think it's more amusing that someone took the time to make separate pages for both characters | 14:24 |
*** Guest42286 has joined #openstack-infra | 14:24 | |
*** zz_jgrimm is now known as jgrimm | 14:25 | |
*** mmaglana has quit IRC | 14:25 | |
mtreinish | fungi: ooh, robo-nixon for ptl. I think I might have found a theme for my nomination email... | 14:27 |
openstackgerrit | Aidan McGinley proposed a change to openstack-infra/jenkins-job-builder: Adds support for the Config File Provider Plugin https://review.openstack.org/119021 | 14:27 |
*** jistr has quit IRC | 14:27 | |
*** andreaf has quit IRC | 14:28 | |
*** ajo is now known as ajo|call | 14:28 | |
*** andreaf has joined #openstack-infra | 14:28 | |
*** andreaf is now known as andreaf_ | 14:28 | |
*** dangers_away is now known as dangers | 14:29 | |
*** jistr has joined #openstack-infra | 14:29 | |
fungi | mtreinish: certainly beats out my "who run barter town" campaign theme | 14:29 |
sdake | morning folks | 14:30 |
fungi | oh joy... something is tanking horizon unit tests in the gate now | 14:30 |
fungi | morning to you sdake | 14:30 |
openstackgerrit | Davanum Srinivas (dims) proposed a change to openstack-infra/config: Add docs jobs for some oslo projects https://review.openstack.org/118940 | 14:30 |
*** eharney has joined #openstack-infra | 14:31 | |
*** david-lyle has joined #openstack-infra | 14:31 | |
fungi | looks like horizon's test_launch_stack_with_hidden_parameters is throwing AssertionError: Couldn't find '<input class=" form-control" id="id___param_public_string" name="__param_public_string" type="text" />' in response | 14:32 |
*** gokrokve has joined #openstack-infra | 14:32 | |
fungi | david-lyle: jpich: ^ known issue? | 14:32 |
openstackgerrit | A change was merged to openstack-infra/config: Move os-{apply,collect}-config to python3-jobs https://review.openstack.org/118073 | 14:33 |
*** hdd has joined #openstack-infra | 14:33 | |
*** rkukura is now known as 6A4AAKDJG | 14:33 | |
*** wenlock has quit IRC | 14:33 | |
*** 17SAA4MFR has joined #openstack-infra | 14:33 | |
*** pblaho has joined #openstack-infra | 14:33 | |
*** yamamoto has joined #openstack-infra | 14:33 | |
*** rkukura has joined #openstack-infra | 14:33 | |
*** 17SAA4JSK has joined #openstack-infra | 14:33 | |
*** 17SAA4HWG has joined #openstack-infra | 14:33 | |
*** 17SAA4BEV has joined #openstack-infra | 14:33 | |
*** AaronGr has joined #openstack-infra | 14:33 | |
*** 17SAA3Z2C has joined #openstack-infra | 14:33 | |
*** annegentle has joined #openstack-infra | 14:33 | |
*** aviau has joined #openstack-infra | 14:33 | |
*** grantbow has joined #openstack-infra | 14:33 | |
*** 17SAA4BEV has quit IRC | 14:33 | |
*** bcrochet has joined #openstack-infra | 14:33 | |
*** grantbow has quit IRC | 14:33 | |
*** grantbow has joined #openstack-infra | 14:33 | |
*** annegentle_ has joined #openstack-infra | 14:33 | |
*** 17SAA4MFR has quit IRC | 14:33 | |
*** bcrochet has quit IRC | 14:33 | |
*** bcrochet has joined #openstack-infra | 14:33 | |
*** pblaho has quit IRC | 14:33 | |
*** yamamoto has quit IRC | 14:33 | |
*** rkukura has quit IRC | 14:33 | |
*** 17SAA4JSK has quit IRC | 14:33 | |
*** 17SAA4HWG has quit IRC | 14:33 | |
*** AaronGr has quit IRC | 14:34 | |
*** 17SAA3Z2C has quit IRC | 14:34 | |
*** annegentle has quit IRC | 14:34 | |
*** aviau has quit IRC | 14:34 | |
openstackgerrit | A change was merged to openstack-infra/config: Enable oslo.db testing on python 3.3 https://review.openstack.org/112006 | 14:34 |
*** annegentle_ is now known as Guest18621 | 14:34 | |
*** jaypipes has joined #openstack-infra | 14:34 | |
fungi | huh, maybe it's just https://review.openstack.org/67140 which is affected | 14:34 |
*** aviau has joined #openstack-infra | 14:35 | |
*** stevemar has joined #openstack-infra | 14:35 | |
*** Svedrin has quit IRC | 14:35 | |
*** signed8bit has joined #openstack-infra | 14:36 | |
*** koolhead17 has joined #openstack-infra | 14:36 | |
fungi | jogo: is that large-ops or partial-ncpu which climbed up to 75% fails? http://jogo.github.io/gate/ | 14:36 |
*** Svedrin has joined #openstack-infra | 14:37 | |
david-lyle | fungi: it's new to me | 14:37 |
*** annegent_ has quit IRC | 14:37 | |
david-lyle | I'll take a look at master locally | 14:38 |
jogo | fungi: AFAIK grenade partial-ncpu | 14:38 |
jogo | large ups doesn't do very much and is this pretty stable | 14:38 |
fungi | david-lyle: it may just be that one change at the head of the gate, since i see unit tests passing for horizon in the check pipeline recently | 14:38 |
jogo | thus* | 14:38 |
*** annegentle has joined #openstack-infra | 14:38 | |
*** jistr has quit IRC | 14:39 | |
openstackgerrit | A change was merged to openstack-infra/reviewday: Change Marconi to Zaqar https://review.openstack.org/116938 | 14:39 |
jogo | fungi: bug 1270710 just spiked again | 14:39 |
uvirtbot | Launchpad bug 1270710 in openstack-ci "sporadic pip timeouts during download" [Medium,Incomplete] https://launchpad.net/bugs/1270710 | 14:39 |
*** jistr has joined #openstack-infra | 14:39 | |
*** jistr has quit IRC | 14:39 | |
*** jistr has joined #openstack-infra | 14:40 | |
*** radez_g0n3 is now known as radez | 14:40 | |
bswartz | do you guys have a process for killing check jobs that go out to lunch and never come back? | 14:40 |
fungi | bswartz: you can upload a new patchset or change the commit message slightly in gerrit (which does the same thing), but to which change are you seeing this happen? i'd like to see if it's a bug somewhere | 14:42 |
*** bhuvan has joined #openstack-infra | 14:43 | |
bswartz | fungi: done | 14:43 |
fungi | bswartz: i'm curious to know why check jobs check jobs went to lunch and never came back" on a change, so i can hopefully try to figure out if it's something we can solve | 14:45 |
*** vhoward has left #openstack-infra | 14:45 | |
fungi | bswartz: to which change are you seeing this happen? | 14:45 |
bswartz | fungi: https://review.openstack.org/#/c/114737/ | 14:46 |
fungi | bswartz: why do you say the check jobs went to lunch on that change? | 14:47 |
*** koolhead17 has quit IRC | 14:47 | |
fungi | a patchset was uploaded only a couple hours ago, and we're presently running a many-hours backlog for check pipeline workers because of the j-3 milestone gate stampede | 14:47 |
bswartz | it was taking forever to run | 14:48 |
bswartz | other jobs that started after it finished before it | 14:48 |
fungi | bswartz: what were you seeing happen for it on http://status.openstack.org/zuul/ ? | 14:48 |
bswartz | it was in the check pipeline for 2.5 hours | 14:48 |
fungi | unfortunately now that you've pushed a new patchset i can't see if there was anything out of the ordinary happening to it | 14:48 |
fungi | bswartz: yep, we have changes which have been waiting in the check pipeline for some types of workers for over 19 hours | 14:49 |
bswartz | okay so you believe 2.5 hours is maybe reasonable -- it was going to eventually finish? | 14:49 |
fungi | bswartz: basically you just reset the clock back another 2.5 hours on getting results posted to that change | 14:49 |
openstackgerrit | A change was merged to openstack-infra/config: Add new stackforge project, python-rackclient https://review.openstack.org/118070 | 14:49 |
* bswartz foreheaddesk | 14:49 | |
*** datsun180b has joined #openstack-infra | 14:50 | |
*** koolhead17 has joined #openstack-infra | 14:50 | |
fungi | which is why i asked you to let me know which change it was before uploading a new patchset | 14:50 |
*** koolhead17 has quit IRC | 14:50 | |
bswartz | okay my bad | 14:50 |
bswartz | I'll try to be more patient | 14:51 |
*** ildikov has quit IRC | 14:51 | |
fungi | bswartz: if you're new to the project, be aware that test results get very, very backed up just before the feature freeze milestone during each development cycle | 14:52 |
*** tonytan4ever has joined #openstack-infra | 14:52 | |
*** shayneburgess has joined #openstack-infra | 14:52 | |
bswartz | I was observing other jobs going faster so I assumed that job had something wrong with it | 14:52 |
markvan_ | fungi: hello, I could use some help again, my patch https://review.openstack.org/#/c/117274/ seems to have completely broke our chef gates. two issues with it. the "sudo apt-get install" did not work, and it seems that ruby version is 1.8 instead of 1.9. | 14:53 |
bswartz | It sounds like you're saying that's normal so I'll just stop worrying | 14:53 |
fungi | bswartz: mostly due to available compute resources--we're currently consuming almost 900 8gb-ram virtual machines nonstop and most of them are focused on testing changes which are already in the gate pipeline | 14:53 |
*** Sincler has joined #openstack-infra | 14:53 | |
*** shayneburgess has quit IRC | 14:54 | |
markvan_ | fungi: log is here: http://logs.openstack.org/49/118749/2/check/gate-cookbook-openstack-common-chef-lint/eaeef4f/console.html | 14:54 |
*** dustins has joined #openstack-infra | 14:55 | |
fungi | markvan_: most jobs revoke sudo access for themselves before running their test payload, to ensure that normal developers don't get sudo accidentally run on them on their workstations when attempting to run the same tests. it's configurable though | 14:56 |
openstackgerrit | Andreas Jaeger proposed a change to openstack-infra/config: Templatize puppet projects in layout.yaml https://review.openstack.org/114001 | 14:56 |
mordred | fungi: so, you're saying we're busy? maybe I should go back on vacation? | 14:56 |
*** tsg has quit IRC | 14:56 | |
*** baohua has quit IRC | 14:56 | |
fungi | mordred: busy is relative. i'm no more busy than usual, but the machines are working their chips off | 14:57 |
*** sandywalsh has quit IRC | 14:57 | |
*** sandywalsh has joined #openstack-infra | 14:57 | |
*** shayneburgess has joined #openstack-infra | 14:57 | |
*** andreykurilin_ has joined #openstack-infra | 14:58 | |
markvan_ | fungi: so being new to that part of it, should that be obvious in the job log? I see a " sudo: unable to resolve host bare-precise-hpcloud-b3-1924966" near the top. something not configured right for that? | 14:58 |
*** shayneburgess has quit IRC | 14:59 | |
mordred | fungi: oh, well, I'm fine with machines performing work | 14:59 |
fungi | markvan_: no, that's just a benign warning sudo throws when it has trouble mapping the system's interface addresses to something in /etc/hosts, but it works fine regardless | 15:00 |
*** mfer has quit IRC | 15:00 | |
*** mfer has joined #openstack-infra | 15:00 | |
*** skolekonov has joined #openstack-infra | 15:01 | |
markvan_ | so my log shows: http://paste.openstack.org/show/105843/ | 15:01 |
*** jcoufal has quit IRC | 15:01 | |
markvan_ | does that mean it's revoked? | 15:01 |
*** baoli has quit IRC | 15:02 | |
fungi | markvan_: see http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/jenkins_job_builder/config/chef-jobs.yaml#n30 | 15:02 |
*** dane_leblanc has quit IRC | 15:02 | |
fungi | markvan_: note that the list of builders in that job-template definition (which are run in order) starts with "revoke-sudo" | 15:02 |
*** zz_dimtruck is now known as dimtruck | 15:02 | |
*** baoli has joined #openstack-infra | 15:03 | |
jogo | fungi: I think one of the reasons grenade is less stable is it does more pip installs | 15:03 |
*** shayneburgess has joined #openstack-infra | 15:03 | |
fungi | markvan_: if you leave that out or move it after another builder which needs sudo permission then it should be able to sudo passwordlessly | 15:03 |
jogo | http://logs.openstack.org/60/112660/15/gate/gate-grenade-dsvm-partial-ncpu/cac23c4/logs/grenade.sh.txt.gz#_2014-09-04_05_23_51_488 | 15:03 |
fungi | jogo: quite probable | 15:03 |
jogo | fungi: could we add something to the .pip.conf to add a local cache? | 15:03 |
jogo | or is that already there | 15:03 |
fungi | mordred: ^? | 15:04 |
jogo | hmm http://logs.openstack.org/60/112660/15/gate/gate-grenade-dsvm-partial-ncpu/cac23c4/logs/devstack-gate-setup-host.txt.gz#_2014-09-04_05_06_38_654 | 15:04 |
fungi | it's not already there, i think the suggestion would be to preinstall a local pip cache on the nodepool images | 15:04 |
markvan_ | fungi: yup, see that, thx. Next question is about the ruby version, where can I better understand how to get the default to 1.9. would "sudo update-alternatives --set ruby /usr/bin/ruby1.9.1" work? | 15:04 |
jogo | fungi: modules/openstack_project/files/pip.conf | 15:05 |
jogo | for starters we can add a local cache dir in that | 15:05 |
fungi | markvan_: i think in other jobs running on ubuntu 12.04 we just invoked /usr/bin/ruby1.9.1 explicitly where it was needed rather than merely running "ruby" (or rather than letting the shebang line pick one for you) | 15:05 |
markvan_ | fungi: ok, I'll look into that as I believe we had it that way a while ago, but ran into another issue so it was changed. thx for your help. | 15:06 |
fungi | jogo: i think the challenge is that we need a lot of packages to cover most of the popular jobs, and are already constrained on image size (we get somewhere around 20gb of disk for each root snapshot) | 15:06 |
jogo | fungi: well if we just add the download_cache option | 15:06 |
jogo | that should help the grenade scenario at least | 15:06 |
jogo | and won't change the image size | 15:07 |
jogo | it may not help much | 15:07 |
fungi | jogo: oh, to keep it from redownloading the same files over and over in one run of a job? | 15:07 |
jogo | but AFAIK it cannot hurt? | 15:07 |
jogo | yup | 15:07 |
jeblair | fungi, jogo: we stopped using the local pip cache because the openstack pypi mirror was so reliable. | 15:07 |
*** dane_leblanc has joined #openstack-infra | 15:07 | |
jeblair | not saying that's the current situation :) | 15:07 |
jeblair | but i'm pretty sure we stopped using it even in the devstack case | 15:08 |
jogo | jeblair: ahh, well bug 1270710 is the top at the moment | 15:08 |
*** bdpayne has joined #openstack-infra | 15:08 | |
uvirtbot | Launchpad bug 1270710 in openstack-ci "sporadic pip timeouts during download" [Medium,Incomplete] https://launchpad.net/bugs/1270710 | 15:08 |
fungi | also, there are some pip/easy_install caching mechanisms which would take the locally-cached file in preference over downloading one even if the locally-cached version was outside the version spec being requested | 15:08 |
fungi | which was an obvious recipe for broken | 15:08 |
jogo | fungi: is that still an issue? | 15:08 |
jeblair | fungi, jogo: have we analysis on whether that's more frequent hp-rax than rax-rax? | 15:09 |
* jogo checks logstash.o.o | 15:09 | |
fungi | jogo: i don't remember which it was... may have been for packages which were install_requires serviced by setuptools/easy_install but not for packages installed explicitly from requirements via pip | 15:09 |
fungi | jeblair: looking now | 15:09 |
jogo | jeblair: in 24 hours, 249 hits for hpcloud | 15:11 |
jogo | and 6 for NOT hpcloud | 15:11 |
jeblair | separately, seeing how often it happens for nodes in rax-dfw (where the mirror is) is interesting too | 15:11 |
jogo | aka rax | 15:11 |
jeblair | jogo: can you expand to 48 hours? | 15:11 |
jogo | jeblair: that is interesting | 15:11 |
jogo | jeblair: sure I'll do all 10 days | 15:12 |
fungi | yeah, that means it went over "teh internets" | 15:12 |
jeblair | rax is actually at a disadvantage because we had a bunch of nodes stuck in delete overnight | 15:12 |
jeblair | (or at an advantage) | 15:12 |
*** pballand has joined #openstack-infra | 15:12 | |
jogo | jeblair: so the don't get stuck in delete issue isn't fixed? | 15:12 |
jeblair | jogo: problem was our side | 15:12 |
jogo | all time rax: 8 | 15:13 |
jogo | query: tags:"console" AND message:("download.py\", line 495" "download.py\", line 433" "download.py\", line 237") AND NOT build_node:*hpcloud* | 15:13 |
jogo | jeblair: ahh | 15:13 |
*** amotoki has quit IRC | 15:14 | |
jogo | all time hpcloud: 326 | 15:14 |
*** amotoki_ is now known as amotoki | 15:14 | |
jeblair | okay, so the internet between rax and hpcloud is broken | 15:15 |
clarkb | jeblair fungi nodes not deleting was a problem in nodepool? sergeylukjanov indicated manualdeletions timed out or did I read that wrong? | 15:15 |
jroll | sdague: have a moment to take a look at https://review.openstack.org/#/c/118507/ please? | 15:15 |
*** MaxV_ has quit IRC | 15:15 | |
fungi | clarkb: yeah, i'm thinking we've got some sort of blocking interaction in the provider manager for the rax regions where the image updater is looping and not letting the periodic deleter run | 15:16 |
SergeyLukjanov | clarkb, there was timeout on manual delete | 15:16 |
fungi | clarkb: still speculation until i can find a smoking gun in the thread dump, but i need to ignore irc for a bit so i can focus on it and people keep needing help with other things | 15:16 |
*** Longgeek has joined #openstack-infra | 15:17 | |
clarkb | fungi doesnt ^ indicate the periodic deleter would fail if it ran? | 15:17 |
clarkb | fungi ok go focus :) | 15:17 |
jogo | jeblair: so thoughts on adding download_cache to pip.conf to reduce chances of duplicate network downloads? hopefully that would help the hpcloud nodes a little bit and no downside (assuming the setuptools/easy_install install outdated version thing won't be triggered). | 15:17 |
Alex_Gaynor | jogo: it seems like this would potentially help a ton on the requirements jobs? (pip 1.6 is going to have download_cache on by default either way) | 15:18 |
jogo | hmm we may already cache things | 15:19 |
jeblair | jogo: i'm not up on the outdated version thing; i think we should find someone who groks that to tell us if it's going to cause problems | 15:19 |
jogo | var/cache/pip | 15:19 |
jogo | http://logs.openstack.org/60/112660/15/gate/gate-grenade-dsvm-partial-ncpu/cac23c4/logs/grenade.sh.txt.gz#_2014-09-04_05_14_53_990 | 15:19 |
Alex_Gaynor | PIP_DOWNLOAD_CACHE=/var/cache/pip | 15:19 |
jeblair | jogo: yeah, i thought that was empty though | 15:19 |
fungi | clarkb: https://review.openstack.org/119059 (once it merges) should solve the problem which is causing the updater to loop continuously in those regions at least | 15:19 |
*** Longgeek_ has quit IRC | 15:20 | |
fungi | (unless it's masking some other broken further down the line) | 15:20 |
* jeblair checks | 15:20 | |
jogo | jeblair: I am not talking about pre-seeding the cache. | 15:20 |
jogo | not sure where /var/cache/pip is set or if its used | 15:20 |
jogo | Alex_Gaynor: ^ | 15:20 |
Alex_Gaynor | Looking at just that log, it looks like its set before every invocation of pip | 15:21 |
jeblair | jogo: oh, you think the 1st grenade run is populating it for potential use by the 2nd? | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Move initial focus on change screen https://review.openstack.org/118355 | 15:21 |
jeblair | confirmed: /var/cache/pip starts empty | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Add project and owner columns to change list https://review.openstack.org/118356 | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Reduce impact of check revisions task https://review.openstack.org/118357 | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Add user-agent and version https://review.openstack.org/118358 | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Save draft cover messages https://review.openstack.org/118359 | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Support paging in queries https://review.openstack.org/118360 | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Add database pre-reqs for change actions https://review.openstack.org/118361 | 15:21 |
openstackgerrit | A change was merged to stackforge/gertty: Add support for editing topic https://review.openstack.org/118362 | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Add support for rebasing a change https://review.openstack.org/118363 | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Add support for abandon/restore https://review.openstack.org/118364 | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Add support for cherry-picking to a branch https://review.openstack.org/118365 | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Remove a stray debug line https://review.openstack.org/118366 | 15:22 |
*** [HeOS] has quit IRC | 15:22 | |
openstackgerrit | A change was merged to stackforge/gertty: Add support for editing commit message https://review.openstack.org/118367 | 15:22 |
openstackgerrit | Joakim Löfgren proposed a change to openstack-infra/jenkins-job-builder: Add PMD publisher https://review.openstack.org/118312 | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Fix immediate change sync on search https://review.openstack.org/118368 | 15:22 |
jogo | jeblair: devstack sests PIP_DOWNLOAD_CACHE | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Clarify keymap entries for local git operations https://review.openstack.org/118369 | 15:22 |
jogo | sets* | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Add command line options to print palette and keymap https://review.openstack.org/118370 | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Don't modify status widgets outside of main thread https://review.openstack.org/118498 | 15:22 |
openstackgerrit | A change was merged to stackforge/gertty: Fix crash on dependency update https://review.openstack.org/118516 | 15:23 |
openstackgerrit | A change was merged to stackforge/gertty: Query projects in batches https://review.openstack.org/118519 | 15:23 |
openstackgerrit | A change was merged to stackforge/gertty: Change _ to - in config YAML https://review.openstack.org/118954 | 15:23 |
openstackgerrit | A change was merged to stackforge/gertty: Change help key https://review.openstack.org/118955 | 15:23 |
openstackgerrit | A change was merged to stackforge/gertty: Clear error flag when changing screen https://review.openstack.org/118956 | 15:23 |
openstackgerrit | A change was merged to stackforge/gertty: Update README and install sample configs https://review.openstack.org/118957 | 15:23 |
*** skolekonov has quit IRC | 15:24 | |
*** dizquierdo has joined #openstack-infra | 15:25 | |
jeblair | jogo: /var/cache/pip starts empty, but is populated by devstack as it installs | 15:26 |
jeblair | jogo: so theoretically, the 2nd run of grenade should benefit from it | 15:26 |
ashaeron | How do you "publish" draft comments on a previous patch set in gerrit? :/ | 15:27 |
*** shayneburgess has quit IRC | 15:27 | |
jogo | jeblair: except, http://logs.openstack.org/60/112660/15/gate/gate-grenade-dsvm-partial-ncpu/cac23c4/logs/grenade.sh.txt.gz#_2014-09-04_05_23_25_024 isn't using it | 15:27 |
jeblair | ashaeron: expand the patch set box and click the 'review button' for that patch set | 15:27 |
jogo | sdague: ^ /var/cache/pip devstack and tempest | 15:28 |
jogo | mtreinish: ^ | 15:28 |
ashaeron | jeblair: thanks | 15:28 |
jeblair | jogo: maybe because the set of packages in the cache is also the set of packages that are installed | 15:28 |
ttx | fungi: number of "running" nodes being limited again -- did you get to the bottom of the "not deleting" issue? | 15:29 |
jeblair | jogo: so anything it would need to install would not be in the cache | 15:29 |
*** bhuvan_ has joined #openstack-infra | 15:29 | |
jeblair | ttx: he's still looking into it, however, the current cause is not because of nodes stuck in delete | 15:29 |
fungi | ttx: i'm looking through a thread dump of nodepool now to see if i can tell | 15:29 |
jeblair | ttx: there was just a gate reset at the top | 15:29 |
jogo | jeblair: devstack doesn't actually export PIP_DOWNLOAD_CACHE | 15:29 |
jogo | jeblair: it only uses it when devstack does a pip command | 15:30 |
ttx | jeblair: ack | 15:30 |
jogo | jeblair: so the tempest tox doesn't use it | 15:30 |
ttx | we seem to be past the bump now | 15:31 |
jogo | jeblair: but cffi may not be used elsewhere | 15:31 |
openstackgerrit | A change was merged to openstack-infra/config: Correct path in prepare_devstack_virt_preview.sh https://review.openstack.org/119059 | 15:31 |
*** shayneburgess has joined #openstack-infra | 15:31 | |
*** bhuvan has quit IRC | 15:31 | |
*** Guest42286 has quit IRC | 15:32 | |
*** MaxV has joined #openstack-infra | 15:32 | |
jogo | jeblair: I don't see anything else installing it | 15:33 |
jeblair | well, at base we've got two options: prepopulate the cache, or set up per-region mirrors | 15:33 |
openstackgerrit | A change was merged to openstack-infra/config: Add a missing 's' to third-party-requests https://review.openstack.org/118893 | 15:34 |
jogo | jeblair: yeah either one of those sounds like the right approach | 15:34 |
*** isaacb has joined #openstack-infra | 15:35 | |
*** woodm1979 has joined #openstack-infra | 15:36 | |
woodm1979 | Hello openstack infra folks! | 15:36 |
woodm1979 | I'm primarily on the Horizon project, and I have a question regarding our minimum pip level. | 15:36 |
*** MaxV has quit IRC | 15:37 | |
*** esker has joined #openstack-infra | 15:37 | |
jogo | jeblair: simply pre-populating /var/cache/lib isn't enough as only devstack jobs use the cache right now | 15:38 |
woodm1979 | Right now, according to https://github.com/openstack/requirements/blob/master/global-requirements.txt#L78 , the minimum pip level needed for anything openstack is 1.4. However, the oslosphinx package (atleast in horizon) needs >= 2.2.0.0a2. That install fails. | 15:39 |
woodm1979 | I'm not sure if it's because of the "a2" on the end, or what, but pip < 1.5 fails. | 15:40 |
woodm1979 | Do any of you have any guidance? | 15:40 |
openstackgerrit | Mark Vanderwiel proposed a change to openstack-infra/config: Fix chef gates, allow sudo for gem install and sue ruby 1.9.1 https://review.openstack.org/119094 | 15:40 |
fungi | woodm1979: you need to pass a special option in pip 1.4.x to convince it to accept prerelease versions of packages, unlike 1.5 and later which does so by default | 15:40 |
fungi | woodm1979: see the pip manpage for details | 15:41 |
*** yjiang5 has joined #openstack-infra | 15:41 | |
*** jistr has quit IRC | 15:41 | |
*** unicell has quit IRC | 15:41 | |
openstackgerrit | Mark Vanderwiel proposed a change to openstack-infra/config: Fix chef gates, allow sudo for gem install and use ruby 1.9.1 https://review.openstack.org/119094 | 15:41 |
*** mmaglana has joined #openstack-infra | 15:42 | |
markvan_ | fungi: I think this should do the trick for us....https://review.openstack.org/#/c/119094/ | 15:43 |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Add query for oslotest/infra py33 bug 1365512 https://review.openstack.org/119063 | 15:43 |
uvirtbot | Launchpad bug 1365512 in oslotest "python33 jobs failing with "No distributions matching the version for oslotest>=1.1.0.0a1"" [Undecided,New] https://launchpad.net/bugs/1365512 | 15:43 |
*** tsg has joined #openstack-infra | 15:44 | |
woodm1979 | Thanks fungi I'll check it out. | 15:45 |
*** MaxV has joined #openstack-infra | 15:46 | |
*** paulrad has quit IRC | 15:47 | |
openstackgerrit | Andreas Jaeger proposed a change to openstack-infra/config: Fix doc8 issues https://review.openstack.org/117342 | 15:48 |
*** paulrad has joined #openstack-infra | 15:48 | |
*** MaxV_ has joined #openstack-infra | 15:48 | |
fungi | i see a thread in the dump sitting in periodicCleanup -> cleanupOneImage -> deleteImage -> submitTask -> wait -> waiter.acquire() | 15:49 |
*** shayneburgess has quit IRC | 15:49 | |
fungi | not sure if that's actually stuck or just a snapshot in time | 15:49 |
*** dkranz has joined #openstack-infra | 15:49 | |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/config: Set download_cache in pip.conf https://review.openstack.org/119095 | 15:50 |
*** woodm1979 has left #openstack-infra | 15:50 | |
openstackgerrit | James E. Blair proposed a change to openstack-infra/config: Add gertty release jobs https://review.openstack.org/119096 | 15:50 |
openstackgerrit | afazekas proposed a change to openstack-infra/nodepool: Add fixed ip to the /etc/nodepool https://review.openstack.org/114840 | 15:51 |
*** doude has quit IRC | 15:51 | |
*** MaxV has quit IRC | 15:51 | |
jeblair | jogo: so let's find someone who actually understands the implications of that to weigh in | 15:51 |
*** koolhead17 has joined #openstack-infra | 15:51 | |
*** dtantsur|brb is now known as dtantsur | 15:51 | |
fungi | hrm, yeah took another thread dump 13 minutes later and that same thread is still sitting at waiter.acquire() | 15:51 |
openstackgerrit | Mark Vanderwiel proposed a change to openstack-infra/config: Fix chef gates, allow sudo for gem install and use ruby 1.9.1 https://review.openstack.org/119094 | 15:52 |
clarkb | jeblair: jogo: I don't think that patch will change it for tempest/grenade either because they run as a different user | 15:52 |
jeblair | fungi: to be fair, they spend most of their time waiting | 15:52 |
clarkb | jeblair: jogo: we need to update d-g iirc where it sets up the pip.conf | 15:52 |
*** packet has joined #openstack-infra | 15:53 | |
*** MaxV_ has quit IRC | 15:53 | |
fungi | jeblair: i expect so... wasn't sure if that was typical behavior for a thread in the pool to continue as the cleanup thread | 15:53 |
*** paulrad has quit IRC | 15:53 | |
*** markmcclain has quit IRC | 15:53 | |
jogo | jeblair: I agree, I put a note in the commit message saying there is an outstanding question | 15:53 |
openstackgerrit | A change was merged to openstack-infra/config: Cleanup tempest-logs.html https://review.openstack.org/117373 | 15:53 |
fungi | rather than being recycled/respawned for each cleanup pulse | 15:53 |
jeblair | jogo: okay, you go hunt that person down :) but i don't actually think this is a solution to the problem unless we also pre-populate it | 15:54 |
clarkb | jeblair: agreed, cloud specific mirrors should be much more reliable | 15:55 |
jogo | jeblair: I'll hunt someone down. I agree it isn't a real solution on its own | 15:55 |
jeblair | clarkb: are we waiting on any pre-reqs for that? | 15:55 |
*** markvan_ has quit IRC | 15:56 | |
clarkb | jeblair: nothing beyond setting up our non jenkins2 account in hpcloud for use. iirc we haven't done anything to it since the switch to 1.1 | 15:56 |
clarkb | jeblair: that means, create network, create router, hook them together | 15:56 |
clarkb | we may need a quota bump too but not for a single server to start | 15:57 |
jeblair | 2014-09-04 15:55:36,989 DEBUG nodepool.NodePool: Finished periodic cleanup | 15:57 |
jeblair | 2014-09-04 15:56:00,001 DEBUG nodepool.NodePool: Starting periodic cleanup | 15:57 |
fungi | yeah, i'm not really sure whatever leak we saw overnight is continuing | 15:57 |
clarkb | then we need to have the jobs select the right mirror (geo dns, /etc/hosts, something) | 15:57 |
jeblair | fungi: ^ | 15:57 |
clarkb | jeblair: I can start with setting up the account today if that is a thing we want to do | 15:58 |
fungi | jeblair: right, and no stacktrace for it | 15:58 |
jeblair | clarkb: switch in devstack-gate on /etc/nodepool file | 15:58 |
jogo | jeblair: if we prepopulate we still need to answer the question fungi raised anyway | 15:58 |
jeblair | jogo: agreed | 15:58 |
fungi | jeblair: i also don't really see any long-delete nodes accumulating | 15:58 |
openstackgerrit | A change was merged to openstack-infra/config: Add list of types of logs to tempest-logs page https://review.openstack.org/117390 | 15:58 |
fungi | jeblair: oho! it waaaaaas stuck... | 16:00 |
jeblair | fungi, mordred: thoughts on per-provider mirrors? | 16:00 |
fungi | 2014-09-03 22:14:00,002 DEBUG nodepool.NodePool: Starting periodic cleanup | 16:00 |
fungi | 2014-09-04 10:23:41,184 DEBUG nodepool.NodePool: Finished periodic cleanup | 16:00 |
*** MaxV has joined #openstack-infra | 16:00 | |
jeblair | fungi: ah yeah, that's a long one | 16:00 |
fungi | a little over 8 hours gap from start to finish | 16:00 |
fungi | spans the log rotation | 16:00 |
*** jlibosva has quit IRC | 16:01 | |
fungi | i'm betting 10:23 utc roughly corresponds with when SergeyLukjanov did manual cleanup | 16:01 |
jeblair | fungi: look at 2014-09-04 10:44:07,768 ERROR nodepool.NodePool: Exception cleaning up image id 216564: | 16:02 |
*** koolhead17 has quit IRC | 16:02 | |
*** weshay has joined #openstack-infra | 16:02 | |
*** jlibosva has joined #openstack-infra | 16:03 | |
jeblair | however, it does look like it was able to delete the server used for that image | 16:03 |
*** dkranz has quit IRC | 16:03 | |
fungi | jeblair: yep, there were about half a dozen of those i found in the debug log, didn't seem related though | 16:03 |
jeblair | well, they were the failed image builds | 16:04 |
fungi | right, for the template nodes | 16:04 |
*** yamahata has quit IRC | 16:06 | |
*** isaacb has quit IRC | 16:06 | |
*** arxcruz has quit IRC | 16:06 | |
jeblair | fungi: i'm not sure there's a connection | 16:06 |
*** marun_ is now known as marun_afk | 16:06 | |
fungi | jeblair: nor i | 16:07 |
*** marcoemorais has joined #openstack-infra | 16:07 | |
mordred | jeblair: I believe we shoudl have them | 16:07 |
jeblair | fungi: hypothesis: rax timed out deleting a bunch of servers last night. it made the periodic cleanup thread slow, and caused many of our normal deletions to fail. SergeyLukjanov saw the same behavior when he manually deleted them | 16:07 |
mordred | jeblair: should I escalate working on that? | 16:07 |
fungi | jeblair: oh, you mean not sure if there's a connection between SergeyLukjanov's manual node deletions and the periodic cleanup completing? | 16:07 |
fungi | ahh, right | 16:07 |
fungi | seems plausible | 16:08 |
jeblair | fungi: no connection between the image looping and deleting problem | 16:08 |
jeblair | mordred: are/were you working on that? | 16:08 |
mordred | jeblair: it was on my list but I had not actually started work on it yet | 16:08 |
fungi | yeah, it was a hunch which did not pan out. was the only thing i could find which seemed to be consistently happening to all rax regions which would have possibly started around the same time as a slow leak on periodic cleanup | 16:08 |
jeblair | mordred: what was your general plan? | 16:08 |
mordred | jeblair: I was kinda hoping that we'd magically have AFS before hand ... but without that ... | 16:09 |
fungi | jeblair: jogo: the "installing the wrong versions from cache" thing came up when we were reworking pypi-mirror for performance, but now i can't recall the details (it's now seeming like it was maybe related to reusing the build cache instead). maybe we should just try and see if it causes any unexpected issues... the result should be apparent in pip freeze output in our logs if we're using different | 16:09 |
mordred | jeblair: we can just run additional copies of the mirror stuff on additional hosts independently, so there might could be mild skew - but it's the shortest path to them existing | 16:09 |
fungi | versions than we requested | 16:09 |
*** rushiagr is now known as rushiagr_away | 16:09 | |
mordred | jeblair: then we just need to tell a node which mirror to use ... which I think can be done at node create/launch time pretty easily | 16:09 |
fungi | anyway, since it appears we're out of immediate danger, i'm going to break for a long lunch to check out an art show up the street | 16:10 |
fungi | back in a while | 16:10 |
jeblair | fungi: have fun, and thanks | 16:10 |
mordred | as in, I think we can make a system that isn't _perfect_ but that is understandable and workable without too much effort, and we can coe back and do a more perfect answer a little later | 16:10 |
*** andreykurilin_ has quit IRC | 16:10 | |
mordred | jeblair: did we get anywhere with dib-nodepool while I was out? | 16:10 |
jeblair | mordred: sounds good; clarkb was talking about setting up the non-jenkins2 account in hpcloud | 16:10 |
jeblair | mordred: we're running nodepool with the code but not exercising it as the in-config-tree part hasn't quite gotten back in shape | 16:11 |
mordred | jeblair: kk | 16:11 |
mordred | jeblair: I'd actually like for it to be per-region mirrors | 16:11 |
mordred | jeblair: so that we have a DFW, ORD and IAD mirror in rax | 16:12 |
*** sballe has quit IRC | 16:12 | |
*** wenlock has joined #openstack-infra | 16:12 | |
jeblair | mordred: what about hp? we're only using one region there (but 3 azs, but they all share networking) | 16:12 |
*** pabelanger_ is now known as pabelanger | 16:13 | |
*** pabelanger has quit IRC | 16:13 | |
*** pabelanger has joined #openstack-infra | 16:13 | |
jeblair | (except that we have 5 of our own networks there, each spanning all 3 azs) | 16:13 |
*** afazekas has quit IRC | 16:13 | |
clarkb | I think hp 1.1 east needs a single mirror. If we spin up west (we should follow up on that because we should do that at some point yes?) we can put one there | 16:13 |
jeblair | mordred: i vote one hp mirror for starters, then see where it breaks :) | 16:13 |
clarkb | jeblair: ++ | 16:13 |
mordred | jeblair: I think just one ... but I thnik it would be lke pypi.region-b.geo-1.openstack.org | 16:13 |
jeblair | clarkb, mordred: ++ | 16:13 |
mordred | jeblair: and the otherws would be pypi.DFW.openstack.org etc. | 16:13 |
clarkb | mordred: iirc we were aksed not to use west yet. Do you happen to know if/when we may be able to split our hpcloud load and add that region? | 16:14 |
mordred | that way, we can have a puppet thing that does pypi-url=http://pypi.<%= OS_REGION %>.openstack.org (hand waves) | 16:14 |
mordred | clarkb: I do not | 16:14 |
*** ashaeron has quit IRC | 16:15 | |
openstackgerrit | A change was merged to openstack-infra/config: Add pypi publish jobs to subunit2sql https://review.openstack.org/118396 | 16:15 |
*** _d34dh0r53_ is now known as d34dh0r53 | 16:16 | |
jeblair | clarkb, mordred: want to split this up? clarkb set up the accounts, mordred set up the hosts in puppet, i'll work on the nodepool/mirror switching bits? | 16:16 |
clarkb | sounds good | 16:17 |
*** elixor has joined #openstack-infra | 16:17 | |
*** hogepodge has joined #openstack-infra | 16:17 | |
clarkb | we should only need to set up hpcloud right? the rax non jenkins account is good to go in iad and ord? | 16:17 |
jeblair | clarkb: i think? it's possible it has very limited quota in those regions | 16:18 |
jeblair | however, if that lags, it's probably okay (those regions aren't as critical to the current problem) | 16:18 |
clarkb | ++ | 16:19 |
*** bdpayne has quit IRC | 16:20 | |
*** MaxV has quit IRC | 16:20 | |
mordred | jeblair: so, wrt mirror switching ... I thnik all of the info we need is in the nova metadata | 16:20 |
clarkb | we shouldn't count on meta data | 16:21 |
*** adalbas has quit IRC | 16:21 | |
mordred | jeblair: although I'm not sure we're using that from facter at the moment (it's very easy to get to from the ansible side | 16:21 |
clarkb | it isn't reliable and our failure rate may even go up as part of that switch | 16:21 |
mordred | clarkb: sure we should, but we're talking about different things | 16:21 |
clarkb | mordred: our slaves don't rely on it today in hpcloud, only the image builds do | 16:22 |
mordred | clarkb: I'm not necessarily talking about the ec2 metadata service | 16:22 |
*** _nadya_ has quit IRC | 16:22 | |
mordred | but let me step back from that for a sec ... once we're on dib, then we're not running puppet on the host in the region itself, so we lose the natural insight of where the node lives | 16:23 |
*** kmartin has joined #openstack-infra | 16:23 | |
mordred | _currently_ there is a very easy way to deal with this, but that way is planned to go away | 16:23 |
*** praneshp has joined #openstack-infra | 16:23 | |
*** dane_leblanc has quit IRC | 16:23 | |
jeblair | i was going to put it in a file in /etc/nodepool that we plop down when we spin up the instance | 16:24 |
mordred | yah | 16:24 |
jeblair | that work? | 16:24 |
mordred | but then we need a select mirror script again | 16:24 |
jeblair | mordred: ah, you wanted to bake it into the per-provider image, which we're getting rid of | 16:24 |
jeblair | i admit, that's elegant, but yeah, it makes dib harder | 16:25 |
mordred | well, kinda. _currently_ that would work - but what I really want is for us to do what you said wrt to /etc/nodepool - just not putting it in /etc/nodepool and instead putting the mirror config in place | 16:25 |
*** gyee has joined #openstack-infra | 16:25 | |
jeblair | mordred: oh, have a ready script that does that? | 16:25 |
mordred | yeah | 16:25 |
mordred | because we know at that time what the region is | 16:25 |
*** cnesa has joined #openstack-infra | 16:26 | |
mordred | and it should work in both approaches | 16:26 |
clarkb | that sounds like a good approach | 16:26 |
jeblair | mordred: it's basically select-mirror either way, it's just whether we do it in nodepool or in all our jobs | 16:27 |
mordred | yah. | 16:27 |
jeblair | nodepool means one place, all our jobs means faster to fix problems :) | 16:27 |
jeblair | but i bet we can swing it in nodepool | 16:27 |
jeblair | i think i will actually still implement it as writing the info to /etc/nodepool, and then have the ready script read that | 16:28 |
mordred | ++ | 16:28 |
jeblair | nodepool provider name, openstack region, openstack az | 16:29 |
jeblair | any other info we want to dump in there? | 16:29 |
mordred | I think that's plenty | 16:29 |
clarkb | I am activating east 1.1 in the openstackci2 account | 16:29 |
clarkb | and hopefully won't get charged for this at the end of the month | 16:29 |
mordred | jeblair: although I do have python code that knows how to get all of the information nova knows about a node and put it into json format ... | 16:29 |
openstackgerrit | Matt Riedemann proposed a change to openstack-infra/elastic-recheck: Update query for pip timeout bug 1270710 https://review.openstack.org/119121 | 16:30 |
uvirtbot | Launchpad bug 1270710 in openstack-ci "sporadic pip timeouts during download" [Medium,Incomplete] https://launchpad.net/bugs/1270710 | 16:30 |
clarkb | mordred: do we need cinder volumes for these nodes as well? | 16:30 |
*** cnesa has quit IRC | 16:30 | |
mordred | clarkb: looking | 16:30 |
mordred | clarkb: /dev/mapper/main-mirror 148G 106G 43G 72% /srv/static/mirror | 16:31 |
mordred | clarkb: we're using 106G right now | 16:31 |
clarkb | oh wow activating services gets you a /24 netowrk and a router now | 16:31 |
clarkb | mordred: ok I will activate cinder too | 16:31 |
*** unicell has joined #openstack-infra | 16:31 | |
clarkb | mordred: since I think 1.1 is ephemeral disk for that size of volume or use cinder | 16:31 |
*** jlibosva has quit IRC | 16:31 | |
mordred | clarkb: actually, we may not need that much | 16:32 |
*** jergerber has joined #openstack-infra | 16:32 | |
mordred | I'm checking how much our old mirror takes | 16:32 |
*** greghayn1 is now known as greghaynes | 16:33 | |
*** jpich has quit IRC | 16:33 | |
clarkb | oh volume comes as part of compute | 16:33 |
clarkb | that is nice. This seems to be a lot simpler than it was in the past | 16:33 |
*** radez is now known as radez_g0n3 | 16:34 | |
zaro | morning | 16:34 |
*** amcrn has joined #openstack-infra | 16:34 | |
clarkb | mordred: I am creating the ci launch file for that account and region now, but once that is done you should be able to start building a node and attaching volumes to it and stuff | 16:34 |
*** adalbas has joined #openstack-infra | 16:35 | |
*** unicell has quit IRC | 16:35 | |
clarkb | mordred: and thats done I think we should be ready to add nodes (assuming pupept is ready, we may need to update puppet first /me looks) | 16:37 |
*** annegent_ has joined #openstack-infra | 16:37 | |
*** dane_leblanc has joined #openstack-infra | 16:37 | |
mordred | clarkb: awesome | 16:37 |
*** radez_g0n3 is now known as radez | 16:38 | |
*** unicell has joined #openstack-infra | 16:38 | |
*** annegentle has quit IRC | 16:38 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Split pypi_mirror into its own class https://review.openstack.org/119124 | 16:38 |
clarkb | mordred: oh good you are already on the puppet | 16:38 |
mordred | clarkb: yeah, I'm working the puppet | 16:39 |
clarkb | mordred: also as part of this move we should consider moving the dfw mirror off of static | 16:39 |
*** annegent_ has quit IRC | 16:39 | |
clarkb | but that can happen whe nwe have >1 of these | 16:40 |
mordred | clarkb: actually, I think we want to create a new dfw mirror called pypi.DFW.openstack.org | 16:40 |
clarkb | ++ | 16:40 |
mordred | and then go back and make pypi.openstack.org a load balancer across the per-region mirrors | 16:40 |
*** pradk has joined #openstack-infra | 16:40 | |
*** bhuvan_ has quit IRC | 16:40 | |
clarkb | mordred: we have 15GB of RAM quota in 1.1 currently. is that big enough to start? | 16:42 |
mordred | absolutely | 16:42 |
mordred | we probably only need 4 | 16:42 |
clarkb | I figured, but double checking | 16:43 |
*** yfried_ has joined #openstack-infra | 16:43 | |
clarkb | and we have 3TB of volume quota which should be plenty | 16:43 |
openstackgerrit | Matt Riedemann proposed a change to openstack-infra/elastic-recheck: Add query for gate-devstack-bashate xen functions fails https://review.openstack.org/119126 | 16:44 |
*** homeless has quit IRC | 16:44 | |
*** homeless has joined #openstack-infra | 16:45 | |
*** sballe has joined #openstack-infra | 16:45 | |
clarkb | mordred: do you want to put a site.pp entry in that change or do a follow up change for pypi\..*\.openstack.org ? | 16:46 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Split pypi_mirror into its own class https://review.openstack.org/119124 | 16:47 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add entry for new PyPI mirrors https://review.openstack.org/119129 | 16:47 |
mordred | clarkb: you mean like that ^^ ? | 16:47 |
*** esker has quit IRC | 16:48 | |
clarkb | mordred: yup, though our other ergexes use ^ and $ anchors | 16:48 |
clarkb | maybe we should add those for consistency across the site.pp | 16:48 |
*** bdpayne has joined #openstack-infra | 16:49 | |
mordred | clarkb: on it | 16:49 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add entry for new PyPI mirrors https://review.openstack.org/119129 | 16:49 |
clarkb | and yo usaw my include apache comment (or noticed that independently) | 16:50 |
mordred | I did notice it and fixed in the updated patch above | 16:50 |
clarkb | one more thing on 119129 | 16:51 |
*** derekh has quit IRC | 16:51 | |
*** unicell has quit IRC | 16:52 | |
mordred | *headdesk* | 16:53 |
*** tonytan4ever has quit IRC | 16:53 | |
*** unicell has joined #openstack-infra | 16:54 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add entry for new PyPI mirrors https://review.openstack.org/119129 | 16:54 |
clarkb | mordred: do you want to try spinning up a node in hpcloud using that proposed change before we merge it? | 16:55 |
clarkb | mordred: since merging may be somewhat slow today. the launch node script should haev a puppet evironment flag that we can use to point at a dev env for puppet when spinning up the node | 16:56 |
mordred | clarkb: I do, and I shall | 16:56 |
clarkb | mordred: oh wait | 16:56 |
clarkb | mordred: we need the node and volume to exist before puppet runs? | 16:56 |
mordred | we do? | 16:56 |
mordred | oh. volume. | 16:56 |
mordred | how about this | 16:56 |
mordred | how about I put in mounting the volume into the puppet | 16:56 |
mordred | since it's going to go on /srv | 16:57 |
mordred | oh - but launch_node doesn't know how to do that does it? | 16:57 |
clarkb | mordred: and just let that bit fail until the volume is present? | 16:57 |
clarkb | mordred: no, this is one of those use cases where openstack breaks ;) | 16:58 |
mordred | well, no. that's not really true | 16:58 |
mordred | ansible can do this _Fine_ | 16:58 |
clarkb | mordred: but only because ansible does openstacks job for it (the spin up and attachment of the volume) | 16:58 |
mordred | but my ansible patches for this aren't quite ready yet | 16:58 |
clarkb | mordred: openstack exist to do this for me | 16:58 |
clarkb | ansible shouldn't have to do it or puppet | 16:58 |
mordred | clarkb: I have come to believe that openstack disagrees withi you | 16:58 |
mordred | and that our attempts to make openstack agree with us are futile | 16:59 |
clarkb | its similar to the floating ip deletion issue. openstack should manage the state for the user | 16:59 |
clarkb | mordred: jgriffith wrote a blueprint to do this for us iirc | 16:59 |
mordred | right. but it doesn't and never will | 16:59 |
mordred | shrug | 16:59 |
clarkb | mordred: not that that means it will get implmented | 16:59 |
clarkb | but cinder seems to think this is a something that it can do for the user | 16:59 |
mordred | I will assume that openstack does not do this until $long_future | 16:59 |
mordred | which means I need to do it | 16:59 |
mordred | which means I'm going to work on making sure my tools know how to do it | 16:59 |
mordred | at which point I will cease caring if openstack fixes it | 16:59 |
clarkb | right I only bring it up as a thing that we should probably try to communicate back to the projects as we find them | 17:00 |
clarkb | I feel like our use cases aren't crazy | 17:00 |
mordred | yah. totally | 17:00 |
mordred | although first I'd like to communicate taht the API version parameter to glanceclient is bonghits | 17:00 |
mordred | especially since you cannot count on getting an API version value back from the keystone catalog | 17:00 |
mordred | and if you do, it's not a value that you can pass to the first param of glanceclient.Client | 17:00 |
*** bhuvan has joined #openstack-infra | 17:01 | |
clarkb | :( | 17:01 |
mordred | so, before I get people to fix hard things | 17:01 |
*** isviridov is now known as isviridov_away | 17:01 | |
mordred | I want to make the very very very basic things work without feeling like I'm stabbing myself in the eye | 17:01 |
mordred | getting openstack to mount a volume at the right time? totally given up on that idea | 17:01 |
clarkb | so to handle this with puppet as is, the simple thing may be to spin up the node against the default node in site.pp. Add volume, mount it etc, then repuppet against correct thing | 17:01 |
*** markvan has joined #openstack-infra | 17:01 | |
mordred | clarkb: well, it's actually possible to associate a nova instance with a cinder volume at boot time | 17:02 |
clarkb | or make puppet/bandersnatch fail until /srv has a volume under it | 17:02 |
clarkb | mordred: but not format it | 17:02 |
mordred | so it's possible as currently stands to do this in one stab | 17:02 |
mordred | clarkb: I'm pretty sure puppet can format it | 17:02 |
mordred | anyway - I need to jump on a call, I will propose more patches in 1 hour | 17:02 |
jeblair | mordred: i'm in a similar situation | 17:03 |
mordred | jeblair: phone call? or chicken/egg cinder? | 17:03 |
jeblair | mordred: phone call | 17:03 |
mordred | jeblair: gross | 17:04 |
*** mjturek has left #openstack-infra | 17:04 | |
*** mrmartin has quit IRC | 17:04 | |
*** chuck_ has joined #openstack-infra | 17:05 | |
*** mrmartin has joined #openstack-infra | 17:06 | |
*** mrmartin has quit IRC | 17:07 | |
*** tsg has quit IRC | 17:07 | |
*** jlibosva has joined #openstack-infra | 17:08 | |
*** tsg has joined #openstack-infra | 17:09 | |
*** elixor has quit IRC | 17:09 | |
mordred | clarkb: btw - I _have_ all of the code to do all of these things in a sane way in other places - but there's no way that I can get that bundled up and submitted sanely in time to do this | 17:09 |
*** johnthetubaguy is now known as zz_johnthetubagu | 17:09 | |
clarkb | mordred: I am reading docs, it looks like nova claims to be able to format things | 17:11 |
clarkb | mordred: I have no idea if our clouds support that or anything | 17:11 |
mordred | clarkb: link? | 17:11 |
*** pelix has quit IRC | 17:12 | |
mordred | clarkb: I was honestly just assuming an exec { mkfs stanza with an 'onlyif => not mounted' | 17:12 |
clarkb | mordred: I did `nova help boot` and the output under --block-device has a format= option | 17:12 |
mordred | great. I'll poke at that | 17:12 |
*** melwitt has joined #openstack-infra | 17:12 | |
clarkb | I am going to write a quick patch to launch node that will at least attache a preexisting volume | 17:12 |
*** elixor has joined #openstack-infra | 17:12 | |
mordred | clarkb: cool | 17:13 |
*** Longgeek_ has joined #openstack-infra | 17:16 | |
*** e0ne has quit IRC | 17:17 | |
*** jlibosva has quit IRC | 17:18 | |
*** luqas has quit IRC | 17:18 | |
*** Longgeek has quit IRC | 17:19 | |
*** chuck_ has quit IRC | 17:19 | |
*** hdd has quit IRC | 17:19 | |
*** harlowja has joined #openstack-infra | 17:20 | |
anteaya | my heart still isn't with me yet and I just went for a walk to see if I could reset it, walked past the continually backing up equipment two doors down to watch them drive over the rocks they had just placed pushing them out of alignment | 17:20 |
*** luqas has joined #openstack-infra | 17:20 | |
*** luqas has quit IRC | 17:21 | |
*** SumitNaiksatam has quit IRC | 17:22 | |
*** yjiang5 has quit IRC | 17:23 | |
*** annegentle has joined #openstack-infra | 17:23 | |
clarkb | mordred: wow this is fun. The cli takes a simple string to figure out how to do the volume attachment. It parses that and creates a dict that is passed to the actual boot call. Tempted to call the parse method directly and give it a string to parse btu it is a private method ... | 17:25 |
clarkb | so I get to figure out what all the options do :) | 17:26 |
*** gokrokve has quit IRC | 17:26 | |
*** pcm_ has quit IRC | 17:27 | |
clarkb | I think this falls under your glance client complaint. it is hard to do things | 17:27 |
*** pballand has quit IRC | 17:28 | |
*** amotoki has quit IRC | 17:29 | |
*** pballand has joined #openstack-infra | 17:29 | |
*** reed has joined #openstack-infra | 17:30 | |
*** pballand has quit IRC | 17:30 | |
*** mpaolino has quit IRC | 17:30 | |
*** MaxV has joined #openstack-infra | 17:31 | |
*** pcm_ has joined #openstack-infra | 17:31 | |
*** yjiang5 has joined #openstack-infra | 17:31 | |
sdague | yeh.... glance.... | 17:32 |
clarkb | sdague: jogo: do you know if I have to provide a boot_index value? | 17:33 |
clarkb | it doesn't look like the client provides one in all cases | 17:33 |
sdague | clarkb: honestly, I don't | 17:33 |
clarkb | ok it doesn't look like I need to | 17:35 |
*** MaxV has quit IRC | 17:36 | |
*** marun_afk has quit IRC | 17:36 | |
*** SumitNaiksatam has joined #openstack-infra | 17:36 | |
*** AaronGreen has left #openstack-infra | 17:37 | |
*** gokrokve has joined #openstack-infra | 17:38 | |
*** dtantsur is now known as dtantsur|afk | 17:39 | |
*** doude has joined #openstack-infra | 17:39 | |
*** rushiagr_away is now known as rushiagr | 17:40 | |
clarkb | I am also looking at guest_format and that seems to only be used for ephemeral and swap | 17:40 |
clarkb | so it may be a noop operation on our persistent volumes | 17:40 |
*** Ryan_Lane has joined #openstack-infra | 17:40 | |
*** dizquierdo has quit IRC | 17:42 | |
*** annegentle has quit IRC | 17:42 | |
*** amuller has quit IRC | 17:42 | |
*** mpaolino has joined #openstack-infra | 17:42 | |
clarkb | its weird because this seems to be unsanitized user datat that ends up in the nova db but I don't really grok nova internals so am probably wrong | 17:42 |
*** annegentle has joined #openstack-infra | 17:43 | |
*** melwitt has quit IRC | 17:43 | |
*** melwitt has joined #openstack-infra | 17:43 | |
*** packet has quit IRC | 17:45 | |
*** lcheng_ has joined #openstack-infra | 17:46 | |
*** packet has joined #openstack-infra | 17:46 | |
dstufft | clarkb: hey, is there any way to get a VM like what the tests run in spun up for manual inspection? I'm working on a CR thing and it's failing tests, but I can repro it :| | 17:47 |
sdague | clarkb: there is quite a bit on unsanitized data pass through | 17:47 |
openstackgerrit | A change was merged to openstack/requirements: Update netaddr to 0.7.12 version for IPv6 https://review.openstack.org/118224 | 17:47 |
clarkb | sdague: uhm ok | 17:47 |
*** annegentle has quit IRC | 17:47 | |
sdague | clarkb: we're trying to fix that :) | 17:47 |
clarkb | dstufft: yup | 17:47 |
clarkb | dstufft: give me a minute t ofind links | 17:48 |
dstufft | clarkb: cool, thanks sir! | 17:48 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Record provider/region/az in /etc/nodepool https://review.openstack.org/119138 | 17:48 |
clarkb | sdague: ok good :) its just odd that we would have a parameter that seems to nly exist for nova's internal benefit then completely expose it to client apis | 17:48 |
*** markmcclain has joined #openstack-infra | 17:48 | |
clarkb | dstufft: which test is failing? is it an integration test or unittest or? | 17:48 |
clarkb | dstufft: we have 2 variety of slaves and slightyl different ways to build each one | 17:49 |
dstufft | clarkb: gate-barbican-py27 and gate-barbican-py26, sane tests are failing in same way on both of those, so only one is enough | 17:49 |
*** e0ne has joined #openstack-infra | 17:50 | |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Add support to launch node for attaching volumes https://review.openstack.org/119143 | 17:51 |
jeblair | clarkb, mordred: use topic 'mirror' ? | 17:51 |
clarkb | mordred: ^ completely untested | 17:51 |
clarkb | jeblair: sure | 17:51 |
clarkb | dstufft: and I take it tox -repy27 was the thing that did not reproduce | 17:52 |
*** yjiang5 has quit IRC | 17:52 | |
dstufft | clarkb: yea | 17:52 |
*** tonytan4ever has joined #openstack-infra | 17:53 | |
clarkb | dstufft: https://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/nodepool/scripts/prepare_node_bare.sh run that script out of /etc/nodepool/scripts with the other contents of that dir in git checked out to that path. Do this on a trusty base image | 17:55 |
clarkb | dstufft: note that is quite destructive so don't do it on a node you care about | 17:55 |
dstufft | clarkb: on any trust install? like just one I spin up using the default rackspace images? or do I need to spin up a openstack specific server somehow | 17:56 |
clarkb | dstufft: default rackspace trusty PVHVM is what we use | 17:56 |
clarkb | dstufft: so that should work fine | 17:56 |
*** mrmartin has joined #openstack-infra | 17:56 | |
dstufft | ok cool | 17:56 |
dstufft | thanks a lot! | 17:56 |
*** annegentle has joined #openstack-infra | 17:56 | |
clarkb | dstufft: we run this prepare node scriptage on that base image then take a snapsot which makes our special purpose built images | 17:56 |
dstufft | clarkb: ok, makes sense | 17:57 |
openstackgerrit | Dan Prince proposed a change to openstack-infra/config: Update RH1 to use net-label instead of net-id https://review.openstack.org/119154 | 17:57 |
clarkb | dstufft: we should have disk image builder working in the near future as well. which will make this much easier | 17:57 |
dstufft | clarkb: btw, pinged christian about getting my PR into bandersnatch, I want to get that done and released before pip 1.6, so pip 1.6 can do the normalization stuff | 17:57 |
clarkb | cool | 17:57 |
*** emagana has joined #openstack-infra | 17:58 | |
*** e0ne has quit IRC | 17:58 | |
*** eharney has quit IRC | 17:58 | |
*** mpaolino has quit IRC | 17:59 | |
clarkb | ok I am going to switch back to logstash worker stuff | 18:00 |
*** 6A4AAKDJG has quit IRC | 18:02 | |
*** rkukura has joined #openstack-infra | 18:02 | |
*** _nadya_ has joined #openstack-infra | 18:03 | |
mordred | ok. off the phone | 18:04 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Add support to launch node for attaching volumes https://review.openstack.org/119143 | 18:04 |
clarkb | mordred: ^ now with a bugfix, but still untested | 18:04 |
clarkb | mordred: and best I could test the format option is only meaningful if attaching an ephemeral device or swap | 18:05 |
clarkb | otherwise nova appears to just put your data in the db and move along | 18:05 |
mordred | clarkb: that's spectacular | 18:05 |
openstackgerrit | Adam Gandelman proposed a change to openstack-infra/devstack-gate: Set required devstack variables for Ironic+grenade https://review.openstack.org/116761 | 18:06 |
*** e0ne has joined #openstack-infra | 18:06 | |
clarkb | mordred: for a future improvement we can have boot support >1 volume but I need to figure out how attach order works and all that | 18:06 |
clarkb | since order matters if the config mgmt side is going to format and mount | 18:06 |
*** annegentle has quit IRC | 18:06 | |
openstackgerrit | Doug Hellmann proposed a change to openstack-infra/config: Move notifications for pycadf from oslo to keystone https://review.openstack.org/119157 | 18:06 |
*** koolhead17 has joined #openstack-infra | 18:07 | |
*** Ajaeger1 has joined #openstack-infra | 18:07 | |
*** andreykurilin_ has joined #openstack-infra | 18:08 | |
mordred | clarkb: well, you can specify device name | 18:09 |
mordred | clarkb: so you can say, I believe "id=blah,device=vdc" | 18:09 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/config: Configure pip with a per-region mirror https://review.openstack.org/119158 | 18:10 |
clarkb | mordred: you can, but I wasn't sure how that affected type and bus if at all | 18:10 |
*** dkranz has joined #openstack-infra | 18:10 | |
mordred | nod | 18:10 |
mordred | clarkb: well, I actually want to replace launch_node anyway - but not today | 18:10 |
clarkb | ya if ansible can do this stuff in a coordinated manner I would be on board with replacement :) | 18:11 |
clarkb | but this seems relatively low impact and solves the immedaite issue? | 18:11 |
*** che-arne has quit IRC | 18:13 | |
*** koolhead_ has joined #openstack-infra | 18:13 | |
*** eharney has joined #openstack-infra | 18:13 | |
*** ociuhandu has quit IRC | 18:13 | |
sdague | clarkb: can you file a bug about what you expect should happen here vs. what's happening? | 18:14 |
clarkb | sdague: there is one already in cinder, but its a blueprint I think | 18:14 |
sdague | clarkb: ok | 18:14 |
morganfainberg | dhellmann, ah thanks! | 18:14 |
clarkb | sdague: long story short when I nova boot a node with a volume I shouldn't have to have an intermediate step to format that filesystem | 18:14 |
sdague | ah, right | 18:14 |
sdague | yes, agreed | 18:14 |
dhellmann | morganfainberg: our channel is noisy enough these days without you guys making it worse! ;-) | 18:14 |
clarkb | sdague: and that appears to be the case for ephemeral storage but not persistent storage so it is half solved | 18:15 |
morganfainberg | dhellmann, hehe | 18:15 |
clarkb | sdague: the biggest reason this is important is that for something like /var/log you will have a really hard time doing that intermediate step cleanly | 18:15 |
*** otherwiseguy has quit IRC | 18:15 | |
*** koolhead17 has quit IRC | 18:15 | |
*** Ajaeger1 has quit IRC | 18:15 | |
sdague | clarkb: yep, agree | 18:15 |
sdague | I wonder if it should actually be in the nova side instead of the cinder side | 18:16 |
*** emagana has quit IRC | 18:16 | |
clarkb | sdague: I think nova needs to pass guest_format to cinder, cinder then formats and does attachment and nova will need to edit /etc/fstab | 18:16 |
clarkb | or the hypervisor or something. so its a bit of both | 18:16 |
sdague | so nova actually does the attach | 18:17 |
*** emagana has joined #openstack-infra | 18:17 | |
sdague | cinder just produces the target | 18:17 |
clarkb | in that case I think part of producing target is a format step, though Isuppose nova could do it too if it knows where the target is | 18:17 |
*** paulrad has joined #openstack-infra | 18:17 | |
sdague | yeh | 18:17 |
*** pcm_ has quit IRC | 18:17 | |
*** baoli has quit IRC | 18:18 | |
*** pcm_ has joined #openstack-infra | 18:18 | |
*** annegentle has joined #openstack-infra | 18:18 | |
*** otherwiseguy has joined #openstack-infra | 18:18 | |
clarkb | jeblair: ok I am further convinced that popen + crm will do the correct thing when stdin is closed | 18:18 |
clarkb | I figure if I can convince myself on two separate days it must be reasonable :P | 18:19 |
clarkb | reviewing the nodepool chaneg now | 18:19 |
clarkb | mordred: can you set your topic to mirror on your changes? | 18:19 |
*** emagana has quit IRC | 18:21 | |
clarkb | jeblair: for 119158 should we set a secondary index at pypi.openstack.org to ease the transition? | 18:22 |
clarkb | actually pip may not do it gracefully if the first one completely fails (as it would in the case where one region didn't have a local mirror) | 18:22 |
clarkb | dstufft: ^ | 18:22 |
clarkb | jogo: partial ncpu seems to be failing a bit more than usual? am I imagining that? | 18:23 |
*** jpeeler has quit IRC | 18:25 | |
*** ZZelle_ has joined #openstack-infra | 18:25 | |
*** otherwiseguy has quit IRC | 18:25 | |
*** amcrn has quit IRC | 18:25 | |
*** amuller has joined #openstack-infra | 18:26 | |
*** e0ne has quit IRC | 18:29 | |
*** doude has quit IRC | 18:29 | |
*** emagana has joined #openstack-infra | 18:29 | |
*** Sincler has quit IRC | 18:30 | |
*** amuller has quit IRC | 18:32 | |
*** e0ne has joined #openstack-infra | 18:32 | |
*** e0ne has quit IRC | 18:35 | |
openstackgerrit | Matt Riedemann proposed a change to openstack-infra/elastic-recheck: Add query for glanceclient/requests bug 1364893 https://review.openstack.org/119164 | 18:35 |
uvirtbot | Launchpad bug 1364893 in python-glanceclient "New version of requests library breaks unit tests" [Undecided,In progress] https://launchpad.net/bugs/1364893 | 18:35 |
*** tomoe_ has quit IRC | 18:36 | |
clarkb | hrm we seem to be leaking nodes again /me looks at nodepool | 18:36 |
openstackgerrit | Adam Gandelman proposed a change to openstack-infra/config: Set virt driver for sideways ironic job https://review.openstack.org/119166 | 18:37 |
adam_g | clarkb, ^ quick fix for the sideways job | 18:37 |
*** nikhil__1 is now known as nikhil | 18:38 | |
clarkb | significant numbers of rax nodes in delete state for >1 hour | 18:38 |
*** nikhil is now known as nikhil_k | 18:38 | |
clarkb | I do not see any building images so I don't think building images is at fault | 18:38 |
*** e0ne has joined #openstack-infra | 18:39 | |
clarkb | adam_g: oh cool we can ditch libvirt for that job? | 18:39 |
*** nikhil_k is now known as nikhil__ | 18:40 | |
adam_g | clarkb, sort of. we still end up using libvirt indirectly, but nova uses the ironic driver | 18:40 |
clarkb | manual deletion seemed to work | 18:41 |
*** markmcclain has quit IRC | 18:41 | |
*** yjiang5 has joined #openstack-infra | 18:41 | |
*** markmcclain has joined #openstack-infra | 18:41 | |
*** nikhil__ is now known as nikhilkomawar | 18:42 | |
*** ianw has quit IRC | 18:42 | |
openstackgerrit | Matt Riedemann proposed a change to openstack-infra/elastic-recheck: Add query for test_volume_boot_pattern TypeError https://review.openstack.org/119168 | 18:42 |
clarkb | nova show on a node says vm_state is building and task_state is deleting | 18:42 |
clarkb | jogo: sdague ^ does that mean nova is trying to build the node, hit an error, and is now deleting the node? | 18:43 |
dprince | clarkb: I would expect it to be in ERROR state if that occurred | 18:43 |
clarkb | dprince: ok, status is BUILD not ERROR | 18:43 |
dprince | clarkb: did it timeout in BUILD, and then was deleted? | 18:44 |
*** paulrad has quit IRC | 18:44 | |
dprince | clarkb: perhaps got stuck building or something? | 18:44 |
*** jp_at_hp has quit IRC | 18:44 | |
fungi | jeblair: if you're talking about generally useful nodepool metadata to stick on each node, then i'd also add the snapshot image name/uuid and nova instance uuid. both useful to regurgitate in jobs for later debugging | 18:44 |
*** paulrad has joined #openstack-infra | 18:44 | |
*** jp_at_hp has joined #openstack-infra | 18:44 | |
fungi | oh, json blob of nova info solves that. i should reall *all* scrollback before responding to parts of it | 18:44 |
clarkb | dprince: hrm that may be a red herring. nova show on a bunch of other nodes comes back with no node | 18:45 |
anteaya | fungi: then you would never respond | 18:45 |
*** jp_at_hp has quit IRC | 18:45 | |
jeblair | clarkb, fungi, mordred: https://review.openstack.org/#/c/119096/ | 18:45 |
*** yolanda has quit IRC | 18:45 | |
clarkb | jeblair: fungi: so is the symptom that nodepool tries to delete the node, it doesn't delete immediately, but the cron never runs to pick up on the eventual deletion? | 18:45 |
dprince | clarkb: cool, yeah that sounds like an odd set of states | 18:45 |
*** dane_leblanc has quit IRC | 18:47 | |
*** nikhilkomawar is now known as nikhil_k | 18:48 | |
*** paulrad has quit IRC | 18:48 | |
ttx | fungi: hi! Looks like we are piling up "deleting"s again? | 18:49 |
* ttx didn't read scrollback | 18:49 | |
*** nikhil_k is now known as nikhilk | 18:49 | |
*** nikhilk is now known as nikhilk_ | 18:50 | |
*** nikhilk_ is now known as nikhil__ | 18:50 | |
*** nikhil__ is now known as nikhil___ | 18:50 | |
openstackgerrit | Dmitry Teselkin proposed a change to openstack-infra/config: Dependencies to install python-cinderclient https://review.openstack.org/119066 | 18:50 |
fungi | ttx: yeah, i was at lunch and clarkb started looking at it | 18:51 |
*** sdake has quit IRC | 18:51 | |
*** pcrews has quit IRC | 18:51 | |
*** nikhil___ is now known as nikhil_k | 18:52 | |
ttx | fungi: ok | 18:52 |
clarkb | the last starting periodic cleanup was at 1556 | 18:52 |
clarkb | so ~3 hours ago | 18:53 |
clarkb | fungi: mostly trying to help but probably rediscovering all the stuff you learned :) | 18:53 |
*** [HeOS] has joined #openstack-infra | 18:54 | |
ttx | I know SergeyLukjanov did some manual cleanup that helped mitigate the symptoms and give the queue smoe air to breathe | 18:54 |
ttx | last time it occurred | 18:54 |
clarkb | ya I think if the node doesn't actually exist we can remove it from the db | 18:55 |
ttx | he also noted that all stuck nodes were actually rax | 18:55 |
ttx | that's about the extent of my knowledge of the issue, fungi is proably lightyears ahead in his analysis | 18:56 |
*** dkranz has quit IRC | 18:56 | |
*** jpeeler has joined #openstack-infra | 18:57 | |
fungi | where lightyears is defined as "i ate lunch and tried not to think about this stuff for a bit" ;) | 18:57 |
*** penguinRaider has quit IRC | 18:57 | |
*** hdd has joined #openstack-infra | 18:57 | |
fungi | but yeah, sounds like something is causing the periodic cleanup to hang or at least take an unusually long time to complete one cycle... taking another thread dump now to see what it's doing | 18:57 |
*** dane_leblanc has joined #openstack-infra | 18:58 | |
clarkb | http://paste.openstack.org/show/105976/ I see that in the logs | 19:00 |
fungi | latest thread dump is "threaddump4.log" in my homedir on nodepool.o.o | 19:00 |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Add query for gate-devstack-bashate xen functions fails https://review.openstack.org/119126 | 19:01 |
clarkb | but that is for image cleanup | 19:01 |
fungi | clarkb: yeah, we were seeing those before in the logs too, corresponding to deleting an image template instance | 19:01 |
fungi | clarkb: but it does come at the end of the last periodic cleanup before the one which started and hasn't ceased | 19:02 |
fungi | 2014-09-04 15:55:36,989 DEBUG nodepool.NodePool: Finished periodic cleanup | 19:02 |
fungi | 2014-09-04 15:56:00,001 DEBUG nodepool.NodePool: Starting periodic cleanup | 19:02 |
fungi | crickets | 19:02 |
*** r1chardj0n3s is now known as r1chardj0n3s_afk | 19:02 | |
clarkb | yup that one happened right before the finish log there | 19:02 |
anteaya | well at least my dentist called to check on me, which was nice of him | 19:03 |
clarkb | but it is in the periodic cleanup | 19:03 |
clarkb | anteaya: it might a trap, they want you to visit again | 19:03 |
fungi | nah, that's why they plant the remote mind control devices in your fillings | 19:03 |
fungi | phone calls and appointment cards were too unreliable | 19:04 |
*** markmcclain has quit IRC | 19:04 | |
anteaya | clarkb: no, my heart isn't doing what it should be, I called to let them know and he called back to check on me | 19:05 |
clarkb | anteaya: oh no :( | 19:05 |
anteaya | clarkb: he is a good dentist | 19:05 |
anteaya | clarkb: yeah, not happy on so many levels, trying to focus on reviews | 19:05 |
*** lcheng_ has quit IRC | 19:06 | |
*** r1chardj0n3s_afk is now known as r1chardj0n3s | 19:06 | |
*** otherwiseguy has joined #openstack-infra | 19:06 | |
clarkb | fungi: Thread: Thread-60572 (140496667784960) is the periodic cleanup thread best I can tell | 19:08 |
clarkb | I am going to do another dump and see if that thread has moved | 19:08 |
fungi | clarkb: yeah, and it's in a wait call, but not sure if that's unusual | 19:08 |
clarkb | fungi: and specifically stuck on a wait to delete and image | 19:09 |
clarkb | not a regular node | 19:09 |
*** terryw has joined #openstack-infra | 19:09 | |
clarkb | so the stuff above may be related? | 19:09 |
fungi | possibly | 19:09 |
clarkb | is it sigusr1 for threadump or 2? | 19:09 |
fungi | 2 | 19:09 |
fungi | 108 NodeLauncher threads, 88 NodeDeleter threads, 12 NodeCompleteThread threads, 7 NodeUpdateListener threads, 2 Gearman threads | 19:10 |
fungi | threads for each provider, each target, and one each of APScheduler, DiskImageBuilder, MainThread, NodePool, plus a couple of generic Thread-NNNN paramiko threads, one of which is our cleanup thread | 19:11 |
*** aysyd has quit IRC | 19:11 | |
clarkb | ya it looks like it is still stuck on that getServer call to cleanup an image | 19:11 |
*** otherwiseguy has quit IRC | 19:11 | |
*** MaxV has joined #openstack-infra | 19:11 | |
clarkb | so the getServer task isn't coming back | 19:12 |
*** lttrl has quit IRC | 19:12 | |
fungi | is the number in parens next to each thread its start time in epoch subseconds (to some precision?) if so the cleanup thread works out to being started on july 10th this year which doesn't sound right at all given it's been restarted plenty | 19:13 |
*** _nadya_ has quit IRC | 19:14 | |
*** emagana has quit IRC | 19:16 | |
clarkb | fungi: it is thread id | 19:16 |
clarkb | for thread_id, stack_frame in sys._current_frames().items(): | 19:16 |
*** emagana has joined #openstack-infra | 19:16 | |
clarkb | the first bit is the human readable thread name | 19:16 |
fungi | ahh, yep, just found the dumper routine | 19:17 |
*** devoid has joined #openstack-infra | 19:17 | |
openstackgerrit | Dolph Mathews proposed a change to openstack-infra/elastic-recheck: add query for mysql server has gone away https://review.openstack.org/119177 | 19:17 |
clarkb | fungi: it is putting the task on a queue then waiting for the task to complete | 19:18 |
clarkb | so either we are never executing the task off of the queue, it is failing and throwing in a way that doesn't signl the wait, or we are waiting on a response? or? | 19:19 |
*** mrmartin has quit IRC | 19:19 | |
fungi | right, just trying to figure out how to tell when comparing one thread dump to the next whether that thread is still stuck on the same instance of the same call, or merely performing another identical action (perhaps one which it spends most of its time doing over and over) | 19:19 |
clarkb | oh good question, maybe strace? | 19:20 |
clarkb | though python trheads probably don't presetn themselves nicely to strace do they? | 19:21 |
fungi | i suspect they're all lumped in the same process | 19:21 |
fungi | looking | 19:21 |
*** emagana has quit IRC | 19:21 | |
fungi | nope, i just see a pause call on the process which never returns | 19:22 |
*** Sukhdev has joined #openstack-infra | 19:22 | |
fungi | aha! | 19:23 |
fungi | pthreads can have their id substituted as you would a process id | 19:23 |
jeblair | fungi, clarkb: the iad, ord, and dfw managers are all running tasks and have 0 queue | 19:24 |
*** packet has quit IRC | 19:24 | |
jeblair | fungi, clarkb: the hpcloud managers have queues | 19:25 |
fungi | jeblair: those are the threads like Thread: rax-iad (140494351386368) presumably? | 19:25 |
fungi | aha, you can tell it by self.not_empty.wait() -> waiter.acquire() | 19:25 |
fungi | my eyes had totally glazed over there | 19:26 |
clarkb | jeblair: so the queue was emptied but the condition was not set | 19:26 |
clarkb | ? | 19:26 |
clarkb | which seems odd as the task maanger and task framework catches all the things | 19:26 |
jeblair | clarkb: i'm not drawing any conclusions, i'm just supplying info :) | 19:26 |
*** aysyd has joined #openstack-infra | 19:27 | |
*** Longgeek_ has quit IRC | 19:29 | |
jeblair | fungi: are you able to strace the individual thread? | 19:30 |
fungi | jeblair: i believe so, but the kernel's thread id doesn't seem to match up to anything in the thread dump so i'm having trouble mapping to the correct value | 19:31 |
mtreinish | anteaya: do you know what's going on with this brocade ci system: https://review.openstack.org/#/c/119060/ | 19:31 |
fungi | pthreads ids the kernel knows about for the nodepoold process are in /proc/5340/task/ | 19:31 |
*** markmcclain has joined #openstack-infra | 19:32 | |
*** mpaolino has joined #openstack-infra | 19:32 | |
*** r1chardj0n3s is now known as r1chardj0n3s_afk | 19:33 | |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Add debug messages to periodic cleanup https://review.openstack.org/119181 | 19:33 |
*** mwagner_lap has quit IRC | 19:33 | |
*** esker has joined #openstack-infra | 19:34 | |
*** mpaolino has quit IRC | 19:34 | |
fungi | looks like i need to compile a little c to map these up | 19:35 |
clarkb | fungi: oh no | 19:36 |
*** [HeOS] is now known as HeOS | 19:36 | |
*** esker has quit IRC | 19:36 | |
jeblair | fungi: oh, i thought it had to be done within the thread? | 19:36 |
clarkb | fungi: do you need the python-dbg interpreter? | 19:36 |
jeblair | fungi: can you do it from outside? | 19:36 |
fungi | jeblair: ahh, looks like probably no | 19:37 |
*** StevenK has quit IRC | 19:38 | |
*** rushiagr is now known as rushiagr_away | 19:39 | |
openstackgerrit | Matt Riedemann proposed a change to openstack-infra/elastic-recheck: Add some notes to the readme about queries that don't hit https://review.openstack.org/119184 | 19:39 |
fungi | oh, maybe... http://blog.devork.be/2010/09/finding-linux-thread-id-from-within.html | 19:39 |
anteaya | mtreinish: I do not, I am looking now | 19:39 |
fungi | i'm assuming the constant i'd pass in there is the python thread id? | 19:39 |
fungi | (the one from the thread dump) | 19:40 |
jeblair | fungi: no, it's the syscall value for "get the current thread id" | 19:40 |
fungi | oh, poo | 19:40 |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Remove resolved fingerprints https://review.openstack.org/117916 | 19:40 |
jeblair | fungi: so that code is what we'd add to nodepool to have a thread know its own tid | 19:40 |
fungi | right, got it | 19:40 |
anteaya | mtreinish: is this the first you noticed them commenting on tempest? | 19:40 |
*** r1chardj0n3s_afk is now known as r1chardj0n3s | 19:41 | |
openstackgerrit | Matt Riedemann proposed a change to openstack-infra/elastic-recheck: Add some notes to the readme about queries that don't hit https://review.openstack.org/119184 | 19:41 |
mordred | jeblair: launch_node.py does not work on HP Cloud | 19:42 |
*** emagana has joined #openstack-infra | 19:42 | |
clarkb | mordred: oh? because of the network stuff? | 19:42 |
mtreinish | anteaya: I noticed it the other day I think, but that's the first time I noticed multiple comments on the same rev | 19:42 |
mordred | yup | 19:42 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Add debug messages to periodic cleanup https://review.openstack.org/119181 | 19:42 |
anteaya | mtreinish: how much time are you willing to give me to find and talk to them before I disable their system? | 19:43 |
*** StevenK has joined #openstack-infra | 19:43 | |
jroll | sdague: have a moment to take a look at https://review.openstack.org/#/c/118507/ please? | 19:43 |
anteaya | so far I have been given a name in -neutron and no response, I haven't emailed to -announce yet | 19:43 |
openstackgerrit | Ken Giusti proposed a change to openstack-infra/config: Add a test to verify oslo.messaging's AMQP 1.0 messaging protocol support https://review.openstack.org/115752 | 19:43 |
mtreinish | anteaya: it doesn't really bug me, its just a few extra emails, I just thought I'd point it out | 19:43 |
* mordred is torn between fixing launch_node or finishing the replacement for it real quick | 19:43 | |
jeblair | fungi, clarkb: so taking what we know from the debug log... | 19:44 |
anteaya | mtreinish: kk, thanks I appreciate you letting me know, if it gets worse do tell me | 19:44 |
jeblair | fungi, clarkb: it look like with the current backlog, the hpcloud providers take about 3 minutes to get through the queue to process a request | 19:44 |
jeblair | fungi, clarkb: and i think it takes 3 requests to delete a server | 19:44 |
jeblair | fungi, clarkb: so if the periodic thread is deleting hpcloud servers, it could take it 9 minutes to do each one | 19:45 |
sdague | jroll: looks fine to me | 19:45 |
jeblair | fungi, clarkb: oh, but actually node deletes are parallel | 19:45 |
jroll | sdague: thanks | 19:45 |
jogo | clarkb: I dug into that a little bit, and it looked like a few issues including failures on the old side | 19:45 |
jeblair | fungi, clarkb: it spawns new threads to actually do the deletes | 19:45 |
clarkb | jeblair: parallel across providers but each provider is serial right? | 19:46 |
jogo | clarkb: I think it fails a lot because there is a 2x chance of hitting a old bug in stable nova-compute | 19:46 |
jeblair | clarkb: yeah, but we're wondering why the perodic thread is waiting on something | 19:46 |
jeblair | clarkb: and i don't think it should be waiting on a normal server delete | 19:46 |
*** pcrews has joined #openstack-infra | 19:46 | |
clarkb | jeblair: oh I see, it is waiting on an image delete if that makes a difference | 19:46 |
jeblair | oh it's deleting an image | 19:46 |
fungi | i just compared stack traces from half an hour apart, and any node ids with deleter threads in the first dump no longer had deleter threads in the second dump, fwiw. so it's at least not hanging indefinitely on one particular node id | 19:46 |
*** MaxV has quit IRC | 19:47 | |
fungi | but i assume those are the original (not periodic retry) deletes | 19:47 |
*** Sincler has joined #openstack-infra | 19:47 | |
jeblair | fungi: they'd look the same | 19:47 |
mordred | clarkb, jeblair: I'm going to hack in an auto-floating-ip for now | 19:47 |
jeblair | fungi: the periodic thread just spawns new normal server delete threads | 19:48 |
fungi | oh, okay. so the periodic cleanup still spawns separate threads for each delete | 19:48 |
mordred | but I think this afernoon I'm going to do the other thing | 19:48 |
fungi | got it | 19:48 |
jeblair | fungi: image ids are already logged as it processes | 19:48 |
jeblair | grep "Deleting image" debug.log | 19:48 |
fungi | yep | 19:49 |
jroll | clarkb: when you have a free moment, could you look at https://review.openstack.org/#/c/118507/ ? :) | 19:49 |
jroll | (not a huge rush) | 19:49 |
fungi | just confirming that their individual deleter threads did not hang around, in that case | 19:49 |
*** weshay has quit IRC | 19:50 | |
*** paulrad has joined #openstack-infra | 19:50 | |
clarkb | it looks like 2-3 minutes is normal for image delete wich may also explain how I ended up catching that in the thread dump | 19:51 |
jeblair | fungi, clarkb: so based on that grep, i think the periodic cleanup thread is working | 19:52 |
*** arnaud has joined #openstack-infra | 19:52 | |
fungi | clarkb: right, each of the three rax regions currently has several failed devstack-f20-virt-preview images undergoing deletion as they continually loop and fail to complete successfully | 19:52 |
clarkb | jeblair: it is working, but slowly? | 19:53 |
*** r1chardj0n3s is now known as r1chardj0n3s_afk | 19:53 | |
fungi | so chances of catching a deletion thread for one is pretty high | 19:53 |
clarkb | since we can see it doesn't start very frequently | 19:53 |
jeblair | clarkb: it started more frequently earlier | 19:53 |
jeblair | i'm just saying it's moving now | 19:53 |
*** terryw has quit IRC | 19:54 | |
fungi | right, it was starting once a minute and completing withing 5-10 seconds, then at 15:56 it started a round which has not returned yet | 19:54 |
jeblair | https://etherpad.openstack.org/p/Chdns7OZek | 19:54 |
jeblair | it has chosen to delete many more images this time | 19:56 |
*** bdpayne has quit IRC | 19:56 | |
fungi | ohhhhh | 19:56 |
fungi | yeah i missed that it jumped from 15:36->15:55 for the previous pass too | 19:57 |
*** MaxV has joined #openstack-infra | 19:57 | |
*** mriedem has quit IRC | 19:57 | |
clarkb | jeblair: fungi: oh interesting | 19:57 |
fungi | grep -e 'Deleting image id:' -e 'periodic cleanup' /var/log/nodepool/debug.log | 19:57 |
*** bdpayne has joined #openstack-infra | 19:57 | |
fungi | basically what's there in the etherpad though | 19:57 |
*** baoli has joined #openstack-infra | 19:58 | |
fungi | so node deletes are spun off into parallel tasks by the periodic cleanup, but image deletes are serialized within the periodic cleanup instead? | 19:58 |
*** datsun180b has quit IRC | 19:58 | |
jeblair | fungi: yep | 19:58 |
*** bhuvan_ has joined #openstack-infra | 19:58 | |
*** bhuvan_ has quit IRC | 19:59 | |
*** bhuvan_ has joined #openstack-infra | 19:59 | |
fungi | that totally 'splains it | 19:59 |
jeblair | because they don't happen very often. and it's the periodic cleanup anyway, it's our last ditch. | 19:59 |
clarkb | would it be reasonable to use a separate cron for that? | 19:59 |
clarkb | or spin off threads? | 19:59 |
jeblair | clarkb: but why? isn't the real issue that the initial rax deletes didn't happen? | 19:59 |
clarkb | jeblair: or they may have failed | 20:00 |
jeblair | i mean, the periodic cleanup isn't going to have any more success until the underlying problem is solved. | 20:00 |
*** unicell has quit IRC | 20:00 | |
clarkb | jeblair: right, but the underlying problem may be out of our control and nodepool should be defensive | 20:00 |
jeblair | clarkb: it is :) | 20:00 |
*** baoli has quit IRC | 20:00 | |
jeblair | clarkb: i'm actually more interestid in why it chose to delete more images this run | 20:00 |
*** craigbr has quit IRC | 20:01 | |
*** baoli has joined #openstack-infra | 20:01 | |
*** liusheng has quit IRC | 20:01 | |
*** bhuvan has quit IRC | 20:01 | |
fungi | worth noting though that 15:30-16:00 or thereabouts will roughly coincide with new image update completion, which expires the original images. but we directly call image delete when we rotate those out rather than waiting for the periodic cleanup right? | 20:01 |
clarkb | my initial guess is that we were able to rebuild new images for those regions this time around | 20:01 |
*** liusheng has joined #openstack-infra | 20:01 | |
*** mpaolino has joined #openstack-infra | 20:01 | |
jeblair | ah, that makes sense | 20:01 |
jeblair | fungi: image deletion only happens in the periodic thread | 20:02 |
*** liusheng has quit IRC | 20:02 | |
fungi | three image updates completed around 15:30, and then a bunch more happened just after that, but the first cleanup was busy deleting those few and didn't catch teh rest until almost 1600 | 20:02 |
jeblair | so yeah, basically we just made a bunch of new images for the first time in 3 days, and are expiring the old ones. actually, probably the 2nd time in three days since we keep n-1. | 20:02 |
fungi | so goes my theory anyway | 20:02 |
anteaya | whoever gets to if first, please disable 10624pattabi-ayyasami-ciBrocade ADX cinerama pattabi.ayyasami@gmail.com | 20:02 |
*** pcm_ has quit IRC | 20:03 | |
fungi | anteaya: done | 20:03 |
jeblair | anteaya: done | 20:03 |
mtreinish | jogo: do you have a good way to measure commit proposal freq. for an individual project? | 20:03 |
anteaya | thank you | 20:03 |
fungi | double-tap to the head | 20:03 |
mtreinish | it looks like graphite shows just the total for all of gerrit | 20:03 |
*** aysyd has quit IRC | 20:03 | |
jeblair | clarkb: okay, so we could parelellize image deletes like we do node deletes | 20:04 |
*** otherwiseguy has joined #openstack-infra | 20:04 | |
jeblair | clarkb: there's a bunch of complicated machinery around that, but i suppose it's probably directly translatable | 20:04 |
jeblair | and it seems to be working well enough for node deletes | 20:04 |
clarkb | jeblair: or we could split the crons to keep the underlying machinery the same | 20:04 |
clarkb | jeblair: that is probably simpler but less "correct" | 20:04 |
jogo | mtreinish: not that I know of | 20:05 |
*** emagana has quit IRC | 20:05 | |
jeblair | clarkb: depends on whether parallel image deletes are a good thing on their own. | 20:05 |
*** hashar has joined #openstack-infra | 20:05 | |
*** emagana has joined #openstack-infra | 20:05 | |
*** emagana has quit IRC | 20:05 | |
*** emagana has joined #openstack-infra | 20:06 | |
*** marcoemorais has quit IRC | 20:06 | |
mtreinish | jogo: ok, I'll figure something out | 20:06 |
fungi | given that an image delete takes somwhere between 5-10 minutes at the moment, depending on provider, it seems sane to separate those tasks at any rate (whether separate serialized queue or parallelized individually) | 20:06 |
*** marcoemorais has joined #openstack-infra | 20:06 | |
jeblair | clarkb, fungi: so the speculation is rax failed to delete and we're falling back on the periodic delete. that probably means that at least once the problem is corrected, manual node deletes should speed things up until the periodic cleanup finishes its current run | 20:07 |
*** marcoemorais has quit IRC | 20:07 | |
*** marcoemorais has joined #openstack-infra | 20:07 | |
clarkb | jeblair: yup, should I script up some manual node deletion? | 20:07 |
*** marcoemorais has quit IRC | 20:07 | |
jeblair | fungi: may already have that ready? | 20:07 |
fungi | i'll run it now | 20:08 |
clarkb | great | 20:08 |
clarkb | fungi: is that a script or jus a one liner? | 20:08 |
*** marcoemorais has joined #openstack-infra | 20:08 | |
jeblair | i will get lunch, then resume mirror work and start on parallel image deletes | 20:08 |
clarkb | I think I end up running a list | grep delete | sort > file, clean it up because ancient nodes that time out then for loop over file content | 20:08 |
fungi | very, very ugly one-liner to list deleting state nodes older than a particular number of minutes and split them into 10 silos for separate delete loops | 20:08 |
*** otherwiseguy has quit IRC | 20:09 | |
clarkb | kk just checking your hack isn't better than mine :) | 20:09 |
clarkb | ok I am going to get food as well so I can continue to help with mirros after | 20:09 |
fungi | should free up here in a moment | 20:09 |
clarkb | mordred: anything you need from us to make node launching work? | 20:10 |
mordred | clarkb: nope. working on it | 20:10 |
*** devoid has quit IRC | 20:10 | |
*** amuller has joined #openstack-infra | 20:11 | |
cinerama | anteaya: is it possible that your client is tab completing from tabs in the output you're copying and pasting? i keep getting highlights from you | 20:12 |
*** doug-fish has left #openstack-infra | 20:12 | |
anteaya | cinerama: it is entirely possible | 20:13 |
anteaya | I couldn't figure out why your nick keeps showing up in my pastes | 20:13 |
clarkb | heh ci<tab? | 20:13 |
anteaya | that would be it, yes | 20:13 |
anteaya | suggestions on how I fix that? | 20:14 |
anteaya | it is only when I copy from the ssh output of the ci list that it happens | 20:14 |
anteaya | cinerama: or could you block pings from me? | 20:14 |
clarkb | anteaya: you should fix it on your side because it is updating your data | 20:15 |
fungi | anteaya: ssh output from the gerrit api? probably has embedded spaces. try |tr '\t' ' ' | 20:15 |
fungi | er, has embedded tabs | 20:16 |
*** aysyd has joined #openstack-infra | 20:16 | |
anteaya | fungi: it does yes | 20:16 |
*** kgiusti has left #openstack-infra | 20:16 | |
fungi | piping it through tr like that should convert them to spaces | 20:16 |
anteaya | 10624 pattabi-ayyasami-ci Brocade ADX CI pattabi.ayyasami@gmail.com | 20:17 |
anteaya | yay, that worked | 20:17 |
anteaya | thanks fungi | 20:17 |
anteaya | sorry cinerama | 20:17 |
cinerama | np, glad you worked it out | 20:18 |
*** dprince has quit IRC | 20:18 | |
*** ociuhandu has joined #openstack-infra | 20:18 | |
*** hashar has quit IRC | 20:18 | |
anteaya | cinerama: thanks for solving that mystery, I couldn't understand where cinerama was coming from | 20:18 |
openstackgerrit | A change was merged to openstack-infra/config: Add gertty release jobs https://review.openstack.org/119096 | 20:18 |
*** bhuvan_ has quit IRC | 20:19 | |
*** devoid has joined #openstack-infra | 20:19 | |
*** tsg has quit IRC | 20:22 | |
*** david-lyle is now known as david-lyle_afk | 20:22 | |
*** david-lyle_afk has quit IRC | 20:23 | |
*** bhuvan has joined #openstack-infra | 20:23 | |
*** devoid1 has joined #openstack-infra | 20:23 | |
*** flaper87 is now known as flaper87|afk | 20:24 | |
*** devoid has quit IRC | 20:24 | |
*** hdd has quit IRC | 20:25 | |
*** elixor has quit IRC | 20:26 | |
*** elixor has joined #openstack-infra | 20:27 | |
openstackgerrit | Jeremy Stanley proposed a change to openstack-infra/nodepool: Record provider/region/az in /etc/nodepool https://review.openstack.org/119138 | 20:27 |
*** david-lyle_afk has joined #openstack-infra | 20:28 | |
*** julim has quit IRC | 20:28 | |
*** yamamoto_ has quit IRC | 20:28 | |
*** devoid1 has quit IRC | 20:28 | |
*** paulrad has quit IRC | 20:29 | |
*** tsg has joined #openstack-infra | 20:29 | |
*** paulrad has joined #openstack-infra | 20:29 | |
*** david-lyle_afk has quit IRC | 20:29 | |
*** david-lyle_afk has joined #openstack-infra | 20:30 | |
*** david-lyle_afk has quit IRC | 20:30 | |
*** lcheng_ has joined #openstack-infra | 20:30 | |
*** david-lyle_afk has joined #openstack-infra | 20:30 | |
*** mpaolino has quit IRC | 20:32 | |
*** david-lyle has joined #openstack-infra | 20:32 | |
*** david-lyle_afk has quit IRC | 20:32 | |
*** devoid has joined #openstack-infra | 20:33 | |
clarkb | fungi: do you know why ^ removes the network info from the test fixture? | 20:34 |
clarkb | er thats just a fake.yaml? | 20:34 |
*** paulrad has quit IRC | 20:34 | |
*** pcrews has quit IRC | 20:36 | |
fungi | clarkb: yeah, looks like none of the tests were using the networks there | 20:36 |
*** doude has joined #openstack-infra | 20:37 | |
*** r1chardj0n3s_afk is now known as r1chardj0n3s | 20:38 | |
*** bhuvan has quit IRC | 20:38 | |
*** cipcosma has quit IRC | 20:40 | |
fungi | fatal: unable to access 'https://review.openstack.org/p/openstack-infra/jenkins-job-builder/': Empty reply from server | 20:40 |
fungi | ick | 20:40 |
fungi | seen in a gate-config-layout job | 20:40 |
fungi | git complaints about "Empty reply from server" are new to me | 20:40 |
clarkb | huh why is it talking to review.o.o? | 20:41 |
fungi | hrm, also... why does the layout job clone from gerrit? | 20:41 |
fungi | jinx | 20:41 |
fungi | looking into it now | 20:41 |
*** tgohad has joined #openstack-infra | 20:41 | |
fungi | btw that hit your 119143 volume attachment change, but it' | 20:41 |
fungi | s also suffering from a legit pep8 issue | 20:42 |
fungi | (at least as legit as whitespace issues can be) | 20:42 |
*** doug-fish has joined #openstack-infra | 20:42 | |
mordred | volume attachement produces strange errors | 20:42 |
mordred | still working through it | 20:42 |
clarkb | mordred: do you want to update my change then? | 20:43 |
mordred | clarkb: no, not yet | 20:43 |
mordred | clarkb: but I will | 20:43 |
mordred | clarkb: if I ever figure out what the hell is up | 20:43 |
clarkb | ok, should I fix pep8 now? | 20:43 |
mordred | clarkb: sure. | 20:43 |
*** tsg has quit IRC | 20:43 | |
mordred | clarkb: why do we generate a new keypair every time? | 20:44 |
*** bookwar has quit IRC | 20:45 | |
clarkb | mordred: because we only need it that first time iirc and you need soemthing for first login | 20:45 |
clarkb | mordred: after that its useless. we could use a long lived one but then you have to manage state somewhere | 20:45 |
mordred | clarkb: no, I mean, why don't we just use the keypair that's associated with the operating accoutn? | 20:46 |
mordred | clarkb: because right now, debuggging failures is a bit of a pain | 20:46 |
*** yjiang5 is now known as yjiang5_away | 20:46 | |
clarkb | mordred: because then you have to set that up is the only reason I think | 20:46 |
clarkb | mordred: its a state problem | 20:46 |
*** imcsk8_ has quit IRC | 20:46 | |
*** imcsk8 has joined #openstack-infra | 20:47 | |
mordred | clarkb: ok. so there is not a reason per-se that we want to avoid doing such a thing | 20:47 |
clarkb | mordred: I don't think so | 20:47 |
*** yjiang5_away has quit IRC | 20:47 | |
clarkb | fungi: oh because it runs that tools script | 20:48 |
*** koolhead_ has quit IRC | 20:48 | |
clarkb | fungi: we should update to use git.o.o | 20:48 |
fungi | clarkb: yeah, i'm working on a patch | 20:48 |
fungi | clarkb: we should actually update it to use the local clones and then update from git.o.o | 20:49 |
fungi | or use zuul cloner | 20:49 |
clarkb | fungi: maybe? I do like I can run it locally without any fuss | 20:50 |
openstackgerrit | Jeremy Stanley proposed a change to openstack-infra/config: Use git farm for layout job cloning https://review.openstack.org/119203 | 20:50 |
fungi | that ^ then | 20:50 |
clarkb | and since that [ -d checks we can have zuul cloner do it in our jobs but have local runs use git.o.o directly | 20:51 |
fungi | should also be self-testing | 20:51 |
clarkb | so I think I like that with updated job later | 20:51 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Add support to launch node for attaching volumes https://review.openstack.org/119143 | 20:52 |
fungi | yeah, zuul cloner would allow us to rip out a fair amount of this, but really we could just git grep the config repo for random uses of git clone and i'm betting most of them could be solved with git cloner consistently | 20:52 |
clarkb | mordred: fungi ^ nwo with pep8 compliant formatting | 20:52 |
fungi | er, with zuul cloner consistently | 20:52 |
*** mrda1 is now known as mrda | 20:53 | |
*** devoid has quit IRC | 20:53 | |
*** devoid has joined #openstack-infra | 20:53 | |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Add query for glanceclient/requests bug 1364893 https://review.openstack.org/119164 | 20:53 |
uvirtbot | Launchpad bug 1364893 in python-glanceclient "New version of requests library breaks unit tests" [Undecided,In progress] https://launchpad.net/bugs/1364893 | 20:54 |
fungi | in fact, the layout job could probably be rewritten to use grenade-like logic, since it is effectively an upgrade integration job | 20:54 |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Add query for test_volume_boot_pattern TypeError https://review.openstack.org/119168 | 20:54 |
*** bookwar has joined #openstack-infra | 20:54 | |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: Update query for pip timeout bug 1270710 https://review.openstack.org/119121 | 20:54 |
uvirtbot | Launchpad bug 1270710 in openstack-ci "sporadic pip timeouts during download" [Medium,Incomplete] https://launchpad.net/bugs/1270710 | 20:54 |
mordred | clarkb: ok. it just flat out does not attach the volume | 20:55 |
clarkb | mordred: huh | 20:56 |
clarkb | mordred: and it isn't that my code is bad? | 20:56 |
mordred | clarkb: unknown yet | 20:56 |
mordred | clarkb: I also don't knwo why make_swap.sh is failing | 20:56 |
clarkb | because the likelihood of my code being bad seems to be high | 20:56 |
mordred | clarkb: I believe what I'm going to do is do a volume-attach after the boot | 20:56 |
clarkb | mordred: you might try setting the boot_index number | 20:57 |
mordred | clarkb: because $derp | 20:57 |
clarkb | mordred: I thought it wasn't required but it may be. try -1 or semething like 3 | 20:57 |
fungi | mordred: make_swap.sh may need additional logic to deal with device names in different providers/platforms | 20:57 |
mordred | clarkb: I think that I'm just going to email jogo and sdague and tell them that novaclient amkes me die inside | 20:57 |
mordred | fungi: it has it | 20:57 |
fungi | k | 20:57 |
clarkb | mordred: that works too | 20:57 |
*** marcoemorais has quit IRC | 20:57 | |
clarkb | mordred: because I did die a little trying to sort out how to make this work | 20:57 |
jogo | mordred: I have a bug I just filed | 20:57 |
jogo | well two: | 20:58 |
jogo | bug 1365251 | 20:58 |
uvirtbot | Launchpad bug 1365251 in python-novaclient "TypeError: __init__() got an unexpected keyword argument 'retry_after'" [High,Confirmed] https://launchpad.net/bugs/1365251 | 20:58 |
*** emagana has quit IRC | 20:58 | |
*** david-lyle has quit IRC | 20:58 | |
jogo | bug 1202179 | 20:58 |
uvirtbot | Launchpad bug 1202179 in python-novaclient "findall in novaclient/base.py is inefficient" [Undecided,In progress] https://launchpad.net/bugs/1202179 | 20:58 |
* mordred removes jogo from his naughty list for now | 20:58 | |
*** david-lyle has joined #openstack-infra | 20:58 | |
sdague | mordred: you think it doesn't make me feel the same? :) | 20:58 |
*** bhuvan has joined #openstack-infra | 20:59 | |
mordred | sdague: why does it exist as a piece of code at all? it's only purpose in life seems to be to make python developers sad | 20:59 |
jogo | mordred: yeah its terrible | 20:59 |
jogo | mordred: that sounds about right | 20:59 |
mordred | ok. well, if it's trying to make python developers sad, it's awesome | 20:59 |
sdague | mordred: you did see that I consider the unified sdk/client to be one of the top 5 project priorities for next cycle, right? | 20:59 |
jogo | if you delete instances by name and not by UUID you may have hit the second bug | 20:59 |
*** emagana has joined #openstack-infra | 21:00 | |
clarkb | sdague: I saw that | 21:00 |
*** asselin has joined #openstack-infra | 21:00 | |
clarkb | sdague: also logging which makes me happy | 21:00 |
*** arnaud has quit IRC | 21:00 | |
dtroyer_zz | mordred, et al: https://github.com/stackforge/python-openstacksdk https://github.com/openstack/python-openstackclient/ | 21:00 |
*** gokrokve has quit IRC | 21:00 | |
dtroyer_zz | sdague: I saw that too (sdk/client). thanks | 21:01 |
*** gokrokve has joined #openstack-infra | 21:01 | |
*** marcoemorais has joined #openstack-infra | 21:01 | |
mordred | dtroyer_zz: the examples dir from openstacksdk makes my eyes bleed | 21:01 |
*** marcoemorais has quit IRC | 21:01 | |
*** marcoemorais has joined #openstack-infra | 21:02 | |
mordred | dtroyer_zz: are there examples that show me how to do things like, you know create servers with ip addresses? | 21:02 |
dtroyer_zz | mordred: I gave up on that and kept my own…need to get it publicly visible again | 21:02 |
clarkb | mordred: in particular the need to pass an REST api path to a create method arg | 21:02 |
*** marcoemorais has quit IRC | 21:02 | |
mordred | dtroyer_zz: because this: https://github.com/stackforge/python-openstacksdk/blob/master/examples/create.py is even worse than python-novaclient's interface | 21:02 |
*** marcoemorais has joined #openstack-infra | 21:02 | |
dtroyer_zz | the SDK is really still learning to crawl. | 21:03 |
* mordred cries inside | 21:03 | |
* fungi votes we strap that toddler to a rocket sled | 21:04 | |
*** gokrokve has quit IRC | 21:04 | |
dtroyer_zz | I've been focusing on client stuff so I'm working top-down | 21:04 |
*** gokrokve has joined #openstack-infra | 21:04 | |
clarkb | mordred: actually you should test if you can do the volume attach with nova cli | 21:04 |
*** gokrokve has quit IRC | 21:04 | |
clarkb | mordred: since that *should* workl and I didn't write that code | 21:04 |
*** gokrokve has joined #openstack-infra | 21:05 | |
*** annegentle has quit IRC | 21:05 | |
*** asselin has quit IRC | 21:05 | |
dtroyer_zz | so I'm going to scrape this conversation for a priority list of things to look at in OSC and make sure they're sane…thanks for the input ;) | 21:05 |
*** mfainberg_phone has joined #openstack-infra | 21:06 | |
mordred | dtroyer_zz: os-sdk working would make me very happy ... at the moment, the amount of boilerplate I have to write to do simple things in python is kinda amazing | 21:06 |
jogo | dtroyer_zz: are you sleep talking? | 21:06 |
dtroyer_zz | jogo: yes…znc hasn't done that to me in a while, not sure why now | 21:06 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Paralellize image deletes https://review.openstack.org/119208 | 21:07 |
jeblair | clarkb, fungi: ^ | 21:07 |
clarkb | mordred: `nova boot --block-device id=someuuid,source=volume,dest=volume,shutdown=preserve` | 21:07 |
clarkb | mordred: if that works then my code is bad and probably needs to set additional defaults that the private parsing of ^ that string sets | 21:07 |
kevinbenton | I just had two neutron patches that finally made it to close enough to the front of the gate to test and they failed with a sahara error: https://jenkins02.openstack.org/job/gate-tempest-dsvm-neutron-large-ops/34598/console | 21:08 |
kevinbenton | is this a known issue? | 21:08 |
*** pcrews has joined #openstack-infra | 21:09 | |
jeblair | kevinbenton: it wasn't a sahara error, that message is harmless (but annoying) | 21:10 |
jeblair | kevinbenton: http://logs.openstack.org/49/113749/8/gate/gate-tempest-dsvm-neutron-large-ops/fc0d900/logs/devstacklog.txt.gz | 21:10 |
kevinbenton | jeblair: oh okay | 21:10 |
kevinbenton | jeblair: so a prereq install issue? | 21:10 |
jeblair | kevinbenton: yeah, looks like the network issues we've been seeing between hpcloud and the pypi mirror in rax | 21:10 |
jeblair | kevinbenton: we're actually working on setting up per-region mirrors to alleviate that right now | 21:11 |
*** che-arne has joined #openstack-infra | 21:11 | |
kevinbenton | jeblair: ah. dang. i’ll miss FF now :-( | 21:11 |
mordred | jeblair, clarkb, fungi: do we use make_swap.sh on nodepool nodes? | 21:11 |
clarkb | mordred: no, I believe d-g handle that if swap is necessary? | 21:12 |
*** emagana has quit IRC | 21:12 | |
clarkb | though nodepool ready script may I guess | 21:12 |
mordred | ok. because the script is broken on hp and I was wondering how much we care | 21:12 |
*** emagana has joined #openstack-infra | 21:12 | |
jeblair | yeah, devstack-gate does it for test nodes, and launch-node is probably lagging that due to the no-hp-hosts thing :) | 21:13 |
*** aysyd has quit IRC | 21:13 | |
jeblair | mordred: if it needs fixing, copying from devstack-gate will probably work | 21:13 |
anteaya | kevinbenton: yes, you will but you are a good candidate for a FFE | 21:13 |
*** devoid has quit IRC | 21:14 | |
anteaya | kevinbenton: talk to mestery or whoever is curating the neutron FFE list and get yourself added | 21:14 |
anteaya | I was talking to ttx before he signed off for the night and he is well aware of the gate status | 21:14 |
*** marun_afk has joined #openstack-infra | 21:14 | |
kevinbenton | anteaya: ok | 21:14 |
sdague | do we know if zookeeper is actually getting installed on unit test nodes? Because the zk unit tests for nova are getting skipped. | 21:14 |
openstackgerrit | A change was merged to openstack-infra/config: Reduce min-ready https://review.openstack.org/118930 | 21:14 |
clarkb | sdague: I can check | 21:15 |
sdague | it might be that we need other libs | 21:15 |
fungi | kevinbenton: i think the general takeaway is that anything which was already approved will try to get squeezed through before j-3 (or if it isn't then it's already past being rejected due to the feature freeze regardless because it's been approved) | 21:15 |
*** cdent has quit IRC | 21:16 | |
mestery | kevinbenton: Talk to me offline, preferably in email | 21:16 |
kevinbenton | fungi: sounds good. i’ll chat with mestery about it | 21:16 |
mestery | kevinbenton: I think I'm heading out for a bit soon | 21:17 |
*** emagana has quit IRC | 21:17 | |
kevinbenton | mestery: just sent you a message | 21:17 |
mestery | kevinbenton: ack | 21:18 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Paralellize image deletes https://review.openstack.org/119208 | 21:18 |
jeblair | clarkb, fungi: ^ now with fewer unrelated changes :) | 21:18 |
jogo | so https://review.openstack.org/#/c/119037/ | 21:18 |
jogo | it fixes a gate bug | 21:18 |
jogo | anyway to get it prioritized? | 21:19 |
jogo | bug https://bugs.launchpad.net/devstack/+bug/1365590 | 21:19 |
uvirtbot | Launchpad bug 1365590 in devstack "gate-devstack-bashate failing on E020 in tools/xen/functions" [Critical,In progress] | 21:19 |
jogo | 37 hits in 24 hours | 21:19 |
jogo | fungi: ^ | 21:19 |
jeblair | it only affects devstack changes | 21:19 |
jeblair | (still might be a good idea, just pointing that out) | 21:20 |
fungi | i guess it would become a priority if it unblocks some other integration testing fixes which are affecting more projects | 21:20 |
clarkb | sdague: zookeeper is running on the node I spot checked | 21:20 |
jeblair | (because devstack changes ahead of it are going to cause resets) | 21:20 |
sdague | clarkb: can you check for 2 python modules? | 21:20 |
*** devoid has joined #openstack-infra | 21:21 | |
*** gokrokve has quit IRC | 21:21 | |
clarkb | sdague: sure | 21:21 |
clarkb | but any python modules should be installed as part of the tox run | 21:21 |
clarkb | not as part of the image | 21:21 |
sdague | evzookeeper, zookeeper | 21:22 |
jogo | here is a devstack patch in the gate https://review.openstack.org/117878 | 21:22 |
fungi | 118090,1 and 119037,3 are the only two devstack changes in the gate so far, but that's two resets we could avoid | 21:22 |
jogo | and a second https://review.openstack.org/118069 | 21:22 |
*** Sukhdev has quit IRC | 21:22 | |
jeblair | it also fails quickly | 21:22 |
clarkb | sdague: those are python modules? | 21:22 |
clarkb | sdague: they should be installed by tox | 21:22 |
sdague | yeh, so they aren't right now, I'm seeing why | 21:22 |
clarkb | neither is currently isntalled in the root of the image | 21:22 |
jogo | fungi: yup, well those will fail each time they are run which can be a lot | 21:22 |
anteaya | when someone feels like it, 10119 varmourci vArmour CI openstack-ci-test@varmour.com can be re-enabled | 21:22 |
fungi | oh, right, 117878,1 is within the window and has already caused damage | 21:23 |
jeblair | it takes < 10 seconds to fail | 21:23 |
jeblair | honestly, i think i'd just keep that one in the back pocket and promote it if we had something else to do too. | 21:24 |
*** elixor has quit IRC | 21:24 | |
jeblair | maybe we should snipe those devstack changes out? | 21:25 |
clarkb | I was going to suggest we could rebase those chagnes on the fix | 21:25 |
clarkb | which is heavy handed... | 21:25 |
clarkb | but an alternative | 21:25 |
*** e0ne has quit IRC | 21:25 | |
dtroyer_zz | jeblair, jogo: https://review.openstack.org/118090 was the only change I think needs to follow https://review.openstack.org/119037. the rest can wait afaik | 21:26 |
fungi | rebasing them on the fix would snipe them out effectively | 21:26 |
dtroyer_zz | and even there I'm not sure how bad the problem is that it fixes | 21:26 |
jeblair | https://jenkins04.openstack.org/job/gate-tempest-dsvm-neutron-full/1261/console | 21:26 |
jeblair | does that mean the current head is about to fail ^ ? | 21:27 |
*** mbacchi has quit IRC | 21:27 | |
*** mmedvede has quit IRC | 21:27 | |
clarkb | I think so | 21:27 |
*** mfainberg_phone has quit IRC | 21:27 | |
clarkb | those tests appear to have failed | 21:27 |
openstackgerrit | Gregory Haynes proposed a change to openstack-infra/config: Enable debug logging for tripleo ha job https://review.openstack.org/119218 | 21:28 |
clarkb | and tempest has gone sideways | 21:28 |
clarkb | ~half an hour since the first fail | 21:28 |
fungi | yep, could promote that devstack fix at the moment the top change reports and avoid losing too many additional nodes | 21:28 |
*** andreaf has joined #openstack-infra | 21:28 | |
*** devoid has quit IRC | 21:29 | |
jeblair | yeah, i think under the circumstances that would be okay | 21:29 |
jogo | dtroyer_zz: unrelated note http://logs.openstack.org/42/117942/2/gate/gate-tempest-dsvm-large-ops/e3b464a/console.html#_2014-09-03_10_55_26_215 | 21:29 |
jogo | /opt/stack/new/devstack/lib/sahara-dashboard: No such file or directory | 21:29 |
jogo | not causing a failure | 21:29 |
jogo | but odd | 21:29 |
*** yamamoto has joined #openstack-infra | 21:29 | |
clarkb | jogo: it has confused everyeone | 21:29 |
clarkb | there is a fix iirc but since its not a gate fixer it hasn't been propmoted? | 21:30 |
dtroyer_zz | jogo: that what https://review.openstack.org/118090 fixes that I mentioned a minute ago | 21:30 |
jeblair | jogo: https://review.openstack.org/#/c/118090/ | 21:30 |
dtroyer_zz | clarkb: yes | 21:30 |
fungi | i'll keep an eye on that failing job run and as soon as zuul reports on the corresponding change i'll go ahead and promote 119037,3 | 21:30 |
clarkb | and it hasn't caused any failures... | 21:30 |
jogo | dtroyer_zz: doh | 21:30 |
sdague | clarkb: so zookeeper is wrapping the clib | 21:30 |
jeblair | fungi: okay, thanks | 21:30 |
sdague | which is why it's not in test-requirements, otherwise everyone would have to install zookeeper dev library locally | 21:31 |
clarkb | sdague: that will do it | 21:31 |
clarkb | I think tooz is supposed to be native but that probably requirse a port | 21:31 |
jogo | I take you guys hhave seen "Bad md5 hash for package" | 21:31 |
clarkb | jogo: ya I think it is related to the network trouble between hpcloud and rax mirror | 21:32 |
clarkb | jogo: did that happen on an hpcloud node? | 21:32 |
jogo | http://logs.openstack.org/97/113197/11/gate/gate-nova-python26/2c864dd/console.html#_2014-09-03_23_28_19_854 | 21:32 |
fungi | jogo: yeah, in previous occurrences i've been unable to reproduce the claimed bad checksum from subsequent downloads, suggesting it's network or local storage related issues | 21:32 |
jogo | clarkb: its on rax | 21:32 |
clarkb | oh thats rax-iad | 21:32 |
jogo | message:"Bad md5 hash for package" | 21:32 |
*** devoid has joined #openstack-infra | 21:33 | |
fungi | clarkb: jogo: could that have been on the problem host in iad which got hunted down yesterday? | 21:33 |
clarkb | oh yup | 21:33 |
fungi | probably memory issues | 21:33 |
*** devoid has quit IRC | 21:33 | |
*** alkari has quit IRC | 21:33 | |
clarkb | jogo: we found a bad hypervisor in iad yesterday | 21:33 |
*** Sincler has quit IRC | 21:34 | |
*** david-lyle has quit IRC | 21:34 | |
jogo | clarkb fungi: wow | 21:34 |
jogo | unrelated question: your fix to not run log checker in grenade landed? | 21:34 |
jogo | clarkb: ^ | 21:34 |
*** lcheng_ has quit IRC | 21:34 | |
*** yamamoto has quit IRC | 21:34 | |
clarkb | I don't think so but let me check | 21:34 |
clarkb | https://review.openstack.org/#/c/118753/ | 21:35 |
clarkb | needs more review | 21:35 |
clarkb | sdague: ^ | 21:35 |
clarkb | jeblair: fungi: if we do a promotion promoting ^ would be good too if it gets approved | 21:35 |
fungi | we're about 2 minutes away from the next reset | 21:36 |
fungi | and i'm not a tempest core ;) | 21:36 |
jogo | mtreinish: want to +A ^ | 21:36 |
jogo | +W* | 21:36 |
sdague | clarkb: oh.... ffs | 21:37 |
sdague | fungi: +A | 21:37 |
sdague | promote at will | 21:37 |
fungi | okay, i'll add it to the pile | 21:37 |
mordred | vobj = client.volumes.create_server_volume( | 21:37 |
mordred | server.id, volume, None) | 21:37 |
mtreinish | jogo: sdague beat me to it | 21:37 |
mordred | just for the record | 21:37 |
*** dkehnx1 has joined #openstack-infra | 21:37 | |
openstackgerrit | A change was merged to openstack-infra/nodepool: Record provider/region/az in /etc/nodepool https://review.openstack.org/119138 | 21:37 |
mordred | apaprently "auto" mount point is done by passing None as the 3rd arg - but the 3rd arg is required | 21:37 |
sdague | clarkb: seriously... owe you a beer for that find | 21:37 |
clarkb | mordred: client there is cinderlcient? | 21:37 |
jogo | fungi clarkb: I saw message:"Bad md5 hash for package" on multiple rax nodes | 21:37 |
jeblair | clarkb: later, we should do something else with that. either devstack-gate should decide whether to run that, or that script should use some tempesty thing to make the decision | 21:37 |
mordred | clarkb: novaclient | 21:37 |
clarkb | sdague: well I did break it so we might be even :) | 21:38 |
jogo | and one hpcloud | 21:38 |
*** devoid has joined #openstack-infra | 21:38 | |
jeblair | clarkb: but we're crossing a line by referencing d-g vars in tempest | 21:38 |
jogo | or two hpcloud | 21:38 |
openstackgerrit | K Jonathan Harker proposed a change to openstack-infra/config: Begin cleaning up bashate failures https://review.openstack.org/118944 | 21:38 |
clarkb | jogo: hpcloud could be networking. were the rax nodes all iad? | 21:38 |
*** lcheng_ has joined #openstack-infra | 21:38 | |
jogo | clarkb: no there is an ord | 21:38 |
clarkb | jeblair: agreed | 21:38 |
sdague | jeblair: yeh, the clean log script should probably leave tempest | 21:38 |
*** annegentle has joined #openstack-infra | 21:38 | |
sdague | it was a convenient place at the time | 21:39 |
clarkb | personally I don't think the clean logs have helped much | 21:39 |
jogo | clarkb: http://tinyurl.com/mlhqju3 | 21:39 |
clarkb | maybe that will change in kilo if we focus on logging | 21:39 |
sdague | clarkb: well, honestly, there are only a couple of logs that are clean | 21:39 |
mordred | clarkb: do you know a way to test if a block device has an FS on it yet? | 21:39 |
*** penguinRaider has joined #openstack-infra | 21:39 | |
sdague | mordred: mount? | 21:39 |
clarkb | sdague: right and we don't seem to be cleaning any of the others | 21:39 |
sdague | clarkb: agreed | 21:39 |
mordred | sdague: just trying mounting it and see if it fails? | 21:39 |
*** marcoemorais has quit IRC | 21:39 | |
clarkb | mordred: ya there is a way with tools like parted iirc | 21:40 |
*** annegentle has quit IRC | 21:40 | |
sdague | mordred: yeh, that's what I'd do | 21:40 |
*** david-lyle has joined #openstack-infra | 21:40 | |
clarkb | sdague: well that only tells you if the fs is not what you wanted | 21:40 |
openstackgerrit | James E. Blair proposed a change to stackforge/gertty: Add tox.ini https://review.openstack.org/119220 | 21:40 |
clarkb | which may be good enough | 21:40 |
sdague | it will give you a bad superblock error if there isn't one | 21:40 |
fungi | okay, that trove change reported, gate reset, and i promoted 119037,3 118753,2 which should appear at the front as soon as the event queue empties | 21:40 |
mordred | sdague: good point | 21:40 |
sdague | or one you can't support | 21:40 |
clarkb | blkid | 21:41 |
clarkb | mordred: ^ | 21:41 |
jeblair | https://jenkins07.openstack.org/job/gate-tempest-dsvm-neutron-heat-slow/4130/console | 21:41 |
*** sdake has joined #openstack-infra | 21:41 | |
jeblair | that failed very quickly on presumably a network error | 21:41 |
*** TravT has joined #openstack-infra | 21:41 | |
jeblair | and not one that we're very likely to be able to work around | 21:41 |
clarkb | mordred: blkid -L somelabelhere | 21:41 |
mordred | clarkb: thanks | 21:41 |
jeblair | (at least, until we have per-region dns servers) | 21:42 |
sdague | clarkb: that's not showing fs on my distro | 21:42 |
*** lcheng_ has quit IRC | 21:42 | |
*** andreykurilin_ has quit IRC | 21:42 | |
fungi | clarkb: sdague: jogo: 119037,3 and 118753,2 are now up front | 21:42 |
clarkb | sdague: it does for me TYPE='ext4' | 21:43 |
*** andreykurilin_ has joined #openstack-infra | 21:43 | |
clarkb | hrm but using -L breaks it | 21:43 |
clarkb | without args it works | 21:43 |
*** marcoemorais has joined #openstack-infra | 21:43 | |
clarkb | fungi: thank you | 21:43 |
clarkb | jeblair: I wonder if unbound is not working on those nodes /me checks | 21:43 |
jeblair | clarkb: oh, that would do it too :( | 21:44 |
mordred | clarkb: doesn't work for me with either with blkid /dev/sda5 | 21:44 |
*** sdake has quit IRC | 21:44 | |
jogo | fungi: thanks | 21:44 |
*** marcoemorais has quit IRC | 21:44 | |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Add fingerprint for bug 1343313 https://review.openstack.org/119222 | 21:44 |
uvirtbot | Launchpad bug 1343313 in openstack-ci "Bad md5 hash for package" [Medium,Triaged] https://launchpad.net/bugs/1343313 | 21:44 |
clarkb | jeblair: seems to be working on a random hpcloud devstack-trusty node | 21:44 |
*** sdake has joined #openstack-infra | 21:44 | |
*** marcoemorais has joined #openstack-infra | 21:44 | |
clarkb | mordred: try it as `blkid` | 21:44 |
*** markmcclain has quit IRC | 21:45 | |
openstackgerrit | A change was merged to stackforge/gertty: Add tox.ini https://review.openstack.org/119220 | 21:45 |
fungi | huh... a puppet apply job just bailed out with: Puppet::Parser::AST::Resource failed with error ArgumentError: Could not find declared class rabbitmq at /etc/puppet/modules/storyboard/manifests/rabbit.pp:29 on node bare-precise-1409840044.template.openstack.org | 21:45 |
mordred | clarkb: gotcha. cool | 21:45 |
clarkb | fungi: it will do that if the install modules script ran on the base image before we added rabbitmq but thats super old iirc | 21:45 |
jeblair | i'm updating my mirror change to reflect the min-ready change that just merged | 21:46 |
jogo | clarkb: do you have a bug for the grenade thing? | 21:46 |
*** marcoemorais has quit IRC | 21:46 | |
jeblair | (merge-check caught it) | 21:46 |
jogo | as I a am going to file a e-r patch for it to take those failures out of unclassified | 21:46 |
*** marcoemorais has joined #openstack-infra | 21:46 | |
clarkb | jogo: no, I didn't make one | 21:46 |
jogo | kk I will file one | 21:47 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/config: Configure pip with a per-region mirror https://review.openstack.org/119158 | 21:47 |
*** alkari has joined #openstack-infra | 21:47 | |
clarkb | jogo: you have to be careful because it will be hard to classify that one | 21:47 |
*** devoid has quit IRC | 21:47 | |
clarkb | after that merges you will have successful jobs that match the same pattern | 21:47 |
jogo | clarkb: I have an idea | 21:48 |
jogo | but we will see | 21:48 |
jogo | clarkb: was thinking message:"Log: n-cond not allowed to have ERRORS or TRACES" AND build_name:*grenade* | 21:48 |
jogo | err message:"not allowed to have ERRORS or TRACES" AND build_name:*grenade* | 21:48 |
clarkb | that may work but only with the new log checker output | 21:49 |
clarkb | so you will get a subset | 21:49 |
jogo | but actually specify the grenade permutations | 21:49 |
*** craigbr has joined #openstack-infra | 21:49 | |
jogo | clarkb: what was the old log checker output? | 21:49 |
clarkb | it spammed all the log lines | 21:49 |
clarkb | and made people confused | 21:49 |
clarkb | this is how I found the thing in the first place | 21:49 |
*** dustins has quit IRC | 21:50 | |
*** dims_ has joined #openstack-infra | 21:51 | |
*** arnaud has joined #openstack-infra | 21:52 | |
*** dims has quit IRC | 21:54 | |
jogo | oh right | 21:55 |
*** andreykurilin_ has quit IRC | 21:56 | |
*** bswartz has quit IRC | 21:57 | |
*** devoid has joined #openstack-infra | 21:57 | |
*** ianw has joined #openstack-infra | 21:57 | |
*** tgohad has quit IRC | 21:58 | |
*** devoid has quit IRC | 21:59 | |
clarkb | ok I am +2 on the mirror topic (except for my change which mordred reports is unworking) | 21:59 |
*** dkranz has joined #openstack-infra | 21:59 | |
mordred | clarkb: more patches coming soon | 21:59 |
clarkb | mordred: I think you are applying from dev env or going to once launching works | 21:59 |
clarkb | so I won't approve anything | 21:59 |
clarkb | awesome | 21:59 |
mordred | don't - I'm not even there yet - I'm still battling volumes | 21:59 |
*** mmedvede has joined #openstack-infra | 21:59 | |
*** dims_ has quit IRC | 22:00 | |
*** eharney has quit IRC | 22:00 | |
*** dims has joined #openstack-infra | 22:01 | |
mattoliverau | Morning | 22:02 |
clarkb | mattoliverau: good morning | 22:02 |
*** Sincler has joined #openstack-infra | 22:02 | |
anteaya | morning mattoliverau | 22:02 |
*** MaxV has quit IRC | 22:02 | |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Add fingerprint for bug 1365738 https://review.openstack.org/119227 | 22:03 |
uvirtbot | Launchpad bug 1365738 in tempest "check logs incorrectly running on grenade jobs" [Undecided,New] https://launchpad.net/bugs/1365738 | 22:03 |
*** mfink_ has quit IRC | 22:03 | |
jogo | clarkb: ^ should at least get rid of some unclassified failures | 22:04 |
*** jgrimm is now known as zz_jgrimm | 22:04 | |
clarkb | jogo: oh neat does the () grouping work that way? | 22:04 |
*** gokrokve has joined #openstack-infra | 22:04 | |
*** devoid has joined #openstack-infra | 22:04 | |
jogo | clarkb: yeah right | 22:05 |
*** dims has quit IRC | 22:05 | |
*** devoid has left #openstack-infra | 22:05 | |
*** marcoemorais has quit IRC | 22:06 | |
*** marcoemorais has joined #openstack-infra | 22:06 | |
*** marcoemorais has quit IRC | 22:07 | |
*** marcoemorais has joined #openstack-infra | 22:08 | |
clarkb | mordred: anything I can do to help? I think I am going to try booting a node in hpcloud with volume attached via novaclient cli if that doesn't conflict with your stuff | 22:08 |
mordred | clarkb: it dose nt | 22:08 |
mordred | clarkb: I've got it working | 22:08 |
mordred | clarkb: I'm cleaning up now | 22:08 |
clarkb | mordred: you are doing separate call right? suppose that gets the job done | 22:09 |
mordred | yes | 22:09 |
clarkb | mordred: oh you know what | 22:09 |
clarkb | mordred: should launch node and not puppet format the thing | 22:09 |
clarkb | are you already doing it this way? | 22:09 |
jogo | mordred: what long does a nova list take for you guys? | 22:09 |
jogo | 'nova --debug list' | 22:09 |
jeblair | jogo: i'll get that for you | 22:09 |
clarkb | jogo: what does --debug do? | 22:09 |
clarkb | that the trace falg? | 22:09 |
jogo | oh wait | 22:09 |
jogo | not debug | 22:09 |
jogo | 'nova --timing list' | 22:09 |
jogo | debug givs you all the wire data | 22:10 |
*** erlon has quit IRC | 22:10 | |
jogo | timing makes a nice table at the bottom saying how long each call took | 22:10 |
jogo | thats what I want | 22:10 |
jogo | clarkb: like this http://paste.ubuntu.com/8228743/ | 22:10 |
jeblair | jogo: https://etherpad.openstack.org/p/2dfmHRWTlr | 22:10 |
jeblair | jogo: what's after the /v2/ in the servers list? | 22:11 |
jeblair | https://server/v2/LONGNUMBER/servers/detail | 22:11 |
jogo | tenant UUID | 22:11 |
jogo | you may want to block that out | 22:11 |
*** bhuvan has quit IRC | 22:12 | |
*** tsg has joined #openstack-infra | 22:12 | |
jogo | I am just looking for the seconds | 22:12 |
jogo | and roughly how many instances it is returning | 22:12 |
jogo | I am trying to get a sense of how slow a full list is for a power user | 22:12 |
*** emagana has joined #openstack-infra | 22:13 | |
jeblair | 1 down, 3 to go | 22:13 |
jogo | wow | 22:13 |
jogo | that is enough detail I think | 22:13 |
jogo | 10 seconds | 22:13 |
*** gondoi is now known as zz_gondoi | 22:13 | |
*** dims has joined #openstack-infra | 22:14 | |
jogo | so its bad | 22:14 |
*** arborism has joined #openstack-infra | 22:14 | |
jogo | jeblair: if you do a 'nova delete server-name' today it does a full list | 22:14 |
jogo | jeblair: as part of https://bugs.launchpad.net/python-novaclient/+bug/1202179 I am trying to fix that | 22:14 |
uvirtbot | Launchpad bug 1202179 in python-novaclient "findall in novaclient/base.py is inefficient" [Undecided,In progress] | 22:15 |
mordred | clarkb: hang on - I've got like 5 patches coming | 22:15 |
jeblair | jogo: nice, thanks! :) we cache nova list in nodepool and only update it every 5 seconds. which is clearly not the right interval :) | 22:15 |
jogo | jeblair: haha 5 seconds huh | 22:16 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add support to launch node for attaching volumes https://review.openstack.org/119143 | 22:17 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Remove references to old volume from fstab https://review.openstack.org/119230 | 22:17 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add a floating IP if needed https://review.openstack.org/119231 | 22:17 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: If we have a local keypair, use it https://review.openstack.org/119232 | 22:17 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add option to keep failed server https://review.openstack.org/119233 | 22:17 |
mordred | clarkb: ^^ there's my current stack | 22:17 |
mordred | clarkb: but I'm finalizing testing of it | 22:17 |
jeblair | jogo: okay, there's 3 samples for you in the etherpad. they are fairly consistent actually. | 22:17 |
clarkb | mordred: looks like nova cli just failed too | 22:17 |
clarkb | mordred: it booted the node but no attachment | 22:18 |
clarkb | mordred: so I think I will just table this and meh | 22:18 |
mordred | clarkb: ossum | 22:18 |
*** emagana has quit IRC | 22:18 | |
mordred | clarkb: well, check out my changes there and see what you hink | 22:18 |
clarkb | yup reading now | 22:18 |
*** chuckC has quit IRC | 22:18 | |
*** bhuvan has joined #openstack-infra | 22:19 | |
*** bhuvan has joined #openstack-infra | 22:19 | |
*** gokrokve has quit IRC | 22:19 | |
*** gokrokve has joined #openstack-infra | 22:19 | |
mordred | clarkb: there is a bug in mount_volume.sh ... | 22:19 |
jogo | jeblair: yeah glad to at least see things are consistent | 22:20 |
TravT | clarkb: 118627,2 just failed, but literally 45 minutes ago it passed all tests before getting reset by anonymous that caused 105231 to fail. I've got multiple dependencies on the glance client waiting | 22:20 |
jogo | so that means my patch should hopefully help | 22:20 |
wenlock | hey, im testing install_puppet.sh on a vanilla node, and noticed that if lsb_release is not installed, install_puppet.sh fails first time around. it seems to work second time around. After digging around in the procedures, i noticed other platforms setup lsb_release but debian doesn't, is there a reason for that? | 22:20 |
openstackgerrit | James E. Blair proposed a change to stackforge/gertty: Add a link to the examples URL in the README https://review.openstack.org/119235 | 22:20 |
jogo | jeblair: as for your 5 second 'nova list' poll | 22:20 |
*** signed8bit has quit IRC | 22:20 | |
jeblair | wenlock: probably just an omission since we don't have any current debian use; happy to have a patch to add it | 22:21 |
wenlock | jeblair, cool, will submit it | 22:21 |
TravT | clarkb: can you please not kick these back to the end of the queue? i've got a bunch of horizon patches that can't get FFE without the glance client. | 22:22 |
*** doude has quit IRC | 22:22 | |
clarkb | TravT: I don't do it, it depends on the state of the zuul queue | 22:22 |
openstackgerrit | James E. Blair proposed a change to stackforge/gertty: Add a link to the examples URL in the README https://review.openstack.org/119235 | 22:22 |
clarkb | mordred: is xvdf where rax starts? | 22:22 |
mordred | clarkb: the make_swap uses xvde | 22:22 |
*** pradk has quit IRC | 22:22 | |
mordred | clarkb: I have not yet tested on rax | 22:23 |
clarkb | mordred: kk | 22:23 |
mordred | clarkb: incoming - found the bug in mount_volume | 22:23 |
*** gokrokve has quit IRC | 22:23 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add option to keep failed server https://review.openstack.org/119233 | 22:23 |
wenlock | jeblair, i also have this other bug, http://paste.forj.io/show/44/, no idea yet how to fix it... but have you seen that one? | 22:23 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: If we have a local keypair, use it https://review.openstack.org/119232 | 22:23 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Remove references to old volume from fstab https://review.openstack.org/119230 | 22:23 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add a floating IP if needed https://review.openstack.org/119231 | 22:23 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add support to launch node for attaching volumes https://review.openstack.org/119143 | 22:23 |
fungi | i think recent pvhvm instances in rax put / on xvda and then start cinder attaching at xvdb (used to use xvdb for ephemeral and put cinder on xvdc and later) | 22:24 |
wenlock | we've been trying to make install_puppet.sh current | 22:24 |
*** yamamoto has joined #openstack-infra | 22:24 | |
*** sweston has quit IRC | 22:25 | |
*** alexpilotti has quit IRC | 22:25 | |
mordred | fungi: awesome | 22:25 |
jeblair | wenlock: i think we use install_puppet.sh all the time actually. i believe it's used on nodepool nodes as well as our long running servers | 22:25 |
*** sweston has joined #openstack-infra | 22:25 | |
openstackgerrit | Jeremy Stanley proposed a change to openstack-infra/nodepool: Paralellize image deletes https://review.openstack.org/119208 | 22:25 |
*** rlandy has quit IRC | 22:25 | |
mordred | fungi: I'm so glad it's complex. of course, it's actually possible to get the info from nova ... I may do that next | 22:25 |
jeblair | wenlock: i don't recognize that error; perhaps mordred understands it? | 22:25 |
*** mfer has quit IRC | 22:25 | |
fungi | mordred: on rax it actually looks like we get an xvda and an xvde... the latter may be the injection disk | 22:26 |
wenlock | we've been working around it with a pip install puppet module | 22:26 |
mordred | wenlock: you're running that with python3 | 22:26 |
mordred | wenlock: python3 does not have urllib2 | 22:26 |
clarkb | mordred: help my hipster brain. what is that perl one liner doing? | 22:26 |
fungi | mordred: oh, actually looks like an ephemeral block device at xvde | 22:26 |
jeblair | mordred: should get-pip handle that? | 22:27 |
wenlock | jeblair on our side https://github.com/forj-oss/maestro/blob/master/puppet/modules/pip/manifests/bootstrap.pp | 22:27 |
jogo | jeblair: thanks again for the numbers | 22:27 |
mordred | clarkb: I'm just not hipster enough to grok sed -i | 22:27 |
mordred | clarkb: I'm deleting the lines from the file that have $DEV in them | 22:27 |
clarkb | mordred: fungi right we split ephemeral into swap and ephemeral iirc | 22:27 |
*** david-lyle has quit IRC | 22:27 | |
*** homeless has quit IRC | 22:27 | |
wenlock | mordred, hmmm, thats a good clue, will try it again to see how python is getting bootstraped | 22:27 |
clarkb | mordred: thanks `sed -i -e '/regex/d' /path` iirc | 22:27 |
mordred | fungi: I'll write a patch to grab the actual device name from nova and pass it in | 22:27 |
mordred | clarkb: right. but how do I escape the /'s | 22:28 |
mordred | jeblair: possibly? | 22:28 |
clarkb | mordred: oh use -e '#regex#d' or similar | 22:28 |
mordred | jeblair: but wenlock is going to have many other issues if /usr/bin/python is python3 | 22:28 |
clarkb | the //s are just convention | 22:28 |
mordred | clarkb: awesome | 22:28 |
*** homeless has joined #openstack-infra | 22:28 | |
mordred | so just like perl then | 22:28 |
wenlock | mordred, yep, i need to prove my python is originating from install_puppet or not | 22:29 |
fungi | mordred: strange though... i have some 1gb pvhvm instances in rax-iad in my personal tenant, and those have no xvde | 22:29 |
jeblair | mordred: don't understand the keypair change | 22:29 |
fungi | mordred: cinder started attaching at xvdb for me | 22:29 |
wenlock | mordred : so far, my bootstrap has been: apt-get update && apt-get -y install curl wget && bash ./install_puppet.sh | 22:29 |
*** andreaf has quit IRC | 22:29 | |
*** yamamoto has quit IRC | 22:29 | |
fungi | mordred: and i have a 67mb xvdd with no recognizable partition table, so that's probably for injection | 22:30 |
*** craigbr has quit IRC | 22:30 | |
mordred | jeblair: the keypair change is to not make ephemeral throwaway keypairs on launch | 22:30 |
*** craigbr has joined #openstack-infra | 22:30 | |
mordred | jeblair: because if we do that, it makes it impossible to shell in and diagnose what went wrong if something breaks before puppet runs | 22:30 |
jeblair | mordred: yeah, i'm not sure i think that's a good idea; i'm not really keen on having the mirrors depend on that at least | 22:30 |
mordred | jeblair: they dont' need it | 22:31 |
*** craigbr has quit IRC | 22:31 | |
ekarlso- | anyone here know the grounds of why testtools is better then nosetests ? | 22:31 |
wenlock | mordred my base image: /# python --version | 22:31 |
wenlock | Python 2.7.3 | 22:31 |
mordred | ekarlso-: they are not equiv | 22:31 |
ekarlso- | mordred: meaning ? | 22:31 |
*** lakshmiS has joined #openstack-infra | 22:31 | |
mordred | ekarlso-: nosetests is a test runner, ,testtools is a base test library | 22:31 |
ekarlso- | mordred: so what's the test runner in openstack's case ? | 22:32 |
fungi | ekarlso-: are you maybe thinking of testr? | 22:32 |
mordred | ekarlso-: we use testr and not nosetests because nosetests a) does not obey the python unittest protocol and b) injects its own code into the things it's testing | 22:32 |
clarkb | and c) blows up on pyton2.6 and returns success | 22:32 |
mordred | ekarlso-: we also use testr because it's based on subunit which is a protocol that we can operate on programatically | 22:32 |
fungi | clarkb: blowing up on python 2.6 _is_ success | 22:32 |
mordred | fungi: :) | 22:32 |
*** emagana has joined #openstack-infra | 22:33 | |
ekarlso- | so what's the runner in openstacks case ? | 22:33 |
ekarlso- | ah, nvm | 22:33 |
ekarlso- | thnx fungi | 22:33 |
ekarlso- | doens't testtools have a runner too though ? | 22:33 |
clarkb | ekarlso-: it does but we don't consume it directly | 22:34 |
*** Sincler has quit IRC | 22:34 | |
clarkb | mordred: jeblair: for the keypair thing can we have it do that if a local key exists? | 22:34 |
mordred | clarkb: that's what it does | 22:34 |
*** emagana has quit IRC | 22:34 | |
clarkb | mordred: hrm you ripped out the logic for creating oen by default? | 22:34 |
mordred | clarkb: I did not? | 22:34 |
*** emagana has joined #openstack-infra | 22:34 | |
clarkb | mordred: sorry the deletion logic | 22:34 |
mordred | clarkb: it's in the except clause | 22:34 |
mordred | yes. oh - yeah, I guess so - I can put that back in | 22:35 |
mordred | honestly, I want to spend 0 mroe time on that | 22:35 |
*** arborism is now known as amcrn | 22:35 | |
clarkb | there are key quota's so we can't just create infinitely | 22:35 |
mordred | clarkb: it should always create teh same name | 22:35 |
mordred | clarkb: but this is not important | 22:35 |
clarkb | I guess I am not following how it works in the "default" case | 22:35 |
jeblair | mordred: commented | 22:36 |
jeblair | for when/if you want to pick it up again | 22:36 |
clarkb | if it tries to reuse the throw away key will it still hve the private key? | 22:36 |
jeblair | mordred: everything else in that stack lgtm in principle | 22:36 |
mordred | clarkb: I am not continuing to work on this patch right now | 22:36 |
*** gokrokve has joined #openstack-infra | 22:37 | |
*** baoli has quit IRC | 22:38 | |
clarkb | mordred: ok, so for the stack above that, do you want to handle rax in the same series or should we continue as is for hpcloud | 22:38 |
*** gokrokve has quit IRC | 22:38 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add option to keep failed server https://review.openstack.org/119233 | 22:39 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: If we have a local keypair, use it https://review.openstack.org/119232 | 22:39 |
mordred | jeblair: ^^ moved keypair to the end | 22:39 |
*** gokrokve has joined #openstack-infra | 22:39 | |
*** emagana has quit IRC | 22:39 | |
*** baoli has joined #openstack-infra | 22:39 | |
mordred | clarkb: let's just get hp sorted, I'm hacking on rax right now | 22:39 |
*** stonemessenger has joined #openstack-infra | 22:39 | |
clarkb | mordred: I am +2 on the mount volume change but do have a comment there | 22:39 |
clarkb | I don't think it matters though | 22:39 |
*** stonemessenger has quit IRC | 22:40 | |
clarkb | (also still not approving anything just in case you want to refine anything else) | 22:40 |
openstackgerrit | A change was merged to stackforge/gertty: Add a link to the examples URL in the README https://review.openstack.org/119235 | 22:40 |
*** flaviof has quit IRC | 22:42 | |
*** sarob has quit IRC | 22:42 | |
*** tsg has quit IRC | 22:42 | |
TravT | is this volume thing you're fixing the same thing I just hit on our glanceclient grenade failure? http://logs.openstack.org/27/118627/2/gate/gate-grenade-dsvm/b04d978/logs/grenade.sh.txt.gz | 22:42 |
*** emagana has joined #openstack-infra | 22:42 | |
ekarlso- | If you run tox without arguments, it first runs py2x and then py3x, but testr somehow can't handle it and something in .testrepository gets corrupted. < is that a common thing? | 22:43 |
clarkb | TravT: no, it is unrelated to the gate | 22:43 |
clarkb | ekarlso-: yes, testr uses a databse file that is different across python versions... | 22:43 |
ekarlso- | clarkb: sounds sad :| | 22:44 |
clarkb | mordred: is the pypi mirror in region-b syncing now? should I create a dns record for it? | 22:45 |
clarkb | mordred: or will it be rebuilt and I should wait? | 22:46 |
*** emagana has quit IRC | 22:47 | |
*** ZZelle_ has quit IRC | 22:48 | |
clarkb | hrm the instance went away so I assume mordred is still iterating | 22:48 |
*** dims has quit IRC | 22:49 | |
*** stevemar has quit IRC | 22:50 | |
*** dims has joined #openstack-infra | 22:50 | |
*** marcusvrn has quit IRC | 22:51 | |
jroll | clarkb: when you have a free moment, could you look at https://review.openstack.org/#/c/118507/ ? :) | 22:51 |
clarkb | jroll: we continue to upload the tarball for backward compat and will stop doing that later or? | 22:53 |
jroll | clarkb: yep, that's in the commit message :) | 22:53 |
mordred | clarkb: I'm pulling the device name out of nova after attaching it just to be sure | 22:53 |
clarkb | mordred: perfect | 22:53 |
jhesketh | Morning | 22:54 |
clarkb | jhesketh: hi there | 22:54 |
*** dims has quit IRC | 22:54 | |
wenlock | mordred, ok your clue helped me.... my base image with python2.7 was to vanilla... i needed to install package python-all-dev before proceeding with install_puppet.sh | 22:55 |
*** mmedvede has quit IRC | 22:55 | |
*** emagana has joined #openstack-infra | 22:55 | |
mordred | wenlock: awesome! | 22:55 |
*** hogepodge has quit IRC | 22:55 | |
*** baoli has quit IRC | 22:55 | |
*** coalcurator has joined #openstack-infra | 22:56 | |
wenlock | mordred: would we bother patching install_puppet.sh for that, or does it just mean thats a requirement before running install_puppet.sh to do apt-get curl wget python-all-dev ? | 22:56 |
*** mmedvede has joined #openstack-infra | 22:56 | |
*** dmsimard is now known as dmsimard_away | 22:56 | |
mordred | wenlock: not sure. clarkb ^^ ? | 22:57 |
mordred | jgriffith: aroudn? | 22:57 |
clarkb | wenlock: mordred: we just need urllib2 right which is stdlib? | 22:58 |
openstackgerrit | Spencer Krum proposed a change to openstack-infra/config: Enable puppet3 master bootstrap https://review.openstack.org/117604 | 22:58 |
clarkb | wenlock: mordred: it seems like your python packages are broken | 22:58 |
clarkb | I don't really have a problem with installing python all dev because we want it anyways, I just think we may be masking a different problem | 22:59 |
*** coalcurator has quit IRC | 23:01 | |
wenlock | clarkb, i think this might be isolated to the image im using, and not an issue with the hpcloud 12.04 images.. im not getting the issue there. | 23:02 |
*** ebonywobble has joined #openstack-infra | 23:02 | |
clarkb | wenlock: it could be, but any install of `python` should include python stdlib | 23:03 |
mordred | STAB STAB STAB STAB STAB STAB STAB | 23:03 |
clarkb | wenlock: when you install python-all-dev you are probably sucking it down but it is very odd that you wouldn't already haev stdlib if yo uhave a python interpreter | 23:03 |
clarkb | mordred: anything I can do to help? | 23:03 |
mordred | clarkb: fix openstack? | 23:03 |
wenlock | mordred, fix docker image i thinks | 23:04 |
jroll | mordred: I like this idea | 23:04 |
mordred | wenlock: oh - docker | 23:04 |
wenlock | yep | 23:04 |
mordred | wenlock: docker images are super minimal | 23:04 |
clarkb | fungi: jroll btw those two gate fixer changes just merged | 23:04 |
openstackgerrit | A change was merged to openstack-infra/config: Publish individual files in IPA post job https://review.openstack.org/118507 | 23:04 |
wenlock | yeah... starting to find that out, grumble | 23:04 |
mordred | wenlock: I'd add something to the Dockerfile that you use to get your base image | 23:04 |
wenlock | mordred, bingo | 23:04 |
jroll | clarkb: "gate fixer"? | 23:04 |
clarkb | jroll: there was a change to tempest to stop failing grenade on dirty logs and a fix to devstack to make devtsack changes pass bashate | 23:05 |
*** lakshmiS has quit IRC | 23:05 | |
clarkb | jroll: so minimal but the grenade one should help a fair bit | 23:05 |
jroll | clarkb: oh, neat, thanks | 23:05 |
clarkb | mordred: as far as fixing openstack goes I could stop working on this infra stuff | 23:06 |
*** lakshmiS has joined #openstack-infra | 23:07 | |
*** chuckC has joined #openstack-infra | 23:07 | |
*** r-daneel has quit IRC | 23:07 | |
openstackgerrit | Edward Raigosa (wenlock) proposed a change to openstack-infra/config: lsb_release is missing for debian setup https://review.openstack.org/119245 | 23:07 |
*** atiwari has quit IRC | 23:08 | |
*** bswartz has joined #openstack-infra | 23:08 | |
clarkb | nodepool still looks happy (it should be since those images were dleeted) | 23:09 |
*** ebonywobble has quit IRC | 23:11 | |
*** alexandra_ has joined #openstack-infra | 23:11 | |
*** asettle has joined #openstack-infra | 23:11 | |
lakshmiS | mordred: We just got kicked out of gate for the patches(https://review.openstack.org/#/c/105231/) and ( https://review.openstack.org/#/c/118627/2) due to what appears to be cinder issue totally unrelated to glance-client. Is there anyway we dont have to go back to approvals from core reviews/check and promote directly back to gate? | 23:11 |
*** wayne__ has joined #openstack-infra | 23:11 | |
*** asettle has quit IRC | 23:12 | |
*** alexandra_ is now known as asettle | 23:12 | |
clarkb | lakshmiS: you shouldn't need any new approvals unless you have to push a new patchset | 23:12 |
lakshmiS | mordred: we just want to see if we can get promoted from check to gate for those patches | 23:12 |
clarkb | lakshmiS: you can reverify to send it back to the gate. If the changes fix gate bugs we can promote them | 23:12 |
clarkb | lakshmiS: if they don't fix gate bugs then they go back to the end of the line | 23:12 |
*** bhuvan has quit IRC | 23:13 | |
lakshmiS | clarkb: they have been verified so many times in check before and fails at gate randomly | 23:14 |
jeblair | clarkb, fungi, mordred: http://lists.openstack.org/pipermail/openstack-dev/2014-September/045013.html | 23:14 |
clarkb | jeblair: nice | 23:15 |
jeblair | dhellmann, lifeless: ^ | 23:15 |
mordred | clarkb: ok. I give up. openstack just doesn't work | 23:16 |
lifeless | mordred: slow learner? | 23:16 |
jeblair | and sorry about the '_' to '-' transition in gertty.yaml. thought that was best to get out of the way before 1.0 :) | 23:16 |
lifeless | jeblair: congrats | 23:16 |
jeblair | lifeless: thanks for your help :) | 23:17 |
clarkb | mordred: is the broken thing querying for the device name? | 23:17 |
clarkb | mordred: maybe we can live without that for now and hardcode in order to spin up mirrors? | 23:17 |
jroll | jeblair: \o/ nice | 23:17 |
clarkb | lakshmiS: I don't know what to tell you. as mordred says openstack just doesn't work... we are at the time in release where the focus is on cramming things in and stuff gets unstable | 23:18 |
lifeless | jeblair: you're welcome :) | 23:18 |
clarkb | lakshmiS: unfortunately that means that a lot of people are affected | 23:19 |
openstackgerrit | Brant Knudson proposed a change to openstack-infra/devstack-gate: Stash Apache httpd logs https://review.openstack.org/116404 | 23:19 |
*** tonytan4ever has quit IRC | 23:19 | |
fungi | jeblair: ++ for hating on browsermail ;) | 23:19 |
jeblair | fungi: i wanted to set the tone appropriately :) | 23:19 |
* clarkb fires up some 1.0 gertty | 23:20 | |
*** mmedvede has quit IRC | 23:20 | |
fungi | the entire e-mail is a great marketing piece. it speaks to my inner luddite | 23:24 |
*** yamamoto has joined #openstack-infra | 23:24 | |
jeblair | that's wonderful! | 23:25 |
pleia2 | haha, great email :) | 23:25 |
clarkb | version 2.0 will support vi keybindings right ;) | 23:25 |
jeblair | fungi: on https://review.openstack.org/#/c/118710/2 i'm getting the idea that the change only works if it violates pep8. that's the logical conclusion, right? | 23:25 |
fungi | hah | 23:27 |
jeblair | clarkb: you'll notice not only does it not support ":q" it also doesn't support "C-x C-c". both of those seemed hard; it'll need like a real command parser or something. | 23:27 |
fungi | jeblair: no, clearly logic dictates than python 2.6 is pep8-tolerant and 2.7 is not | 23:27 |
clarkb | jeblair: maybe it should be an emacs mode then we can evil it | 23:28 |
*** yamamoto has quit IRC | 23:29 | |
*** lcheng_ has joined #openstack-infra | 23:29 | |
clarkb | jogo: does postgres just take longer? | 23:29 |
fungi | i guess test_node_az has a bit of a race or something | 23:29 |
clarkb | looks like postgres is always the last job to finish | 23:29 |
mordred | clarkb: there is a reason nobody actual uses postgres for things | 23:29 |
clarkb | mordred: :) | 23:29 |
jeblair | fungi: oh did it fail that both times? | 23:29 |
jeblair | fungi: i was just setting up to try to repro locally | 23:29 |
fungi | jeblair: no, it passed that test the first round, then i changed a little whitespace to make pep8 stfu and *boom* | 23:30 |
jeblair | fungi: yeah, i rechecked it; there are 2 failed test runs. it looks like test_subnodes and test_node_az respectively | 23:31 |
fungi | jeblair: oh! looks like maybe it did hit it again when you rechecked | 23:31 |
fungi | but different test | 23:31 |
fungi | jut same job | 23:31 |
fungi | just | 23:32 |
clarkb | mordred: anything else I can help with before football happens? | 23:32 |
mordred | clarkb: seriously. I have now tried EVERY CONCEIVABLE COMBINATION of novaclient and cinderclient calls | 23:32 |
clarkb | mordred: wow | 23:32 |
mordred | none of them work | 23:32 |
mordred | some of them I just get things like this: | 23:32 |
mordred | cinderclient.exceptions.NotFound: The resource could not be found. (HTTP 404) (Request-ID: req-48fb9e20-c67c-47f6-abe9-ab08a48e2a5c) | 23:32 |
clarkb | mordred: and this is to get the device name right? | 23:32 |
mordred | yes | 23:32 |
mordred | I have absolutely no idea what is going wrong, and I've ceased caring | 23:32 |
mordred | clarkb: I will do this by hand | 23:33 |
mordred | which is what I should have done in the fistr place | 23:33 |
clarkb | mordred: well you can use your old script right? | 23:33 |
mordred | jogo ^^ or anyone else who happens to be lurking and watching my agony in trying to use openstack client libraries to do actual work | 23:33 |
clarkb | mordred: the first volume name is predictable | 23:33 |
fungi | mordred: having trouble getting the cinder client lib to give you the device name like the cli does? | 23:33 |
clarkb | its just potentially nto always predictable | 23:33 |
mordred | clarkb: I thought the name was not predictable on rax | 23:34 |
clarkb | mordred: I think fungi said it would be /dev/xvdc | 23:34 |
*** zhiwei has joined #openstack-infra | 23:34 | |
clarkb | fungi: did I misparse? | 23:34 |
mordred | oh, really? I may have misparsed | 23:34 |
clarkb | they put ephemeral on xvde | 23:34 |
mordred | oh. cute | 23:34 |
clarkb | I hope that the reason isn't ephemeral starts with e | 23:35 |
mordred | I think it is | 23:35 |
*** dangers is now known as dangers_away | 23:35 | |
fungi | that would be amazing | 23:35 |
clarkb | but your first volume attache is on xvdc according to fungi's test | 23:35 |
clarkb | fungi: where did b go? | 23:35 |
clarkb | a is / right? e is ephemeral is b swap? | 23:35 |
kevinbenton | how do i take the subunit.txt.gz from openstack jenkins and get testr to parse it so i can get the order of the tests before a failure | 23:36 |
fungi | and yes, seems that on our 8gb pvhvm nodes we get an ephemeral on xvde, but my 1gb instances have no xvde, instead they have an injection disk at xvdd and my first cinder device attached at xvdb | 23:36 |
mordred | fungi: oh - so cinder goes to xvdb? | 23:36 |
fungi | i am not convinced they are consistent between flavors | 23:36 |
* fungi looks at the recent elasticsearch nodes for comparison | 23:37 | |
clarkb | fungi: oh :( | 23:37 |
clarkb | kevinbenton: one sec | 23:37 |
openstackgerrit | A change was merged to openstack-infra/config: Add a magnetodb job for cassandra integration tests https://review.openstack.org/112304 | 23:38 |
jeblair | fungi: i understand the race condition. it's actually a nodepool race, where it can end up creating more nodes than necessary. | 23:39 |
jeblair | fungi: i'm going to roll the dice again on that change while i work on a real solution | 23:39 |
jogo | mordred: having fun yet? | 23:40 |
mordred | jogo: stab stab stab | 23:40 |
jogo | our 'northbound API's are staby | 23:41 |
mordred | s/northbound/ | 23:41 |
clarkb | kevinbenton: http://testrepository.readthedocs.org/en/latest/MANUAL.html#automated-test-isolation-bisection I think that is what you want after you testr load < file yo udownloaded | 23:41 |
mordred | s/northbound// | 23:41 |
*** oomichi has joined #openstack-infra | 23:41 | |
*** emagana has quit IRC | 23:42 | |
*** melwitt has quit IRC | 23:42 | |
jogo | mrodden: heh | 23:42 |
kevinbenton | clarkb: thanks | 23:42 |
*** emagana has joined #openstack-infra | 23:42 | |
fungi | crappity... i've smacked my internal dns... one sec | 23:42 |
fungi | mordred: according to pvs, the first cinder volume on elasticsearch07 is xvdb1 | 23:44 |
fungi | we've mounted the ephemeral disk second partition on /opt as xvde2 | 23:44 |
openstackgerrit | Joshua Harlow proposed a change to openstack-infra/config: Switch taskflow to use the unittests slave script https://review.openstack.org/116388 | 23:44 |
mordred | fungi: that's promising | 23:44 |
clarkb | I am going to duck out for a short while to prepare for the football game | 23:45 |
clarkb | back in a bit | 23:45 |
mordred | clarkb: ^^ | 23:45 |
mordred | gah | 23:45 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add option to keep failed server https://review.openstack.org/119233 | 23:45 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: If we have a local keypair, use it https://review.openstack.org/119232 | 23:45 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Remove references to old volume from fstab https://review.openstack.org/119230 | 23:45 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add a floating IP if needed https://review.openstack.org/119231 | 23:45 |
clarkb | oh I will review first | 23:45 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add support to launch node for attaching volumes https://review.openstack.org/119143 | 23:45 |
mordred | clarkb: I just updated the script to do xvdb | 23:46 |
clarkb | mordred: wfm | 23:46 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Be more atomic when counting nodes https://review.openstack.org/119256 | 23:46 |
jeblair | fungi: ^ | 23:46 |
*** emagana has quit IRC | 23:47 | |
clarkb | mordred: again not approving since you will probably exercise it as you spin up nodes? | 23:47 |
*** amuller has quit IRC | 23:47 | |
mordred | yes. I am exercising now | 23:47 |
mordred | I just pulled fresh from the tip of that | 23:47 |
fungi | mordred: though on static.o.o we have cinder volumes on b and d-p, and xvdc looks like a small (2gb) disk, not sure for what (metadata?) | 23:47 |
mordred | fungi: well, I don't think this is likely to be flexible enough to deal with static .. | 23:48 |
clarkb | ya first iteration is one volume only | 23:48 |
clarkb | as long as it starts on b we are ok | 23:48 |
openstackgerrit | Stefano Maffulli proposed a change to openstack-infra/jeepyb: Adding a suggestion to file a bug https://review.openstack.org/115759 | 23:48 |
fungi | mordred: sure, just pointing out that their choices for block device names seem to jump around a bit | 23:49 |
harlowja | btw, i think u guys will like https://review.openstack.org/#/c/118941/ ;) | 23:49 |
openstackgerrit | Edward Raigosa (wenlock) proposed a change to openstack-infra/config: give install_modules options for loading requirements externally https://review.openstack.org/117892 | 23:50 |
fungi | jeblair: neat, so actually a legit bug in nodepool, not just a flaky test | 23:50 |
openstackgerrit | Edward Raigosa (wenlock) proposed a change to openstack-infra/config: provide an option in plugin to pre-update a plugin with puppet https://review.openstack.org/104652 | 23:51 |
*** marcoemorais has quit IRC | 23:51 | |
*** marcoemorais has joined #openstack-infra | 23:51 | |
*** marcoemorais has quit IRC | 23:52 | |
*** marcoemorais has joined #openstack-infra | 23:52 | |
*** marcoemorais has quit IRC | 23:52 | |
*** marcoemorais has joined #openstack-infra | 23:53 | |
*** marcoemorais has quit IRC | 23:53 | |
*** marcoemorais has joined #openstack-infra | 23:53 | |
anteaya | marcoemorais: check your client | 23:54 |
*** yamamoto has joined #openstack-infra | 23:54 | |
*** tomoe_ has joined #openstack-infra | 23:54 | |
*** sdake has quit IRC | 23:57 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add option to keep failed server https://review.openstack.org/119233 | 23:57 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: If we have a local keypair, use it https://review.openstack.org/119232 | 23:57 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Remove references to old volume from fstab https://review.openstack.org/119230 | 23:57 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add a floating IP if needed https://review.openstack.org/119231 | 23:57 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add support to launch node for attaching volumes https://review.openstack.org/119143 | 23:57 |
mordred | fungi, jeblair: okie. one more fix found in this last iteration | 23:57 |
*** unicell has joined #openstack-infra | 23:57 | |
mordred | ((hint, it's in mount_volume.sh - the protection against not formatting an already formatted volume was too strong)) | 23:58 |
*** dimtruck is now known as zz_dimtruck | 23:58 | |
jeblair | mordred: can that ever be too strong? | 23:58 |
mordred | jeblair: well, it not only didn't format, it also didn't mount | 23:58 |
*** yamamoto has quit IRC | 23:59 | |
jeblair | at least it fails safe | 23:59 |
*** wenlock has quit IRC | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!