*** Ryan_Lane has quit IRC | 00:00 | |
*** wenlock has quit IRC | 00:03 | |
*** portante|afk is now known as portante | 00:09 | |
*** sarob has quit IRC | 00:09 | |
*** ryanpetrello has joined #openstack-infra | 00:13 | |
anteaya | alexpilotti: you are fifth! | 00:13 |
---|---|---|
*** dina_belova has joined #openstack-infra | 00:14 | |
alexpilotti | anteaya: wow, the pole position is getting closer! | 00:14 |
anteaya | *crosses fingers* | 00:15 |
*** gyee has quit IRC | 00:16 | |
*** dina_belova has quit IRC | 00:19 | |
*** sgviking has joined #openstack-infra | 00:26 | |
*** vipul is now known as vipul-away | 00:29 | |
*** nosnos has joined #openstack-infra | 00:36 | |
*** morganfainberg is now known as mdrnstm | 00:37 | |
*** mdrnstm is now known as needscoffee | 00:37 | |
*** needscoffee is now known as morganfainberg | 00:38 | |
anteaya | alexpilotti: look for results | 00:39 |
*** vipul-away is now known as vipul | 00:40 | |
anteaya | alexpilotti: hopefully it is going to be merged | 00:40 |
anteaya | looks to me like it passed the gate | 00:40 |
* alexpilotti looks and crosses various fingers :-) | 00:41 | |
*** rfolco has quit IRC | 00:41 | |
anteaya | I see someone left a review comment about 20 minutes ago | 00:42 |
anteaya | hopefully that doesn't influence the jenkins report/merge | 00:42 |
anteaya | oh there you are | 00:43 |
anteaya | your patch is the head of the queue this time | 00:44 |
anteaya | I don't understand why since it appeared to me that both patches ahead of it had passed tests | 00:44 |
*** reed has quit IRC | 00:47 | |
jeblair | anteaya: remember the changes? | 00:48 |
alexpilotti | yeah, somebody re-added a +2 although it was already +2a | 00:48 |
anteaya | alexpilotti: but you are still in line | 00:49 |
anteaya | jeblair: yes, but I was expecting that if the two patches ahead of his patch passed and his patch passed, they would all be merged together | 00:49 |
anteaya | the patch immediately ahead of his passed and was waiting for the head to finish | 00:50 |
anteaya | if the head failed alexpilotti's patch would be second in line, it is the head | 00:50 |
anteaya | so both patches must have passed | 00:50 |
anteaya | if so, why didn't alexpilotti's patch also merge | 00:50 |
anteaya | since if his was the failing patch it wouldn't still be head | 00:50 |
anteaya | or have I missed something? | 00:50 |
jeblair | anteaya: i don't know, but if you remember which patches were ahead of his, i can look at it | 00:52 |
jeblair | i can also try to dig that out of the logs, but it's, er, tricky | 00:52 |
anteaya | I wasn't tracking the patch numbers :( | 00:52 |
anteaya | no worries, it was just unexpected | 00:52 |
anteaya | I should have kept the patch numbers | 00:53 |
anteaya | I think they were both heat patches | 00:53 |
anteaya | these top merged heat patches are the right time frame: https://review.openstack.org/#/q/heat,n,z | 00:54 |
anteaya | and the fact that the merged comment is logged first followed by the verified comment on this patch is odd: https://review.openstack.org/#/c/44338/ | 00:56 |
*** kiall has quit IRC | 00:56 | |
openstackgerrit | Joe Gordon proposed a change to openstack-dev/hacking: Redirect stderr for git rev-parse https://review.openstack.org/45150 | 00:56 |
anteaya | and this heat patch is in the post queue: https://review.openstack.org/gitweb?p=openstack/heat.git;a=commitdiff;h=42aaccf6ab1f14e23ab0d2ad32f670811781e7f6 | 00:56 |
anteaya | so it looks like they all passed and merged | 00:57 |
anteaya | I am at a loss as to why alexpilotti's patch didn't merge at the same time then | 00:57 |
alexpilotti | my patch likes some suspense I guess | 00:58 |
salv-orlando | hi, sorry to add more meat on the table. Is it just a temporary thing, or the check pipeline is launching python27 jobs only? | 01:03 |
anteaya | salv-orlando: what patch number are you watching | 01:04 |
salv-orlando | all of them on status.openstack.org/zuul | 01:05 |
anteaya | i see success in python26 jobs in patch #2 and #3 in the check queue right now | 01:05 |
anteaya | 40298 and 45142 | 01:05 |
salv-orlando | correct, but currently all the other patches are queued and no jobs are starting | 01:05 |
*** kiall has joined #openstack-infra | 01:05 | |
anteaya | those are just at the top of my screen right now | 01:05 |
salv-orlando | no worries, I guess it's just a matter of time before queued job start | 01:06 |
anteaya | post queue has a higher priority than check | 01:06 |
anteaya | yes | 01:06 |
salv-orlando | ok, cool | 01:06 |
anteaya | resources go to post first | 01:06 |
anteaya | then check | 01:06 |
anteaya | and there is a smaller pool of python26 nodes than other nodes | 01:06 |
Alex_Gaynor | Wowa. I step away for a few hours, and 61 items in the gate. Is it slow, or are people going crazy for the freeze? | 01:08 |
anteaya | bit of both | 01:08 |
anteaya | for some reason I am not seeing patches move out of the gate in the large groupings that I saw earlier in the week and I don't know why | 01:09 |
Alex_Gaynor | interesting | 01:09 |
anteaya | here are the patch numbers for the first 36 patches in the gate right now | 01:09 |
anteaya | I would be grateful for a second set of eyes as I follow what happens after the next reset | 01:10 |
anteaya | who merges and which patch is the new head | 01:10 |
*** ArxCruz has joined #openstack-infra | 01:10 | |
anteaya | http://paste.openstack.org/show/45786/ | 01:10 |
anteaya | sorry there they are | 01:10 |
anteaya | okay alexpilotti what happened? | 01:11 |
*** pcrews has quit IRC | 01:11 | |
alexpilotti | anteaya: so far so good :-) | 01:11 |
anteaya | great | 01:11 |
anteaya | let's hope it goes to post | 01:11 |
alexpilotti | the first 2 merged | 01:12 |
Alex_Gaynor | fingers crossed for no more resets :) | 01:12 |
anteaya | yeah | 01:12 |
alexpilotti | the last one is waiting for the usual endless progress devstack | 01:12 |
alexpilotti | *postgress | 01:12 |
anteaya | okay so for your 3 patches you are babysitting you only have one to go? | 01:12 |
alexpilotti | actually no, standard devstack, gone as well anyway | 01:13 |
*** nati_ueno has joined #openstack-infra | 01:13 | |
alexpilotti | the 3rd one merged as well, so beside a last one that requires some late reviews I'm done for this round | 01:13 |
anteaya | alexpilotti: sorry I have lost track | 01:13 |
anteaya | oh in that case, yay! | 01:14 |
alexpilotti | anteaya: thanks again for your help! :-) | 01:14 |
anteaya | np | 01:14 |
anteaya | congratulations on merged patches | 01:14 |
anteaya | happy sleep | 01:14 |
Alex_Gaynor | nooooo | 01:14 |
Alex_Gaynor | gate reset time | 01:14 |
anteaya | yeah | 01:15 |
*** dina_belova has joined #openstack-infra | 01:15 | |
anteaya | let's find out which patch caused it and what happened | 01:15 |
anteaya | the 7th one on my list is the new head | 01:15 |
Alex_Gaynor | https://review.openstack.org/#/c/45074/ | 01:15 |
Alex_Gaynor | is the cause | 01:16 |
Alex_Gaynor | I don't recognize that failure from the rechecks page | 01:16 |
anteaya | okay I want to look at all the patches prior to that one | 01:16 |
*** changbl has joined #openstack-infra | 01:17 | |
anteaya | this one was just before the failure and it merged: https://review.openstack.org/#/c/42534 | 01:17 |
anteaya | yup all the rest ahead of 45074 merged | 01:18 |
*** dina_belova has quit IRC | 01:19 | |
*** pcrews has joined #openstack-infra | 01:20 | |
yjiang5 | 01:22 | |
*** ewindisch- has quit IRC | 01:22 | |
anteaya | okay next list: http://paste.openstack.org/show/45787/ | 01:24 |
portante | one of mine is #2 in that list | 01:25 |
portante | let me know if I can help | 01:25 |
anteaya | portante: don't touch anything | 01:27 |
anteaya | swing a rubber chicken if you have one | 01:27 |
*** dolphm has quit IRC | 01:27 | |
anteaya | sdague if you are around, we have a tempest setupclass timeout test failure: http://logs.openstack.org/74/45074/1/gate/gate-tempest-devstack-vm-postgres-full/1c7d428/console.html | 01:27 |
anteaya | portante: :D | 01:28 |
lifeless | mordred: what are the rules for things being added to te jobs section of zuul layout.yaml ? | 01:28 |
lifeless | http://ci.openstack.org/stackforge.html covers projects/ not jobs/ | 01:28 |
*** alexpilotti has quit IRC | 01:29 | |
* portante swings | 01:29 | |
anteaya | :D | 01:29 |
*** kiall has quit IRC | 01:30 | |
anteaya | funny the test jobs on the patch 4th in line in the gate queue are queued | 01:32 |
anteaya | the first 3 devstack vm tests: functional, full and neutron | 01:32 |
anteaya | while the next 18 patches after that patch, 44625, 3 are all running | 01:33 |
anteaya | how odd | 01:33 |
lifeless | different pools | 01:34 |
lifeless | I think | 01:34 |
anteaya | I meant to say all test jobs are running on the next 18 patches | 01:34 |
anteaya | really? | 01:34 |
anteaya | okay I look forward to being educated on why that is | 01:34 |
*** salv-orlando has left #openstack-infra | 01:36 | |
yjiang5 | anteaya: strange, all patches before/after 44625 is undergoing, while 44625 is still queued | 01:36 |
anteaya | yes | 01:37 |
anteaya | I see the same | 01:37 |
*** cyeoh_ is now known as cyeoh | 01:37 | |
anteaya | funny since 44625 must merge or fail before any behind it can do anything | 01:37 |
*** sgviking has quit IRC | 01:38 | |
anteaya | there are at least one test job running on all 55 patches in the gate queue | 01:38 |
anteaya | yet those 3 test jobs on 44625 are still queued | 01:38 |
*** yaguang has joined #openstack-infra | 01:39 | |
yjiang5 | anteaya: I'm really not so lucky :-( | 01:39 |
anteaya | every patch in the gate queue is in the same pipeline right now, there is no pipeline change | 01:40 |
anteaya | yjiang5: I'm rooting for you | 01:40 |
anteaya | yjiang5: which patch are you cheering for? | 01:40 |
yjiang5 | anteaya: so still lucky :) the 44625 is my patch | 01:41 |
anteaya | ah okay | 01:41 |
portante | I think 44625 is my patch | 01:41 |
anteaya | well let's cheer for those test jobs starting on your patch | 01:41 |
anteaya | ha ha ha | 01:41 |
yjiang5 | portante: oops, let me check my number | 01:41 |
anteaya | https://review.openstack.org/#/c/44625/ is portante's patch | 01:42 |
anteaya | but you can cheer for it too, yjiang5 | 01:42 |
yjiang5 | portante: anteaya, I monitor a wrong patch :$ | 01:42 |
anteaya | yjiang5: that's okay | 01:42 |
anteaya | is yours in the queue yjiang5? | 01:43 |
anteaya | show me the url for the patch | 01:43 |
Alex_Gaynor | pretty sure something is broken with 44625, somehow | 01:43 |
anteaya | why aren't those test jobs starting? | 01:43 |
yjiang5 | anteaya: my patch is far behind him. 44645 | 01:43 |
anteaya | ah okay | 01:43 |
*** svarnau_ has quit IRC | 01:43 | |
yjiang5 | anteaya: and 39891, I can go for dinner first | 01:44 |
anteaya | yjiang5: this is you? https://review.openstack.org/#/c/44645/ | 01:44 |
anteaya | yeah dinner is a good idea | 01:44 |
anteaya | Alex_Gaynor: are you anywhere near the aws hackathon? | 01:44 |
anteaya | can someone get a text to jeblair, I think zuul needs his touch | 01:44 |
Alex_Gaynor | anteaya: unless it is *shockingly* near my apartment, nope :) | 01:44 |
anteaya | okay | 01:44 |
yjiang5 | anteaya: yes, that mine. | 01:45 |
*** yjiang5 is now known as yjiang5_dinner | 01:45 | |
anteaya | okay, well I will cheer for that one too | 01:45 |
yjiang5_dinner | anteaya: thanks. | 01:45 |
*** ryanpetrello has quit IRC | 01:45 | |
anteaya | yeah, those jobs still haven't started on 44625 | 01:45 |
anteaya | I don't know what zuul will do with that patch | 01:46 |
anteaya | maybe it will kick it out if it can't be merged | 01:46 |
Alex_Gaynor | I imagine it'll hang for forever? | 01:46 |
anteaya | oh I hope not | 01:46 |
anteaya | clarkb fungi mordred jeblair we need a core please | 01:47 |
fungi | 'sup? | 01:47 |
anteaya | yay | 01:47 |
anteaya | look at the gate queue | 01:47 |
*** ryanpetrello has joined #openstack-infra | 01:47 | |
anteaya | see the second patch, the swift one | 01:48 |
anteaya | 44625 | 01:48 |
anteaya | 3 jobs never started | 01:48 |
anteaya | we don't want it to clog the queue | 01:48 |
fungi | i'll need to go get on a computer... just a sec | 01:48 |
anteaya | and we also need the jobs to run and the patch to merge | 01:48 |
anteaya | thanks | 01:48 |
anteaya | portante: we may need to kick your patch | 01:49 |
anteaya | sorry about that | 01:49 |
*** sgviking has joined #openstack-infra | 01:49 | |
anteaya | it will requeue and hopefully jobs will start on it next time | 01:49 |
anteaya | yeah right now it is holding up the 3 patches behind it | 01:49 |
Alex_Gaynor | well, right now it's holding up the entire gate :) | 01:50 |
anteaya | yes | 01:50 |
anteaya | tests are still running but nothing can proceed | 01:50 |
fungi | er, having trouble getting to status.o.o for some reason | 01:52 |
anteaya | fungi: hopefully you can remove 44625 from the queue | 01:52 |
anteaya | hmmmm | 01:52 |
anteaya | there are 7 patches behind it ready to merge | 01:52 |
Alex_Gaynor | well, when we remove 44625 that'll trigger a reset | 01:52 |
anteaya | rats | 01:53 |
Alex_Gaynor | (I assume) | 01:53 |
anteaya | I hope fungi has a magic trick up his sleeve | 01:53 |
anteaya | if not, better now than in 2 hours | 01:53 |
portante | anteaya: so do I just do a "reverify no bug" to get it back in the queue? | 01:53 |
Alex_Gaynor | portante: once it's actually removed, yes | 01:53 |
Alex_Gaynor | (assuming removal is the solution) | 01:54 |
anteaya | portante: before you do that, let's have a quick look | 01:54 |
portante | okay, just give the word | 01:54 |
portante | I don't see anything in zuul anymore | 01:54 |
anteaya | just to see if there is anything that might have caused the jobs not to start | 01:54 |
portante | ah, zuul is back | 01:54 |
anteaya | but that doesn't make sense, the jobs starting or not wouldn't have anything to do with your patch | 01:54 |
fungi | gah, this machine had an old hosts entry for status.o.o from a previous maintenance. my bad | 01:55 |
portante | remember what sherlock holmes says ... | 01:55 |
anteaya | let's hear from fungi after we are back up, but yes probably a reverify no bug | 01:55 |
Alex_Gaynor | portante: "elementary my dear watson"? | 01:55 |
*** UtahDave has joined #openstack-infra | 01:55 | |
portante | it lookslike they are urnning? | 01:55 |
portante | 44625 | 01:55 |
anteaya | tests are running | 01:56 |
anteaya | yay | 01:56 |
Alex_Gaynor | yay! | 01:56 |
anteaya | better we wait than a gate reset | 01:56 |
anteaya | yay fungi | 01:56 |
Alex_Gaynor | and without a gate reset | 01:56 |
*** kiall has joined #openstack-infra | 01:56 | |
anteaya | yay | 01:56 |
lifeless | fungi: you might know, when does a jobs/ entry get added to zuul/layout.yaml ? | 01:56 |
portante | oy, the flood gates are going to open when this is done! | 01:56 |
anteaya | 11 patches after yours portante ready to go iin | 01:56 |
anteaya | yes | 01:57 |
anteaya | let's hope your tests pass | 01:57 |
* portante grabs some foul nearby | 01:57 | |
Alex_Gaynor | oh man, if portante's tests fail... I'll be so sad | 01:57 |
anteaya | fungi: did you do anything to get those test jobs running on 44625? | 01:57 |
anteaya | portante: swing that chicken | 01:57 |
fungi | anteaya: yeah, it looks like it didn't have available devstack slaves to devote to those jobs until just a moment ago, which is odd considering there were others running further down the queue | 01:57 |
anteaya | or did we just have to wait for some nodes to be freed up | 01:58 |
anteaya | yes | 01:58 |
anteaya | that was what we were seeing | 01:58 |
fungi | but we are definitely getting starved for devstack slaves | 01:58 |
*** nati_uen_ has quit IRC | 01:58 | |
anteaya | okay | 01:58 |
fungi | if you look at the graph | 01:58 |
anteaya | jeblair: said something about hard limits to the amount of nodes we can create at any given time | 01:58 |
anteaya | I don't know what the numbers are | 01:58 |
anteaya | maybe next ff we won't have a hackathon on the same night :/ | 01:59 |
portante | :) | 01:59 |
anteaya | I don't see a failure anywhere in the gate queue right now | 02:00 |
* anteaya grabs her rubber chicken | 02:00 | |
*** rcleere has joined #openstack-infra | 02:01 | |
openstackgerrit | Jeremy Stanley proposed a change to openstack-infra/config: Preserve change creation time on project renames https://review.openstack.org/45155 | 02:01 |
anteaya | okay if your patch passes gate tests portante, 24 patches behind it go too | 02:02 |
anteaya | 43904.9 is failing | 02:02 |
anteaya | so 42296,5 should be the head of the next queue after the gate reset | 02:03 |
fungi | lifeless: the top-level jobs list is for overriding conditions under which a job runs. we use it to restrict jobs to certain branches of projects or to make jobs non-voting mainlu | 02:03 |
fungi | mainly | 02:03 |
anteaya | that is my expectation if 44625 passes | 02:03 |
fungi | also to set some parameter functions | 02:03 |
fungi | and occasionally override result messages | 02:03 |
* anteaya grabs a rubber turkey | 02:03 | |
anteaya | oh 39891,40 and 43569,6 are still running tests | 02:05 |
lifeless | fungi: so in a derived infrastructure it can have just ^.*$ with set_log_url ? | 02:05 |
anteaya | and 44625,3, all the rest of the tests have passed through to the failing patch | 02:05 |
fungi | lifeless: you may even be able to omit it entirely, but i'd need to go digging in source/docs to confirm whether zuul does something sane with that parameter if so | 02:06 |
*** nati_ueno has quit IRC | 02:08 | |
fungi | anteaya: but yeah, i think what we saw was an instance of clarkb's earlier observation that jobs are not always started sequentially following a gate reset (and it becomes a lot more obvious when there aren't enough slaves to go around) | 02:09 |
*** nati_ueno has joined #openstack-infra | 02:09 | |
anteaya | fungi: yes | 02:09 |
anteaya | so I wasn't seeing this earlier in the week or last week | 02:09 |
anteaya | mind you, I wasn't watching closely yesterday | 02:10 |
anteaya | and Monday was rather quiet | 02:10 |
anteaya | so how to gather information to understand why this is? | 02:10 |
anteaya | am I the only one scrolling up and down the gate queue yelling encouragement to zuul? | 02:11 |
anteaya | 43569,6 has finished and passed all tests | 02:12 |
portante | 44625: tempest.scenario.test_volume_snapshot_pattern.TestVolumeSnapshotPattern.test_volume_snapshot_pattern ... FAIL | 02:12 |
anteaya | nooooooooooo | 02:13 |
anteaya | nooooo | 02:13 |
clarkb | fungi they are started sequentially | 02:13 |
*** nati_ueno has quit IRC | 02:13 | |
yjiang5_dinner | anteaya: 39891,40 has running a long time, strange. | 02:13 |
clarkb | fungi but zuul handles a null job result from jenkins by requeuing jobs | 02:14 |
portante | 44625: setUpClass (tempest.api.compute.images.test_list_image_filters.ListImageFiltersTestXML) ... FAIL | 02:14 |
portante | doomed | 02:14 |
anteaya | yjiang5_dinner: yes it is waiting for 44625 to finish, which is going to fail | 02:14 |
clarkb | null job results happen when a slaves ssh connection is dead for example | 02:14 |
* anteaya wipes tears away | 02:14 | |
yjiang5_dinner | anteaya: so everyne has to come again? ? ....... | 02:14 |
anteaya | yjiang5_dinner: stand by | 02:14 |
anteaya | we are about to find out | 02:15 |
clarkb | we all need a healthy dose of patience and less flaky tests :) | 02:15 |
anteaya | yeah | 02:15 |
yjiang5_dinner | clarkb: agree. | 02:15 |
*** yjiang5_dinner is now known as yjiang5 | 02:15 | |
anteaya | let's see what 44777,1 does | 02:15 |
*** dina_belova has joined #openstack-infra | 02:15 | |
anteaya | it is the patch after 44625 | 02:15 |
portante | but even though 44625 failed, won't the 24 passes behind it still get submitted? | 02:16 |
fungi | portante: no, because that failing change might have also been doing something to make the others behind it work | 02:16 |
fungi | so they need to be retested to be sure | 02:16 |
lifeless | fungi: if pushing to github fails; where does it get logged? | 02:16 |
portante | that is a swift change though | 02:17 |
fungi | also, one of those 24 (39891) has a tempest job on its way to failure | 02:17 |
anteaya | current gate patch lineup: http://paste.openstack.org/show/45789/ | 02:17 |
lifeless | anteaya: probably... | 02:17 |
clarkb | lifeless review_site/logs/error_log | 02:17 |
lifeless | anteaya: (only one yelling encouragement) | 02:17 |
fungi | portante: zuul doesn't know that the tempest jobs don't really (yet) do anything with swift | 02:17 |
lifeless | clarkb: on the gerrit machine ? | 02:17 |
clarkb | lifeless yes | 02:17 |
yjiang5 | fungi: 39891 also has failure? ........:'( | 02:17 |
anteaya | lifeless: I'm still yelling, grieving now | 02:17 |
portante | oy | 02:18 |
fungi | lifeless: i think gerrit's error log may contain indication of mirroring failures | 02:18 |
lifeless | git@github.com:testing-cabal/testtools.git: reject HostKey: github.com | 02:18 |
clarkb | I thought glance uses swift in tempest or something? | 02:18 |
clarkb | I have seen tempest catch actual integration fails before in swift | 02:19 |
fungi | oh, perhaps. in which case there's the possibility that one swift change was enabling some feature used by subsequent glance changes | 02:19 |
koolhead17 | anteaya did you recently blogged about devstack? | 02:20 |
anteaya | i did | 02:20 |
lifeless | fungi: / clarkb: ^ any thoughts | 02:20 |
anteaya | how are you koolhead17? | 02:20 |
koolhead17 | anteaya: am awesome. Does that blog also mentiones about restarting the servicrs? | 02:20 |
*** dina_belova has quit IRC | 02:20 | |
clarkb | lifeless: add githubs host keys to your knownhosts? | 02:20 |
anteaya | koolhead17: no, I was going to cover that in post 2 | 02:21 |
fungi | lifeless: might be that the github host key needs to be (manually since we might not have puppeted it) cached in ~gerrit2/.ssh/known_hosts | 02:21 |
lifeless | clarkb: just regular ~/.ssh/knownhosts? or /etc/ or ? | 02:21 |
anteaya | it will probably be a series of 3, but the timing is loose | 02:21 |
koolhead17 | cool. remind me once its up | 02:21 |
koolhead17 | i will forward to a lot many people | 02:21 |
anteaya | koolhead17: can do and thanks for asking | 02:21 |
anteaya | ha ha ha | 02:21 |
clarkb | lifeless I think known hosts for the user running gerrit | 02:21 |
koolhead17 | a lot people asked me for it | 02:21 |
anteaya | I welcome your feedback too, koolhead17 | 02:21 |
anteaya | ah okay | 02:21 |
clarkb | fungi ^ do you remember from the git.o.o stuff? | 02:21 |
koolhead17 | i will try it out all myself as well | 02:22 |
koolhead17 | :D | 02:22 |
anteaya | koolhead17: I was going to go with go into screen and control C to stop then up arrow to get the command back and then enter | 02:22 |
anteaya | did you want something more/different than that? | 02:22 |
fungi | lifeless: clarkb: yes i believe it came up and we manually added it to that file with the assertion that pleia2 was going to submit a change to start managing the list (but first we need to decide which of the various entries in it are no longer needed) | 02:22 |
koolhead17 | anteaya: how about making a nice screencast | 02:23 |
koolhead17 | :D | 02:23 |
anteaya | koolhead17: interesting you should ask about that | 02:23 |
anteaya | I have been working on something to facilitate that | 02:23 |
koolhead17 | anteaya: you should blog and then attach a screencast as well | 02:23 |
Alex_Gaynor | nooooooooo | 02:23 |
Alex_Gaynor | gate reset | 02:23 |
portante | yes | 02:23 |
koolhead17 | recordmydesktop is cool tool | 02:23 |
anteaya | I have to let mordred get through his email pile before I bring that up again | 02:23 |
portante | we have all already gone through our mourning ... | 02:24 |
anteaya | yeah, been mourning for a bit now | 02:24 |
anteaya | koolhead17: it isn't so much the tools with screencasts as the editing | 02:24 |
anteaya | editing takes for ever and it makes or breaks a good screencast (or film or tv production) | 02:25 |
koolhead17 | not sure of editing tools | 02:25 |
anteaya | tools are part of it and time the other | 02:25 |
anteaya | I have edited sound and it takes a long time | 02:25 |
anteaya | and I have worked with video editors | 02:25 |
lifeless | fungi: I've added it to my copious doc patches. | 02:25 |
anteaya | so yes, all things I have thought about, but I can't be rushed into saying yes right now | 02:26 |
anteaya | though I appreciate the support and encouragement, koolhead17 | 02:26 |
anteaya | I'm also not saying no, I'm saying not right now | 02:26 |
*** pcrews has quit IRC | 02:26 | |
koolhead17 | anteaya: lets get all the blogs ASAO | 02:26 |
koolhead17 | P | 02:26 |
koolhead17 | :) | 02:26 |
anteaya | ASAOAssociation for Social Anthropology in Oceania | 02:27 |
anteaya | AsAoAscending Aorta | 02:27 |
anteaya | ASAOAdvanced Space Analysis Office | 02:27 |
*** ArxCruz has quit IRC | 02:27 | |
anteaya | ? | 02:27 |
anteaya | am I close yet? | 02:27 |
fungi | all singing about openstack! | 02:28 |
anteaya | ha ha ha | 02:28 |
* fungi likes to keep it positive | 02:28 | |
anteaya | 44777,1 is what I am singing right now | 02:28 |
anteaya | :D | 02:28 |
clarkb | perhaps it is just the way my brain works but screencasts of a terminal aren't very useful | 02:29 |
anteaya | portante: did you want any more pressure on you and your patch tonight? | 02:29 |
anteaya | or have you had your fill? | 02:29 |
anteaya | clarkb: yeah, I find the same | 02:29 |
anteaya | I tweet with a guy that does animation | 02:30 |
anteaya | I have wondered what it would be like to work with him | 02:30 |
portante | well, the one that follows that, 44626, had to be rebased anyways, since it would not merge | 02:30 |
anteaya | not animation, illustration | 02:30 |
anteaya | different skills, sorry my bad | 02:30 |
anteaya | portante: okay good | 02:30 |
anteaya | glad it didn't take you down | 02:30 |
anteaya | rebase away and then come back into the queue | 02:30 |
anteaya | so was this a flakey test failure? | 02:31 |
anteaya | or did it catch a problem in the code? | 02:31 |
portante | so I just rebased them both and tomorrow they'll get reviewed, hopefully, and go through | 02:31 |
anteaya | ah, well | 02:31 |
anteaya | feature freeze is whenever ttx wakes up (at least that is what I am going by) | 02:32 |
anteaya | but it is your call portante | 02:32 |
*** dims has quit IRC | 02:33 | |
*** ericw has joined #openstack-infra | 02:34 | |
portante | anteaya: thx | 02:35 |
anteaya | :D | 02:35 |
anteaya | here is the list of patches in our current queue, I did all of them this time: http://paste.openstack.org/show/45790/ | 02:38 |
*** adalbas has quit IRC | 02:40 | |
*** jhesketh__ has quit IRC | 02:50 | |
*** jhesketh has joined #openstack-infra | 02:50 | |
*** jhesketh_ has joined #openstack-infra | 02:52 | |
*** Ryan_Lane has joined #openstack-infra | 02:52 | |
*** xchu has joined #openstack-infra | 02:55 | |
Alex_Gaynor | another failure :( these are killing us | 02:55 |
*** kiall has quit IRC | 02:55 | |
Alex_Gaynor | Am I crazy to think all other development on neutron should stop until these are resolved :( | 02:55 |
portante | Alex_Gaynor: about how to handle a hackathon ... ;) | 02:57 |
*** sarob has joined #openstack-infra | 02:57 | |
anteaya | FAIL: tempest.api.compute.volumes.test_volumes_get.VolumesGetTestXML.test_volume_create_get_delete[gate,smoke] on 44777 :( | 02:59 |
*** UtahDave has quit IRC | 03:00 | |
anteaya | is this failure neutron related Alex_Gaynor? | 03:01 |
Alex_Gaynor | anteaya: it's my understanding that most of the flaky bits are neutron, that's second-hand though | 03:01 |
anteaya | sorry if that is a silly question, I don't quite grok all of neutron | 03:01 |
anteaya | there has been some neutron flakiness that is true | 03:01 |
Alex_Gaynor | I grok ~0% of neutron :) | 03:02 |
anteaya | markmcclain was on hand today to try to address some of that | 03:02 |
anteaya | ah okay | 03:02 |
*** reed has joined #openstack-infra | 03:02 | |
anteaya | a failing tests mentioning volumes has me leaning toward cinder | 03:03 |
*** kiall has joined #openstack-infra | 03:03 | |
anteaya | but that is my first response | 03:03 |
anteaya | and yes | 03:04 |
anteaya | having patches move through the gate is not helping out any | 03:04 |
anteaya | move through the gate one at a time, is what i meant to type | 03:04 |
markmcclain | anteaya: still around | 03:04 |
anteaya | I thought it but my hands did something else | 03:04 |
anteaya | yeah so this failure on 44777: http://logs.openstack.org/77/44777/1/gate/gate-tempest-devstack-vm-postgres-full/a79cb93/console.html | 03:05 |
anteaya | I don't think that is neutron, is it? | 03:05 |
*** melwitt has quit IRC | 03:05 | |
*** ryanpetrello has quit IRC | 03:05 | |
markmcclain | it's not neutron because it volume code | 03:07 |
jgriffith | anteaya: you're correct, that does not appear to be cinder related | 03:07 |
jgriffith | markmcclain: :) | 03:07 |
jgriffith | you beat me to it | 03:07 |
anteaya | right that is what I thought | 03:07 |
anteaya | jgriffith: it isn't cider related? | 03:07 |
jgriffith | anteaya: markmcclain I've been seeing that particular issue jumping around various projects/components all week | 03:07 |
anteaya | volumes are swift then? | 03:08 |
anteaya | cinder is block, I'm so tired, sorry | 03:08 |
jgriffith | anteaya: no, no.. it is Cinder | 03:08 |
anteaya | it is | 03:08 |
*** ryanpetrello has joined #openstack-infra | 03:08 | |
anteaya | hmmm, do you have a bug filed for it anywhere yet, jgriffith? | 03:08 |
jgriffith | well... I haven't figured out what the root is yet though | 03:08 |
anteaya | let's put up a bug report | 03:08 |
jgriffith | anteaya: I believe there's one logged, lemme check | 03:08 |
anteaya | k | 03:08 |
anteaya | let's add this then | 03:09 |
jgriffith | anteaya: I was asking folks about this the other night | 03:09 |
anteaya | and if I see it again I know where to put it | 03:09 |
anteaya | ah okay | 03:09 |
jgriffith | I've seen this pop up for various tests ranging from instance deletion to volumes etc | 03:09 |
jgriffith | the "empty attachments" exception | 03:09 |
jgriffith | anyway... lemme look, just a sec | 03:09 |
anteaya | 9.5 hours of logs got lost in the outage last night | 03:09 |
*** pcrews has joined #openstack-infra | 03:09 | |
anteaya | so I must have missed you mentioning it | 03:09 |
anteaya | yeah, okay because I have seen it at least once today | 03:10 |
anteaya | but didn't know what to do next | 03:10 |
*** ericw has quit IRC | 03:12 | |
jgriffith | anteaya: there are quite a few bugs logged on status.openstack/rechecks that seem to have this signature | 03:13 |
jgriffith | anteaya: sadly they're for all sort of different tests, some with failures ahead of it some not | 03:13 |
anteaya | jgriffith: are they all one bug and need to be consolidated? | 03:14 |
anteaya | eww | 03:14 |
anteaya | so a race condition of some sort? | 03:14 |
jgriffith | anteaya: well until we root casue it I don't think consolidating is going to work | 03:14 |
anteaya | okay | 03:14 |
anteaya | let's do this | 03:14 |
jgriffith | anteaya: it seems like something falls apart in the test tiself, but not sure | 03:14 |
anteaya | what should I do for any failing test that mentions volumes and empty attachments? | 03:15 |
jgriffith | anteaya: for example: http://logs.openstack.org/51/43651/2/gate/gate-tempest-devstack-vm-postgres-full/e5043e2/console.html | 03:15 |
jgriffith | anteaya: which is part of https://bugs.launchpad.net/cinder/+bug/1220436 | 03:15 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during swift gate jobs" [Undecided,New] | 03:15 |
anteaya | is this the consistent part? StringException: Empty attachments: | 03:15 |
*** dina_belova has joined #openstack-infra | 03:16 | |
jgriffith | anteaya: yes! | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document push key acceptance. https://review.openstack.org/45162 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Improve Zuul docs. https://review.openstack.org/45163 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document spinning up a derived zuul. https://review.openstack.org/45164 | 03:16 |
lifeless | booyah | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Gerrit docs improvements - user and groups. https://review.openstack.org/45001 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document review.pp parameters a bit. https://review.openstack.org/44969 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Make gerrit DB setup match actual practice. https://review.openstack.org/44993 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document basic admin hints for jeepyb. https://review.openstack.org/45043 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Non-openstack-ci support for launch/dns.py. https://review.openstack.org/44980 | 03:16 |
jgriffith | anteaya: I'm seeing that all over | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document bootstrapping of Gerrit ACLs. https://review.openstack.org/45011 | 03:16 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Explain API projects a little. https://review.openstack.org/45111 | 03:16 |
jgriffith | haha! lifeless finished his patches | 03:16 |
anteaya | ha ha ha | 03:17 |
anteaya | thanks lifeless, I always liked the rain | 03:17 |
anteaya | jgriffith: hrm | 03:17 |
jgriffith | anteaya: ok, interesting | 03:18 |
jgriffith | anteaya: log it against that one | 03:18 |
jgriffith | so there's a trace pack in the volume-logs | 03:18 |
*** ericw has joined #openstack-infra | 03:18 | |
jgriffith | I"ll get a fix up here shortly | 03:18 |
anteaya | against this bug? https://bugs.launchpad.net/cinder/+bug/1220436 | 03:18 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during swift gate jobs" [Undecided,New] | 03:18 |
anteaya | okay | 03:18 |
*** xchu has quit IRC | 03:19 | |
*** dina_belova has quit IRC | 03:20 | |
*** nati_ueno has joined #openstack-infra | 03:20 | |
lifeless | nuts, broken patch at the top | 03:21 |
lifeless | sorry | 03:21 |
jgriffith | anteaya: so there's a race, where the scheduler is calling update_stats before the lvm driver has initialized | 03:21 |
anteaya | ah okay | 03:21 |
anteaya | how do we ensure that lvm driver is up before update_stats is called by the scheduler? | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Gerrit docs improvements - user and groups. https://review.openstack.org/45001 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document spinning up a derived zuul. https://review.openstack.org/45164 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Improve Zuul docs. https://review.openstack.org/45163 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document review.pp parameters a bit. https://review.openstack.org/44969 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document push key acceptance. https://review.openstack.org/45162 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Make gerrit DB setup match actual practice. https://review.openstack.org/44993 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document basic admin hints for jeepyb. https://review.openstack.org/45043 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Non-openstack-ci support for launch/dns.py. https://review.openstack.org/44980 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document bootstrapping of Gerrit ACLs. https://review.openstack.org/45011 | 03:22 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Explain API projects a little. https://review.openstack.org/45111 | 03:22 |
*** dkliban has joined #openstack-infra | 03:22 | |
jgriffith | anteaya: I think the better way to do it is going to be to check service status on the update call and act accordingly | 03:22 |
jgriffith | anteaya: working on something now | 03:22 |
anteaya | okay, good plan | 03:22 |
anteaya | go jgriffith | 03:22 |
anteaya | lifeless: y'know it is moments like this when markmc's idea of a patch that merges other patches looks appealing to me | 03:23 |
anteaya | *she said parting the curtains of the lifeless patch onslaught* | 03:24 |
markmcclain | looks like we need an infra core to work their magic again | 03:24 |
lifeless | anteaya: that wouldn't help here | 03:25 |
anteaya | well what we did last time was wait for a node to free up to run the test job | 03:25 |
lifeless | anteaya: the reason this is a high stack is sheer logistics putting it together; it's a lot easier to manage. | 03:25 |
anteaya | fungi, it turns out, didn't really do anything last time | 03:25 |
lifeless | anteaya: there are many tangled threads that can in principle be done separately, with no need for an octopus at the top | 03:26 |
anteaya | though why the patch at the head has a queued test job is beyond me | 03:26 |
anteaya | if there is a good explaination I haven't gotten it through my thick head yet | 03:26 |
* fungi rarely actually does anything anyway ;) | 03:26 | |
anteaya | showing up counts | 03:26 |
anteaya | yay fungi | 03:26 |
anteaya | lifeless: yeah, i know it, and it does make sense | 03:27 |
anteaya | it is my way of acknowledging the lifeless wave | 03:27 |
anteaya | some people go to sports events to witness the wave | 03:27 |
anteaya | I just hang out in here | 03:28 |
anteaya | :D | 03:28 |
*** pcrews has quit IRC | 03:28 | |
anteaya | the next failure in the queue is the cinder failure j griffiths is working on, I have attached the console log to the bug | 03:31 |
*** UtahDave has joined #openstack-infra | 03:31 | |
*** xchu has joined #openstack-infra | 03:31 | |
*** pcrews has joined #openstack-infra | 03:32 | |
lifeless | \o/ http://192.237.210.61/ | 03:35 |
*** dklyle has joined #openstack-infra | 03:35 | |
lifeless | fungi: mordred: clarkb: how does one check zuul is 'done' setup wise ? | 03:35 |
fungi | clarkb: so you're saying what we saw with jobs for later changes in a dependent queue starting before jobs for parent changes ahead of them which need the same node types is not what you were seeing earlier? | 03:36 |
anteaya | yay lifeless | 03:36 |
fungi | lifeless: do something which should cause it to start a job? | 03:37 |
lifeless | ah dns has updated | 03:38 |
lifeless | fungi: so I guess thats what I'm asking | 03:38 |
fungi | lifeless: depends on what else you have which needs exercising really | 03:38 |
clarkb | fungi it is what we saw earlier | 03:39 |
clarkb | fungi but it has a good reason for the behavior | 03:39 |
lifeless | fungi: I have https://review.testing-cabal.org/#/c/1/ | 03:39 |
lifeless | fungi: and I have a zuul. I think. | 03:39 |
lifeless | fungi: but I don't see zuul noticing it. | 03:39 |
lifeless | fungi: my config tree is https://github.com/testing-cabal/ci-config | 03:40 |
lifeless | [I'll self-host once the thing is working] | 03:40 |
fungi | lifeless: ahh, so zuul's log should mention when it sees you add a gerrit comment | 03:40 |
fungi | might only be at debug level though | 03:41 |
*** dklyle has quit IRC | 03:42 | |
*** vipul has quit IRC | 03:42 | |
*** dklyle has joined #openstack-infra | 03:42 | |
fungi | that will confirm the gerrit ssh api connection for the stream watcher is working | 03:42 |
*** vipul has joined #openstack-infra | 03:43 | |
*** dklyle has quit IRC | 03:43 | |
lifeless | GitCommandError: 'git clone -v ssh://zuul@review.testing-cabal.org:29418/testing-cabal/testtools /var/lib/zuul/git/testing-cabal/testtools' returned exit status 128: Host key verification failed. | 03:43 |
lifeless | could be related | 03:43 |
*** dklyle has joined #openstack-infra | 03:43 | |
*** dklyle has joined #openstack-infra | 03:43 | |
*** dklyle has quit IRC | 03:44 | |
*** dklyle has joined #openstack-infra | 03:44 | |
lifeless | fungi: 2013-09-05 03:33:11,534 DEBUG zuul.Scheduler: Reconfiguration complete | 03:44 |
lifeless | 2013-09-05 03:33:11,534 DEBUG zuul.Scheduler: No queue file found | 03:44 |
lifeless | is No queue file found important? | 03:44 |
*** xchu has quit IRC | 03:45 | |
fungi | lifeless: mmm, yeah the ssh error is similar to your manage-projects issue. proof we fail at puppeting our ssh known_hosts files | 03:46 |
lifeless | it would be nice if the zuul status page surfaced such issues | 03:46 |
*** vogxn has joined #openstack-infra | 03:49 | |
fungi | lifeless: the "no queue file" error is benign | 03:49 |
lifeless | fungi: ok, I can't see anything wrong now, but it's still not trying to do anything | 03:49 |
lifeless | fungi: I'm sure it's cluebatness but any pointers would be deeply appreciated. | 03:49 |
lifeless | zuul has cloned my project locally | 03:50 |
lifeless | I don't have gear or jenkins yet | 03:50 |
lifeless | I had assumed that this would lead to a stalled queue rather than an empty queue | 03:50 |
fungi | the queue file is an epheneral pickle file it creates to persist the queue status across graceful restarts | 03:51 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document spinning up a derived zuul. https://review.openstack.org/45164 | 03:51 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 03:51 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Improve Zuul docs. https://review.openstack.org/45163 | 03:51 |
fungi | lifeless: did you upload or recheck a change after fixing the ssh connection? | 03:52 |
lifeless | fungi: trying that now. | 03:52 |
lifeless | fungi: I had assumed it would scan for anything already pending | 03:53 |
lifeless | ahha | 03:53 |
lifeless | https://review.testing-cabal.org/#/c/1/ | 03:53 |
lifeless | 'LOST' | 03:53 |
lifeless | guess that means I need a jenkins next | 03:53 |
fungi | the only gerrit state zuul knows is events it receives | 03:53 |
fungi | it doesn't go hunting | 03:53 |
lifeless | no good will? | 03:54 |
fungi | bad will hunting | 03:54 |
* fungi is going to try to afk for sleep | 03:56 | |
openstackgerrit | Darragh Bailey proposed a change to openstack-infra/jenkins-job-builder: Add repo scm https://review.openstack.org/45165 | 03:57 |
anteaya | okay | 03:58 |
anteaya | is there another core around I can ping if need be? | 03:58 |
anteaya | fungi ^ | 03:58 |
anteaya | I'll guess I will just have to take my chances | 03:58 |
*** sdake_ has joined #openstack-infra | 04:00 | |
*** sdake_ has joined #openstack-infra | 04:00 | |
*** zhiyan has joined #openstack-infra | 04:01 | |
*** zhiyan has left #openstack-infra | 04:02 | |
*** xchu has joined #openstack-infra | 04:02 | |
*** rcleere has quit IRC | 04:11 | |
*** nati_ueno has quit IRC | 04:13 | |
*** nati_ueno has joined #openstack-infra | 04:13 | |
*** nati_ueno has quit IRC | 04:14 | |
*** nati_ueno has joined #openstack-infra | 04:16 | |
*** dina_belova has joined #openstack-infra | 04:16 | |
*** gordc has joined #openstack-infra | 04:20 | |
clarkb | anteaya is something broken? | 04:20 |
anteaya | no | 04:20 |
anteaya | sorry just wondered if anyone else was up | 04:20 |
lifeless | is there any need for people to visit jenkins.openstack.org these days? | 04:20 |
anteaya | my apologies | 04:21 |
*** dina_belova has quit IRC | 04:21 | |
clarkb | lifeless: occasionally to hunt down test logs for tests that break badly enough to prevent log copying | 04:21 |
anteaya | j griffith is workin on a cinder patch that hopefully with fix random test failures in the gate | 04:21 |
clarkb | lifeless very infrequent | 04:21 |
fungi | lifeless: for in progress console logs too | 04:21 |
anteaya | like this one | 04:22 |
anteaya | https://jenkins01.openstack.org/job/gate-tempest-devstack-vm-postgres-full/8488/console | 04:22 |
anteaya | StringException: Empty attachments: | 04:22 |
anteaya | this is the third one in the last hour or so | 04:22 |
*** ericw has quit IRC | 04:23 | |
*** nati_ueno has quit IRC | 04:23 | |
anteaya | the common denominator is the StringException: Empty attachments: | 04:23 |
lifeless | clarkb: / fungi: is jenkins all puppetised? Or is there manual fixup ? | 04:24 |
anteaya | so because of this the gate progress is slow | 04:24 |
lifeless | doc/source/jenkins.rst is ambiguous | 04:24 |
*** nati_ueno has joined #openstack-infra | 04:24 | |
fungi | lifeless: manual global config | 04:24 |
clarkb | lifeless: the global config for jenkins is manual. everything else is puppetized | 04:25 |
*** sarob has quit IRC | 04:25 | |
lifeless | clarkb: the doc is vague about what that entails. | 04:25 |
lifeless | fungi: ^ | 04:25 |
*** sarob has joined #openstack-infra | 04:25 | |
*** vogxn has quit IRC | 04:25 | |
lifeless | any chance I can get a copy of the config xml (santised of secrets, of course) ? | 04:26 |
anteaya | and the head of the gate queue just failed on the same bug: http://logs.openstack.org/45/44645/9/gate/gate-tempest-devstack-vm-postgres-full/bd06a66/console.html | 04:26 |
clarkb | lifeless it means you need to config jenkins by hand after puppet installs it | 04:26 |
anteaya | so the gate is one at a timing right now | 04:26 |
lifeless | clarkb: I get that, but jenkins is about 4 billion options and the docs are 10K foot view docs. | 04:26 |
anteaya | clarkb: once j griffith has a patch ready, and it passes check, is there any way to put it at the front of the gate queue? | 04:27 |
lifeless | clarkb: I've setup plenty of jenkins (and written code for the core) - this isn't unfamiliarity but rather sadly too much familiariry | 04:27 |
*** yongli_away is now known as yongli | 04:27 | |
clarkb | anteaya: not easily | 04:27 |
anteaya | okay | 04:27 |
anteaya | so one patch at a time is failing on this bug | 04:27 |
lifeless | clarkb: specifically I need to know: what plugins to install, what global options need changing/initial setup. | 04:27 |
clarkb | lifeless: I'm not sure how to work around that | 04:27 |
anteaya | so if we wait for it go through the queue, we could be wasting a lot of time | 04:28 |
yjiang5 | anteaya: gate is now one at a timing, then back to the end of the queue to wait again. | 04:28 |
clarkb | lifeless: plugin installs should be automated | 04:28 |
anteaya | once the patch is ready that is | 04:28 |
anteaya | yjiang5: yeah, I saw that | 04:28 |
anteaya | sigh | 04:28 |
lifeless | clarkb: ok, so I just need a copy of the global config to crib off | 04:28 |
yjiang5 | anteaya: do you know what's the pass ratio today? | 04:29 |
*** vogxn has joined #openstack-infra | 04:29 | |
lifeless | clarkb: perhaps we're talking at cross purposes? | 04:29 |
anteaya | yjiang5: no, I do not | 04:29 |
anteaya | let me see if I can get graphite.openstack.org to do my bidding - probably not though, but I'll try | 04:29 |
*** sarob has quit IRC | 04:29 | |
clarkb | lifeless: I am on my phone right now, hard to get on jenkins | 04:30 |
lifeless | maybe mordred is on better connection? | 04:30 |
anteaya | here is total changes in the gate: http://graphite.openstack.org/graphlot/?width=586&height=308&_salt=1378355430.505&target=stats_counts.zuul.pipeline.gate.total_changes | 04:31 |
anteaya | for the last 24 hours | 04:31 |
fungi | anteaya: in the past we've killed zuul after exporting the pipelines, approved the fix and then reverified the old changes, but only in dire circumstances | 04:31 |
anteaya | well my hand is up for dire | 04:32 |
anteaya | the gate is working on one patch at a time | 04:32 |
yongli | Alex_Gaynor: thanks reverify our patch | 04:32 |
anteaya | and 3 of the last 5 patches have failed on the same bug | 04:32 |
Alex_Gaynor | Unless Iv'e missed something, restarting won't fix anything, the tests are fundamentally unstable right now. We need to fix the root cause. | 04:32 |
clarkb | lifeless: this is a major problem with jenkins. but the global config should shouldn't really need anything specific in it. | 04:32 |
*** ryanpetrello has quit IRC | 04:32 | |
Alex_Gaynor | Do we have a patch for that issue, I didn't se eone. | 04:32 |
*** ryanpetrello has joined #openstack-infra | 04:32 | |
lifeless | clarkb: what about the LP sso configuration, for instance? | 04:32 |
anteaya | Alex_Gaynor: yes, we are talking about how to get the patch once it is ready to the front of the gate queue | 04:33 |
lifeless | clarkb: number of executors on jenkins itself | 04:33 |
anteaya | j griffith is working on the patch | 04:33 |
anteaya | it isn't ready yet | 04:33 |
Alex_Gaynor | anteaya: what if we all went to bed and see if it's better in the morning :/ | 04:33 |
clarkb | lifeless: none of those values are required to be set | 04:33 |
anteaya | I just need to make sure a core is around once it is ready | 04:33 |
clarkb | lifeless: you can use whatever auth you want with however many master executors | 04:33 |
clarkb | we use 1 | 04:33 |
anteaya | Alex_Gaynor: well we could try, but I know I wouldn't sleep | 04:33 |
lifeless | so, again, my goal is to describe how to setup the same basic infra | 04:34 |
clarkb | or should now that we have nodepool, but those are implementation specifics and depend on what tests you are running | 04:34 |
lifeless | not how to setup something that happens to use the same components | 04:34 |
lifeless | cloning what openstack has, and documenting it better, are explicit goals. | 04:34 |
lifeless | clarkb: what else do you set when you setup e.g. jenkins02 ? | 04:35 |
clarkb | lifeless: in the global config? | 04:35 |
lifeless | by hand anywhere on the machine | 04:35 |
anteaya | yjiang5: here are merges in the last 24 hours: http://graphite.openstack.org/graphlot/?width=586&height=308&_salt=1378355496.062&target=stats_counts.gerrit.event.change-merged | 04:36 |
anteaya | we have only merged 2 patches in the last hour and a half | 04:36 |
clarkb | lifeless: for the global config: we configure a Maven environment for maven jobs, enable the zmq plugin globally, enable the gearman plugin, configure scp and ftp credentials | 04:36 |
yjiang5 | anteaya: sigh | 04:36 |
anteaya | yeah | 04:36 |
clarkb | lifeless: everything else should be puppetized, plugin installs, jenkins install, etc is all done with puppet | 04:37 |
anteaya | we were doing well a few hours ago | 04:37 |
fungi | lifeless: yeah, we carted the global config.xml over when building the new ones too | 04:37 |
lifeless | fungi: hah! I knew it. | 04:37 |
lifeless | fungi: so can I get a copy of that? | 04:37 |
lifeless | fungi: - passwords | 04:37 |
lifeless | clarkb: scp and ftp creds for what ? | 04:37 |
anteaya | 12 hours ago we were doing really well | 04:38 |
clarkb | lifeless: for the scp and ftp plugins so that they can copy files to various hosts | 04:38 |
lifeless | clarkb: so LP SSO setup is puppettised ? | 04:38 |
clarkb | lifeless: ftp is used to publish docs to docs.openstack.org and scp is used for copying logs and other build artifacts | 04:38 |
clarkb | lifeless: no that is part of the global config, but you covered that | 04:38 |
clarkb | (I thought it was already on the list) | 04:38 |
fungi | lifeless: i'm not in a place to be able to scour the config for sensitive data and redact it right this moment | 04:39 |
fungi | also, i clearly must have been lying about going to sleep | 04:39 |
lifeless | clarkb: http://paste.ubuntu.com/6065153/ | 04:39 |
clarkb | lifeless: I think that covers the big items | 04:40 |
lifeless | clarkb: what does 'configure a maven environment' mean, really ? | 04:40 |
clarkb | lifeless: Jenkisn allows you to give names to particular maven build environments, eg Maven3 -> Maven 3.X.Y. And will install that particular env on the slaves and use it to run the tests that select Maven3 | 04:41 |
yjiang5 | 04:41 | |
lifeless | clarkb: so thats for some java things, e.g. the jenkins plugins we have? | 04:42 |
clarkb | lifeless: right | 04:42 |
fungi | lifeless: also tons of manuals | 04:42 |
lifeless | fungi: ? | 04:42 |
fungi | the docs peeps seem to like maven | 04:43 |
clarkb | lifeless: looks like we also configure the timestamper plugin globally (to set the timestamp format) | 04:43 |
clarkb | the docs are docbook built with maven | 04:43 |
lifeless | oh god. | 04:43 |
lifeless | Now I want to stab things. | 04:43 |
* fungi seconds the worry there, but it's what they know i guess | 04:44 | |
fungi | seems to work okay for them | 04:45 |
lifeless | it's just so tedious to write by hand | 04:46 |
* lifeless was writing docbook xml manuals in 2000 | 04:46 | |
clarkb | some of the docs are now markdown converted to docbook | 04:46 |
clarkb | it isn't the way I would do it | 04:46 |
lifeless | huh, jenkins01's cert is self-signed | 04:47 |
lifeless | is that oversight ? | 04:47 |
lifeless | or intent? | 04:47 |
clarkb | intent | 04:47 |
fungi | on purpose | 04:47 |
*** portante is now known as portante|afk | 04:47 | |
lifeless | should jenkins be the same eventually? | 04:47 |
fungi | yeah | 04:47 |
fungi | less used as a user ui now, nor does zuul use its https api any longer | 04:48 |
fungi | heh, i said "user ui" | 04:49 |
fungi | must be getting late | 04:49 |
clarkb | it feels late for me | 04:50 |
lifeless | admins are users too :P | 04:50 |
yjiang5 | 04:51 | |
anteaya | I'll ping once the patch is ready | 04:51 |
anteaya | as long as one of you is available to do the gate reset | 04:51 |
fungi | lifeless: admins aren't real people | 04:51 |
clarkb | anteaya: I don't expect that any of us will be around to do a reset | 04:52 |
clarkb | anteaya: at this point I think we let the gate thrash itself into submission | 04:52 |
anteaya | really? | 04:52 |
lifeless | diong admong at midnight - bad idea. | 04:52 |
anteaya | 2 patches merged in the last hour and half? | 04:52 |
clarkb | anteaya: it requires a decent amount of baby sitting, pulling out the other machine, and potentially staying up should something go sideways | 04:53 |
lifeless | now if only there was an asiapac infra person | 04:53 |
anteaya | clarkb: okay | 04:53 |
lifeless | *doing admin*, I mean. | 04:53 |
anteaya | yeah | 04:53 |
anteaya | got one in your pocket? | 04:53 |
Alex_Gaynor | Why does it look like 26 jobs aren't being started? | 04:54 |
* fungi should relocate to hawaii | 04:54 | |
anteaya | they run after the gate queue | 04:55 |
clarkb | fungi: I could live with my grandparents for a month around feature freezes | 04:55 |
anteaya | the gate queue has priority | 04:55 |
anteaya | and there are limited numbers of the python26 nodes | 04:55 |
fungi | clarkb: good idea | 04:55 |
anteaya | so yes, sometimes it looks like the 26 jobs aren't being started, but they will once the nodes become available | 04:55 |
anteaya | clarkb: that could help | 04:56 |
lifeless | fungi: clarkb: train up someone in asiapac | 04:56 |
anteaya | you could tell us about their great fruit trees | 04:56 |
clarkb | lifeless: we need a person | 04:57 |
* fungi grows one from a seed | 04:57 | |
lifeless | clarkb: what % of time would they need to be on infra for you to consider them a person. | 04:57 |
lifeless | clarkb: e.g. 20%? | 04:58 |
lifeless | clarkb: 10%? 30%? | 04:58 |
clarkb | lifeless: I think mordred and jeblair want >50% | 04:58 |
anteaya | ohhh like invasion of the body snatchers | 04:59 |
anteaya | I couldn't sleep around plants for years after watching that | 04:59 |
anteaya | they would need at least that to learn the systems | 04:59 |
lifeless | clarkb: interesting | 04:59 |
anteaya | and stay current with changes | 04:59 |
lifeless | anteaya: funny, I seem to be doing that :) | 05:01 |
lifeless | anteaya: but I doubt I could do 50% | 05:01 |
anteaya | you do | 05:01 |
anteaya | you would be great | 05:01 |
anteaya | well can you handle fixing the gate and stuff when we need you? | 05:02 |
lifeless | not yet | 05:02 |
lifeless | though I suspect at this point I have more clue than the average bear. | 05:02 |
anteaya | you do | 05:03 |
anteaya | I would trust you to learn the dance | 05:03 |
*** nati_ueno has quit IRC | 05:03 | |
anteaya | if you are willing to be pinged when stuff goes down | 05:03 |
*** nati_ueno has joined #openstack-infra | 05:04 | |
anteaya | another volume failure but not a string exception, this was a teardownclass timeout: FAIL: tearDownClass (tempest.api.volume.test_volumes_list.VolumeListTestXML) | 05:05 |
anteaya | http://logs.openstack.org/53/41453/4/gate/gate-tempest-devstack-vm-postgres-full/8bedd87/console.html | 05:05 |
lifeless | anteaya: the issue is it requires privileged access | 05:05 |
lifeless | anteaya: that should be handed out to a minimum of people | 05:05 |
anteaya | I wonder if it is related to the race condition j griffith is working to fix | 05:05 |
anteaya | lifeless: yes | 05:05 |
lifeless | anteaya: I'd need to earn that and stay current enough to not make things worse... | 05:05 |
anteaya | je blair must allow you access | 05:06 |
anteaya | exactly | 05:06 |
anteaya | root access is not given lightly | 05:06 |
lifeless | clarkb: the jenkins_ssh_key thing is the public key for jenkins master to login on slaves, right ? | 05:07 |
*** kiall has quit IRC | 05:07 | |
clarkb | lifeless: yes | 05:08 |
anteaya | 3 patches merged in the last round | 05:08 |
anteaya | so if we can do 3-5 patches in an hour, with 46 in the queue we are looking at 9 hours, minimum to clear what is in the gate right now | 05:12 |
*** ryanpetrello has quit IRC | 05:15 | |
*** ryanpetrello has joined #openstack-infra | 05:15 | |
*** dina_belova has joined #openstack-infra | 05:17 | |
*** kiall has joined #openstack-infra | 05:19 | |
*** sarob has joined #openstack-infra | 05:21 | |
*** dina_belova has quit IRC | 05:21 | |
*** sarob has quit IRC | 05:25 | |
anteaya | setupclass timeout failure on volumes: http://logs.openstack.org/55/43155/3/gate/gate-tempest-devstack-vm-full/8674b65/console.html | 05:26 |
anteaya | and one patch got merged | 05:27 |
*** nicedice has quit IRC | 05:28 | |
*** xchu has quit IRC | 05:28 | |
jgriffith | anteaya: that last one is not related | 05:31 |
anteaya | okay | 05:31 |
jgriffith | anteaya: however, I believe it's related to a recent commit | 05:31 |
anteaya | how are you doing? | 05:31 |
anteaya | okay | 05:31 |
anteaya | well let's focus on one a time | 05:32 |
anteaya | any luck on the string exception error? | 05:32 |
jgriffith | anteaya: well I know where/what, but I don't know why | 05:32 |
anteaya | okay | 05:32 |
anteaya | do you have any faith in a patch? | 05:33 |
anteaya | or not yet? | 05:33 |
jgriffith | anteaya: I know a hack to make it not stack-trace, but I'd like to understand the problem | 05:33 |
jgriffith | anteaya: it seems that something is not getting cleaned up properly | 05:33 |
anteaya | well not stack-tracing doesn't really get us anywhere, does it? | 05:33 |
anteaya | okay | 05:33 |
jgriffith | anteaya: yes, actually | 05:34 |
anteaya | does it? | 05:34 |
jgriffith | anteaya: that's why the empty string is returned | 05:34 |
anteaya | the stack-trace? | 05:34 |
*** dkliban has quit IRC | 05:34 | |
jgriffith | anteaya: http://logs.openstack.org/42/44542/1/gate/gate-tempest-devstack-vm-postgres-full/2d73c40/logs/screen-c-api.txt.gz | 05:35 |
* anteaya looks | 05:35 | |
jgriffith | anteaya: ^^ 2013-09-03 19:23:47.646 20293 | 05:35 |
anteaya | thanks | 05:36 |
jgriffith | non-existent key error | 05:36 |
*** sarob has joined #openstack-infra | 05:36 | |
*** vogxn has quit IRC | 05:36 | |
*** pblaho has joined #openstack-infra | 05:36 | |
anteaya | KeyError: u'gigabytes_Volume-type-650760248' | 05:37 |
anteaya | that is a non-existant key error? | 05:37 |
jgriffith | anteaya: yes | 05:37 |
anteaya | okay | 05:37 |
anteaya | when did this start showing up? | 05:38 |
anteaya | do you recall? | 05:38 |
jgriffith | anteaya: don't know, jeblair or clarkb usually do the hard work of trackign down those sorts of things for me :) | 05:38 |
jgriffith | anteaya: but this is a bad time for those sorts of things | 05:39 |
jgriffith | anteaya: I'm working on it | 05:39 |
anteaya | yeah | 05:39 |
anteaya | I understand and thank you | 05:39 |
*** gordc has quit IRC | 05:39 | |
anteaya | okay at this point even if you write a patch that will work it will be quite a while before it gets merged | 05:40 |
*** SergeyLukjanov has joined #openstack-infra | 05:40 | |
anteaya | we explored drastic measures but if we invoke them it requires much babysitting of the infra systems afterward | 05:40 |
Alex_Gaynor | I'm drifting off to sleep here, but should someone mail -dev? | 05:40 |
*** xchu has joined #openstack-infra | 05:40 | |
anteaya | so best done when alert | 05:40 |
anteaya | yeah, you are correct | 05:41 |
anteaya | let me compose something to let people know of the situation | 05:41 |
jgriffith | better to recheck against the bug :) | 05:41 |
anteaya | yes I should include that, using this bug: https://bugs.launchpad.net/cinder/+bug/1220436 | 05:41 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during gate jobs" [Critical,Triaged] | 05:41 |
*** sarob has quit IRC | 05:42 | |
anteaya | I can't speak for ttx but I will leave the door open for him to post to the thread so folks know what he is going to do | 05:42 |
anteaya | but right now we are at about 2 patches an hour and gate queue of 45 patches | 05:43 |
jgriffith | anteaya: that's actually pretty good for 3'rd milestone :) | 05:43 |
anteaya | so I will let them know to just keep reverifying with the bug number | 05:43 |
anteaya | and that you are working on a fix | 05:44 |
anteaya | and hopefully progress will be made once the sun is in the sky for those who can track it down | 05:44 |
anteaya | inviting eyes, of course | 05:44 |
anteaya | jgriffith: oh good | 05:44 |
anteaya | jgriffith: this is a race condition, yes? | 05:47 |
jgriffith | anteaya: I'm honestly not sure, but given how intermittent it is I would say yes | 05:48 |
anteaya | I'll say so in the email | 05:49 |
anteaya | If I turn out wrong, please step in and correct me. | 05:49 |
*** zhiyan has joined #openstack-infra | 05:54 | |
zhiyan | fungi: ping | 05:55 |
*** dina_belova has joined #openstack-infra | 05:57 | |
anteaya | he is asleep | 05:59 |
anteaya | zhiyan: can I offer any assistance | 05:59 |
anteaya | if only to listen | 05:59 |
*** reed has quit IRC | 06:01 | |
*** vogxn has joined #openstack-infra | 06:01 | |
anteaya | Alex_Gaynor: posted to ml: http://lists.openstack.org/pipermail/openstack-dev/2013-September/014644.html | 06:01 |
anteaya | I made a spelling mistake, sigh, what else is new | 06:01 |
zhiyan | anteaya: thanks. actually i have quick question here, i add a new feature to nova, but it need call cinderclient with a new interface, so i changed cinderclient also. but how to make those landing, since if cinderclient change not enable then nova tempest test in gate will raise exception .. | 06:01 |
anteaya | jgriffith: you around still? | 06:02 |
anteaya | zhiyan: has a cinderclient question | 06:02 |
anteaya | or are you asking about a dependency? | 06:03 |
anteaya | you create a dependency between the two patches | 06:03 |
anteaya | so the cinderclient change needs to land first | 06:03 |
zhiyan | can i make it in different repo? | 06:04 |
anteaya | so your nova change depends on the cinderclient change | 06:04 |
anteaya | that is an interesting question | 06:04 |
anteaya | let me think for a moment | 06:04 |
openstackgerrit | Peter Liljenberg proposed a change to openstack-infra/jenkins-job-builder: Added support for JaCoCo plugin Publisher https://review.openstack.org/44705 | 06:04 |
anteaya | zhiyan: that is a good point, most dependencies I am familiar with are within the same repo | 06:05 |
anteaya | I am uncertain how to create a dependency relationship between repos | 06:05 |
yjiang5 | anteaya: possibly if you extend "slowing down" as 2 patch per hour, more person will jump into it when they wake up :) | 06:05 |
anteaya | oh I do hope so | 06:05 |
anteaya | more help is great | 06:06 |
anteaya | zhiyan: do you have patch numbers? | 06:06 |
anteaya | have you submitted either patch to gerrit yet? | 06:07 |
anteaya | we have 9 in post!! | 06:08 |
anteaya | yay 9 patches got in | 06:08 |
anteaya | it must have been the email | 06:08 |
anteaya | yay! | 06:08 |
*** dkliban has joined #openstack-infra | 06:09 | |
yjiang5 | anteaya: I ping a cinder guys, not sure if he has any idea on it. | 06:11 |
*** dkranz has quit IRC | 06:11 | |
anteaya | yjiang5: great thank you | 06:11 |
zhiyan | anteaya: yes | 06:12 |
zhiyan | anteaya: 1 sec | 06:12 |
anteaya | k | 06:12 |
openstackgerrit | Peter Liljenberg proposed a change to openstack-infra/jenkins-job-builder: Added SBT builder support https://review.openstack.org/44685 | 06:13 |
zhiyan | anteaya: nova side: https://review.openstack.org/#/c/44817/ | 06:13 |
zhiyan | cinder side: https://review.openstack.org/#/c/44672/ | 06:13 |
* anteaya looks | 06:13 | |
zhiyan | anteaya: this is the nova side tempest log: http://logs.openstack.org/17/44817/1/check/gate-tempest-devstack-vm-full/b1f841c/logs/screen-n-cpu.txt.gz | 06:13 |
*** dkliban has quit IRC | 06:15 | |
zhiyan | anteaya: at 2013-09-03 09:24:27.621 , you can see nova-compute throw an exception since cinderclient change not ready there | 06:15 |
*** dkehn_ has joined #openstack-infra | 06:15 | |
anteaya | zhiyan: okay here is my first problem, the link to the blueprint doesn't have a blueprint at the other end: https://blueprints.launchpad.net/openstack/?searchtext=read-only-volumes | 06:15 |
anteaya | now yesterday was feature freeze, we are just working on getting the work merged | 06:16 |
anteaya | so it is possible there was a blueprint before but it has moved or been changed | 06:16 |
anteaya | due to feature freeze | 06:16 |
*** dkehn has quit IRC | 06:16 | |
anteaya | is that possible? | 06:16 |
zhiyan | anteaya: seems since https://blueprints.launchpad.net/cinder/+spec/read-only-volumes bp been marked 'Implemented' by john | 06:17 |
* anteaya thinks, but if the blueprint was for icehouse, it would be marked as such | 06:17 | |
zhiyan | anteaya: i don't think so sorry | 06:17 |
zhiyan | no, i believe john mark it Implemented just since cinder service side finished it...but not nova and *client | 06:18 |
zhiyan | anteaya: and i don't think this situation is a problem to my issue TBH.. | 06:18 |
anteaya | okay so that is the blueprint for cinder | 06:18 |
anteaya | is there a blueprint for nova for read only volumes? | 06:18 |
anteaya | I guess what I am saying is that if you are just introducing new work now, I don't think it will be accepted | 06:19 |
zhiyan | no, from that bp's whiteboard, you can see we put the all review item there | 06:19 |
*** winston-d has joined #openstack-infra | 06:19 | |
zhiyan | but not create any duplicated bp for other proj.. | 06:19 |
zhiyan | anteaya: thanks for you notice, i really like to have a try :) | 06:20 |
anteaya | okay the cinder blueprint mentions this nova patch: https://review.openstack.org/#/c/34722/ which is abandoned | 06:20 |
*** dkranz has joined #openstack-infra | 06:20 | |
zhiyan | anteaya: yes it is. just pls ignore it | 06:21 |
anteaya | zhiyan: I'm not trying to discourage you, I am glad you want to offer some work | 06:21 |
anteaya | I just don't see how this work can get it? | 06:21 |
anteaya | do you have a feature freeze exception from russelb? | 06:21 |
anteaya | s/can get it?/can get in | 06:22 |
zhiyan | anteaya: i will try. but currently i meet the gate testing issue. | 06:22 |
*** dkliban has joined #openstack-infra | 06:23 | |
anteaya | okay so you wrote this patch for read-only-volumes for cinder and it merged Aug 29: https://review.openstack.org/#/c/38322/ | 06:24 |
zhiyan | anteaya: i'd like resolve it firstly, and at same time to request FFE. But if it get -2 by FF is ok to me also, it will not affect me pass tempest... | 06:24 |
zhiyan | anteaya: yes | 06:24 |
anteaya | I'm getting a picture of the status | 06:25 |
zhiyan | anteaya: cool! | 06:25 |
zhiyan | anteaya: #38322 for cinder service side. | 06:25 |
anteaya | zhiyan: can you give me a time stamp for where I should look for the error in this log please: http://logs.openstack.org/17/44817/1/check/gate-tempest-devstack-vm-full/b1f841c/logs/screen-n-cpu.txt.gz | 06:26 |
zhiyan | anteaya: for cinderclient: https://review.openstack.org/#/c/44672/ https://review.openstack.org/#/c/45171/ | 06:26 |
zhiyan | anteaya: for novaclient: https://review.openstack.org/#/c/44674/ | 06:26 |
zhiyan | anteaya: for nova server side: https://review.openstack.org/#/c/44817/ | 06:26 |
zhiyan | anteaya: that's all for me currently. | 06:27 |
zhiyan | anteaya: it's 2013-09-03 09:24:27.621 | 06:27 |
winston-d | are you guys talking about hte bug slowing down the gate? | 06:27 |
zhiyan | winston-d: no | 06:27 |
zhiyan | anteaya: actually you can get all points by search 'Attaching volume' key works. | 06:28 |
anteaya | winston-d: hi, yes did you have a question | 06:28 |
anteaya | zhiyan: okay so you have a patch merged for cinder, you need a patch merged for nova, one for nova client and 2 for cinder client right? | 06:29 |
zhiyan | yesyes | 06:29 |
winston-d | i'm confused, yes, or no? | 06:29 |
zhiyan | winston-d: no | 06:30 |
anteaya | winston-d: yes | 06:30 |
zhiyan | winston-d: but i think anteaya like you ask question here :) | 06:30 |
anteaya | I am interested in hearing your thoughts on the bug | 06:30 |
anteaya | sorry zhiyan but that is my higher priority | 06:31 |
zhiyan | anteaya: sure thing, pls | 06:31 |
anteaya | winston-d: did you have thoughts on the bug? | 06:31 |
anteaya | or does it affect you? | 06:31 |
anteaya | zhiyan: okay so I found the error, what order does the gate want your patches? | 06:31 |
yjiang5 | anteaya: winston-d is cinder core, so possibly he has some idea also. | 06:32 |
anteaya | great | 06:32 |
anteaya | thank yjiang5 | 06:32 |
anteaya | yes I am keen to talk to winston-d | 06:32 |
zhiyan | anteaya: cinderclient first (#44672) then to nova (#44817) | 06:33 |
anteaya | okay | 06:33 |
winston-d | anteaya: ok. so i looked at the comment you added to https://bugs.launchpad.net/cinder/+bug/1220436. i failed to see the relationship between last two comments from you and cinder | 06:34 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during gate jobs" [Critical,Triaged] | 06:34 |
anteaya | well since i don't know how to create a dependency between repositories, I suggest you get ciinderclient through | 06:34 |
anteaya | and make a comment on both patches that they are dependant but you can't link them yet | 06:34 |
winston-d | anteaya: https://jenkins01.openstack.org/job/gate-tempest-devstack-vm-postgres-full/8488/console this failed at tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestJSON.test_delete_image_that_is_not_yet_active[gate,negative] | 06:35 |
anteaya | and then when fungi is up ask him and see what he says | 06:35 |
anteaya | winston-d: StringException: Empty attachments: | 06:35 |
anteaya | that is the similarity | 06:35 |
winston-d | anteaya: and this https://jenkins01.openstack.org/job/gate-tempest-devstack-vm-postgres-full/8462/console failed at tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active[gate,negative] | 06:35 |
*** dina_belova has quit IRC | 06:36 | |
anteaya | the understanding of the bug has changed since the bug was filed | 06:36 |
anteaya | the error StringException: Empty attachments: is showing up in many tests across many projects | 06:36 |
zhiyan | anteaya: okey, thx. | 06:36 |
anteaya | but it is cinder related | 06:36 |
winston-d | anteaya: well but those aren't test cases against Cinder, why do you think it has any relationship with cinder? | 06:36 |
*** alexpilotti has joined #openstack-infra | 06:36 | |
anteaya | j griffith says that StringException: Empty attachments: is the common element | 06:36 |
anteaya | winston-d: did you read the log attached to the email? | 06:37 |
anteaya | at the time stamp I suggested? | 06:37 |
winston-d | anteaya: IRC log? | 06:37 |
anteaya | it includes j griffith's explaination | 06:37 |
anteaya | http://lists.openstack.org/pipermail/openstack-dev/2013-September/014644.html | 06:38 |
anteaya | he says it better than I | 06:38 |
anteaya | I think it is related to cinder because j griffith told me it is related to cinder | 06:38 |
anteaya | that is the best I have | 06:38 |
*** UtahDave has quit IRC | 06:39 | |
*** yjiang5 has left #openstack-infra | 06:41 | |
winston-d | anteaya: i read it again just now. still don't understand how Cinder contributed to a failed Nova test case | 06:41 |
anteaya | okay | 06:41 |
anteaya | I'm sorry I have no better explanation | 06:41 |
winston-d | anteaya: anyway, i'll try to ctach up with jgriffith | 06:41 |
anteaya | okay thanks | 06:41 |
winston-d | anteaya: thank you | 06:42 |
anteaya | I'm sure he will appreciate the extra eyes on this | 06:42 |
anteaya | np | 06:42 |
*** Ryan_Lane has quit IRC | 06:42 | |
*** sgviking has left #openstack-infra | 06:43 | |
*** vogxn has quit IRC | 06:43 | |
anteaya | yay 5 in post and the gate is at 27 | 06:44 |
*** SergeyLukjanov has quit IRC | 06:45 | |
*** dkehn has joined #openstack-infra | 06:47 | |
*** dkehn_ has quit IRC | 06:48 | |
*** dina_belova has joined #openstack-infra | 06:49 | |
*** Ryan_Lane has joined #openstack-infra | 06:49 | |
* ttx yawns | 06:49 | |
anteaya | morning ttx | 06:50 |
anteaya | I hope you slept well | 06:50 |
*** alexpilotti has quit IRC | 06:51 | |
ttx | I see the gate queue is not fully absorbed yet | 06:51 |
anteaya | you see correctly | 06:51 |
*** alexpilotti has joined #openstack-infra | 06:52 | |
anteaya | we had a bit of a slow down | 06:52 |
ttx | is the slowdown fixed now ? | 06:52 |
anteaya | http://lists.openstack.org/pipermail/openstack-dev/2013-September/014644.html | 06:52 |
anteaya | no | 06:52 |
ekarlso- | hey #infra! | 06:52 |
anteaya | I'll let you read the mailing list post and am here to answer questions | 06:52 |
anteaya | hey ekarlso- | 06:52 |
*** alexpilotti has quit IRC | 06:53 | |
*** dina_belova has quit IRC | 06:53 | |
*** jpich has joined #openstack-infra | 06:54 | |
ttx | anteaya: ok | 06:55 |
anteaya | okay great | 06:55 |
ttx | it seems to go slightlmy faster than 2/h lately ? | 06:55 |
*** harlowja_at_home has joined #openstack-infra | 06:55 | |
ttx | (judging by the trends graph) | 06:55 |
anteaya | we had 9 patches go in in a bunch | 06:55 |
anteaya | and then 5 | 06:55 |
anteaya | I celebrated | 06:56 |
anteaya | so yeah, since I sent the email we got lucky twice | 06:56 |
anteaya | let's hope that continues | 06:56 |
ttx | I might slack for a few more hours on some projects to allow catch-up | 06:57 |
harlowja_at_home | qq, that the infra guys might know about | 06:57 |
harlowja_at_home | http://logs.openstack.org/05/45105/1/gate/gate-tempest-devstack-vm-neutron/306dee5/logs/screen-key.txt.gz shows that keystone port already being in use, is this a known bug? | 06:57 |
anteaya | ttx thanks | 06:57 |
anteaya | yeah your patch was the first I have seen this myself | 06:59 |
anteaya | I was just looking at that actually, it just failed within the last hour | 06:59 |
ttx | anteaya: is there some action I should follow to spot false negatives and reverify them ? | 07:00 |
ttx | or you're on it ? | 07:00 |
anteaya | you are welcome to play along if you want | 07:00 |
anteaya | I can cover for a bit more but I am pretty tiredd | 07:00 |
anteaya | bascially if you get a string exception: StringException: Empty attachments: | 07:01 |
ttx | Am happy to help... is there a convenient way to spot them, rather than watch the queue all the time ? | 07:01 |
*** harlowja_at_home has quit IRC | 07:01 | |
anteaya | that is 'reverify bug 1220436' | 07:01 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during gate jobs" [Critical,Triaged] https://launchpad.net/bugs/1220436 | 07:01 |
anteaya | I am just watching the queue | 07:01 |
*** odyssey4me has joined #openstack-infra | 07:02 | |
anteaya | ttx this is my current list of gate queue patches: http://paste.openstack.org/show/45793/ | 07:02 |
*** Ryan_Lane has quit IRC | 07:03 | |
anteaya | I check every 20 minutes or so and if the head patch has changed, follow up with the prior patches to see if they merged or failed | 07:03 |
ttx | anteaya: ok | 07:04 |
ttx | anteaya: go to bed | 07:04 |
anteaya | I am also seeing some setupclass and teardownclass timeout errors, I haven't filed them and I have no info on them, but they sound like flakey test to me | 07:04 |
anteaya | thanks ttx | 07:04 |
anteaya | you're a gem | 07:04 |
* anteaya sends hugs | 07:04 | |
anteaya | good night | 07:04 |
ttx | gdnite | 07:04 |
*** anteaya has quit IRC | 07:04 | |
*** shardy_afk is now known as shardy | 07:05 | |
*** vogxn has joined #openstack-infra | 07:05 | |
*** xchu has quit IRC | 07:08 | |
*** Bada has joined #openstack-infra | 07:16 | |
*** xchu has joined #openstack-infra | 07:23 | |
*** Bada has quit IRC | 07:28 | |
yongli | hello, do we have enough time to merge all gating patches before H3 cut? | 07:34 |
ttx | yongli: yes, i'm trying to get the queue purged before I cut | 07:37 |
*** vogxn has quit IRC | 07:37 | |
yongli | ttx: i feel better you said that.. | 07:37 |
ttx | so probably in a few more hours for the projects which have patches in the queue | 07:38 |
*** hashar has joined #openstack-infra | 07:47 | |
*** boris-42 has quit IRC | 07:51 | |
*** hashar has quit IRC | 07:52 | |
*** mrmartin has joined #openstack-infra | 07:54 | |
*** vogxn has joined #openstack-infra | 07:55 | |
*** hashar_ has joined #openstack-infra | 07:57 | |
*** mkerrin has quit IRC | 07:59 | |
*** dizquierdo has joined #openstack-infra | 08:06 | |
*** alexpilotti has joined #openstack-infra | 08:07 | |
*** jhesketh_ has quit IRC | 08:11 | |
*** alexpilotti has quit IRC | 08:14 | |
*** dklyle has quit IRC | 08:17 | |
shardy | Anyone know what's up with zuul jobs which have tasks in CANCELED state? | 08:25 |
shardy | do they need kicking with reverify? | 08:25 |
*** vogxn has quit IRC | 08:25 | |
*** niska has quit IRC | 08:26 | |
lifeless | shardy: I know, they do not. | 08:28 |
shardy | lifeless: k, thanks | 08:28 |
lifeless | shardy: it's the state that speculative jobs go into when a precondition fails. | 08:28 |
lifeless | shardy: swift is having trouble and throwing out lots of patches in the gate; all the dependent jobs then get cancelled and requeued | 08:28 |
lifeless | at least, AIUI. | 08:29 |
shardy | lifeless: aha, OK, so they'll get rescheduled when the dependent patches get merged, gotcha | 08:29 |
lifeless | when gate resources free up actually | 08:29 |
lifeless | so the dependencies are temporal, not code wise | 08:30 |
*** niska has joined #openstack-infra | 08:30 | |
hashar_ | shardy: I think the jobs are CANCELED when a dependent change is failing. | 08:30 |
lifeless | e.g. A, B, C are all set to land, A B C all start up tests in parallel. | 08:30 |
hashar_ | shardy: zuul would then cancel the tests for all child changes since they will most probably fail as well. | 08:30 |
lifeless | A fails, B and C get set to CANCELLED | 08:30 |
lifeless | then it starts over with just B and C | 08:30 |
hashar_ | a change got removed (and its builds canceled) if it can no longer merge on thebranch) | 08:31 |
shardy | hashar_, lifeless: OK, thanks for the info, AFAICS the changes depended on are now merged, so I guess I just make some more coffee and wait :) | 08:32 |
lifeless | shardy: like I said, it's not git dependencies | 08:35 |
lifeless | shardy: its temporal - the time position in the queue. | 08:35 |
lifeless | shardy: that is also constrained by git dependencies | 08:36 |
shardy | lifeless: are there any docs which describe how this all works? | 08:37 |
* shardy finds the zuul manual | 08:38 | |
ttx | shadower: yeah they are cancelled because one job above them failed | 08:39 |
ttx | arrrh | 08:39 |
ttx | shardy: ^ | 08:39 |
hashar_ | shardy: http://ci.openstack.org/zuul/gating.html | 08:39 |
lifeless | shardy: there are docs, and zuul is fairly new so this should be well documented | 08:39 |
hashar_ | shardy: though that gating doc might need to be improved with more examples :] | 08:39 |
*** paul-- has quit IRC | 08:40 | |
ttx | looks like the gate is borked right now. | 08:40 |
hashar_ | the source doc is in Zuul code ssh://review.openstack.org:29418/openstack-infra/zuul.git file ./doc/source/gating.rst | 08:40 |
ttx | tests succeed but jobs fail | 08:40 |
shardy | ttx: I'm looking at https://review.openstack.org/#/c/44339/ | 08:41 |
shardy | which seems wedged in zuul despite the parent changes all getting merged | 08:41 |
ttx | Looks like it's the log uploading that fails | 08:41 |
ttx | shardy: looking | 08:42 |
shardy | ttx: it's the last patch we need before branching | 08:42 |
ttx | shardy: yeah that one was cancelled because a patch higher up in the queue failed to merge | 08:42 |
ttx | it will be restarted automatically | 08:43 |
ttx | problem is... there seem to be some condition right now that prevents any test from succeeding | 08:43 |
ttx | http://logs.openstack.org/69/44869/3/gate/gate-neutron-python26/8ddbd62/console.html | 08:43 |
lifeless | ttx: you're looking in jenkins ? | 08:43 |
shardy | ttx: ok, cool, thanks | 08:43 |
ttx | lifeless: the tests seem to be succeeding... but then the job fails after trying to upload logs | 08:43 |
ttx | shardy: looks like I'll have to delay FF to account for those last-minute gate fails | 08:44 |
lifeless | ttx: mmm, not sure | 08:44 |
lifeless | 2013-09-05 08:18:31.481 | + .tox/py26/bin/python /usr/local/jenkins/slave_scripts/subunit2html.py ./subunit_log.txt testr_results.html | 08:44 |
lifeless | 2013-09-05 08:31:57.501 | Build timed out (after 40 minutes). Marking the build as failed. | 08:44 |
lifeless | 2013-09-05 08:31:57.583 | /usr/local/jenkins/slave_scripts/run-tox.sh: line 65: 16162 Terminated .tox/$venv/bin/python /usr/local/jenkins/slave_scripts/subunit2html.py ./subunit_log.txt testr_results.html | 08:44 |
lifeless | 2013-09-05 08:31:57.583 | + gzip -9 ./subunit_log.txt | 08:44 |
lifeless | ttx: ^ thats the thing | 08:44 |
ttx | lifeless: oh. | 08:44 |
lifeless | ttx: the buidl is marked as failed, and log copying happens to let you diagnose it ;) | 08:44 |
ttx | so it's subunit2html.py which timeouts ? | 08:45 |
lifeless | looks like it. Which is super odd. | 08:45 |
lifeless | only a 4.4M subunit file | 08:46 |
ttx | lifeless: I got a couple of fails like this already | 08:46 |
*** paul-- has joined #openstack-infra | 08:46 | |
lifeless | so I really can't image that taking 12m to process | 08:46 |
ttx | so I assumed it was a permanent condition | 08:46 |
lifeless | but, Alex_Gaynor was saying that subunit2html was taking 10m | 08:46 |
ttx | ok, will retry those jobs, it's sane to assume that those are false negatives | 08:47 |
* lifeless pulls it down to poke at | 08:47 | |
*** zhiyan has left #openstack-infra | 08:47 | |
ttx | lifeless: i'll file a bug | 08:47 |
lifeless | [ <=> ] 35,316,289 1.17MB/s <-- odd | 08:47 |
lifeless | downloading... http://logs.openstack.org/69/44869/3/gate/gate-neutron-python26/8ddbd62/subunit_log.txt.gz | 08:48 |
ttx | https://bugs.launchpad.net/openstack-ci/+bug/1221094 | 08:51 |
uvirtbot | Launchpad bug 1221094 in openstack-ci "Gate tests fail with subunit2html.py time out" [Undecided,New] | 08:51 |
*** vogxn has joined #openstack-infra | 08:51 | |
hashar_ | lifeless: your client might be uncompressing them on the fly ? The .gz is 4.4MB apparently | 08:53 |
lifeless | hashar_: I know, wget did that | 08:54 |
lifeless | hashar_: bad TE vs CE on the server config. | 08:54 |
hashar_ | curl does the same :( | 08:54 |
lifeless | ok, so I can reproduce this | 09:00 |
lifeless | processing a lot of unencapsulated text through the v2 parser is tickling the perf issue | 09:01 |
*** derekh has joined #openstack-infra | 09:02 | |
*** yongli is now known as yongli_going_hom | 09:07 | |
*** hashar_ is now known as hashar | 09:07 | |
*** kiall has quit IRC | 09:10 | |
*** xchu has quit IRC | 09:14 | |
ttx | shardy: will cut for Heat when 44339 merges | 09:16 |
*** kiall has joined #openstack-infra | 09:18 | |
*** jhesketh__ has joined #openstack-infra | 09:26 | |
*** mrmartin has quit IRC | 09:27 | |
*** hashar has quit IRC | 09:44 | |
*** hashar has joined #openstack-infra | 09:47 | |
shardy | ttx: Ok, sounds good, thanks | 09:52 |
*** ruhe has joined #openstack-infra | 09:53 | |
ttx | wow, now if that neutron python-26 job succeeds we have a nice queue of 10 patches all landing | 09:56 |
* ttx prays | 09:56 | |
ttx | come on, Murphy. Be good for once | 09:57 |
ttx | YAY | 09:58 |
*** dizquierdo has left #openstack-infra | 09:58 | |
ttx | STRIKE 11 | 09:58 |
ttx | shardy: got 44339 in. Will cut for you in a minute | 09:59 |
*** vogxn has quit IRC | 09:59 | |
ttx | I just need to finish cleaning up the blood from that chicken | 10:00 |
shardy | ttx: sounds good, and ??? | 10:01 |
ttx | I had to pray to some rather fringe gods to make that strike 11 happen | 10:01 |
shardy | haha ;) | 10:01 |
* ttx will be in New Orleans in a few weeks to perfect that technique | 10:02 | |
* shardy has been doing a patch-dance to invoke the mighty zuul ;) | 10:03 | |
ttx | shardy: can I move all your h3 bugs to rc1 ? | 10:03 |
shardy | ttx: yes, please do | 10:03 |
ttx | (i'll let you mark parallel-delete implemented) | 10:04 |
ttx | shardy: ^ | 10:04 |
shardy | ttx: thanks, was just about to do that ;) | 10:04 |
shardy | ttx: done | 10:05 |
ttx | shardy: you're all set | 10:12 |
ttx | branch cut, feature-frozen | 10:12 |
shardy | ttx: thanks! | 10:12 |
*** heyongli has joined #openstack-infra | 10:19 | |
*** heyongli is now known as heyongli-home | 10:19 | |
*** ruhe has quit IRC | 10:24 | |
*** mkerrin has joined #openstack-infra | 10:25 | |
heyongli-home | seems Zuul gating now work very well, :) | 10:26 |
*** dizquierdo has joined #openstack-infra | 10:28 | |
*** hashar has quit IRC | 10:30 | |
lifeless | clarkb: jeblair: I suspect that having the puppet dashboard be public access will be disclosing password tokens and so forth | 10:31 |
lifeless | separately, nuts - getting a 403 from jjb | 10:36 |
*** dims has joined #openstack-infra | 10:37 | |
lifeless | ah csrf token support missing in jjb | 10:39 |
*** tian has quit IRC | 10:42 | |
lifeless | tomorrow I will pick this up - I'm up to: 2013-09-05 06:00:00,003 ERROR zuul.IndependentPipelineManager: Unable to find change queue for project testing-cabal/testtools | 10:44 |
lifeless | on zuul. | 10:44 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Improve Jenkins documentation. https://review.openstack.org/45215 | 10:46 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Docs on bringing up Jenkins in new infrastructures. https://review.openstack.org/45216 | 10:46 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document spinning up a derived zuul. https://review.openstack.org/45164 | 10:46 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 10:46 |
*** ruhe has joined #openstack-infra | 10:52 | |
*** pcm_ has joined #openstack-infra | 11:05 | |
*** dprince has joined #openstack-infra | 11:06 | |
*** pcm_ has quit IRC | 11:07 | |
*** pcm_ has joined #openstack-infra | 11:07 | |
*** hashar has joined #openstack-infra | 11:22 | |
*** nosnos has quit IRC | 11:24 | |
*** dhellmann_ is now known as dhellmann | 11:33 | |
*** ruhe has quit IRC | 11:34 | |
*** pcm_ has quit IRC | 11:34 | |
*** pcrews has quit IRC | 11:37 | |
*** ruhe has joined #openstack-infra | 11:37 | |
*** dina_belova has joined #openstack-infra | 11:38 | |
*** rfolco has joined #openstack-infra | 11:40 | |
*** kiall has quit IRC | 11:43 | |
*** ArxCruz has joined #openstack-infra | 11:44 | |
*** pcrews has joined #openstack-infra | 11:47 | |
*** ruhe has quit IRC | 11:57 | |
*** yaguang has quit IRC | 12:01 | |
*** boris-42 has joined #openstack-infra | 12:04 | |
*** kiall has joined #openstack-infra | 12:07 | |
*** ruhe has joined #openstack-infra | 12:14 | |
*** dina_belova has quit IRC | 12:16 | |
*** derekh has quit IRC | 12:31 | |
*** zul has quit IRC | 12:32 | |
*** zul has joined #openstack-infra | 12:37 | |
*** dina_belova has joined #openstack-infra | 12:41 | |
*** pcm_ has joined #openstack-infra | 12:46 | |
*** derekh has joined #openstack-infra | 12:46 | |
*** hashar has quit IRC | 12:49 | |
*** lcestari has joined #openstack-infra | 12:50 | |
sdague | hey folks, some of the python 26 tests for neutron are failing on the subunit processing - http://logs.openstack.org/52/45152/1/gate/gate-neutron-python26/216e697/console.html | 12:51 |
sdague | any idea what the deal is? | 12:51 |
*** sandywalsh has joined #openstack-infra | 12:51 | |
ttx | sdague: filed a bug for it, lifeless started to look | 12:52 |
ttx | https://bugs.launchpad.net/openstack-ci/+bug/1221094 | 12:52 |
sdague | so the issue is that subunit2html processing on neutron takes 9 - 12 minutes | 12:52 |
uvirtbot | Launchpad bug 1221094 in openstack-ci "Gate tests fail with subunit2html.py time out" [Undecided,New] | 12:52 |
ttx | got a couple retries on that one | 12:52 |
*** hashar has joined #openstack-infra | 12:52 | |
sdague | yeh, don't retry, the short term fix is to increase job timeout | 12:52 |
*** weshay has joined #openstack-infra | 12:53 | |
ttx | <lifeless> ok, so I can reproduce this | 12:53 |
ttx | <lifeless> processing a lot of unencapsulated text through the v2 parser is tickling the perf issue | 12:53 |
sdague | ttx: ok, cool | 12:53 |
sdague | yeh, the neutron jobs have 15k unit tests | 12:53 |
sdague | but the reality is right now, 25% of the run length of the neutron unit tests jobs is the subunit report | 12:54 |
ttx | ha. ha. | 12:54 |
*** portante|afk is now known as portante | 12:54 | |
*** heyongli-home has quit IRC | 12:55 | |
*** vogxn has joined #openstack-infra | 12:58 | |
openstackgerrit | Sean Dague proposed a change to openstack-infra/config: up python 26 jobs to 60 minute time outs https://review.openstack.org/45228 | 12:59 |
sdague | jeblair, fungi, mordred, clarkb: that would actually help quite a bit on gate throughput | 13:00 |
ArxCruz | ALL: How can I connect zuul to jenkins in my own environment? I have both configured and with jenkins-job-builder installed, but I have no idea how to connect each other :/ | 13:01 |
*** ruhe has quit IRC | 13:02 | |
*** gordc has joined #openstack-infra | 13:04 | |
*** prad has joined #openstack-infra | 13:04 | |
hashar | ArxCruz: depends on your version of Zuul :-) | 13:05 |
ArxCruz | hashar: from git | 13:05 |
hashar | ArxCruz: 1.2 uses the Jenkins API to communicate, you need to configure a user in Jenkins (i.e. zuul-bot) and fill in the server, username and apikey in zuul.conf [jenkins] section. | 13:05 |
hashar | ArxCruz: the recent versions of Zuul do not use Jenkins API. Instead it uses gearman as a middle war. | 13:06 |
hashar | ArxCruz: you will need the gear Jenkins plugin | 13:06 |
ArxCruz | hashar: i have gearman jenkins plugin installed | 13:06 |
hashar | ArxCruz: http://ci.openstack.org/zuul/zuul.html#gearman | 13:06 |
*** afazekas has joined #openstack-infra | 13:06 | |
ArxCruz | so, if i understand right, isn't zuul who communicate with jenkins, it's the opposite right ? | 13:06 |
ArxCruz | so, I need to have gearman installed in my jenkins machine | 13:07 |
hashar | ArxCruz: so apparently under [gearman]Â section of zuul.conf, you need to specify server and port of your jenkins | 13:07 |
hashar | i think the plugin does provide a gearman server | 13:07 |
hashar | ArxCruz: http://amo-probos.org/post/15 shows the overall layout | 13:08 |
*** amotoki has joined #openstack-infra | 13:08 | |
hashar | ah maybe that is zuul shipping the gearman server and Jenkins connect to it | 13:08 |
*** gordc has left #openstack-infra | 13:08 | |
hashar | yeah Zuul has one [gearman_server] start=true | 13:09 |
*** jpeeler has quit IRC | 13:12 | |
*** jpeeler has joined #openstack-infra | 13:12 | |
*** dims has quit IRC | 13:13 | |
*** dims has joined #openstack-infra | 13:15 | |
ArxCruz | hashar: cool, thanks, I will investigate, it seems some silly firewall rule, ipv6 is enabled and the /etc/init.d/iptables-persistent is failing | 13:16 |
*** adalbas has joined #openstack-infra | 13:17 | |
hashar | :( | 13:17 |
*** w__ is now known as olaph | 13:18 | |
*** anteaya has joined #openstack-infra | 13:25 | |
*** dizquierdo has left #openstack-infra | 13:25 | |
sdague | so is it just me, or has gerrit gotten a little slow? | 13:26 |
anteaya | is gerrit slow? | 13:26 |
anteaya | I just signed in a opened my page, what part of using gerrit is slow for you sdague? | 13:28 |
anteaya | s/a/and | 13:28 |
sdague | review push | 13:28 |
anteaya | ah okay | 13:28 |
anteaya | I have nothing to push so will take your word for it | 13:28 |
* anteaya reads backscroll | 13:29 | |
*** leifmadsen has quit IRC | 13:29 | |
ttx | anteaya: the cinder issue is mostly gone | 13:29 |
ArxCruz | hashar: it seems puppet doesn't install gearman in zuul, or zuul have an internal gearman ? | 13:29 |
anteaya | ArxCruz: zuul is the communicator | 13:29 |
ttx | anteaya: got a strike 11 at some point | 13:29 |
anteaya | ttx you are the man to make the call | 13:30 |
anteaya | ttx and thanks | 13:30 |
ArxCruz | anteaya: jenkins communicate with zuul or zuul communicates with jenkins ? | 13:30 |
*** clayb has joined #openstack-infra | 13:30 | |
ArxCruz | it seems it's jenkins -> zuul | 13:30 |
*** clayb has left #openstack-infra | 13:30 | |
anteaya | zuul communicates with jenkins | 13:30 |
anteaya | zuul communicates with gerrit | 13:31 |
anteaya | zuul is the layer that takes the information from gerrit and tells jenkins to run a job | 13:31 |
anteaya | then it takes the information from jenkins and puts it in gerrit | 13:31 |
anteaya | zuul was created because jenkins and gerrit don't talk to each other | 13:31 |
ArxCruz | anteaya: I understand, so, I've installed zuul using puppet and openstack puppet configuration | 13:32 |
ArxCruz | in zuul.conf.erb there's no [jenkins] entry | 13:32 |
sdague | anteaya: can you take a look at this - https://review.openstack.org/45228 | 13:32 |
* fungi checks the overnight damage | 13:33 | |
sdague | that would actually solve a lot of our recent resets | 13:33 |
sdague | fungi: you too :) | 13:33 |
sdague | well I found a new one (which was probably part of the issue yesterday) | 13:33 |
anteaya | ttx I recognize the patch at the head of the queue from last night, poor 44777,1 | 13:33 |
sdague | timingout out on neutron 26 runs | 13:33 |
*** thomasm has joined #openstack-infra | 13:33 | |
fungi | holy scrollback, batman | 13:34 |
ttx | fungi: the best part is when I sacrificed a chicken | 13:34 |
openstackgerrit | Andreas Jaeger proposed a change to openstack-infra/config: Add new manual "Cloud Administrator Guide" https://review.openstack.org/45229 | 13:34 |
* ttx considers moving real logs (wooden ones) as a break | 13:34 | |
anteaya | thanks sdague, so should that tie off the setupclass and teardownclass timeouts? | 13:34 |
anteaya | ttx doing firewood is very relaxing | 13:35 |
anteaya | the axe and woodpile are never far away from me | 13:35 |
sdague | anteaya: I don't know, do you have a link to that issue | 13:35 |
ttx | the axe is never away from me | 13:35 |
anteaya | sdague: just the channel logs from last night | 13:35 |
anteaya | ttx ha ha ha | 13:35 |
anteaya | sdague: sorry, I wasn't thinking the clearest, I never filed a bug | 13:36 |
sdague | anteaya: no, this is a different issue | 13:36 |
anteaya | sdague: oh okay, well if you think this is the right approach, I am behind you | 13:36 |
anteaya | thanks for the small patch :D | 13:36 |
sdague | basically the neutron unit tests have grown large enough (15k), that they take a legitimate 32 minutes on their own | 13:36 |
*** sergmelikyan has joined #openstack-infra | 13:37 | |
sdague | and the report processing slowness issue is tripping them over the 40 min limit | 13:37 |
*** bashok_ has joined #openstack-infra | 13:37 | |
anteaya | ttx I recognize 35189 from last night as well | 13:37 |
anteaya | sdague: ah okay | 13:37 |
anteaya | but we are getting some patches that are in because you haven't cut off yet: https://review.openstack.org/#/c/40229/ came in an hour ago | 13:39 |
anteaya | the two patches that catch my eye in the queue are the first two, they are from last night/yesterday | 13:39 |
anteaya | after that your call, ttx, there is some new stuff in there today | 13:40 |
ttx | anteaya: I have a list with the remaining PTLs (nova/neutron) | 13:40 |
ttx | waiting for a few more and we are done | 13:41 |
anteaya | great | 13:41 |
anteaya | wonderful | 13:41 |
anteaya | limping to the finish line | 13:41 |
ttx | neutron ETA 6min in best case scenario | 13:41 |
ttx | nova ETA 23+min | 13:41 |
annegentle | hey infra | 13:42 |
sdague | ttx: so I doubt those neutron patches are going to land | 13:42 |
anteaya | ttx great thanks | 13:42 |
sdague | I think we're basically hitting that timeout nearly 100% of the time now | 13:42 |
annegentle | I wanted to give you all a heads up, we're doing some moving around in the openstack-manuals repo and it'll mean changes to where pom.xml files are stored | 13:42 |
*** burt has joined #openstack-infra | 13:42 | |
anteaya | hey annegentle | 13:42 |
anteaya | sdague: ah okay | 13:43 |
ttx | sdague: the subunit thing ? | 13:43 |
annegentle | We're going to do the work this weekend before the boot camp | 13:43 |
sdague | yep | 13:43 |
ttx | sdague: beh | 13:43 |
annegentle | will you have reviewers around who can help push stuff through to avoid publishing delay? | 13:43 |
anteaya | annegentle: do you need anything from an infra core or just in case we spot something unusual? | 13:43 |
sdague | fungi: any chance you want to rush through that change - https://review.openstack.org/#/c/45228/ ? | 13:43 |
ttx | sdague: we've got some neutron commits landing though lately | 13:43 |
sdague | ttx: this morning? | 13:43 |
annegentle | anteaya: I'll need cores to push through changes to build jobs based on pom.xml locations changing | 13:44 |
ttx | sdague: https://review.openstack.org/#/c/43558/ | 13:44 |
anteaya | ah okay, well so far myself and fungi are up and he is reading backscroll | 13:44 |
sdague | ttx: it's a race, maybe they'll get through, but we're on such a hairy edge of timing there, it's a coin flip | 13:44 |
ttx | sdague: https://review.openstack.org/#/c/35624/ etc | 13:44 |
annegentle | anteaya: ok cool. It's basically a "flattening" so we'll be eliminating "doc/src/docbkx" for the most part | 13:44 |
anteaya | so I'll let those reading get back to you when they can, I'll point them to your request | 13:44 |
annegentle | anteaya: thanks! | 13:44 |
anteaya | awesome for flattening | 13:44 |
anteaya | my pleasure | 13:45 |
ttx | sdague: I'll go sacrificing another chicken | 13:45 |
sdague | ttx: gate-neutron-python26 SUCCESS in 39m 48s | 13:45 |
sdague | so that one had 12s to spare :) | 13:45 |
ttx | not sure frozen chickens will do though | 13:45 |
anteaya | ttx must have been a heavy day for chicken | 13:45 |
sdague | ttx: you need to upgrade to goats | 13:45 |
anteaya | give it a shot, you got this far | 13:45 |
ttx | anteaya: all my live ones are gone now | 13:45 |
anteaya | they didn't make it through the night | 13:46 |
*** tstevenson has joined #openstack-infra | 13:47 | |
sdague | ttx: yeh, nope, reset | 13:47 |
ttx | sdague: told ya. no blood, unhappy gods | 13:47 |
anteaya | :( | 13:48 |
sdague | yeh, we just need to push up the timeout | 13:48 |
sdague | otherwise every neutron change in the gate at this point is basically a timebomb | 13:48 |
sdague | through no fault of their own | 13:48 |
openstackgerrit | Andreas Jaeger proposed a change to openstack-infra/config: Add new manual "Cloud Administrator Guide" https://review.openstack.org/45229 | 13:49 |
anteaya | once fungi gets finished backscroll perhaps he can shoe horn it in | 13:50 |
zul | ttx: maybe you just need a bigger pool of blood | 13:50 |
anteaya | morning zul | 13:51 |
*** mriedem has joined #openstack-infra | 13:51 | |
zul | hey anteaya | 13:51 |
ttx | zul: is that a candidacy ? | 13:52 |
zul | ttx: not my blood of course but i can come up with some suggestions | 13:52 |
*** adalbas has quit IRC | 13:53 | |
* ttx takes a quick break | 13:53 | |
anteaya | here is another keystone did not start error: https://jenkins02.openstack.org/job/gate-tempest-devstack-vm-full/8109/console | 13:57 |
fungi | ttx: i saw the ritual--the testing gods must have been pleased indeed | 13:58 |
anteaya | harlowja: what came of your keystone error of about 7 hours ago? | 13:58 |
*** thedodd has joined #openstack-infra | 13:58 | |
*** pblaho has quit IRC | 13:58 | |
*** pblaho has joined #openstack-infra | 13:59 | |
*** adalbas has joined #openstack-infra | 13:59 | |
fungi | yeah, i think the scrollback is accumulating faster than i can read it | 14:00 |
sdague | fungi: can I get you to jump to the end about the neutron test timouts? :) | 14:01 |
*** yaguang has joined #openstack-infra | 14:01 | |
anteaya | fungi yes | 14:01 |
*** pblaho has quit IRC | 14:03 | |
fungi | sdague: caught up... so by rush through you just mean approve without additional +2's. i suppose it's warranted... doing | 14:05 |
*** AJaeger has joined #openstack-infra | 14:06 | |
sergmelikyan | We have a problem with tags in one of murano repositories. Previously stackforge/murano-common was located on GitHub and had version tags (0.2 and 0.2.1). After moving to stackforge this tags moved too, and now, when we releasing version 0.2 we could not reassign this tags to correct revision (as it is already existing and we have not rights to force). Could someone remove this tags for us? | 14:06 |
anteaya | sdague: I found four setup/teardown class failures with bug reports: https://bugs.launchpad.net/tempest/+bug/1221237 https://bugs.launchpad.net/tempest/+bug/1218812 https://bugs.launchpad.net/tempest/+bug/1218279https://bugs.launchpad.net/tempest/+bug/1217734 well those are all setup class bugs, I did see some teardown class timeout errors last night | 14:06 |
uvirtbot | Launchpad bug 1221237 in tempest "FAIL: setUpClass (tempest.api.volume.test_volumes_actions.VolumesActionsTest)" [Undecided,New] | 14:06 |
EmilienM | Hi, could someone review https://review.openstack.org/#/c/45229/ to have a new guide in the manuals :) | 14:06 |
*** krtaylor has joined #openstack-infra | 14:06 | |
sdague | fungi: yep | 14:07 |
sdague | basically until that goes in just about every neutron job is going to cause a reset in the gate | 14:07 |
sdague | the ones that were making it through previously were doing so with 12s to spare before reset | 14:07 |
sdague | but I haven't seen one make it in the last 90 minutes | 14:07 |
fungi | sergmelikyan: you will need to release 0.2.2 or something. removing/replacing tags doesn't work so well due to issues with them being cached on the test-running and releasing infrastructure | 14:07 |
*** yjiang5 has joined #openstack-infra | 14:08 | |
sdague | anteaya: thanks, I'll take a look | 14:08 |
openstackgerrit | A change was merged to openstack-infra/config: up python 26 jobs to 60 minute time outs https://review.openstack.org/45228 | 14:08 |
sdague | oh, it's that issue again | 14:08 |
anteaya | k | 14:08 |
fungi | sergmelikyan: we used to do it, and it makes a mess to clean up. also it's kind of rewriting history, which isn't great to downstream consumers of your code | 14:08 |
anteaya | that issue? | 14:08 |
sdague | this is the cinder scheduler race I think | 14:08 |
anteaya | is it the same issue? | 14:09 |
anteaya | dang | 14:09 |
*** kiall has quit IRC | 14:09 | |
*** mrodden has joined #openstack-infra | 14:09 | |
*** jhesketh__ has quit IRC | 14:10 | |
sdague | yeh, I'll look into a bit more for real in a minute | 14:11 |
sdague | fungi: so will the timeouts be effective immediately? or is there a restart of the jenkins that need to happen? | 14:11 |
fungi | sdague: this will get jjb'd into the jenkins definitions for those jobs by puppet, no restarts needed | 14:12 |
jeblair | anteaya, fungi, ttx, sdague: good morning; anything i need to know? scrollback is immense. | 14:12 |
sergmelikyan | fungi this tags migrated few days ago, and CI infrastructure never runned release pipeline for them | 14:12 |
fungi | though already running jobs may not be affected, so it'll take the next gate reset to really go into effect | 14:12 |
anteaya | jeblair: morning | 14:12 |
fungi | sergmelikyan: i can probably manually trigger release jobs for them in a little while. point is the systems retrieve and cache tags, and don't automatically refresh them if their names are the same so changing the tags to point to different commits doesn't work out so well | 14:14 |
anteaya | some slowdown in the gate due to a cinder scheduling bug: http://lists.openstack.org/pipermail/openstack-dev/2013-September/014644.html | 14:14 |
anteaya | ttx has got a plan for when he is going to cut off, he has already cut off heat | 14:14 |
ttx | anteaya: I have already cut everyone bt neutron/nova | 14:15 |
sdague | fungi: ok, well at least only one more timebomb then | 14:15 |
jeblair | sergmelikyan: the best thing to do in this case is to make a new tag and ignore the old one | 14:15 |
anteaya | no fix yet for the cinder scheduling bug but jgriffith is aware and I am confident will check in when he is up and around | 14:15 |
anteaya | ttx okay | 14:15 |
fungi | jeblair: also sdague observed that neutron unit tests now take in excess of 40 minutes much of the time (partly due to a subunit processing performance issue lifeless will work on solving upstream), so temporarily increased py26 jobs timeout to an hour | 14:15 |
ttx | jeblair: nothing urgent. Just trying to babysit patches through various random fails | 14:15 |
jeblair | fungi: cool, i read the review and +2d it | 14:15 |
fungi | ahh, i see that now ;) | 14:16 |
jeblair | i think clarkb and lifeless were batting around ideas about the subunit thing | 14:16 |
ttx | that would be https://bugs.launchpad.net/openstack-ci/+bug/1221094 | 14:16 |
uvirtbot | Launchpad bug 1221094 in openstack-ci "Gate tests fail with subunit2html.py time out" [Undecided,New] | 14:16 |
*** pblaho has joined #openstack-infra | 14:17 | |
sergmelikyan | jeblair, fungi Thx | 14:17 |
*** bashok_ has quit IRC | 14:17 | |
sdague | jeblair: this job just failed with a keystone port conflict - http://logs.openstack.org/74/42474/13/gate/gate-tempest-devstack-vm-full/6640711/logs/ | 14:18 |
*** bashok_ has joined #openstack-infra | 14:18 | |
sdague | any chance that we burbed and reused an unclean devstack node? | 14:18 |
*** kiall has joined #openstack-infra | 14:19 | |
anteaya | sdague: yes harlowja had one of those last night/7 hours ago | 14:19 |
*** rnirmal has joined #openstack-infra | 14:20 | |
anteaya | sdague: http://logs.openstack.org/05/45105/1/gate/gate-tempest-devstack-vm-neutron/306dee5/ | 14:21 |
jeblair | ttx: thanks i commented on 1221094 | 14:21 |
jeblair | sdague: looking | 14:21 |
sdague | russellb filed this bug on it https://bugs.launchpad.net/openstack-ci/+bug/1221247 for recheck, not sure if there are others we should be tracking | 14:22 |
uvirtbot | Launchpad bug 1221247 in openstack-ci "keystone didn't start due to address already in use" [Undecided,New] | 14:22 |
anteaya | sdague: is it the same error? it looked the same to me | 14:22 |
*** jhesketh__ has joined #openstack-infra | 14:22 | |
sdague | anteaya: yep, looks same to me | 14:23 |
sdague | anteaya: was there another bug for that | 14:23 |
anteaya | okay | 14:23 |
jeblair | sdague: nodepool only saw 1 job run on that node. i'll check jenkins logs | 14:23 |
anteaya | I didn't file one and I don't think he did either | 14:23 |
anteaya | so I'm going with no | 14:23 |
sdague | ok, no worries, just wanted to clean up dups if they were out there | 14:23 |
anteaya | absolutely | 14:23 |
Alex_Gaynor | Is there a description somewhere of what falls under feature freeze, specifically are the PyPy {tox, CI} changes now waiting until icehouse? | 14:24 |
*** yaguang has quit IRC | 14:24 | |
anteaya | morning Alex_Gaynor | 14:24 |
jeblair | sdague: jenkins also says it only ran one job on that node, and the timestamps match | 14:24 |
fungi | Alex_Gaynor: changes to stuff in the openstack-infra repos don't follow the coordinated release freezes | 14:24 |
sdague | Alex_Gaynor: tests and test fixes are typically not prevented by feature freeze | 14:24 |
Alex_Gaynor | anteaya: Morning! Seems we survived the night. | 14:25 |
Alex_Gaynor | sdague, fungi: Thanks | 14:25 |
*** bashok_ has quit IRC | 14:25 | |
anteaya | Alex_Gaynor: we did, thanks for all your support last night, it really helped me | 14:25 |
fungi | Alex_Gaynor: though we do sometimes have soft infra freezes around releases, milestones and other high-volume periods to reduce unnecessary churn and free us up to dea with spontaneous scaling/load problems | 14:26 |
Alex_Gaynor | fungi: does the volume of reviews get crazy towards teh remaining milestones like it did before FF? | 14:27 |
fungi | Alex_Gaynor: less so after ff, from what i've seen | 14:27 |
fungi | but every cycle is a little different, so who knows this time | 14:28 |
* anteaya goes to feed cats | 14:28 | |
ttx | Alex_Gaynor: the next milestones are actually release candidates | 14:28 |
ttx | Alex_Gaynor: so they are published whenever the targeted bug list gets to 0 | 14:28 |
ttx | Alex_Gaynor: see for grizzly: http://fnords.wordpress.com/2013/04/05/grizzly-the-day-after/ | 14:29 |
fungi | if you don't see the fnords, they can't eat you | 14:30 |
ttx | so there is no deadline, as long as each project produces at least one RC you're good | 14:30 |
ttx | fungi: we share some culture I see | 14:30 |
*** dizquierdo has joined #openstack-infra | 14:30 | |
jeblair | i deleted/recreated a few slaves that were stuck in the scp step of an aborted job | 14:30 |
fungi | ttx: indeed. i am a member of the golden apple corps | 14:30 |
fungi | sdague: also i've confirmed that the job timeout bump is live on jenkins01 and jenkins02 now, so any py26 jobs (re)started in the past 10 minutes or later should have more leeway | 14:33 |
fungi | at least according to the last modified timestamp on their configs | 14:33 |
Alex_Gaynor | Are neutron jobs run in parallel? | 14:34 |
fungi | Alex_Gaynor: they currently use 'python setup.py testr' so i'm going to say "yes" | 14:35 |
Alex_Gaynor | btw, someone should propose a PyCon talk on how OpenStack does CI! | 14:35 |
*** bashok has joined #openstack-infra | 14:35 | |
fungi | my guess is someone already has (or are you chairing and happen to know there isn't one proposed yet?) | 14:36 |
anteaya | ummm, there was some difficulty with running neutron jobs parallel with testr | 14:36 |
Alex_Gaynor | I don't think I've seen one yet :] | 14:37 |
* fungi doesn't even know if the cfp has gone out yet, honestly. bad with calendars | 14:37 | |
jeblair | oh look the cfp is open :) | 14:37 |
Alex_Gaynor | Yup, 2 more weeks to submit talks! | 14:37 |
anteaya | so let's hear from dkanz before we move off the neutron/testr question | 14:37 |
Alex_Gaynor | http://us.pycon.org/2014/speaking/cfp/ | 14:37 |
*** yaguang has joined #openstack-infra | 14:37 | |
anteaya | or perhaps jeblair knows the answer, is neutron running parallel tests? | 14:38 |
jeblair | ttx: i assume we're not going to schedule the summit to conflict with pycon? | 14:38 |
fungi | gah, i can't go to pycon. kinda getting married that week | 14:38 |
ttx | jeblair: we'll certainly do out best | 14:39 |
ttx | our* | 14:39 |
ttx | jeblair: even if them moving into April is not really good news | 14:39 |
Alex_Gaynor | fungi: I guess that's an ok excuse | 14:39 |
anteaya | fungi: congratulations! | 14:39 |
jeblair | fungi: you could get married in montreal | 14:39 |
ttx | jeblair: check with Lauren for prospective next summit dates ? | 14:39 |
fungi | jeblair: i could if i didn't already have about $5k sunk into an event house | 14:40 |
*** ruhe has joined #openstack-infra | 14:41 | |
ttx | pile of 5 maybe going in in 2 min | 14:41 |
fungi | i thought at least the j summit was looking more like early/mid may? or has that changed now? | 14:41 |
ttx | fungi: that's what I have in my books | 14:41 |
fungi | but yeah, certainly possibility of pycon/openstack contention starting in 2015 | 14:41 |
anteaya | ttx any early ideas of where the next summit will be | 14:42 |
anteaya | like what continent? | 14:42 |
fungi | maybe we should start teaming up with them and using the same venues back-to-back ;) | 14:42 |
ttx | anteaya: US | 14:42 |
anteaya | ha ha ha | 14:42 |
anteaya | okay | 14:42 |
jeblair | oh, my lca talk was accepted! | 14:42 |
anteaya | congratulations | 14:42 |
fungi | jeblair: awesome! pleia2 said hers was too | 14:42 |
*** weshay has quit IRC | 14:43 | |
ttx | one of those days i'll go to LCA | 14:43 |
*** dina_belova has quit IRC | 14:43 | |
* ttx picked beer/FOSDEM this year again | 14:43 | |
*** weshay has joined #openstack-infra | 14:43 | |
fungi | beer is never a bad choice | 14:43 |
ttx | 0 min. Let's all pray | 14:44 |
* anteaya prays | 14:44 | |
jeblair | ttx: they're not very conflicting this year (almost 1 month apart) | 14:44 |
jeblair | s/this/next/ | 14:44 |
ttx | jeblair: yeah... but I figured I should go when it's a classic Australia or New Zealand | 14:45 |
Alex_Gaynor | ttx: Hmm, is there a reason swift doesn't get a candidate tarball? | 14:45 |
*** dina_belova has joined #openstack-infra | 14:45 | |
ttx | Alex_Gaynor: because they don't follow the release schedule | 14:45 |
Alex_Gaynor | Ah. | 14:46 |
ttx | Alex_Gaynor: they only coordinate the final release | 14:46 |
ttx | https://wiki.openstack.org/wiki/Havana_Release_Schedule | 14:46 |
jd__ | jeblair: can I have approved back on https://review.openstack.org/#/c/43851/ ? | 14:46 |
Alex_Gaynor | ttx: oh, thanks, somehow I'd never noticed the swit column | 14:46 |
ttx | anteaya: it's been 0 min for a bit too long I'm afraid | 14:46 |
anteaya | ttx have faith | 14:46 |
fungi | the test duration is a guess based on previous test runs, so won't be spot-on | 14:47 |
ttx | Alex_Gaynor: they arguably have a different need, being a lot more stable and all | 14:47 |
*** dina_bel_ has joined #openstack-infra | 14:47 | |
jeblair | jd__: done | 14:47 |
anteaya | ttx the patch at the head hasn't failed yet, that I can see | 14:47 |
*** dina_belova has quit IRC | 14:47 | |
jd__ | jeblair: thanks | 14:48 |
ttx | that's a 8-batch now | 14:48 |
* anteaya continues to pray | 14:48 | |
jd__ | jeblair: don't want to abuse, but while you're at it, if you have a minute https://review.openstack.org/#/c/44681/ :) | 14:48 |
anteaya | the 6th patch hasn't failed either | 14:49 |
jeblair | jd__: no problem, aprvd that one too (yay)! | 14:50 |
*** mriedem has quit IRC | 14:50 | |
* ttx stops looking | 14:50 | |
Alex_Gaynor | is there a fulltext search for reviews? | 14:50 |
fungi | that job at the head may not make it. started about 6 minutes before the timeout bump to 60 | 14:50 |
*** ericw has joined #openstack-infra | 14:50 | |
ttx | ARGH | 14:50 |
anteaya | nooooo | 14:50 |
fungi | yep, no good | 14:50 |
fungi | Build timed out (after 40 minutes). Marking the build as failed. | 14:50 |
ttx | the subunit thing again ? | 14:51 |
jeblair | yep | 14:51 |
ttx | didn't we just raise that timeout ? | 14:51 |
fungi | well, 12 minutes of it anyway | 14:51 |
openstackgerrit | A change was merged to openstack-infra/config: Activate devstack gate for Ceilometer https://review.openstack.org/44681 | 14:51 |
anteaya | ttx yes but that patch had started before the change went in | 14:51 |
fungi | ttx: we raised the timeout, but it didn't take effect until 6 minutes after that job started | 14:51 |
ttx | ha ha ha | 14:52 |
*** salv-orlando has joined #openstack-infra | 14:52 | |
fungi | based on the timestamps i have | 14:52 |
markmcclain | so I need to requeue those again? | 14:52 |
*** pblaho has quit IRC | 14:52 | |
* salv-orlando is googling for ways to commit suicide | 14:53 | |
fungi | markmcclain: yes, they should pass now that they're allowed to run longer | 14:53 |
salv-orlando | markmcclain: we need to take also #42806 off the queue | 14:53 |
fungi | salv-orlando: play "5 minutes to kill yourself" for a while. it's good practice | 14:53 |
markmcclain | salv-orlando: right | 14:54 |
anteaya | oh look it is a new moon in Virgo | 14:55 |
*** pentameter has joined #openstack-infra | 14:56 | |
*** ruhe has quit IRC | 14:58 | |
markmcclain | ok.. I've re-sequenced so they'll pass again | 15:01 |
anteaya | markmcclain: yay | 15:01 |
zaro | good morning | 15:01 |
anteaya | markmcclain: are neutron tests running parallel in the gate? | 15:02 |
anteaya | morning zaro | 15:02 |
anteaya | markmcclain: I know there were some issues with that earlier | 15:02 |
*** kiall has quit IRC | 15:04 | |
markmcclain | anteaya: there is one that is running both as a check and gate | 15:04 |
Alex_Gaynor | I wonder, is there anyway I could convince graphite to show me counts of the number of *-pypy jobs that have been run? | 15:05 |
anteaya | markmcclain: the devstack-vm-neutron one | 15:05 |
anteaya | okay thanks | 15:05 |
markmcclain | anteaya: ah.. no those jobs are not parallel yet | 15:06 |
anteaya | markmcclain: oh okay, which neutron job is parallel? | 15:06 |
markmcclain | we had problems yesterday with a unittest that was failing due to out of order execution by testr | 15:07 |
*** AJaeger has quit IRC | 15:07 | |
anteaya | ah okay | 15:07 |
markmcclain | we fixed they yesterday | 15:07 |
anteaya | yes I remember there were race conditions | 15:07 |
anteaya | well done, did the change merge? | 15:08 |
markmcclain | yes.. when in about 000UTC | 15:08 |
anteaya | great | 15:09 |
anteaya | but there is still something standing in the way of neutron running parallel? | 15:09 |
* zaro finishes scrollback | 15:09 | |
zaro | congrats fungi. | 15:09 |
*** kiall has joined #openstack-infra | 15:10 | |
markmcclain | anteaya: yes need the neutron test is not the full suite | 15:10 |
markmcclain | there are folks working to finally close that gap | 15:10 |
*** pblaho has joined #openstack-infra | 15:10 | |
anteaya | markmcclain: that is great to hear | 15:10 |
anteaya | thank you | 15:10 |
*** dina_bel_ has quit IRC | 15:10 | |
anteaya | I look forward to hearing about their progress as it continues | 15:10 |
jeblair | Alex_Gaynor: http://graphite.openstack.org/compose/?_t=0.6905084141797536&from=-8weeks&bgcolor=ffffff&title=Gerrit%20Events%20%28per%20Day%29&width=586&height=308&fgcolor=000000&_salt=1378393809.273&target=stats_counts.zuul.pipeline.*.job.gate-*-pypy.* | 15:11 |
fungi | zaro: thanks--though we set the date over a year ago, so it's more just a looming deadline now ;) | 15:11 |
Alex_Gaynor | jeblair: nice, thanks! | 15:11 |
jeblair | Alex_Gaynor: that's a start; you can "sum()" that if that's what you meant | 15:11 |
jeblair | Alex_Gaynor: oh, sorry about the title, forgot to reset that | 15:12 |
Alex_Gaynor | jeblair: are the units per-hour right now? | 15:12 |
*** vogxn has quit IRC | 15:14 | |
ericw | jeblair: thanks again for helping out yesterday | 15:14 |
*** ruhe has joined #openstack-infra | 15:15 | |
ericw | jeblair: btw, the idea was raised that your "how to contribute" talk would make an awesome youtube video if you ever got around to it. | 15:15 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Add oslo.version https://review.openstack.org/40498 | 15:17 |
fungi | ericw: i think reed did at least some of that for http://youtu.be/mT2yC6ll5Qk (linked from https://wiki.openstack.org/wiki/How_To_Contribute#If_you.27re_a_developer currently) | 15:17 |
Alex_Gaynor | jeblair: cool, looks like http://graphite.openstack.org/graphlot/?from=-2weeks&bgcolor=ffffff&_t=0.6905084141797536&height=575&width=881&_salt=1378394250.803&fgcolor=000000&showTarget=sumSeries(stats_counts.zuul.pipeline.*.job.gate-*-pypy.*)&target=summarize(sumSeries(stats_counts.zuul.pipeline.*.job.gate-*-pypy.*)%2C%221d%22) is approximately what I'm | 15:18 |
Alex_Gaynor | looking for | 15:18 |
*** pblaho has quit IRC | 15:18 | |
ericw | fungi: cool. I figured there might be something already, but I couldn't find it. | 15:18 |
jeblair | Alex_Gaynor: i think the unaltered values are 'events per 10 second interval', so i usually use summarize to fix that up | 15:18 |
Alex_Gaynor | jeblair: yeah, sumSeries + summarize seems to work | 15:19 |
jeblair | Alex_Gaynor: yeah, that looks right to me | 15:19 |
ericw | fungi: that's pretty much what jeblair covered, although he got deeper into gerrit and zuul. | 15:19 |
fungi | ericw: yeah, reed's video is not as in-depth as the talk, but it scratches the itch | 15:19 |
jeblair | ericw: you're welcome! | 15:20 |
*** SergeyLu_ has joined #openstack-infra | 15:23 | |
ericw | grr… gerrit groups are my bane | 15:23 |
*** changbl has quit IRC | 15:23 | |
*** SergeyLu_ has quit IRC | 15:23 | |
ericw | sorry for flooding your email boxes everyone on jenkins-job-builder-core | 15:23 |
jeblair | heh | 15:24 |
*** mrodden has quit IRC | 15:24 | |
ericw | uh, is there a way to get jenkins to test draft changes anymore? | 15:24 |
*** SergeyLu_ has joined #openstack-infra | 15:24 | |
jeblair | ericw: nope; better bet is to publish it and mark it work in progress | 15:25 |
jeblair | ericw: (it never did test drafts; it doesn't see an event if you add it, and otherwise it doesn't have access) | 15:26 |
ericw | jeblair: well, once upon a time I would manually add Jenkins as a reviewer and MAKE it | 15:26 |
ericw | or so I thought, anyway | 15:27 |
*** SergeyLu_ has quit IRC | 15:29 | |
fungi | in an open development process, there's very little point to gerrit's drafts feature anyway | 15:30 |
*** SergeyLukjanov has joined #openstack-infra | 15:30 | |
fungi | at least with something like wip there to mark it as not ready for wider review | 15:31 |
anteaya | patch 41177 is failing on the cinder scheduling bug | 15:31 |
*** UtahDave has joined #openstack-infra | 15:31 | |
*** gyee has joined #openstack-infra | 15:33 | |
*** vogxn has joined #openstack-infra | 15:35 | |
*** mrodden has joined #openstack-infra | 15:36 | |
yjiang5 | 15:41 | |
*** yjiang5 has left #openstack-infra | 15:41 | |
Alex_Gaynor | jeblair: I assume we upgraded our graphite after the RCE ~a week ago? | 15:42 |
*** pcm__ has joined #openstack-infra | 15:46 | |
*** pcm__ has quit IRC | 15:48 | |
fungi | jeblair: clarkb: mordred: manage-projects has been repeatedly failing trying to add pypi-mirror because it already exists as a project in gerrit. was it previously added and then (maybe only partially) removed? fatal: project "openstack-infra/pypi-mirror" exists | 15:49 |
*** pcm_ has quit IRC | 15:50 | |
mordred | fungi: we deleted it | 15:50 |
mordred | fungi: perhaps we did not delete it right | 15:50 |
fungi | it's looking like that might be the case | 15:50 |
mordred | great | 15:50 |
*** boris-42 has quit IRC | 15:51 | |
*** kiall has quit IRC | 15:52 | |
anteaya | ttx which 2 feature patches are you waiting on? | 15:52 |
*** danger_fo_away is now known as danger_fo | 15:56 | |
ttx | 42806 38230 35189 42474 | 15:56 |
ttx | 2 for nova 2 for neutron | 15:56 |
ttx | now they will be backported, i cut the branches | 15:57 |
dhellmann | it has been a while since I ran a fresh devstack, but it looks like now it's modifying the requirements files for all of my repositories -- is that how we fixed the issue of installing stuff that may have requirement conflicts? | 15:57 |
anteaya | ttx okay if I see anything I will ping | 15:57 |
ttx | anteaya: russellb and markmcclain are on them | 15:58 |
anteaya | ttx ah okay | 15:58 |
*** reed has joined #openstack-infra | 15:58 | |
fungi | mordred: removing the local copy of the pypi-mirror repo and rerunning manage-projects seems to have worked. not sure whether you did that previously or if this was just a fluke, but file for future reference | 16:01 |
*** kiall has joined #openstack-infra | 16:02 | |
anteaya | two down, two to go | 16:03 |
*** nati_ueno has quit IRC | 16:04 | |
dhellmann | mordred: ^^ | 16:05 |
mordred | dhellmann: yes | 16:06 |
dims | hi, is there a way to trigger Smokestack against this review? https://review.openstack.org/#/c/39929/ | 16:06 |
mordred | dhellmann: devstack now force-aligns all projects to the global requirements list | 16:06 |
fungi | dims: check with dprince | 16:06 |
mordred | fungi: thanks | 16:07 |
dims | fungi, thanks | 16:07 |
dhellmann | mordred: :-/ - no more "git pull" to update my dev box, I guess | 16:08 |
mordred | dhellmann: I think that there was discussion around a flag to disable | 16:08 |
dhellmann | mordred: I'm looking for where it happens so I can add one if it's not there | 16:08 |
dprince | dims: I can refire it sure. But the burden is on me to pull in the actual oslo.messaging RPM before it will pass. | 16:09 |
mordred | dhellmann: ah: # Don't update repo if local changes exist | 16:09 |
dprince | dims: don't hold it up for me. I normally try to be ahead of the game but just haven't got to this yet. | 16:10 |
mordred | dhellmann: so we won't bork you if you've been hacking in there already | 16:10 |
mordred | dhellmann: but I do not believe we accounted for you wanting to git pull :) | 16:10 |
dhellmann | mordred: I want to keep a clean master sandbox and say "update all my repos; rerun devstack" | 16:10 |
dhellmann | yeah | 16:10 |
mordred | dhellmann: hrm. what if - | 16:10 |
dhellmann | I need to understand the dev workflow other people use. Seems strange to create a whole new set of sandboxes every time you want to change tasks. | 16:10 |
mordred | dhellmann: if we run update.py, we do the setup.py develop and then do a git reset | 16:11 |
*** sergmelikyan has quit IRC | 16:11 | |
mordred | dhellmann: that way we install with the right versions, but leave you with a decent repo state | 16:11 |
dhellmann | will that preserve any changes I *have* made? | 16:11 |
dhellmann | mordred: fwiw, we also had someone submit a change to ceilometer that include the requirements updates devstack made | 16:11 |
mordred | we don't update your requirements if you have made changes | 16:11 |
mordred | good | 16:11 |
dhellmann | even if they are committed changes? | 16:11 |
mordred | dhellmann: yeah | 16:11 |
dhellmann | no, mixing requirements updates and code updates is not good | 16:12 |
mordred | dhellmann: oh - sorry, misunderstood | 16:12 |
dhellmann | yeah, thought you probably did :-) | 16:12 |
mordred | yeah - I think I have a patch idea - one sec | 16:12 |
*** SergeyLukjanov has quit IRC | 16:13 | |
dhellmann | I'm working on it, if you're busy | 16:13 |
mordred | dhellmann: no, I'm on it - sexy patch coming | 16:14 |
dhellmann | mordred: git diff returns 0 if there are changes from master that have been committed | 16:14 |
mordred | right. that's fine | 16:14 |
dhellmann | ah, I guess, if the definition of "you're working in here" is "you have uncommitted changes in here" | 16:14 |
mordred | right. because that's the one where you'll get screwed potentially | 16:15 |
mordred | dhellmann: somethign like this: http://paste.openstack.org/show/45818 | 16:15 |
dhellmann | how about just a flag that I can set to disable this behavior entirely on my dev system? | 16:15 |
dhellmann | it's fine for the gate, but this is really potentially disruptive if I'm modifying more than one project at a time | 16:16 |
mordred | dhellmann: even with the logic in the above patch? | 16:16 |
mordred | dhellmann: (I mean, a flag is fine, but it means that people can still get confused before they know about the flag) | 16:16 |
*** ruhe has quit IRC | 16:17 | |
* dhellmann thinks | 16:17 | |
jd__ | mordred: at which point do we make projects only depends on 'openstack-requirements' and distribute the latter as a Python package? | 16:17 |
jd__ | :) | 16:17 |
mordred | jd__: :) | 16:17 |
dhellmann | I guess if no changes are made if there are pending changes, and the reset is only done in the case when automatic changes are made, then this would be safe | 16:17 |
mordred | jd__: well, it's tricky, we still don't want to have all projects depend on all things in openstack-requirements | 16:17 |
dhellmann | jd__: I suggested that at pycon, but I forget why we said we couldn't do that | 16:17 |
mordred | jd__: for instance, python-swiftclient does not want to install eventlet | 16:18 |
dhellmann | that's not what we'd do, though | 16:18 |
jd__ | mordred: fair enough… :/ | 16:18 |
dhellmann | a pbr plugin would read the data file in the requirements set to get the version info to apply based on the unversioned names in the requirements list of the current project | 16:18 |
mordred | I've got a change coming that will auto-propose changes to the project (like translations updates) when requirments changes | 16:18 |
*** Ryan_Lane has joined #openstack-infra | 16:18 | |
mordred | dhellmann: requires access to a thing you might not have though | 16:19 |
dhellmann | openstack-requirements would be a setup_requires dependency | 16:19 |
dhellmann | we'd have to start cutting releases of openstack-requirements | 16:19 |
mordred | hrm. | 16:19 |
mordred | that seems more complex | 16:19 |
dhellmann | yeah | 16:20 |
mordred | also, I've discovered | 16:20 |
mordred | that versioned things in setup_requires is epic fail | 16:20 |
mordred | because it won't update them | 16:20 |
mordred | and will block if it has an older version | 16:20 |
dhellmann | ah | 16:20 |
dhellmann | that's pretty bad | 16:20 |
mordred | yup | 16:20 |
mordred | I'm having to change how we depend on pbr because of it | 16:20 |
dhellmann | auto-proposing the changes to all projects is probably the best approach, then | 16:20 |
mordred | and also, pbr can now NEVER need a version bump at the setup.py level | 16:20 |
*** jpich has quit IRC | 16:21 | |
mordred | dhellmann: it's not my favorite plan, but seems to be the best thing we've got for now | 16:21 |
mordred | but please, by all means, keep pushing on this, because I'm sure there is a better answer somewhere | 16:21 |
*** yaguang has quit IRC | 16:21 | |
dhellmann | I'll let you know if I come up with something | 16:21 |
anteaya | ttx yay! | 16:21 |
dhellmann | mordred: so back to devstack not breaking my dev environment, do you want to submit a patch or should i? | 16:22 |
jd__ | mordred: did we envision something like letting projects handle their requirements, but having a resolver that indicates that someprojects have dependencies that prevents them from being installed on the same env? | 16:22 |
mordred | dhellmann, Alex_Gaynor I'm also going to propose a pycon talk on pbr and how/why openstack does what it does around packaging | 16:22 |
*** zeus has joined #openstack-infra | 16:22 | |
mordred | dhellmann, Alex_Gaynor: not because I think everyone else should jump on board, but we have a set of input requirements that I think it's interesting to talk about in detail | 16:22 |
dhellmann | mordred: cool, that deadline's coming up soon | 16:22 |
mordred | dhellmann: yeah. just noticed that this morning | 16:23 |
mordred | jd__: no - more that we don't want projects to handle their own requirements | 16:23 |
jd__ | mordred: reason #1 being? | 16:23 |
mordred | jd__: at least for now :) | 16:23 |
mordred | jd__: oh, wait, I just processed your sentence differently | 16:24 |
mordred | jd__: yeah, the swift guys want that | 16:24 |
notmyname | what do I want? | 16:24 |
jd__ | I think I'd want that too | 16:24 |
mordred | basically, have openstack-requirements update.py when it's doing the sync determine if the listed requirment in the project is compatible with the global list | 16:24 |
jd__ | notmyname: handle your dependencies | 16:24 |
mordred | even if it's not a direct match | 16:24 |
ttx | anteaya: all set | 16:24 |
anteaya | yay | 16:24 |
mordred | it's a harder problem, because the parser/resolver for that gets very complex | 16:24 |
mordred | so I think I'd be open to it - but do not have an actual technical solution in mind for it yet | 16:25 |
ttx | anteaya: since those pesky merges had the suprising idea of merging just after I cut the branch, I used a trick and re-cut MP | 16:25 |
ttx | to avoid having to backport | 16:25 |
notmyname | jd__: mordred: meh. it's not the first thing I'd change if I had to pick ;-) | 16:25 |
anteaya | oh how lovely | 16:25 |
ttx | a bit unorthodox, but works | 16:25 |
*** reed has quit IRC | 16:25 | |
anteaya | if it works, that is greata | 16:25 |
anteaya | great too | 16:25 |
mordred | notmyname: I know - but every little thing I can do to make creiht less unhappy is a win in my book | 16:26 |
ttx | anteaya: generates an unhappy jenkins job but then refreshes the tarball alright | 16:26 |
anteaya | not having to backport is worth it | 16:26 |
mordred | ttx: what's the issue? | 16:26 |
mordred | ttx: is it a problem with my merge job thing? | 16:26 |
notmyname | mordred: I'm more concerned with swift's success than with mollifying any one person :-) | 16:26 |
ttx | mordred: no issue | 16:26 |
anteaya | jenkins had a good night, so it can tolerate a bit of unhappiness methinks | 16:26 |
jd__ | mordred: ok, good to hear, I could try to work on something that'd check for a list of requirements if there's a conflict | 16:26 |
mordred | notmyname: I know - and it's why I like you | 16:26 |
mordred | jd__: awesome. well, the script is update.py in openstack/requirements | 16:27 |
mordred | jd__: and I | 16:27 |
anteaya | mordred: we were just tying up a conversation that started about 9 hours ago | 16:27 |
mordred | jd__: and I _believe_ that there is code somewhere in pip or pkg_resources that can tell you if the current installed version matches a version spec | 16:27 |
dstufft | there is | 16:27 |
mordred | dstufft: jd__: so I'd imagine that perhaps it could take two version specs? | 16:27 |
ttx | mordred: I cut the MP branch for neutron and nova without waiting for the last bits to merge because I waited all day for that to happen. Then, of course, they all merge 20 min later. So rather than forcing PTLs to go through backports to get those to MP, I just deleted Mp and cut it again | 16:27 |
mordred | dstufft: or is the "are two versoin specs compat" problem harder | 16:28 |
dstufft | welll | 16:28 |
mordred | ttx: AH. gotcha | 16:28 |
ttx | mordred: that generates a strange branch-tarball job with a SHA of 000000 but then the next one fixes it | 16:28 |
*** kiall has quit IRC | 16:28 | |
dstufft | you may end up with incompat versions that way | 16:28 |
dstufft | FWIW | 16:28 |
mordred | really? | 16:28 |
anteaya | the gate is empty | 16:28 |
anteaya | the gate is empty | 16:28 |
ttx | https://jenkins02.openstack.org/job/nova-branch-tarball/194/ | 16:28 |
anteaya | yay! | 16:28 |
ttx | mordred: ^ | 16:28 |
ttx | we made it ! | 16:29 |
mordred | oh. right | 16:29 |
* anteaya considers going for a nap | 16:29 | |
ttx | mordred: like I said, not an issue | 16:29 |
mordred | jd__: there is another problem to consider | 16:29 |
mordred | jd__: which is that even if the specified versions are compatible | 16:29 |
jeblair | ttx: that's for the branch delete | 16:29 |
mordred | jd__: that does not mean that the transitive dependencies of those versoin ranges is | 16:29 |
dstufft | Project A: depends on foo>=2.0<4.0, Project B: depends on foo>=2.0<3.0, Project C: depends on foo>=3.0 | 16:29 |
ttx | jeblair: yes, I figured | 16:29 |
mordred | jd__: ^^ what dstufft said | 16:29 |
dstufft | Project A and Project B, and Project A and Project C are compat, but B and C are not | 16:30 |
* anteaya decides to go for a walk instead | 16:30 | |
dstufft | so you need to a matrix that checks every combination | 16:30 |
*** amotoki has quit IRC | 16:30 | |
* ttx closes http://status.openstack.org/zuul/ | 16:30 | |
ttx | been watching it for too long | 16:30 |
mordred | dstufft: that too. good point. and thanks, that's an excellent reason that we do some of our pain :) | 16:30 |
anteaya | ttx ha ha ha | 16:30 |
anteaya | yes | 16:30 |
clarkb | morning | 16:30 |
mordred | morning clarkb | 16:30 |
*** prad has quit IRC | 16:30 | |
anteaya | morning clarkb | 16:30 |
ttx | clarkb: perfect timing :) | 16:30 |
clarkb | uh oh /me reads scrollback | 16:31 |
*** ruhe has joined #openstack-infra | 16:31 | |
ttx | clarkb: we are all done, queue to 0 and all. | 16:32 |
dstufft | mordred: determining if two version specifiers overlap is probably pretty hard though unless you have candidate version numbers to test against | 16:32 |
jd__ | mordred: dstufft: indeed, but we could test that | 16:33 |
* ttx takes long break | 16:33 | |
jd__ | ttx: a week or so | 16:33 |
*** pcrews has quit IRC | 16:33 | |
dstufft | e.g. if you have a list of versions, you can filter those versions by spec1 and spec2 and if you have an empty list at the end they don't overlap (or as a false positive there's just nothing released that matches both) | 16:33 |
*** guitarzan has joined #openstack-infra | 16:33 | |
*** bashok_ has joined #openstack-infra | 16:33 | |
dstufft | you could probably do it, but you'd have to do some sort of reasoning that I don't know exists as of yet | 16:34 |
guitarzan | anteaya: ping | 16:34 |
*** bashok has quit IRC | 16:35 | |
guitarzan | I hope I just fixed the cinder gate bug in case you folks are interested :) | 16:35 |
jd__ | dstufft: what'd be wrong combining a bunch of requirements, using min of < version and max of > versions and trying to install them all? | 16:35 |
*** kiall has joined #openstack-infra | 16:35 | |
dstufft | jd__: to determine if specifiers overlap? | 16:36 |
jd__ | dstufft: no, just to know if the global list of requirements of all openstack projects is installable | 16:36 |
dstufft | oh | 16:36 |
*** titanous has joined #openstack-infra | 16:36 | |
jd__ | which is what openstack-requirements is about | 16:36 |
dstufft | Does openstack use any entrypoints | 16:37 |
*** vogxn has quit IRC | 16:37 | |
dstufft | jd__: I'm thinking a sec | 16:37 |
titanous | is anyone responsible for http://graphite.openstack.org/ around? | 16:38 |
jd__ | dstufft: I'm ok with that ;) | 16:38 |
clarkb | titanous: yes there are several of us | 16:38 |
dstufft | jd__: so the problem is going to be that pip kinda sucks and it doesn't understand multiple requirements | 16:38 |
*** jaypipes has joined #openstack-infra | 16:38 | |
jd__ | dstufft: well I'll build only one list for it | 16:38 |
*** NobodyCam has joined #openstack-infra | 16:38 | |
dims | dprince, ah gotcha. just needed a sanity check that qpid stuff is still working as we don't have any other continuous runs where qpid is enabled | 16:39 |
dstufft | e.g. Project A says foo>1.0, and Project B says foo>=1.1 and you do ``pip install ProjectA ProjectB`` then pip will ignore the second version spec | 16:39 |
jd__ | dstufft: I parse all requirements.txt and I combine things like pymongo>=2.4,<3 and pymongo>=2.6 to pymongo>=2.6,<3 and then I try to install pymongo>=2.6,<3 and so on | 16:39 |
dstufft | jd__: ah | 16:40 |
dstufft | hrm | 16:40 |
jd__ | dstufft: if everything's installable, it's a win | 16:40 |
*** rcleere has joined #openstack-infra | 16:40 | |
*** hashar has left #openstack-infra | 16:41 | |
* jd__ observes dstufft looking for non working cases | 16:41 | |
mordred | transitive depends | 16:43 |
mordred | pymongo>=2.6,<3 doesn't tell you things about compatibility of the transitive dependency differences between ?=2.4,<3 and >=2.6,<3 | 16:44 |
mordred | jd__: but also, just out of curiosity, which problem are you trying to solve? | 16:44 |
*** odyssey4me has quit IRC | 16:45 | |
dstufft | jd__: so the only problem I can think of (assuming you actually do compile the dependencies of the entire tree and not just the openstack dependencies) is that you may end up with broken installs because of the stupid pip behavior (that I'm trying to fix FWIW) | 16:45 |
dstufft | where it only pays attention to the first version spec | 16:45 |
jd__ | mordred: agreed, but running the actual install via pip will tell if there's conflicts | 16:45 |
jd__ | mordred: I'm just trying to get rid of openstack/requirements and the process as it is now :-) | 16:46 |
mordred | jd__: hahahahahahaahahahhaha | 16:46 |
*** enikanorov-w_ has quit IRC | 16:46 | |
mordred | jd__: it's taken us 1.5 years to get it in place | 16:46 |
mordred | and it has the benefit of being simple to understand, even if it's annoying | 16:47 |
mordred | so I can see why you would | 16:47 |
mordred | but you might go insane | 16:47 |
sdague | yeh, seriously, I spent 3 weeks on it recently :) | 16:47 |
sdague | it drives you a little bonkers | 16:47 |
mordred | pure insanity | 16:47 |
sdague | I still need to create the min version test | 16:47 |
sdague | which I think we could actually do now with requirements | 16:48 |
mordred | sdague: yeah. also, did you see I have a test-clients-against-old-severs thing pretty much ready? | 16:48 |
sdague | nope | 16:48 |
sdague | link? | 16:48 |
mordred | sdague: https://review.openstack.org/#/c/41931/ | 16:48 |
*** dizquierdo has left #openstack-infra | 16:48 | |
mordred | sdague: and https://review.openstack.org/#/c/41945/ | 16:48 |
sdague | I'm going to disappear for a bit for mid day bike ride + lunch, I'll have to look when I get back | 16:49 |
dstufft | mordred: jd__ I think if pip had a real dep solver then openstack/requirements mgiht not be as important (except it makes it simple to place the blame when the set goes uninstallable, if your version spec doesn't fit it's your fault) | 16:50 |
mordred | dstufft: yeah. I agree | 16:51 |
*** Ryan_Lane has quit IRC | 16:51 | |
*** svarnau has joined #openstack-infra | 16:52 | |
*** prad has joined #openstack-infra | 16:52 | |
clarkb | jeblair: have you seen the bug from russellb about keystone not starting due to address in use? | 16:52 |
dstufft | If I understand jd__'s suggestion well enough he's proposing to walk the entire dependency tree and concat all of the specifiers into a single specifier per dependency that would also work, but at that point you're basically writing your own dep solver and I think the problem is harder then it appears on the surface | 16:53 |
clarkb | jeblair: I expected nodepool and the single use gearman flag to make that a non issue. Is this still a known problme? | 16:53 |
*** titanous has left #openstack-infra | 16:54 | |
jeblair | clarkb: sdague pointed it out to me, afaict the node sdague cited was used exactly once | 16:55 |
openstackgerrit | Peter Liljenberg proposed a change to openstack-infra/jenkins-job-builder: Added SBT builder support https://review.openstack.org/44685 | 16:55 |
dstufft | jd__: mordred for instance, if Project B has foo>=1.0, and project C has foo<1.5, and foo has 1.0, 1.2, and 1.6 available, when you resolve the tree for Project B you'll get foo 1.6 (and it's dependencies), then when you add in the tree for Project B you'll have to refine the tree, throw away the branch for foo 1.6 and restart it with foo 1.2 | 16:55 |
clarkb | for the subunit2html timeout problem it looks like we can convert to subunit v2 before passing it to subunit2html and that will speed us up | 16:55 |
clarkb | jeblair: interesting | 16:55 |
dstufft | Sat solvers make this nicer though :D | 16:56 |
jeblair | clarkb: is there a ci bug i need to respond to? | 16:56 |
* dstufft is still working on a SAT solver | 16:56 | |
clarkb | jeblair: yes 1221247 | 16:56 |
* mordred supports dstufft and his solvers | 16:56 | |
clarkb | should I be updating run-tox/run-unittest to do the subunit version conversion before converting to html? | 16:57 |
jeblair | clarkb: that's the node i examined | 16:57 |
*** enikanorov-w has joined #openstack-infra | 16:57 | |
jd__ | dstufft: agreed, but I would think that in our case of validation, resolving all projects at the same time just checking that "pip install foo>=1.0,<1.5" works is enough | 16:57 |
jd__ | dstufft: I imagine pip knows how to do that already? | 16:58 |
dstufft | jd__: how do you compress multiple copies of specifiers for foo | 16:58 |
dstufft | into one specifer | 16:58 |
openstackgerrit | Peter Liljenberg proposed a change to openstack-infra/jenkins-job-builder: Added support for JaCoCo plugin Publisher https://review.openstack.org/44705 | 16:58 |
jd__ | dstufft: reassure me, I imagine there's something in Python that know how to parse version strings and compare them? | 16:58 |
dstufft | jd__: yes | 16:58 |
jd__ | dstufft: like dpkg --compare-versions :) | 16:58 |
*** boris-42 has joined #openstack-infra | 16:59 | |
jd__ | dstufft: so that'd be enough to compress, no? do I miss something? | 16:59 |
dstufft | jd__: But you have to locate all the specifiers for "foo" in the dependency tree, and then you have to conjoin them, and the available specifiers can change based on other specifiers | 17:00 |
dstufft | the locating part is the hard part | 17:00 |
jd__ | dstufft: in theory yes, but my thinking is to stop at the top of tree, pip install everything we know, and see if that works -- now I realize that since pbr runs pip install on its own on each setup.py, this might fails actually | 17:01 |
*** thedodd has quit IRC | 17:02 | |
dstufft | jd__: that doesnt' tell you anything because pip is dumb and only pays attention to the first specifier for "foo" it finds | 17:02 |
dstufft | so you can have two packages with incompatible specifiers and it'll install fine because pip just picked the first one it saw | 17:02 |
Alex_Gaynor | Does trove-client run it's own jenkins workers for some reason? | 17:03 |
jd__ | dstufft: except if we don't run pip | 17:03 |
jd__ | dstufft: can we run setup.py without calling pip at all? | 17:03 |
dstufft | running setup.py just calls setuptools, I don't know offhand but I suspect setuptools has similar behavior | 17:03 |
dstufft | since i'm not aware of any real dep solver code in setuptools | 17:04 |
dstufft | afaik none of the python installers have a real dependency solver | 17:04 |
dstufft | which afaik is part of what caused openstack/requirements to be created | 17:04 |
jd__ | dstufft: do you know where I can find code that parses version like foo>1.2,<=3 | 17:05 |
*** pcrews has joined #openstack-infra | 17:05 | |
jd__ | dstufft: it's not like it's a mess but… :) | 17:06 |
dstufft | uhh | 17:06 |
dstufft | I think that's | 17:06 |
dstufft | pkg_resources.Requirements.parse() | 17:06 |
dstufft | off the top of my head | 17:06 |
jd__ | thanks :) | 17:06 |
Alex_Gaynor | dstufft: distlib also has a version of that right? | 17:07 |
dstufft | yea | 17:07 |
dstufft | I don't know it offhand | 17:07 |
dstufft | and I hate distlib's api | 17:07 |
dstufft | I think that might be Requirement instead of Requirements | 17:07 |
*** _TheDodd_ has joined #openstack-infra | 17:08 | |
*** SergeyLukjanov has joined #openstack-infra | 17:10 | |
openstackgerrit | Pierre Rognant proposed a change to openstack-infra/jenkins-job-builder: Fix plot plugin support https://review.openstack.org/45280 | 17:10 |
*** moted has quit IRC | 17:11 | |
*** yjiang5_away is now known as yjiang5 | 17:11 | |
*** dhellmann is now known as dhellmann_ | 17:11 | |
*** moted has joined #openstack-infra | 17:11 | |
*** derekh has quit IRC | 17:11 | |
*** kiall has quit IRC | 17:14 | |
*** hashar_ has joined #openstack-infra | 17:15 | |
*** changbl has joined #openstack-infra | 17:16 | |
*** reed has joined #openstack-infra | 17:17 | |
*** ruhe has quit IRC | 17:17 | |
*** kiall has joined #openstack-infra | 17:20 | |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: Fix missing requirements list https://review.openstack.org/45283 | 17:21 |
*** dina_belova has joined #openstack-infra | 17:21 | |
*** hashar has joined #openstack-infra | 17:22 | |
*** hashar has quit IRC | 17:22 | |
*** hashar_ has quit IRC | 17:23 | |
portante | are there other projects besides swift using nose for unit tests runs? | 17:25 |
clarkb | portante: horizon | 17:25 |
clarkb | maybe glance | 17:25 |
*** dina_belova has quit IRC | 17:25 | |
*** dina_belova has joined #openstack-infra | 17:26 | |
*** michchap has quit IRC | 17:26 | |
fungi | another hilarious race: https://jenkins01.openstack.org/job/gate-ceilometer-python26/494/console | 17:26 |
clarkb | portante: yup glance too | 17:26 |
fungi | or s/race/rounding error/ possibly | 17:27 |
*** afazekas has quit IRC | 17:27 | |
clarkb | fungi: that is awesome | 17:27 |
mordred | portante: we've only moved about half of them so far | 17:27 |
mordred | portante: we ran out of time this cycle | 17:28 |
mordred | portante: so I expect we'll pick it up again for icehouse | 17:28 |
*** wenlock has joined #openstack-infra | 17:29 | |
mordred | dhellmann_: did you want me to submit that change to devstack? | 17:29 |
portante | mordre: k, I notice that swift does not have a post commit coverage report, since it is missing a tox "cover" section | 17:29 |
portante | does anybody consume those coverage reports? | 17:30 |
mordred | portante: some people in some projects do | 17:30 |
portante | but there is no cross project use or roll up, right? | 17:30 |
*** ruhe has joined #openstack-infra | 17:33 | |
clarkb | portante: we do not combine coverage for shared code across projects. Is that what you are asking? | 17:33 |
portante | or even just roll up a report of the code coverage numbers of all projects in openstack | 17:34 |
portante | it appears POST coverage report step is really only useful per project, not consumed outside the project | 17:35 |
mordred | correct | 17:35 |
mordred | we've discussed some ways to make that data more readily consumable | 17:35 |
portante | k thx | 17:35 |
mordred | but none of them have actually surfaced into actions, because of other priorities :) | 17:36 |
*** nati_ueno has joined #openstack-infra | 17:36 | |
mordred | (such as injecting the raw data into graphite) | 17:36 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Convert subunit logs to subunit v2 before use. https://review.openstack.org/45285 | 17:36 |
clarkb | mordred: ^ that conflicts with the run-tox rename change, | 17:37 |
clarkb | mordred: I can rebase one on top of the other if there is some order we would like | 17:37 |
mordred | clarkb: I'm fine with that. it's a more important change | 17:37 |
clarkb | ryanpetrello: ^ | 17:37 |
mordred | the subunit conversion has real operational impact today | 17:37 |
mordred | I tihnk we should rebase ryanpetrello's work on top of it | 17:37 |
*** gyee has quit IRC | 17:37 | |
clarkb | ok | 17:37 |
*** Ryan_Lane has joined #openstack-infra | 17:39 | |
anteaya | guitarzan: sorry I missed you, I just went for a long walk | 17:40 |
anteaya | have you a url for a patch, I am very interested | 17:40 |
anteaya | oh and thanks for your nick, love it | 17:40 |
* anteaya looks around for the oneeyedonehornedflyingpurplepeopleeater | 17:41 | |
*** ruhe has quit IRC | 17:43 | |
clarkb | mordred: the reason neutron subunit logs are so huge is the log capture | 17:43 |
mordred | clarkb: ah | 17:43 |
clarkb | mordred: I think every test is capturing log info (which is good, but explains the size) | 17:43 |
mordred | yes. yes indeed it does | 17:43 |
*** kiall has quit IRC | 17:44 | |
annegentle | hey team infra! Where does the code live that automatically marks Fix Released for bugs linked to a review patch? I've got 10 bugs in openstack-manuals that were properly linked and review patches showed up, but after merge Fix Released wasn't set. | 17:45 |
annegentle | I looked through old reviews to see where it's set but I can't find it. | 17:45 |
clarkb | annegentle: openstack-infra/jeepyb/jeepyb/cmd/somethingsomething | 17:46 |
clarkb | annegentle: we recently changed the syntax for bug <-> lp management | 17:46 |
annegentle | clarkb: ok diving in to see if I see anything | 17:46 |
annegentle | clarkb: and left docs in the dark? :) | 17:46 |
clarkb | annegentle: no it was announced | 17:47 |
annegentle | clarkb: ok then I was blind to it :) | 17:47 |
clarkb | annegentle: you guys listen to openstack-dev@lists.openstack.org right? | 17:47 |
*** nati_ueno has quit IRC | 17:47 | |
clarkb | annegentle: if you have an example change I can tell you if that is the problem | 17:47 |
annegentle | clarkb: yeah sure do, looking through archives now | 17:47 |
*** nati_ueno has joined #openstack-infra | 17:48 | |
anteaya | portante: cinder might still be using nose as well | 17:48 |
clarkb | annegentle: jeblair sent the announcement | 17:48 |
clarkb | anteaya: I think they switched \o/ | 17:48 |
clarkb | swift, horizon, glance, and keystone are the remaining projects | 17:48 |
annegentle | clarkb: here is one https://bugs.launchpad.net/openstack-manuals/+bug/1162118 | 17:48 |
uvirtbot | Launchpad bug 1162118 in openstack-manuals "Document image cache management" [Medium,Triaged] | 17:48 |
anteaya | clarkb: awesome | 17:48 |
clarkb | all of the clients, heat, ceilometer, nova, and neutron are testr'd | 17:48 |
guitarzan | anteaya: https://bugs.launchpad.net/cinder/+bug/1220436 | 17:48 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during gate jobs" [Critical,In progress] | 17:48 |
anteaya | yay | 17:48 |
anteaya | hey guitarzan | 17:49 |
*** kiall has joined #openstack-infra | 17:49 | |
clarkb | anteaya: ya, that uses the old syntax, which we still support to a degree as you see | 17:49 |
clarkb | annegentle: ^ | 17:49 |
clarkb | annegentle: but to get the fix released status change you need to use the new Closes-Bug: header | 17:49 |
annegentle | clarkb: ok got it | 17:49 |
annegentle | clarkb: ohhh it's in the commit message | 17:50 |
annegentle | clarkb: lightbulb! | 17:50 |
annegentle | ok also one other doc thing | 17:51 |
annegentle | I mentioned it earlier but wanted to be sure you know, we're doing a refactor to get rid of doc/src/docbkx directories in the openstack-manuals repo | 17:51 |
annegentle | this weekend, can we get help approving the build changes? | 17:51 |
clarkb | sure, I will probably be watching the first week of football :) and can review those changes then | 17:52 |
anteaya | so bascially you are asking it to do less work to return the dict as a result of that method, guitarzan? | 17:52 |
*** ruhe has joined #openstack-infra | 17:52 | |
guitarzan | anteaya: yes | 17:52 |
anteaya | yay | 17:52 |
guitarzan | exactly | 17:52 |
anteaya | less work is good | 17:52 |
anteaya | +1 | 17:52 |
clarkb | annegentle: do you have a particular timeframe set yet? | 17:52 |
annegentle | clarkb: Diane Fleming and I are doing it Sat. before boot camp | 17:53 |
annegentle | clarkb: though I might start tomorrow | 17:53 |
anteaya | guitarzan: great, let's see it merge | 17:54 |
guitarzan | anteaya: I just got someone to +2 it, so here goes | 17:54 |
anteaya | yay | 17:54 |
* anteaya watches zuul status page again | 17:54 | |
guitarzan | anteaya: is there a way for me to see how often people were rechecking for that bug? | 17:56 |
clarkb | jeblair: looks like we already have bup running on wiki.o.o. How does one go about checking that the mysql dumps have ended up on the backup server in one piece? | 17:57 |
anteaya | guitarzan: http://status.openstack.org/rechecks/ | 17:57 |
anteaya | top of the page | 17:57 |
anteaya | so just 7 since we told people to use that bug number | 17:57 |
anteaya | plus the ones that I linked the logs in the bug report | 17:58 |
*** Ryan_Lane has quit IRC | 17:59 | |
*** danger_fo is now known as danger_fo_away | 17:59 | |
*** Ryan_Lane has joined #openstack-infra | 17:59 | |
*** Ryan_Lane has quit IRC | 17:59 | |
*** Ryan_Lane has joined #openstack-infra | 18:00 | |
*** Ryan_Lane has quit IRC | 18:00 | |
*** Ryan_Lane has joined #openstack-infra | 18:00 | |
jd__ | dstufft: does it sound complicated to teach pip that asking for pbr<1,>3 should not resolve to pip 1.4.1 but to nothing? :-( | 18:00 |
jeblair | clarkb: we should probably add a restore section to the docs | 18:00 |
clarkb | jeblair: :) I am reading the bup readme on github now. | 18:01 |
clarkb | jeblair: looks like we can use git to look at history and bup join to restore | 18:01 |
jeblair | clarkb: be root on wiki.o.o; be in a tmp dir (eg, definitely not /) | 18:01 |
jeblair | clarkb: and run "bup join -r bup-wiki@ci-backup-rs-ord.openstack.org: root | tar -xvf -" | 18:01 |
jeblair | clarkb: that's from .bash_history, so i think that should work | 18:01 |
jeblair | clarkb: also, you may want to change that tar command as appropriate | 18:01 |
dstufft | jd__: pip does not have a real dep solver, it's not a super easy thing to do | 18:02 |
dstufft | jd__: I'm working on one though using a SAT solver that'll actually handle these cases correctly | 18:02 |
*** MarkAtwood has joined #openstack-infra | 18:03 | |
guitarzan | anteaya: ahh, I see, thanks | 18:04 |
anteaya | great | 18:04 |
MarkAtwood | mordred, others: there is a practice openstack project to practice reviews and checkins and such, yes? | 18:04 |
clarkb | jeblair: does the ':' in -r argument potentially allow you to backup to different dirs on the same host? | 18:04 |
LinuxJedi | hey, can someone please make it so libra-milestone can tag in stackforge/python-libraclient? (or is there another way to do it?) | 18:05 |
clarkb | LinuxJedi: there is a magical way to do it | 18:06 |
LinuxJedi | awesome... | 18:06 |
clarkb | LinuxJedi: http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/gerrit/acls/stackforge/python-libraclient.config add the push tag permissions there | 18:06 |
fungi | MarkAtwood: we have some test projects on review-dev.openstack.org, but for the most part practice happens for new contributors by selecting easy-to-review changes and leaving a +1 or -1 as necessary with comments. also plenty of low-hanging-fruit bugs to get practice submitting simple changes. mistakes are human--we don't judge anyone | 18:07 |
clarkb | LinuxJedi: http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/gerrit/acls/stackforge/puppet-modules.config#n14 is an example | 18:07 |
LinuxJedi | cool, thanks | 18:07 |
jeblair | clarkb: i think so | 18:07 |
jeblair | MarkAtwood: yes tehre is | 18:07 |
mordred | fungi, MarkAtwood: there's also http://git.openstack.org/cgit/openstack-dev/sandbox | 18:07 |
fungi | oh, and openstack-dev/sandbox | 18:08 |
Mithrandir | fungi: heck, even some of us who've been submitting stuff for a while get things wrong a lot. :-P | 18:08 |
fungi | what mordred just said | 18:08 |
jeblair | fungi: yes that's what i was just looking up :) | 18:08 |
* mordred won | 18:08 | |
* clarkb hands mordred a prize | 18:08 | |
mordred | whee! | 18:08 |
* mordred got a prize! | 18:08 | |
fungi | Mithrandir: you just described me ;) | 18:08 |
MarkAtwood | thanks jeblair | 18:08 |
jeblair | i used it yesterday. it still works even. | 18:08 |
*** cody-somerville has quit IRC | 18:08 | |
MarkAtwood | thanks mordred | 18:08 |
senk | hooray, more remedial training for me! much thx to MarkAtwood :) | 18:09 |
jeblair | extra points to anyone who knows what to do with an exquisite corpse. | 18:09 |
fungi | listen to it? | 18:09 |
senk | hand it off to the next artist? | 18:09 |
MarkAtwood | weep over it byronically | 18:09 |
fungi | heh | 18:09 |
jeblair | all correct! :) | 18:09 |
openstackgerrit | Andrew Hutchings proposed a change to openstack-infra/config: Add tagging permissions to python-libraclient https://review.openstack.org/45294 | 18:09 |
LinuxJedi | clarkb: https://review.openstack.org/45294 <- like that? | 18:10 |
Alex_Gaynor | Does someone have teh ability to kill https://jenkins01.openstack.org/job/gate-grenade-devstack-vm/7577/console it seems to have hung and is blocking the gate | 18:10 |
clarkb | LinuxJedi: I don't think you need create too | 18:10 |
clarkb | LinuxJedi: but I may be wrong, I end up reading docs whenever I touch gerrit acls | 18:10 |
clarkb | Alex_Gaynor: looking | 18:11 |
*** ruhe has quit IRC | 18:11 | |
openstackgerrit | Andrew Hutchings proposed a change to openstack-infra/config: Add tagging permissions to python-libraclient https://review.openstack.org/45294 | 18:11 |
clarkb | Alex_Gaynor: thats kind of cool | 18:11 |
clarkb | jeblair: fungi mordred do we want to debug the test Alex_Gaynor has pointed out before killing it? | 18:11 |
LinuxJedi | clarkb: I just copied from libra. I've modified | 18:11 |
jeblair | clarkb: i will look quickly | 18:11 |
Alex_Gaynor | clarkb: there's one at teh head of the check queue in the same state (obviously less important as it doesn't block stuff) | 18:11 |
jeblair | ("nodepool hold" command doesn't exist yet :( ) | 18:12 |
anteaya | well it has been running for an hour and 34 minutes | 18:12 |
anteaya | nice job on the status.js elapsed time change, clarkb, this was the first job that ran long enough for me to see it | 18:13 |
*** nicedice_ has joined #openstack-infra | 18:13 | |
fungi | grenade jobs are allowed to run for up to 3 hours, so would take a while to get killed by the timeout | 18:13 |
jeblair | jenkins 1955 0.0 0.0 13956 1180 ? S 16:37 0:00 git remote prune origin | 18:14 |
anteaya | do grenade jobs need to still have that much time to run? | 18:14 |
jeblair | jenkins 1956 0.0 0.0 87120 6496 ? S 16:37 0:03 git-remote-https origin https://git.openstack.org/openstack/python-swiftclient | 18:14 |
jeblair | that's what it is running :( | 18:14 |
clarkb | anteaya: no we should probably look at reducing our timeouts | 18:14 |
clarkb | jeblair: :( | 18:15 |
jeblair | git-remot 1956 jenkins 4u IPv4 11217 0t0 TCP devstack-precise-hpcloud-az3-222572.novalocal:40485->git.openstack.org:https (ESTABLISHED) | 18:15 |
jeblair | no traffic | 18:15 |
clarkb | it is using https too. I figured git:// would give us more problems like this | 18:16 |
clarkb | jeblair: what is the IP address for that slave? | 18:16 |
jeblair | 15.185.252.64 | 18:16 |
jeblair | lsof -i -n|grep 15.185.252.64 | 18:16 |
*** ruhe has joined #openstack-infra | 18:16 | |
jeblair | is nil on git.o.o | 18:16 |
fungi | i don't see a socket for that at the other end, no | 18:17 |
fungi | maybe a stray tcp/rst back to git.o.o from some intermediary network device, leaving the slave end hung in an established state indefinitely | 18:17 |
*** rcleere has quit IRC | 18:18 | |
clarkb | Sep 5 16:39:36 localhost haproxy[1167]: 15.185.252.64:40485 [05/Sep/2013:16:37:52.523] balance_git_https balance_git_https/git01.openstack.org 0/20/104150 47741 cD 2/2/2/2/0 0/0 | 18:18 |
clarkb | haproxy seems to indicate the connection was closed on its end | 18:18 |
fungi | sounds consistent with that, then | 18:18 |
jeblair | or the fin packet was lost | 18:18 |
fungi | er, closed as in reset by peer or fin sent? | 18:19 |
fungi | usually complete termination doesn't register until a fin/ack is received | 18:19 |
clarkb | c = "the client-side timeout expired while waiting for the client to send or receive data. " | 18:19 |
fungi | aha | 18:19 |
clarkb | big C is client unexpectedly closed connection | 18:20 |
jeblair | i blame floating ips | 18:20 |
* fungi is tempted to say "it's the cloud" | 18:20 | |
jeblair | (because they use nat | 18:20 |
openstackgerrit | A change was merged to openstack-infra/gear: Update gear docs to include gearman server daemon https://review.openstack.org/43780 | 18:21 |
jeblair | and this is the sort of thing i expect to see with large-scale nat) | 18:21 |
fungi | but yeah, floating ips means nat which means state tracking upstream which means another place for some aggressive timeout to walk all over our connections | 18:21 |
fungi | zactly | 18:21 |
clarkb | http://code.google.com/p/haproxy-docs/wiki/SessionState | 18:21 |
fungi | great reference | 18:22 |
clarkb | for those interested in what the two character session states mean | 18:22 |
*** sdake has quit IRC | 18:22 | |
mordred | dstufft: what's pip 1.4 behavior with versions such as 0.1.6.post3 | 18:22 |
jeblair | fungi, clarkb: i think that concludes this debugging session? | 18:22 |
clarkb | (the "cD" towards the end of the log line) | 18:22 |
fungi | jeblair: yes, kill away | 18:22 |
clarkb | jeblair: I think so | 18:22 |
*** sdake has joined #openstack-infra | 18:22 | |
*** sdake has quit IRC | 18:22 | |
*** sdake has joined #openstack-infra | 18:22 | |
dstufft | mordred: define what you mean by behavior | 18:22 |
mordred | dstufft: does it install them by default with a >=0.1.6 ? | 18:22 |
jeblair | aborted | 18:22 |
dstufft | mordred: probably | 18:22 |
dstufft | I don't know for sure | 18:23 |
dstufft | but I'd assume so | 18:23 |
mordred | ok. wasn't sure if they'd get caught by the ignore-pre-release things | 18:23 |
*** cody-somerville has joined #openstack-infra | 18:24 | |
clarkb | 52 out of 228718 haproxy log entries show the cD session state | 18:25 |
clarkb | (for the current log file) | 18:25 |
fungi | i wonder how many of those are search engine crawlers | 18:26 |
harlowja | jeblair qq, i was looking at https://jenkins01.openstack.org/computer/centos6-9/ do u know which centos version that is? (latest?) | 18:26 |
fungi | what percentage of the 228718 i mean | 18:26 |
clarkb | harlowja: 6.4 | 18:26 |
harlowja | thx | 18:26 |
dstufft | mordred: oh, .post isn't a pre-release | 18:26 |
dstufft | or shouldn't be | 18:27 |
dstufft | it's a post release :D | 18:27 |
fungi | chances are those cD states are impacting bulk transfer requests more heavily than basic browsing | 18:27 |
mordred | dstufft: duh. /me *facepalms* | 18:27 |
jd__ | dstufft: ok thanks, good to know :) that looks like a so simple case I thought it worked already :( | 18:29 |
clarkb | jeblair: fungi: https://review.openstack.org/#/c/45285/ could use eyes. Though now that the feature freeze is behind us it may not be super urgent | 18:30 |
clarkb | ok back to backups | 18:30 |
fungi | clarkb: i think i agree with tstevenson on bug 1021697, but since it's your change he linked, you should probably weigh in | 18:32 |
uvirtbot | Launchpad bug 1021697 in openstack-ci "gerritbot should have logging" [Medium,Triaged] https://launchpad.net/bugs/1021697 | 18:32 |
clarkb | fungi: tstevenson is correct | 18:34 |
clarkb | fungi: the logs are in /var/log/gerritbot | 18:34 |
clarkb | on review.o.o | 18:34 |
clarkb | I will update the bug | 18:34 |
fungi | just making sure there weren't still other logs we wanted it to generate besides those. cool | 18:34 |
fungi | and yes, i have used those logs many times to try and diagnose netsplit-induced madness | 18:35 |
openstackgerrit | A change was merged to openstack-infra/config: Convert subunit logs to subunit v2 before use. https://review.openstack.org/45285 | 18:36 |
tstevenson | where does the code for uvirtbot live? | 18:37 |
clarkb | tstevenson: soren runs it, it isn't something we manage | 18:37 |
clarkb | I think it is available somewhere though | 18:37 |
*** ruhe has quit IRC | 18:38 | |
anteaya | the length of the title 'openstack/python-ceilometerclient' pushes the time out to the title bar for that patch in the css for status.o.o/zuul | 18:40 |
openstackgerrit | Monty Taylor proposed a change to openstack-dev/pbr: Rework run_shell_command https://review.openstack.org/42337 | 18:40 |
*** dina_belova has quit IRC | 18:40 | |
*** whoops has joined #openstack-infra | 18:40 | |
clarkb | is tar -P only dangerous on extraction? we archive with -P but that won't cause problems unless I extract with -P as well? | 18:41 |
clarkb | fungi: jeblair ^ | 18:41 |
fungi | clarkb: right, all the tar vulnerabilities are really on extract | 18:41 |
fungi | by default gnu tar has sanitized extraction for years | 18:41 |
* fungi goes hunting a good reference to cite | 18:42 | |
clarkb | fungi: I am mostly worried about accidentally overwriting things than being pwned (presumably the source tar is safe from bup) | 18:42 |
fungi | it was all the talk ~10 years ago | 18:42 |
tstevenson | clarkb: Thanks. Just wondering if uvirtbot wouldn't also be the appropriate bot to report on review items when they get mentioned. | 18:42 |
clarkb | http://www.gnu.org/software/tar/manual/html_node/absolute.html indicates `tar -cf - -C / /` might be a better archive command? | 18:43 |
fungi | tstevenson: uvirtbot would make sense to grow the ability to query gerrit, if its author is amenable | 18:43 |
fungi | clarkb: depends on whether you want to specify -C when extracting (i usually do, or set my cwd accordingly) | 18:44 |
clarkb | tar -X /etc/bup-excludes -cPf - / is the current command | 18:44 |
clarkb | would need the -X argument in the other command too | 18:44 |
fungi | right, i also generally don't use -P on create | 18:44 |
*** mrmartin has joined #openstack-infra | 18:45 | |
mordred | sdague: is there a specific reason why tempest does not list requirements.txt in its deps list in tox.ini? | 18:46 |
clarkb | mordred: yes | 18:46 |
mordred | awesome | 18:46 |
clarkb | mordred: the reason is postgres vs mysql vs other libs that devstack conditionally installs | 18:46 |
clarkb | mordred: so instead tempest relies on devstack and site pacakges to do the right hting | 18:46 |
mordred | ah | 18:47 |
fungi | clarkb: looking back over the docs, it's fine to just `tar -cf /` unless you're trying to suppress the "removing leading /" stderr line | 18:49 |
fungi | and -X too yes | 18:50 |
clarkb | fungi: it runs in cron so I think that is part of the intention | 18:50 |
clarkb | to suppress the stderr output | 18:51 |
fungi | ahh, so yeah using -C will be a little more sensible i think (though maybe bup is trying to be portable to a variety of tar implementations) | 18:51 |
clarkb | I am working out the restore process on jenkins-dev | 18:52 |
fungi | scratch the portability hypothesis, i think -P is mostly only gnu tar anyway | 18:52 |
clarkb | ya I think -P makes other tars unhappy | 18:52 |
clarkb | should /usr/* be in the bup-excludes? | 18:54 |
mordred | clarkb: I think the idea was that things in /usr are things we can re-install easily from puppet | 18:54 |
clarkb | mordred: right, and it isn't in the list | 18:54 |
*** SergeyLukjanov has quit IRC | 18:54 | |
mordred | AH | 18:54 |
*** dkehn has quit IRC | 18:55 | |
mordred | then I'd personally vote yes - but it's possible jeblair has a differing thought? | 18:55 |
fungi | as long as we don't put hard-to-replace data in /usr/loca/something | 18:55 |
fungi | local | 18:55 |
jeblair | hrm | 18:55 |
jeblair | that reduces its efficacy as a forensic tool | 18:55 |
mordred | fungi: I would hope we would not do that | 18:55 |
fungi | that too. hard to see what was in /usr at a specific point in time | 18:55 |
mordred | but jeblair makes a good point | 18:55 |
jeblair | i lean toward leaving it in and counting on git to do the right thing to make it not take up much space | 18:55 |
clarkb | wfm | 18:56 |
clarkb | this jenkins-dev join may end up being a lot larger than I expected :) | 18:56 |
fungi | yeah, i think bup will filter that out well since contents of /usr change infrequently and in small ways | 18:56 |
*** Ryan_Lane has quit IRC | 18:56 | |
*** Ryan_Lane has joined #openstack-infra | 18:56 | |
clarkb | jeblair: does bup chunk files in such a way that I can join /foo/bar.txt for example? | 18:57 |
* clarkb reads more bup docs | 18:57 | |
fungi | all in all our biggest data consumers will be things with high rates of change (logs, databases, et cetera) | 18:57 |
*** bashok_ has quit IRC | 18:57 | |
*** sdake_ has quit IRC | 18:59 | |
clarkb | looks like `bup save` and `bup restore` correspond to individual files and `bup split` and `bup join` correspond to bulk backups | 19:00 |
*** sdake_ has joined #openstack-infra | 19:01 | |
*** sdake_ has joined #openstack-infra | 19:01 | |
*** reed has quit IRC | 19:01 | |
*** reed_ has joined #openstack-infra | 19:02 | |
clarkb | bup save and bup index are new and experimental and missing features... | 19:03 |
clarkb | I suppose we will continue with the old style backups | 19:03 |
anteaya | guitarzan: failed on the bug it is designed to fix | 19:07 |
anteaya | :( | 19:07 |
clarkb | http://cacti.openstack.org/cacti/graph.php?action=view&local_graph_id=328&rra_id=all only ~25mbps :( | 19:07 |
clarkb | this will teach me to not use cron | 19:08 |
clarkb | s/cron/screen/ how did I get that wrong | 19:09 |
guitarzan | anteaya: I thought it failed on bug 1218391 | 19:10 |
uvirtbot | Launchpad bug 1218391 in tempest "tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure" [Undecided,New] https://launchpad.net/bugs/1218391 | 19:10 |
*** sdake_ has quit IRC | 19:10 | |
anteaya | guitarzan: I was going by the error: StringException: Empty attachments: | 19:11 |
guitarzan | I don't think it's the same bug | 19:11 |
anteaya | the way I understood it any StringException: Empty attachments: error was due to a cinder scheduling race | 19:11 |
guitarzan | the stack trace is the same as https://bugs.launchpad.net/tempest/+bug/1218391 | 19:11 |
uvirtbot | Launchpad bug 1218391 in tempest "tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure" [Undecided,New] | 19:11 |
anteaya | oh okay, well in that case, I am wrong | 19:11 |
*** krtaylor has quit IRC | 19:12 | |
anteaya | in my email to the ml | 19:12 |
anteaya | I had advising anyone seeing a StringException: Empty attachments: to use the reverify bug 1220436 statement | 19:12 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during gate jobs" [Critical,In progress] https://launchpad.net/bugs/1220436 | 19:12 |
anteaya | so there may be patches attached to that bug, which by your definition belong to a different bug | 19:13 |
*** sdake_ has joined #openstack-infra | 19:13 | |
*** sdake_ has joined #openstack-infra | 19:13 | |
anteaya | I hadn't been seeing StringException: Empty attachments: before last night and then we were seeing them everywhere | 19:14 |
anteaya | so I was working with the understanding they were all related | 19:14 |
guitarzan | I haven't dug deep enough to know where that StringException thing originates | 19:14 |
anteaya | okay | 19:14 |
anteaya | I had also talked to jgriffith about my understanding and I thought he was in agreement | 19:15 |
anteaya | but I may have misunderstood what he was saying | 19:15 |
*** Ryan_Lane has quit IRC | 19:16 | |
*** krtaylor has joined #openstack-infra | 19:17 | |
guitarzan | anteaya: well, our bug did make something throw that exception, but it looks like others do too | 19:19 |
guitarzan | I'm sure cinder isn't the only project with race conditions like that | 19:19 |
jgriffith | guitarzan: you're correct on that | 19:20 |
anteaya | guitarzan: okay so StringException: Empty attachments: is not just from cinder? | 19:20 |
jgriffith | guitarzan: "cinder not the only proj" | 19:20 |
anteaya | hey jgriffith | 19:20 |
jgriffith | anteaya: hey | 19:20 |
anteaya | thanks for sending over someone with such a great nick | 19:20 |
anteaya | I needed that | 19:20 |
jgriffith | ha | 19:20 |
guitarzan | haha | 19:20 |
anteaya | we made it through the night! | 19:20 |
jgriffith | indeed | 19:21 |
anteaya | I have the song playing in my head every time I see your nick, guitarzan | 19:21 |
anteaya | it is like entrance music | 19:21 |
fungi | the zuul status page suggests everyone has gone to sleep or is out getting drunk | 19:21 |
anteaya | can't stop laughing | 19:21 |
anteaya | woohoo | 19:21 |
MarkAtwood | we need to talk high bandwidth and private about it soon | 19:21 |
guitarzan | anteaya: maybe jgriffith gets to be the monkey | 19:21 |
* guitarzan runs | 19:21 | |
openstackgerrit | A change was merged to openstack/requirements: Drop Cheetah global requirement https://review.openstack.org/40206 | 19:22 |
zul | yay mongodb percolatiing in the cloud-archive | 19:22 |
anteaya | ha ha ha | 19:22 |
anteaya | zul yay mongodb | 19:22 |
fungi | zul: new enough to make ceilo happy? | 19:22 |
zul | fungi: newer enough to get me drunk at ODS for free | 19:23 |
*** pabelanger has quit IRC | 19:23 | |
fungi | heh | 19:23 |
anteaya | jgriffith: so did I mis-understand from last night, so all StringException: Empty attachments: are not cinder related? | 19:24 |
jgriffith | anteaya: so the failure you pointed out was in the cinder test | 19:24 |
*** dprince has quit IRC | 19:24 | |
anteaya | okay | 19:24 |
anteaya | but there were other StringException: Empty attachments: failures | 19:25 |
jgriffith | anteaya: my point last night was I believe I've seen this in other tests/projects as well | 19:25 |
jgriffith | anteaya: correct | 19:25 |
anteaya | ah okay | 19:25 |
anteaya | my bad | 19:25 |
anteaya | I totally misunderstood you | 19:25 |
anteaya | so something is sprinkling this around projects | 19:25 |
anteaya | so this is a dependency change or something | 19:26 |
anteaya | and it is hitting everybody, occasionally | 19:26 |
jgriffith | anteaya: we'll need to catch it again somewhere else to make sure but yes I believe so | 19:26 |
anteaya | and I gave the wrong instructions | 19:26 |
anteaya | figured I would, but I had to give something | 19:26 |
anteaya | jgriffith: well it was on guitarzan's patch: http://logs.openstack.org/71/45271/2/gate/gate-tempest-devstack-vm-postgres-full/80530b8/ | 19:27 |
* anteaya is overusing guitarzan's nick | 19:27 | |
anteaya | it is too much fun | 19:27 |
* guitarzan enjoys seeing his irc client light up | 19:27 | |
anteaya | and I mis-attributed it to bug 1220436 | 19:27 |
anteaya | :D | 19:27 |
uvirtbot | Launchpad bug 1220436 in cinder "test_cinder_quota_class_show failes during gate jobs" [Critical,In progress] https://launchpad.net/bugs/1220436 | 19:27 |
guitarzan | I think it's actually related to whatever test runner we happen to be using | 19:28 |
guitarzan | but that's kind of a wild guess | 19:28 |
anteaya | bug guitarzan feels that failure is more closely related to https://bugs.launchpad.net/tempest/+bug/1218391 | 19:28 |
uvirtbot | Launchpad bug 1218391 in tempest "tempest.api.compute.images.test_images_oneserver.ImagesOneServerTestXML.test_delete_image_that_is_not_yet_active spurious failure" [Undecided,New] | 19:28 |
anteaya | s/bug guitarzan/bug guitarzan | 19:29 |
jgriffith | anteaya: agreed | 19:29 |
anteaya | okay | 19:29 |
jgriffith | anteaya: guitarzan I still have hopes for the patch that's in the queue | 19:30 |
anteaya | so there are many different bugs that have StringException: Empty attachments: | 19:30 |
jgriffith | anteaya: guitarzan at least for the cinder case | 19:30 |
anteaya | go patch | 19:30 |
jgriffith | anteaya: yes, it's a translation failure, so it's a symptom | 19:30 |
anteaya | so would this be a dependency version upgrade or something | 19:31 |
anteaya | that is the source? | 19:31 |
guitarzan | something like that, yes | 19:31 |
guitarzan | because it seems to get thrown on all test failures | 19:32 |
anteaya | it was popping up a lot last night | 19:32 |
anteaya | way way too much | 19:32 |
*** senk has quit IRC | 19:32 | |
*** sdake_ has quit IRC | 19:33 | |
*** vipul is now known as vipul-away | 19:34 | |
*** vipul-away is now known as vipul | 19:34 | |
*** vipul is now known as vipul-away | 19:35 | |
*** MarkAtwood has quit IRC | 19:35 | |
*** vipul-away is now known as vipul | 19:35 | |
*** kiall has quit IRC | 19:36 | |
anteaya | jgriffith guitarzan rarrrr: http://logs.openstack.org/71/45271/2/gate/gate-tempest-devstack-vm-neutron/a9449ec/console.html | 19:36 |
anteaya | so you pick the bug this time | 19:37 |
*** dkehn has joined #openstack-infra | 19:37 | |
guitarzan | is there one for tempest.thirdparty.boto.test_ec2_instance_run.InstanceRunTest.test_run_stop_terminate_instance_with_tags ? | 19:37 |
harlowja | have u guys seen pbr having a multiprocessing error? | 19:38 |
anteaya | mordred: ^ | 19:38 |
anteaya | harlowja: I haven't | 19:38 |
harlowja | http://logs.openstack.org/39/45139/25/check/gate-taskflow-python27/3e1bdd6/console.html | 19:38 |
anteaya | I am stepping around pbr myself, trying to juggle the balls I have in the air | 19:38 |
harlowja | np | 19:38 |
anteaya | guitarzan: I haven't seen one | 19:38 |
harlowja | not sure if its d2to1 or pbr :-/ | 19:38 |
anteaya | harlowja: yeah, it's a coin toss | 19:39 |
lifeless | morning | 19:40 |
anteaya | guitarzan: at this point I am tempted to create a bug report with the StringException: Empty attachments: as the title | 19:40 |
guitarzan | anteaya: that's a red herring IMO | 19:40 |
guitarzan | it only happens after an actual test fails | 19:40 |
anteaya | and the individual tests where it is showing up in the comments | 19:41 |
*** dina_belova has joined #openstack-infra | 19:41 | |
anteaya | the StringException: Empty attachments: is a red herring? | 19:41 |
* anteaya puts her fishing net away | 19:41 | |
guitarzan | I think that's an artifact of whatever is spitting out the test failures | 19:41 |
anteaya | then I have been red herring'd all night | 19:41 |
guitarzan | and the real bugs are in the tests themselves | 19:41 |
lifeless | clarkb: so hi | 19:41 |
anteaya | okay | 19:41 |
guitarzan | a lot of those tests probably have race conditions like the one we had in cinder | 19:41 |
lifeless | clarkb: how does one debug 2013-09-05 06:00:00,003 ERROR zuul.IndependentPipelineManager: Unable to find change queue for project testing-cabal/testtools ? | 19:41 |
anteaya | let's go with your thoughts then guitarzan | 19:41 |
anteaya | guitarzan: all right then I guess it is a new bug and needs its own bug report | 19:42 |
*** Ryan_Lane has joined #openstack-infra | 19:43 | |
*** sarob has joined #openstack-infra | 19:44 | |
*** pcm_ has joined #openstack-infra | 19:44 | |
fungi | lifeless: in your layout.yaml projects list, does testing-cabal/testtools have any pipelines specified and if so, are any of those defined as independent pipelines in the main pipelines list? | 19:44 |
*** pcm_ has quit IRC | 19:44 | |
jeblair | lifeless: that is likely bad log message describing something that is no longer an error. sorry. | 19:45 |
*** pcm_ has joined #openstack-infra | 19:45 | |
*** dina_belova has quit IRC | 19:45 | |
lifeless | jeblair: so I have a jenkins up | 19:45 |
lifeless | jeblair: and jobs defined | 19:45 |
lifeless | jeblair: but nothing tries to run AFAICT | 19:45 |
*** pcm_ has quit IRC | 19:45 | |
*** sarob has quit IRC | 19:45 | |
lifeless | http://zuul.testing-cabal.org/, https://jenkins01.testing-cabal.org/, https://review.testing-cabal.org/#/c/1/ | 19:45 |
*** sarob has joined #openstack-infra | 19:46 | |
openstackgerrit | Monty Taylor proposed a change to openstack/requirements: Remove version pins from setup_requires https://review.openstack.org/45311 | 19:46 |
jeblair | lifeless: paste layout.yaml and zuul debug logs from around the time you pushed that change? | 19:46 |
mordred | morning lifeless | 19:46 |
*** vipul is now known as vipul-away | 19:46 | |
*** vipul-away is now known as vipul | 19:46 | |
lifeless | fungi: jeblair: https://github.com/testing-cabal/ci-config/blob/testcabal/modules/testcabal_project/files/zuul/layout.yaml | 19:46 |
lifeless | is my layout.yaml | 19:47 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Add backup restore docs. https://review.openstack.org/45312 | 19:47 |
*** kiall has joined #openstack-infra | 19:47 | |
lifeless | hmm, let me check it gain in case I cherrypicked the wrong thing to look at | 19:48 |
fungi | lifeless: if memory serves, it's spewing that "error" for each of the independent pipelines you've defined which aren't reflected in that particular project's set | 19:48 |
fungi | so post, pre-release, release, silent, experimental (probably not periodic since it lacks a gerrit trigger?) | 19:49 |
fungi | if so, we see those all the time in production i think, and they're benign | 19:49 |
lifeless | ahahahahaha don't do stuff when tired | 19:49 |
*** UtahDave has quit IRC | 19:49 | |
lifeless | I did my recheck by 'try 2' not by 'recheck no bug' | 19:50 |
jeblair | telepathy trigger is not implemented yet :( | 19:50 |
fungi | 'take 27. action!' | 19:50 |
jeblair | it's a commonly requested feature though | 19:50 |
anteaya | I wondered if you had changed the text | 19:50 |
lifeless | still | 19:50 |
*** sarob has quit IRC | 19:51 | |
*** dina_belova has joined #openstack-infra | 19:51 | |
lifeless | ok, so I got a 'LOST' again | 19:51 |
lifeless | does zuul look at the available executors ? | 19:52 |
lifeless | I haven't gotten a centos6 executor up yet... | 19:52 |
fungi | the zuul debug log, while quite verbose, should have a lot of detail around what job(s) it asked to have launched and their return status | 19:52 |
lifeless | http://paste.openstack.org/show/45833/ | 19:53 |
lifeless | anyhow, onto slaves | 19:54 |
lifeless | is nodepool required or optional ? | 19:54 |
lifeless | I mean, I want it, but I'm leaving here for the airport in 4 hours. | 19:54 |
jeblair | lifeless: that's the log info for a verify-1 event; likely the event from zuul itself | 19:54 |
jeblair | lifeless: zuul always logs a reason for declaring a build lost | 19:54 |
lifeless | jeblair: oh, found it. | 19:54 |
jeblair | lifeless: it does examine executors; it probably determined the function was not registered with gearman | 19:55 |
*** dina_belova has quit IRC | 19:55 | |
lifeless | 2013-09-05 19:49:48,870 DEBUG zuul.Gearman: Function build:gate-testtools-python26 is not registered | 19:55 |
fungi | bingo | 19:55 |
jeblair | lifeless: nodepool is optional | 19:55 |
anteaya | guitarzan: did you want to reverify that patch with a bug number and see if we can get it merged? | 19:56 |
lifeless | jeblair: how much of nodepool is puppetted? | 19:56 |
*** pabelanger has joined #openstack-infra | 19:57 | |
lifeless | jeblair: and probably more important, the setup-and-snapshot stuff - how much of that is likely to be omg openstack specific at the moment? Right now I'm concerned about turning this all into a presentation other folk can consume-and-collaborate on. | 19:57 |
lifeless | ok this - http://paste.openstack.org/show/45835/ is the full debug output associated with a 'recheck no bug' | 19:59 |
jeblair | lifeless: nodepool is completely puppeted; it should be non-openstack-specific, the setup scripts for the snapshots are in the openstack_project puppet module | 19:59 |
*** sandywalsh has quit IRC | 20:00 | |
jeblair | lifeless: it's basically not documented at all yet though. there be dragons. | 20:00 |
lifeless | jeblair: lets see how many I can scare up. | 20:01 |
lifeless | jeblair: is lack of executors a possible cause for that 'build is not registered' error from gear ? | 20:01 |
jeblair | lifeless: yes | 20:01 |
*** vipul is now known as vipul-away | 20:01 | |
jeblair | lifeless: (specifically, that there has never been an executor registered that can run that function since the gearman server started; once there is one, the function definition will persist and future jobs would be queued) | 20:02 |
jeblair | lifeless: it's basically typo-protection | 20:02 |
jeblair | (otherwise a typo in a job name for the gate queue could stop everything permanently) | 20:03 |
marun | to everybody and nobody: would it be possible to detect which lines of a given change are not exercised by the unit tests? | 20:03 |
lifeless | marun: coverage should tell you that | 20:04 |
*** mriedem has joined #openstack-infra | 20:04 | |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Docs on bringing up Jenkins in new infrastructures. https://review.openstack.org/45216 | 20:04 |
lifeless | marun: though coverage only tells you lines, not expressions | 20:04 |
marun | lifeless: I guess I haven't looked at a full coverage report, just summaries | 20:05 |
*** vipul-away is now known as vipul | 20:06 | |
lifeless | jeblair: do we run any non-elastic slaves? | 20:06 |
fungi | lifeless: plenty, plenty, plenty | 20:06 |
lifeless | ok, so there is value in documenting both nodepool and non-nodepool? | 20:06 |
*** jhesketh__ has quit IRC | 20:06 | |
*** gyee has joined #openstack-infra | 20:07 | |
fungi | lifeless: yeah. we even have a number of special-purpose slaves which do one or a very small subset of jobs | 20:07 |
lifeless | is non-nodepool also well puppeted, and for my test job - python26 - are they setup to connect to jenkins01 | 20:07 |
marun | lifeless: What do you think of rejecting changesets that decrease coverage? | 20:07 |
lifeless | marun: very ambivalent | 20:07 |
fungi | lifeless: pretty much all the slaves are completely puppeted because we manually stand then up in batches already and don't want to have to mess with them | 20:08 |
marun | lifeless: fair enough | 20:08 |
lifeless | marun: the problem is that I have little respect for mere coverage as a quality metric :) | 20:08 |
jeblair | marun: that's feedback that we've wanted to provide for quite some time | 20:08 |
marun | lifeless: I'm not suggesting it's a good quality metric. | 20:08 |
lifeless | marun: I think it's a useful signal for reviewers | 20:08 |
jeblair | marun: there is stalled work to run coverage as a check job | 20:08 |
marun | lifeless: yeah, that's my thought | 20:08 |
lifeless | marun: the problem is though, that most projects need something like 5000% coverage to actually get, well, coverage. | 20:08 |
lifeless | marun: [because of cyclomatic complexity] | 20:08 |
marun | lifeless: I think it would be useful for reviewers to know it a change that's introduced adds lines that aren't covered. | 20:09 |
jeblair | marun: are you interested in picking up that work? | 20:09 |
lifeless | marun: say yes! | 20:09 |
marun | ... | 20:09 |
marun | ...maybe? | 20:09 |
lifeless | fungi: jeblair: ok so if we have batches of slaves, what subset does nodepool do ? | 20:09 |
lifeless | (nodepool is 100% dynamic, right?) | 20:10 |
fungi | lifeless: right now just the devstack slaves which run tempest, grenade, et cetera | 20:10 |
marun | jeblair: If you could point me at the stalled work, I could at least take a look. | 20:10 |
lifeless | fungi: ok, interesting. Thats because reuse is so hard there, I presume ? | 20:10 |
lifeless | so ok, I'll bring up my centos6 slave by hand. | 20:11 |
jeblair | marun: zuul itself hase a coverage job that runs on check | 20:11 |
jeblair | marun: http://logs.openstack.org/45/42645/11/check/dev-zuul-coverage/134109f/cover/ | 20:11 |
jeblair | marun: http://logs.openstack.org/45/42645/11/check/dev-zuul-coverage/134109f/ | 20:11 |
anteaya | I'm going to sign off for the day and get an early sleep, see y'all tomorrow | 20:11 |
*** anteaya has quit IRC | 20:11 | |
fungi | lifeless: our long-lived unit test slaves (preciseXX.slave.o.o, centos6-XX.slave.o.o, et cetera and also special-purpose slaves like tx, pypi, mirror slaves and so on) are not nodepool-managed | 20:11 |
fungi | lifeless: yes, nodepool now does what the pool management jobs for devstack-gate previously did, because ick don't run anything on a slave after tests had root access to it | 20:12 |
marun | jeblair: do you know if coverage can report on which lines are not covered? | 20:12 |
fungi | lifeless: so we need a way to turn those over quickly after they get used | 20:12 |
fungi | lifeless: whereas the other slaves don't allow privileged access to the test jobs | 20:12 |
marun | jeblair: I think it would be best to avoid having to track coverage over time. Easier to flag lines which are not covered that were introduced/changed by the current changeset. | 20:12 |
jeblair | marun: it's worth thinking about whether that kind of job should return success or failure depending on some criteria (like % increased/decreased, or whether some % of changed lines are covered) | 20:13 |
jeblair | marun: i agree, that sounds like a really good fit for a check job too | 20:13 |
jeblair | marun: i don't know the answer to your question; so i think that's where the additional work lies | 20:13 |
marun | jeblair: Got it, I'll investigate. | 20:13 |
jeblair | marun: basically; the structure for how to run a job is there, if you can make that do something more useful, then it's probably about ready to go | 20:13 |
*** sandywalsh has joined #openstack-infra | 20:14 | |
marun | jeblair: ok, cool. | 20:14 |
jeblair | marun: we can then template it and apply it everywhere | 20:14 |
fungi | lifeless: it's been debated that we should eventually use nodepool to manage longer-running slaves too (whether it's just start running all tests on bare devstack slaves and throw them away after each job or do something a little more intelligent) | 20:14 |
jeblair | marun: http://git.openstack.org/cgit/openstack-infra/config/tree/modules/jenkins/files/slave_scripts/run-cover.sh | 20:15 |
*** sarob has joined #openstack-infra | 20:15 | |
fungi | lifeless: including ideas like kexec boot a fresh slave to a particular archetype on demand for a given test | 20:15 |
lifeless | whats the 'certname' setting in a slave definition for/do ? | 20:15 |
fungi | lifeless: identification certificate for that slave in jenkins | 20:16 |
lifeless | they all get the same one? | 20:16 |
jeblair | fungi: (puppet) | 20:16 |
fungi | er, oh in site.pp | 20:16 |
fungi | yes, the name of the shared cert for puppet | 20:17 |
fungi | older slaves we shared one puppet cert across all of a particular type | 20:17 |
fungi | more recently we've started giving long-lived slaves of the same type distinct certs per server instead, like normal puppet machines | 20:17 |
fungi | which is the default behavior if you don't specify the certname | 20:18 |
fungi | i initially thought you were asking about the slave credentials in jenkins | 20:19 |
*** jhesketh__ has joined #openstack-infra | 20:19 | |
*** sandywalsh has quit IRC | 20:20 | |
lifeless | so I presume you need to override --cert on launch-node.py | 20:21 |
lifeless | do you supply the full path to it, or just the centos06.slave.openstack.org.pem ? | 20:21 |
lifeless | fungi: ^ | 20:21 |
fungi | lifeless: yeah, --cert if you want to use a common shared cert rather than one named for the machine. and just the name of the file, no leading path | 20:23 |
*** dhellmann_ is now known as dhellmann | 20:23 | |
fungi | lifeless: see launch/README (it explains the situation) | 20:24 |
*** sdake_ has joined #openstack-infra | 20:25 | |
*** sdake_ has joined #openstack-infra | 20:25 | |
openstackgerrit | Ryan Petrello proposed a change to openstack-infra/config: Add pep8 checks for wsme. https://review.openstack.org/45322 | 20:26 |
openstackgerrit | Ryan Petrello proposed a change to openstack-infra/config: Add py26 and py33 tests and PyPi uploads. https://review.openstack.org/45323 | 20:28 |
mordred | uhm | 20:28 |
mordred | fungi, clarkb: http://logs.openstack.org/11/45311/1/check/gate-requirements-python27/be67b96/console.html | 20:28 |
mordred | it's a python27 unittest job that's not properly setting the mirror | 20:28 |
mordred | fungi, clarkb: ignore me | 20:29 |
fungi | mordred: i thought gate-requirements-pythonXX were exempted from the mirror because they needed to try installing things from beyond | 20:29 |
mordred | yes | 20:29 |
mordred | that's right | 20:30 |
fungi | no worries | 20:30 |
*** sarob_ has joined #openstack-infra | 20:30 | |
lifeless | is the slave subdomain implicitly setup by the dns scripts ? | 20:30 |
lifeless | or does that need a manual step? | 20:30 |
fungi | mordred: i think the expectation was that would get run less frequently based on changed file matches and so was less of a risk for random failures due to pypi reachability | 20:30 |
fungi | lifeless: we don't soa a separate subdomain for slave.o.o | 20:31 |
fungi | lifeless: those are all just served directly out of the openstack.org soa | 20:31 |
lifeless | kk | 20:31 |
fungi | so precise42.slave address record in the openstack.org domain, effectively | 20:32 |
*** sarob_ has quit IRC | 20:32 | |
lifeless | we use centos 6.4 right ? | 20:33 |
mordred | yup | 20:33 |
fungi | but that's really just a nameserver implementation detail since we're not delegating subdomains anywhere | 20:33 |
*** sarob_ has joined #openstack-infra | 20:33 | |
*** sandywalsh has joined #openstack-infra | 20:33 | |
*** sarob has quit IRC | 20:33 | |
fungi | so no separate ns records for slave.o.o or anything like that | 20:34 |
fungi | but even if we did, i don't think there's anything in our infrastructure which would care either way (aside from how we'd have to compose the rackdns command line) | 20:35 |
clarkb | back from lunch | 20:38 |
mordred | fungi: you know - hp runs moniker - we could move our dns entries to hp cloud and start using python-monikerclient | 20:38 |
mordred | it may not be openstack openstack yet, but at least it's open source | 20:38 |
clarkb | jeblair: https://review.openstack.org/#/c/45312/ did you see that? | 20:38 |
clarkb | jeblair: backup restore docs | 20:38 |
NobodyCam | mordred: can you point me to the correct repo(s) in -infra for the patch(s) requested by your review | 20:38 |
mordred | NobodyCam: yes! it's openstack-infra/config | 20:38 |
fungi | mordred: yeah, that's why i was asking the other day if the current monikerclient would work with hpcloud's implementation | 20:38 |
NobodyCam | :) | 20:39 |
mordred | NobodyCam: you want to look at the layout.yaml file | 20:39 |
mordred | fungi: I missed that - and yes it will | 20:39 |
NobodyCam | awesome :) TY | 20:39 |
mordred | fungi: if it doesn't, we can yell at kiall or CaptTofu | 20:39 |
fungi | mordred: after all, rackspace's dns service isn't openstack either | 20:39 |
mordred | yup. but moniker is at least trying to be openstack | 20:39 |
fungi | so moniker has my vote, of the two | 20:39 |
mordred | yah | 20:39 |
lifeless | nuts | 20:40 |
lifeless | err: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class testcabal_project::puppet_cron for centos6.slave.testing-cabal.org at /opt/config/production/manifests/site.pp:5 on node centos6.s | 20:40 |
mordred | I was thinking we might want to get an openstackci account on hpcloud for 'important' things in case there are ever any resources there we'd like to use for things other than devstack pool | 20:40 |
*** rfolco has quit IRC | 20:40 | |
clarkb | fungi: mordred: should be easy to switch if we want too, just create all the records then update SOA | 20:40 |
mordred | clarkb: nod | 20:40 |
clarkb | mordred: we have that account | 20:40 |
mordred | clarkb: we do? I thought we only had the openstackjenkins* accounts | 20:41 |
fungi | clarkb: yep. better if rackspace will let us query all the records via api, which i think they will | 20:41 |
clarkb | mordred: I added a second proper account at jeblair's request for things like backups | 20:41 |
mordred | clarkb: oh neat | 20:41 |
mordred | good | 20:41 |
clarkb | mordred: you comped the account supposedly :) | 20:41 |
mordred | clarkb: I'm sure I did | 20:41 |
clarkb | mordred: I think we may also be using a node or two there to test asterisk tomorrow | 20:42 |
fungi | clarkb: mordred: i think one of the sticking points is that there are other people who aren't infra core who are currently also able to change records in openstack.org, and there would need to be some discussion about what becomes of that | 20:42 |
mordred | fungi: yes. I would like to remove that ability myself | 20:42 |
clarkb | it has caused problems in the past | 20:42 |
clarkb | and is a bug imo | 20:42 |
mordred | fungi: and I'd _REALLY_ love to have the dns mapping be in puppet and go via code review if I had my way | 20:42 |
lifeless | even for slaves ? | 20:43 |
lifeless | that seems like super high friction for something that should be entirely automated | 20:43 |
fungi | it has caused problems in the past, but also those are people doing work. maybe those people should be dragged kicking and screaming into infra ;) | 20:43 |
mordred | lifeless: for things that are long-lived | 20:43 |
clarkb | fungi: I really don't like tar -P | 20:43 |
lifeless | mordred: oh; would I be trolling if I questioned having those tings at all? :) | 20:43 |
mordred | lifeless: for things that are short-lived/automated, I'd think we could do sub-domain delegation to the automated-slaves account | 20:43 |
clarkb | fungi: it is a bit scary seeing all of those rooted paths scroll by | 20:44 |
mordred | lifeless: yes | 20:44 |
mordred | lifeless: :) | 20:44 |
*** whoops has quit IRC | 20:44 | |
lifeless | clarkb: btw did you see my note about dashboard and oath tokens etc? | 20:44 |
clarkb | lifeless: I didn, haven't had a chance to think about it yet | 20:44 |
lifeless | clarkb: the diffs that puppet shows for changed files include things like the jjb ini file that has credentials in it. | 20:44 |
clarkb | ugh | 20:44 |
fungi | clarkb: which is why i don't normally back up with -C or -P and just live with the line on stderr (filtering it with grep -v if needed) | 20:44 |
* clarkb turns off apache on the dashboard | 20:45 | |
mordred | lifeless: not because it's not a theoretical good idea - but rather because the amount of work to get there is currently unachievable - but we've talked about that already ;) | 20:45 |
lifeless | clarkb: assuming that the dashboard shows you what 'puppet agent --test' does, more or less | 20:45 |
*** enikanorov-w has quit IRC | 20:45 | |
lifeless | clarkb: ^ | 20:45 |
clarkb | lifeless: oh, let me check | 20:45 |
lifeless | clarkb: mine isn't working yet for display | 20:45 |
clarkb | lifeless: it has all of the data in the report but what is accessible over http may be less | 20:45 |
mordred | however, speaking of that ... | 20:45 |
lifeless | clarkb: see under 'bug filed' | 20:45 |
mordred | fungi, clarkb: perhaps we should attach a cinder volume to ci-puppetmaster and put things we'd be sad if we lost on it | 20:46 |
mordred | like the heira file | 20:46 |
mordred | just to be safe | 20:46 |
clarkb | mordred: this might be a bad opinion, but I think volumes are less reliable than server images | 20:46 |
clarkb | mordred: the number of times I have had to fsck filesystems on static.o.o is too high | 20:46 |
fungi | maybe we should make regular snapshots of it instead if we care | 20:46 |
mordred | clarkb: that's a good point | 20:46 |
mordred | fungi: ++ | 20:47 |
clarkb | I am going to stop apache on puppetdashboard right now to be double sure | 20:47 |
*** ArxCruz has quit IRC | 20:47 | |
fungi | or just decide how best to safely back up that machine with the same tools we back up everything else (we can encrypt stuff we're particularly worried about leaking) | 20:47 |
clarkb | pleia2: ^ do you know if lifeless' concern is a problem? | 20:47 |
mordred | lifeless: I _believe_ we've checked before | 20:47 |
mordred | and that the passwords that come from heira are not printed | 20:47 |
mordred | but I'm not 100% sure - I just remember having that concern a year ago | 20:48 |
lifeless | mordred: does it show diffs to config files ? | 20:48 |
mordred | and we decided that we did not need to password protect | 20:48 |
clarkb | apache is stopped | 20:48 |
mordred | lifeless: I do not believe it does | 20:48 |
clarkb | of course watch puppet start it again, I am going to stop puppet too | 20:48 |
*** gyee has quit IRC | 20:48 | |
clarkb | all done | 20:49 |
fungi | the logs displayed by the dashboard don't include diff output, pretty sure | 20:49 |
* mordred supports doublechecking that | 20:49 | |
lifeless | argh, stab stab stab | 20:49 |
lifeless | how does one debug 'could not retrieve catalog' errors ? | 20:49 |
fungi | instead i've seen them merely mention the old and new checksum of a changed file | 20:49 |
clarkb | lifeless: apache logs | 20:49 |
lifeless | err: Could not retrieve catalog from remote server: Error 400 on SERVER: Could not find class testcabal_project::puppet_cron for centos6.slave.testing-cabal.org at /opt/config/production/manifests/site.pp:516 on node centos6.slave.testing-cabal.org | 20:49 |
clarkb | oh in that case you need to figure out why the puppet_cron class on line 516 couldn't be included | 20:50 |
lifeless | puppet command line puppet agent --environment production --server ci-puppetmaster.testing-cabal.org --no-daemonize --verbose --onetime --pluginsync true --certname centos6.slave.testing-cabal.org | 20:50 |
lifeless | looks correct | 20:50 |
clarkb | lifeless: it is a problem with the manifest itself | 20:50 |
clarkb | compile time error | 20:50 |
fungi | interpreting puppet's error messages is often a lot like reading tea leaves | 20:51 |
*** dina_belova has joined #openstack-infra | 20:51 | |
mordred | lifeless: are your puppet module paths set properly? | 20:52 |
mordred | lifeless: have you run puppet on your puppet master to get the puppet.conf set up properly? | 20:52 |
lifeless | is the 'node' match in site.pp based on the cert, or hostname ? | 20:52 |
clarkb | lifeless: cername | 20:53 |
fungi | clarkb: on your backup and restore instructions, will tmpreaper pose a problem if you restore into /tmp and bup preserves the original (possibly very old) file timestamps? | 20:53 |
clarkb | fungi: possibly | 20:53 |
lifeless | clarkb: so the certname being passed in to launch won't match the regex | 20:53 |
*** gyee has joined #openstack-infra | 20:53 | |
lifeless | clarkb: node /^centos6-?\d+\.slave\.testing-cabal\.org$/ { | 20:53 |
lifeless | clarkb: won't match centos6.slave.testing-cabal.org as the cert. | 20:53 |
clarkb | fungi: I actually did these things by hand in /root/clarkb-something and realized that that dir would get backed up... | 20:53 |
lifeless | I've spottedone oddited, I had the 'certname =>' line in that ndoe definition slightly typoed; fixing and retrying | 20:55 |
clarkb | fungi: would it be bette rto have directions that use /root/somedir with a note to remove that dir when done? | 20:55 |
fungi | lifeless: so, the "node" parameter will get matched against the the node name, which will normally identify itself using the cert specified in the certname parameter (or with a cert of the same name as the node if not specified) | 20:55 |
fungi | from what i've seen | 20:55 |
fungi | s/parameter/pattern/ | 20:55 |
lifeless | I think there must be something else to it otherwise the slave instructions can't possibly work | 20:55 |
lifeless | because the single shared cert's name doesn't match the pattern | 20:56 |
*** dina_belova has quit IRC | 20:56 | |
clarkb | jeblair: thoughts on where the temporary backup working dir should go? | 20:56 |
soren | tstevenson: It's runnning some ancient version of ubottu. | 20:56 |
soren | tstevenson: I can find the exact version if you like. | 20:56 |
*** _TheDodd_ has quit IRC | 20:56 | |
fungi | clarkb: yeah, i think somewhere which won't magically get thrown away by tmpreaper while you're using it. bonus points if it's also somewhere we're excluding from backups | 20:56 |
openstackgerrit | Anne Gentle proposed a change to openstack-infra/jeepyb: Adds <service>-api to map to openstack-api-site in launchpad. https://review.openstack.org/45324 | 20:57 |
lifeless | mordred: btw I have had a paper accepted @ LCA | 20:57 |
mordred | lifeless: woot | 20:57 |
* mordred never got around to submitting one | 20:57 | |
clarkb | fungi: can you drop a note on the review so that jeblair and mordred can see it with context? | 20:57 |
lifeless | mordred: I know :) | 20:57 |
fungi | yup | 20:57 |
clarkb | mordred: fail | 20:58 |
clarkb | mordred: mine got accepted too | 20:58 |
mordred | I'll just go drink with people | 20:58 |
mordred | clarkb: awesome! | 20:58 |
*** dkehn has quit IRC | 20:59 | |
clarkb | fungi: jeblair: I am going to hold off on pulling a restored backup from wiki until tomorrow, I noticed the bup cron runs before the logrotate cron so I won't see the rotated backups until then | 20:59 |
*** tstevenson has quit IRC | 21:00 | |
fungi | makes sense | 21:00 |
*** dkehn has joined #openstack-infra | 21:00 | |
lifeless | fungi: so when you launch slaves | 21:00 |
lifeless | fungi: do you pass --cert? | 21:00 |
lifeless | I am thoroughly confused. | 21:00 |
fungi | lifeless: if you want those slaves to share a common cert, then pass --cert to the launch command | 21:01 |
fungi | lifeless: if you want them to get distinct certs (which is what we're moving to these days) then omit --cert | 21:01 |
lifeless | fungi: so site.pp is setup for distinct certs? | 21:01 |
lifeless | fungi: it has a 'certname =>' option in the node definition. | 21:01 |
fungi | in the latter case, the certname will be the same as the nodename, and thus no need to specify a certname parameter in site.pp either | 21:01 |
lifeless | ok | 21:02 |
*** _TheDodd_ has joined #openstack-infra | 21:02 | |
*** sarob_ has quit IRC | 21:02 | |
fungi | where we're using certname in site.pp it's where we want to the cert to be named something other than the node (generally just where we have multiple nodes sharing one cert, which we're getting away from) | 21:03 |
*** sarob has joined #openstack-infra | 21:03 | |
*** dolphm has joined #openstack-infra | 21:03 | |
*** _TheDodd_ has quit IRC | 21:03 | |
*** pcm_ has joined #openstack-infra | 21:03 | |
dolphm | does this group on launchpad still impact anything? https://launchpad.net/~keystone-core | 21:03 |
dolphm | the description is "The team of devs who have review approval for Keystone" but that appears to be managed in gerrit now | 21:04 |
fungi | lifeless: i tried to capture the situation for the --cert parameter in launch/README | 21:04 |
jeblair | clarkb: re LCA, awesome! there will be quite a few of us there | 21:04 |
lifeless | fungi: yes, which has me tied up in knots, I think. | 21:04 |
clarkb | jeblair: sounds like it. I am excited | 21:04 |
lifeless | mordred: you can do miniconf stuff | 21:04 |
fungi | dolphm: bad description. we now only really use the core groups on lp as security points of contact for private bugs | 21:04 |
lifeless | mordred: stewarts CICD miniconf was accepted IIRC | 21:04 |
mordred | lifeless: yeah. I might do that | 21:04 |
mordred | lifeless: oh good. stewart submitted that one | 21:05 |
dolphm | fungi: as distinguished from https://launchpad.net/~keystone-drivers/ ? | 21:05 |
fungi | dolphm: there's a pending bug suggesting renaming the -core teams on lp to -security or somthing | 21:05 |
fungi | dolphm: and yes, -drivers is mostly just for people controlling the bueprint targeting i believe | 21:05 |
fungi | er, blueprint | 21:06 |
fungi | dolphm: tried to capture the distinction at https://wiki.openstack.org/wiki/Project_Group_Management | 21:06 |
dolphm | fungi: awesome! let me read | 21:07 |
*** sarob has quit IRC | 21:07 | |
fungi | dolphm: though it doesn't specifically mention the -core teams in lp since their future is still somewhat in limbo | 21:08 |
*** sarob has joined #openstack-infra | 21:08 | |
dolphm | fungi: i'll remove the existing description from keystone-core, at least | 21:08 |
fungi | dolphm: thanks! | 21:08 |
jeblair | clarkb, fungi, mordred: food for thought: what we actually want with dns is more like a BIND file in the config repo, which is not a usage pattern facilitated by cloud dns | 21:09 |
*** pentameter has quit IRC | 21:09 | |
fungi | jeblair: point. maybe we stand up our own authoritative nameservers. they're remarkably low-maintenance anyway | 21:10 |
jeblair | clarkb, fungi, mordred: (especially when you consider that we want to stop manually launching slaves, that leaves pretty much only things that we want to go through code review) | 21:10 |
openstackgerrit | Chris Krelle proposed a change to openstack-infra/config: Add python 3.3 and pypy checks to ironicclient https://review.openstack.org/45327 | 21:10 |
clarkb | jeblair: I want to say moniker may support that? but I may be remember really poorky | 21:10 |
openstackgerrit | Steven Dake proposed a change to openstack-infra/config: Add heat-templates to tarballs.openstack.org https://review.openstack.org/45328 | 21:11 |
fungi | if moniker can safely merge a supplied bind zonefile, then maybe that's a good compromise | 21:11 |
mordred | I'm not saying that using moniker would be more efficient than running our own bind files in puppet | 21:11 |
mordred | but it is more towards dogfooding the things we're involved with building | 21:11 |
jeblair | fungi: or sign up with a service that lets us xfer things (if we think things like geographic diverse multicast are important) | 21:11 |
jeblair | if moniker can do that, cool; i was basing my assumption on rax. | 21:12 |
jeblair | mordred: running our own bind != having bind files in puppet | 21:12 |
jeblair | just to clarify | 21:12 |
mordred | indeed | 21:12 |
fungi | jeblair: silent master with slaves as the service provider would be a great model. then we could even have more than one provider | 21:12 |
jeblair | fungi: that's my favorite model | 21:12 |
*** tstevenson has joined #openstack-infra | 21:13 | |
fungi | it's what i did the last place i had to manage a very large distributed dns infrastructure for a service provider | 21:13 |
fungi | worked out great | 21:13 |
mordred | yup. we have used that before too | 21:13 |
mordred | I'm betting that none of the cloud dns models support that though | 21:13 |
fungi | well, my old employer would configure to slave zones from a customer's master server upon request | 21:14 |
clarkb | hpcloud gives you geographically diverse dns servers iirc | 21:14 |
fungi | but yeah it might be uncommon with a larger provider | 21:14 |
clarkb | but not across multiple providers | 21:14 |
fungi | though now that i think about it, even back in the days of cheap online-only dns hosting providers like granitedns, that was a common feature | 21:15 |
fungi | i think even dyndns, who i currently host my personal domains through, may offer to let me have them slave zones from a master of my own choosing | 21:16 |
fungi | so it might not be that uncommon of a feature | 21:16 |
lifeless | ohhhhh damn. I found a genuine error in my default definition from the top of the stack; I suspect i didn't need to fork params.pp and puppet_cron.pp | 21:16 |
lifeless | nuts, anyhow I think this will fix it. | 21:17 |
mordred | I thnk the point of DNSaaS is to serve people who do not want to or know how to manage their own dns zone files or delegations or other things | 21:17 |
mordred | we may be a terrible set of test people, since we do know how to do those things | 21:17 |
*** tstevenson has quit IRC | 21:17 | |
lifeless | do we want to know how to do them ? | 21:17 |
fungi | it dawns on me though, with the idea of managing zonefiles via code review, getting serial numbers to be increasing-only will get fun ;) | 21:18 |
lifeless | Perhaps we should run our own moniker instance to manage the master ? | 21:18 |
mordred | or just use the moniker that's already runing, make a heat template to control the resources themselves and store the heat template in puppet | 21:18 |
jeblair | heat template would probably be okay; the point is code review | 21:19 |
mordred | jeblair: yes. totally. code review == essential | 21:19 |
jeblair | (creating our own yaml file that gets translated into bind, where bind is already quite easy and we all know it is not sane; if heat has already done that, then we are not insane) | 21:19 |
clarkb | we could probably write a very simple twisted names script that reads bind zone file, queries designate, updates as necessary | 21:19 |
clarkb | not that I want to write such a thing, but it shouldn't be terrible | 21:20 |
fungi | i think we would need a gating job, specifically, to ensure that newserial is strictly greater than oldserial | 21:20 |
lifeless | what sort of bugs will code review on dns catch ? | 21:20 |
lifeless | I mean other than bind config definition errors | 21:20 |
jeblair | fungi: yeah. that would be fine. and of course if anyone uses a real editor (:P) to edit the bind files it will be automatically updated anyway | 21:20 |
clarkb | lifeless: incorrect A record data | 21:20 |
clarkb | lifeless: it has been known to happen | 21:20 |
jeblair | lifeless: we're using code review here to also mean "able to be contributed to" | 21:21 |
clarkb | jeblair: you mean ^A right? | 21:21 |
fungi | jeblair: right, i'm more concerned with when there are several proposed dns changes and we have to be careful about merge order | 21:21 |
lifeless | jeblair: ah, acls ? | 21:21 |
jeblair | lifeless: by having something in a repo, it means someone can submit a proposal | 21:21 |
fungi | jeblair: also, merge conflicts on the serial will be very, very common i guess | 21:21 |
lifeless | jeblair: oh flip side - openning it out? | 21:21 |
jeblair | lifeless: the current state is that only people who know people who have secret passwords can make changes | 21:21 |
clarkb | fungi: jeblair: what if we ignore the serial and use zone files as an intermediate format | 21:21 |
jeblair | lifeless: exactly, as in the way that anyone can change the jenkins config because it's in a public code repo | 21:21 |
fungi | clarkb: post-processing script which updates the serial on load? that could work | 21:22 |
mordred | yes to both things jeblair said | 21:22 |
fungi | then we don't keep the serial itself in git, just a placeholder | 21:22 |
clarkb | fungi: right | 21:22 |
clarkb | reviewing the serial isn't actually interesting | 21:22 |
mordred | fungi, clarkb: that sounds suspiciously like duplicating a lot of the work that moniker is already doing | 21:22 |
jeblair | zuul nnfi passes tests! | 21:22 |
lifeless | do we use nodelabeller or some other means to get slaves labelled ? | 21:22 |
clarkb | jeblair: hurray | 21:23 |
mordred | lifeless: no. we don't really use node labels in complex ways | 21:23 |
mordred | lifeless: also, that would be more java code running in jenkins | 21:23 |
lifeless | mordred: 'some other means' then ? | 21:23 |
mordred | we try to keep that to an absolute minimum | 21:23 |
*** tstevenson has joined #openstack-infra | 21:23 | |
fungi | clarkb: more to the point, reviewing the serial number is only interesting if you need to be sure people don't screw it up. taking that possibility away makes it uninteresting data to review | 21:23 |
clarkb | lifeless: nodepool sets it to some configured value, and for non nodepool slaves we add them by hand and type in a label | 21:23 |
mordred | lifeless: when they're added, they get a single static label then they are added | 21:23 |
mordred | lifeless: what clarkb said | 21:24 |
mordred | lifeless: basically, jenkins can't handle running lots of code, so we try to keep the tasks it needs to do in java to an absolute minimum | 21:24 |
mordred | jeblair: w00t! | 21:24 |
mordred | jeblair: I support our new nffi overlords! | 21:24 |
lifeless | mordred: sure, I don't care what the answer is, just that it's documented ;) | 21:25 |
mordred | lifeless: yup. was just giving philisophical background for context | 21:25 |
lifeless | how does the slave get connected to the master ? | 21:25 |
mordred | lifeless: two ways: | 21:25 |
lifeless | non-nodepool | 21:25 |
clarkb | fungi: good point | 21:25 |
mordred | lifeless: then one way - we add it by hand | 21:25 |
lifeless | whats the nodepool answer ? | 21:26 |
clarkb | somehow the list of changes up for review has grown gigantic again | 21:26 |
clarkb | going to try and do code review this afternoon | 21:26 |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: Diff between installed packages and requirements https://review.openstack.org/45329 | 21:26 |
mordred | lifeless: nodepool uses jenkins api to add it | 21:26 |
clarkb | jog0: I am rerunning the experimental job for https://review.openstack.org/#/c/43779/ will approve if that comes back clean | 21:28 |
jog0 | clarkb: awesome | 21:28 |
*** alexpilotti has joined #openstack-infra | 21:28 | |
lifeless | my jenkins01 has no credentials available in the dropdown on https://jenkins01.testing-cabal.org/computer/createItem | 21:30 |
lifeless | should that be blank and it will Just Work ? | 21:30 |
clarkb | lifeless: credentials are new | 21:30 |
clarkb | lifeless: I think they have their own global management windows. You can click the advanced button and give it a key instaed | 21:30 |
lifeless | we puppet it to a file on the master right ? | 21:31 |
clarkb | lifeless: I don't think so. We didn't use credentials until nodepool | 21:32 |
clarkb | lifeless: jeblair will have a better idea of how they are handled. | 21:32 |
clarkb | lifeless: our non nodepool servers just set an ssh key | 21:32 |
lifeless | oh, i see - I was confused by the hiera entry | 21:32 |
*** Ryan_Lane has quit IRC | 21:32 | |
*** cp16net has joined #openstack-infra | 21:33 | |
*** fifieldt has joined #openstack-infra | 21:33 | |
lifeless | clarkb: what home dir do you give the slaves? | 21:35 |
openstackgerrit | Anne Gentle proposed a change to openstack-infra/config: Removes separate Object Storage Admin manual as the content is now in other books. https://review.openstack.org/45333 | 21:35 |
*** rnirmal has quit IRC | 21:35 | |
mordred | dhellmann: ping | 21:36 |
lifeless | ok, oath. | 21:36 |
jeblair | lifeless: process is make a credential in jenkins using the gui, look at the xml file on disk and get the UUID of the credential, put it in hiera | 21:36 |
jeblair | lifeless: (bootstrapping jenkins is terrible) | 21:37 |
lifeless | jeblair: huh, I just copied in the ssh master key made earlier | 21:37 |
lifeless | jeblair: and gave it a name to remember later | 21:37 |
jeblair | lifeless: the credential param for nodepool is the jenkins credential uuid | 21:38 |
lifeless | jeblair: I'm probably misunderstanding something; can correct me in code review | 21:38 |
lifeless | jeblair: I haven't gotten to nodepool yet | 21:38 |
clarkb | lifeless: /home/jenkins is the homedir | 21:38 |
lifeless | so, I've got a slave up | 21:38 |
mordred | woot | 21:38 |
jeblair | lifeless: ok, well you have made a credential at this point, i believe, for use by your manually configured slaves | 21:38 |
clarkb | mordred: reviewed https://review.openstack.org/#/c/42677/ there are a few things that need changing but I am excited to get that in | 21:38 |
jeblair | lifeless: so, er, you can use its uuid later with nodepool | 21:38 |
lifeless | zuul still whinged LOST | 21:39 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/zuul: Use NNFI scheduler algorithm https://review.openstack.org/45334 | 21:39 |
clarkb | jog0: http://logs.openstack.org/79/43779/19/experimental/gate-tempest-devstack-vm-large-ops/c69c4ea/ failed. This is ok, this is why we put it in the experimental queue. Should I go ahead and approve that change or do you want to llok at the failure? | 21:40 |
mordred | clarkb: "Let's trigger the job manually for now, so we don't get a backlog in the silent queue if it doesn't work." | 21:40 |
lifeless | jeblair: I'm still getting 2013-09-05 21:39:15,582 DEBUG zuul.Gearman: Function build:gate-testtools-python26 is not registered | 21:41 |
mordred | clarkb: perhaps I should add it to the experimental queue | 21:41 |
jog0 | clarkb: looking at it | 21:41 |
clarkb | mordred: yes experimental :) this is what the pipeline exists for | 21:41 |
clarkb | lifeless: you may need to kick the gearman plugin to make it reregister functions | 21:41 |
lifeless | jeblair: you can see on https://jenkins01.testing-cabal.org/ that I have a slave and the job was able to be executed. | 21:41 |
clarkb | zaro: ^ do you know? | 21:41 |
jog0 | http://logs.openstack.org/79/43779/19/experimental/gate-tempest-devstack-vm-large-ops/c69c4ea/logs/screen-n-sch.txt.gz#_2013-09-05_21_36_40_049 | 21:42 |
jog0 | ?? | 21:43 |
jog0 | I have no idea what that is | 21:43 |
* fungi is afk for a bit. going out to eat. bbl | 21:43 | |
jeblair | clarkb: i promise i'll click the button needed to see if it works | 21:43 |
*** mrodden has quit IRC | 21:43 | |
jog0 | clarkb: I say merge it anyay since it is experimental | 21:43 |
jeblair | clarkb: so we don't have to make mordred redo the change again | 21:43 |
jog0 | actually let me make one change first | 21:43 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Try additional Rackspace region https://review.openstack.org/42677 | 21:43 |
jog0 | to make debugging a little easier for now | 21:43 |
jeblair | sigh | 21:44 |
lifeless | clarkb: so I clicked on 'test connection' in jenkins and now zuul got further | 21:44 |
clarkb | jeblair: :/ | 21:44 |
clarkb | jog0: ok I will hold off | 21:44 |
lifeless | http://paste.openstack.org/show/45841/ | 21:44 |
lifeless | ^should I worry about that ? | 21:44 |
uvirtbot | lifeless: Error: "should" is not a valid command. | 21:44 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/devstack-gate: Add support for large_ops tempest test https://review.openstack.org/43779 | 21:45 |
clarkb | lifeless: we update the build descriptions with info about changes and patchests, I think that is what failed. Not a major concern unless it prevents something else from working | 21:45 |
*** mgagne has left #openstack-infra | 21:45 | |
jeblair | clarkb, mordred: i have explained why i still think ps5 is the right approach in my -1 for ps6 on 42677 | 21:45 |
lifeless | clarkb: so zuul didn't merge the patch itself | 21:46 |
lifeless | clarkb: I had to click 'review and submit' or something like that to make it happen | 21:46 |
clarkb | lifeless: but that info can be nice to have when you are debugging stuff through the jenkins UI | 21:46 |
jog0 | clarkb: there, disabled multiple n-cpu which *may* cause things to fail so ahve to rerun the test a few times | 21:46 |
clarkb | jeblair: looking | 21:46 |
lifeless | clarkb: separately, the job definition I have seems to be a no-op, I presume I didn't copy enough jjb stuff over | 21:46 |
clarkb | lifeless: you need everything in openstack_project/files/jenkins_job_builder/config for complete JJB configs | 21:47 |
*** thomasm has quit IRC | 21:47 | |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Try additional Rackspace region https://review.openstack.org/42677 | 21:47 |
openstackgerrit | A change was merged to openstack-infra/config: Use gerrit for the remote update in post jobs https://review.openstack.org/44988 | 21:48 |
zaro | lifeless: i could not load jenkins at https://jenkins01.testing-cabal.org/ | 21:48 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/zuul: Use NNFI scheduler algorithm https://review.openstack.org/45334 | 21:49 |
clarkb | jeblair: I think the worry with the silent queue is valid because it is basicaly the check queue, I was less worried about experimental, but happy to trigger manually for now | 21:49 |
lifeless | zaro: be sure to login | 21:49 |
lifeless | zaro: or did it not let you get that far? | 21:49 |
zaro | lifeless: not even presented with login. | 21:49 |
zaro | lifeless: seems like it just keeps trying to load. | 21:49 |
lifeless | zaro: try now | 21:50 |
*** Ryan_Lane has joined #openstack-infra | 21:50 | |
zaro | lifeless: better now | 21:50 |
lifeless | clarkb: everything no matter what you're actually using? | 21:50 |
openstackgerrit | Alex Gaynor proposed a change to openstack-infra/config: Run the ceilometerclient tests under PyPy https://review.openstack.org/45336 | 21:50 |
lifeless | clarkb: see my review docs on jenkins to see what I up to last night, which included jjb | 21:50 |
zaro | lifeless: should i be able to login? it says Error 'Failed to login:null' | 21:51 |
clarkb | lifeless: looking (you don't actually need everything, but you will if using arbitrary jobs) | 21:51 |
*** changbl has quit IRC | 21:51 | |
*** dina_belova has joined #openstack-infra | 21:52 | |
*** gordc has joined #openstack-infra | 21:53 | |
openstackgerrit | A change was merged to openstack-infra/config: Try additional Rackspace region https://review.openstack.org/42677 | 21:53 |
zaro | lifeless: i can see why you got the exception finding build number 2. i don't see any jenkins projects? | 21:54 |
*** dina_belova has quit IRC | 21:55 | |
jeblair | mordred: https://review.openstack.org/#/c/43145/ | 21:56 |
*** mrmartin has quit IRC | 21:57 | |
*** ericw has quit IRC | 21:57 | |
lifeless | zaro: I have lots - they are set for logged in viewing only | 21:57 |
mordred | jeblair: I'm fine with it | 21:58 |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: Diff between installed packages and requirements https://review.openstack.org/45329 | 21:58 |
jeblair | mordred: your thoughts on clarkb's point about the setup.py hack removal? | 21:58 |
mordred | jeblair: I don't _think_ we need it any more | 21:58 |
lifeless | zaro: I just logged a fresh browser session in | 21:59 |
zaro | lifeless: that error cached on my chrome browser. tried to login with firefox, got further but reached error on launchpad login. | 22:02 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/zuul: Use NNFI scheduler algorithm https://review.openstack.org/45334 | 22:02 |
*** tstevenson has quit IRC | 22:02 | |
zaro | lifeless: got opps id if that helps, OOPS ID: OOPS-8cf9ec9140bf40b6b7ecf191376ae159 | 22:02 |
*** ericw has joined #openstack-infra | 22:02 | |
lifeless | zaro: thats an LP thing | 22:03 |
lifeless | zaro: I can't poke at that these days, sorry. #launchpad should be able to help. | 22:03 |
*** zeus has quit IRC | 22:04 | |
*** prad has quit IRC | 22:05 | |
*** fungi has quit IRC | 22:08 | |
zaro | lifeless: i'm able to login to launchpad.net, i only get the error when i attempt to login redirected from your jenkins server. | 22:09 |
zaro | lifeless: so probably launchpad.net is ok. | 22:09 |
lifeless | zaro: it's not. | 22:09 |
lifeless | zaro: that OOPS-ID indicates an assertion on the LP side. | 22:10 |
*** atiwari has joined #openstack-infra | 22:11 | |
*** fungi has joined #openstack-infra | 22:11 | |
*** dolphm has quit IRC | 22:15 | |
*** burt has quit IRC | 22:15 | |
openstackgerrit | A change was merged to openstack-infra/config: Run the heatclient tests under PyPy https://review.openstack.org/44996 | 22:20 |
*** weshay has quit IRC | 22:21 | |
*** gordc has left #openstack-infra | 22:22 | |
openstackgerrit | Boris Pavlovic proposed a change to openstack-infra/config: Add new project Rally https://review.openstack.org/44952 | 22:24 |
*** jhesketh__ has quit IRC | 22:25 | |
lifeless | the presentations at openstack-infra/publications | 22:25 |
lifeless | what presentation toolchain is needed to work with them? | 22:26 |
lifeless | I see a large index.html; surely that isn't the preferred form of modification | 22:26 |
lifeless | ? | 22:26 |
jeblair | lifeless: which one? | 22:26 |
lifeless | overview seems to be the only one up there so far | 22:27 |
jeblair | i thought another had been added, maybe it's not merged yet | 22:27 |
jeblair | lifeless: anyway, yes, that's the preferred form of modification. | 22:27 |
jeblair | lifeless: skip past the boilerplate at the top; the actual slides are minimal html | 22:27 |
mordred | yup | 22:27 |
boris-42 | jeblair hi | 22:28 |
boris-42 | jeblair I rebase my patch set https://review.openstack.org/#/c/44952/ | 22:28 |
jeblair | as in, minimal, semantic, html | 22:28 |
mordred | it makes it exceptionally easy to edit and display in a browser at conferences easily | 22:28 |
boris-42 | jeblair and thanks for review | 22:29 |
jeblair | boris-42: you're welcome | 22:30 |
openstackgerrit | Chris Krelle proposed a change to openstack-infra/config: Add python 3.3 and pypy checks to ironicclient https://review.openstack.org/45327 | 22:31 |
mordred | jeez. what is it with all of the various things wanting submissions like, 6 months or further out? | 22:34 |
mordred | how the heck am I supposed to have an idea of what's going to be interesting 6 months from how | 22:35 |
mordred | no3w | 22:35 |
mordred | now | 22:35 |
mordred | gah | 22:35 |
mordred | RANT RANT RANT | 22:35 |
openstackgerrit | Chris Krelle proposed a change to openstack-infra/config: Add python 3.3 and pypy checks to ironicclient https://review.openstack.org/45327 | 22:37 |
openstackgerrit | A change was merged to openstack-infra/config: Add projects of Fuel family to Stackforge https://review.openstack.org/45044 | 22:40 |
jeblair | mordred: like what? | 22:42 |
clarkb | mordred: submissions for conferences? | 22:43 |
mordred | yeah | 22:43 |
mordred | a thing just came across asking me to vote for a submission for sxsw - and I didn't even realize that submissions had been opened/closed for that yet (I had been thinking about submitting this year) | 22:43 |
mordred | oh well | 22:43 |
lifeless | mordred: because if it's only interesting for a total of 6 months, it's not interesting enough to present. | 22:43 |
mordred | lifeless: well, I was mostly kidding about that part | 22:43 |
jog0 | clarkb: https://review.openstack.org/#/c/43779/ is ready | 22:44 |
clarkb | jog0: thanks | 22:44 |
jog0 | although maybe its worth just switching over to silent right away? | 22:44 |
jog0 | clarkb: as we have used the experimental command a bunch already | 22:44 |
clarkb | jog0: that would be in a change to openstack-infra/config so the d-g change is fine as is and ready for merging | 22:45 |
jog0 | clarkb: oh right | 22:45 |
*** ryanpetrello has quit IRC | 22:45 | |
clarkb | I have +2'd it, hopefully another core will drive by and give it approval | 22:45 |
jog0 | clarkb: thanks | 22:45 |
*** dims has quit IRC | 22:46 | |
openstackgerrit | A change was merged to openstack-infra/config: Preserve change creation time on project renames https://review.openstack.org/45155 | 22:46 |
jeblair | the cells change passed? | 22:46 |
lifeless | yay. | 22:46 |
jeblair | s/change/job/ | 22:46 |
lifeless | 2013-09-05 22:46:18 ERROR: toxini file 'tox.ini' not found | 22:46 |
lifeless | I have test infrastructure. Ish. | 22:46 |
lifeless | now to write up something useful for people :) | 22:47 |
mordred | lifeless: wot! | 22:47 |
lifeless | or punt and just give the overview talk :) | 22:47 |
mordred | woot! | 22:47 |
lifeless | I don't have 'logs' or 'static' or zmq or a bunch of other services | 22:47 |
lifeless | It would be really nice to run most of this in a lxc cloud spread overlay style over multiple providers | 22:48 |
lifeless | just from a 'I don't have a huge budget' perspective | 22:48 |
mordred | jeblair, clarkb: https://review.openstack.org/#/c/44993/4 | 22:48 |
wenlock | lifeless: thats awesome, where will your docs show up? | 22:48 |
mordred | the root user is not passwordless on machines managed by puppet mysql | 22:49 |
mordred | the module makes a password and puts it in /root/.my.cnf | 22:49 |
jog0 | mordred: thanks | 22:49 |
jog0 | btw yesterday jeblair and i figured out that every release costs about 100k in cloud servers | 22:50 |
jog0 | for infra | 22:50 |
mordred | jog0: nice | 22:50 |
jeblair | jog0: for devstack-gate | 22:50 |
jeblair | there's still more if you count the rest of infra | 22:50 |
jog0 | with an average of 60 or 90 8GB VMS running at any given time (at ~$200) a month | 22:51 |
*** thomasm has joined #openstack-infra | 22:51 | |
clarkb | is that all? | 22:51 |
*** ericw has quit IRC | 22:51 | |
jeblair | clarkb: we need to use moar | 22:51 |
*** dina_belova has joined #openstack-infra | 22:52 | |
jog0 | ++ | 22:53 |
*** dkliban has quit IRC | 22:53 | |
jeblair | jog0: https://review.openstack.org/#/c/45334/ will use moar | 22:53 |
clarkb | I guess we are not counting, jenkins, gerrit, logstash, elasticsearch, and so on | 22:53 |
*** ericw has joined #openstack-infra | 22:53 | |
jog0 | jeblair: woot | 22:54 |
jog0 | jeblair: and the change you mentioned about the nodepool to queue thing | 22:54 |
*** dina_belova has quit IRC | 22:57 | |
jeblair | jog0: yes | 22:57 |
jeblair | clarkb, mordred: yesterday i had the idea that nodepool should be a gearman client that inspects the zuul gearman queue and determines load that way, so that if there are 50 jobs in the queue, it can immediately launch 50 jobs (beyond the min-ready it would otherwise launch) | 22:58 |
jeblair | s/launch 50 jobs/launch 50 nodes/ | 22:58 |
jog0 | jeblair: I didn't realize you were making that up as you went ;) | 22:58 |
*** pcrews has quit IRC | 22:59 | |
jog0 | so I forgot who asked the other day, but the large ops test works with neutron too | 22:59 |
lifeless | wenlock: this mornings docs are not yet up | 23:00 |
lifeless | wenlock: and I have a bunch of errata that isn't yet documented | 23:01 |
lifeless | wenlock: but - https://review.openstack.org/#/dashboard/4190 | 23:01 |
lifeless | wenlock: (and also 20 or so recent commits to infra/config | 23:01 |
*** dims has joined #openstack-infra | 23:01 | |
*** dims has quit IRC | 23:03 | |
*** sarob_ has joined #openstack-infra | 23:08 | |
fungi | lifeless: are you prepping this preso for linuxconf/cloudopen in nola then? | 23:08 |
clarkb | jog0: I was curious but someone else asked | 23:09 |
lifeless | incoming | 23:10 |
*** wenlock has quit IRC | 23:10 | |
lifeless | fungi: kiwipycon, tomorrow. | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document Jenkin slave management. https://review.openstack.org/45345 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document running custom slaves in ones own infra. https://review.openstack.org/45346 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Docs on bringing up Jenkins in new infrastructures. https://review.openstack.org/45216 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Gerrit docs improvements - user and groups. https://review.openstack.org/45001 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document spinning up a derived zuul. https://review.openstack.org/45164 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Phase 3 infra bootstrap docs: gerrit. https://review.openstack.org/44970 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document review.pp parameters a bit. https://review.openstack.org/44969 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Improve Zuul docs. https://review.openstack.org/45163 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document push key acceptance. https://review.openstack.org/45162 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Make gerrit DB setup match actual practice. https://review.openstack.org/44993 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document basic admin hints for jeepyb. https://review.openstack.org/45043 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Non-openstack-ci support for launch/dns.py. https://review.openstack.org/44980 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Document bootstrapping of Gerrit ACLs. https://review.openstack.org/45011 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Improve Jenkins documentation. https://review.openstack.org/45215 | 23:10 |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Explain API projects a little. https://review.openstack.org/45111 | 23:10 |
*** sarob has quit IRC | 23:11 | |
lifeless | and I must apologise, I haven't even looked for code review feedback yet, been too flat out on this arc. | 23:11 |
jog0 | clarkb: it means we can ru nlarge_ops even with neutron | 23:11 |
*** UtahDave has joined #openstack-infra | 23:11 | |
mordred | wow. that's so many patches | 23:12 |
jeblair | lifeless: ok, i won't re-review them then. :) | 23:12 |
lifeless | mordred: about as many landed last week | 23:12 |
*** pcm_ has quit IRC | 23:12 | |
fungi | lifeless: you have to fly to kiwipycon? i suppose i never quite grokked the sheer size of the island | 23:12 |
lifeless | fungi: 2000km or so north to south | 23:12 |
*** nijaba has quit IRC | 23:12 | |
lifeless | fungi: also 2.5 main islands :) | 23:13 |
davidlenwell | driving from north to south in nz is fun tho | 23:13 |
*** nijaba has joined #openstack-infra | 23:13 | |
*** sarob_ has quit IRC | 23:13 | |
lifeless | fungi: so it's totally road-trippable, but you can bet I would not have this all done if we had been driving up | 23:13 |
davidlenwell | once you get used to being on the wrong side of the road | 23:13 |
lifeless | davidlenwell: 'wrong' | 23:14 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Remove ::1 mysql root user. https://review.openstack.org/45347 | 23:14 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Upgrade puppetlabs-mysql to 0.6.1. https://review.openstack.org/45348 | 23:14 |
davidlenwell | I knew you would catch that | 23:14 |
lifeless | :) | 23:14 |
fungi | yikes. i assumed that was just mercator projection aberration or something | 23:14 |
clarkb | jeblair: mordred ^ | 23:14 |
clarkb | fungi: ^ | 23:14 |
lifeless | fungi: it's basically a single long fault line | 23:14 |
clarkb | note 0.6.1 has not been tested at all, and should possibly be WIPed | 23:14 |
lifeless | fungi: subduction zones 3 eva. | 23:14 |
lifeless | s/3/4/ | 23:15 |
mordred | clarkb: instead of removing the user entry - perhaps just adding $mysql_root_password to it? | 23:15 |
clarkb | mordred: I like removing it because it is consistent with the account_security manifest | 23:15 |
lifeless | fungi: a major reason for NZ existing is that the pacific plate is digging under the australian plate as they move by. | 23:15 |
mordred | clarkb: can you explain that further? | 23:15 |
fungi | clarkb: yeah we had a similar change proposed a few months ago to upgrade the module, and abandoned it because nobody had time to confirm it wouldn't break things | 23:15 |
fungi | lifeless: sounds like a good reason to exist | 23:15 |
*** boris-42 has quit IRC | 23:16 | |
lifeless | fungi: from memory mercator only distorts horizontally, no ? | 23:16 |
clarkb | mordred: https://github.com/puppetlabs/puppetlabs-mysql/blob/0.5.0/manifests/server/account_security.pp we use that today, it is just missing root@::1 | 23:16 |
fungi | lifeless: yeah, so it was obviously a bad assumption on my part | 23:16 |
mordred | clarkb: ok | 23:16 |
clarkb | sdague: are you still around? | 23:17 |
mordred | clarkb: I'm still actually not thrilled with that design | 23:18 |
mordred | clarkb: but I'm willing to just give up on it for now | 23:18 |
clarkb | mordred: can you explain why that account exists at all? | 23:18 |
clarkb | if you can't edit /etc/hosts you probably don't belong on the server as root | 23:19 |
mordred | clarkb: the unique key for a user connecting to mysql use user@host | 23:19 |
clarkb | that is the only use case I can figure out | 23:19 |
openstackgerrit | A change was merged to openstack-infra/config: Add pep8 checks for wsme. https://review.openstack.org/45322 | 23:19 |
mordred | btw: | 23:19 |
mordred | mordred@review:~$ mysql -uroot -h::1 | 23:19 |
mordred | ERROR 2003 (HY000): Can't connect to MySQL server on '::1' (111) | 23:19 |
clarkb | oh cool | 23:19 |
mordred | so, I'm pretty sure it's kind of meaningless | 23:19 |
mordred | so I'm now more ok with you removing it | 23:19 |
clarkb | sdague: wondering if you have a good etherpad-lite version that I should be using | 23:19 |
mordred | my concern is that there is a lookup/fallback sequence | 23:20 |
sdague | clarkb: oh, yeh, hold on I'll give you the git hash I'm running | 23:20 |
mordred | and I honestly don't know where ::1 falls in relation to 127.0.0.1 and localhost | 23:20 |
mordred | BUT - it seems that root@localhost which has a password is taking precedence over root@::1 | 23:20 |
clarkb | sdague: are you using a hash newer than the last tagged version? | 23:20 |
mordred | which is good | 23:21 |
sdague | eplite_version => "cd277e5810a0ed2f2dac204a340e588fc329669b" | 23:21 |
sdague | clarkb: yes | 23:21 |
clarkb | thanks | 23:21 |
sdague | well, it was at the time | 23:21 |
clarkb | sdague: current versions is 1.2.91 | 23:21 |
sdague | yeh, I think I'm newer than that | 23:21 |
sdague | basically, their releases don't mean anything | 23:21 |
sdague | and I was told if I found a bug, git pull, before asking about it | 23:22 |
sdague | so I did, it went away, and such it is | 23:22 |
sdague | I'm also on node 0.10.1 | 23:22 |
sdague | fwiw | 23:22 |
clarkb | thanks | 23:22 |
clarkb | all very useful | 23:22 |
mordred | sdague: anything of note that's better in your eplite over ours? | 23:23 |
sdague | I can actually port my puppet stuff back up tomorrow if you like, my policy is a moderate fork of the config repo | 23:23 |
sdague | mordred: just that ep_heading module, which I tried to add to dev, and it exploded | 23:23 |
sdague | so maybe the new hash fixes it | 23:24 |
mordred | sdague: lifeless has just been taking a bunch of notes/patches in an attempt to making tracking it less forky | 23:24 |
sdague | cool | 23:24 |
jeblair | mordred: i don't think lifeless has been doing anything with etherpad | 23:24 |
mordred | clarkb: what's ep_headings do? | 23:24 |
clarkb | sdague: I am going to take a chainsaw to the manifests | 23:24 |
mordred | jeblair: not at all | 23:24 |
clarkb | mordred: gives you headings in the text | 23:24 |
sdague | mordred: it gives you H* headers (html style) with a style bar | 23:24 |
clarkb | sdague: but I think the end product should be more friendly, will wait for your input though | 23:24 |
mordred | jeblair: just letting sdague know that lifeless has been poking at reusability since sdague runs a mild fork | 23:25 |
mordred | clarkb: neat | 23:25 |
clarkb | sdague: really need newer version up before summit so that we can test it a bit | 23:25 |
clarkb | sdague: do you still have the issue with pages that go over a certain size? | 23:25 |
clarkb | sdague: and if so any idea if redis fixes that? | 23:25 |
sdague | clarkb: yes | 23:25 |
sdague | no, it's a client side problem | 23:25 |
clarkb | oh good, now I don't have to worry about redis | 23:25 |
sdague | it might be fixed in latest pulls, I haven't updated recently | 23:26 |
sdague | yeh, I'm using mysql | 23:26 |
clarkb | sdague: maybe I will start with the tip of master and fall back on your hash if I have trouble | 23:26 |
sdague | I think other notable things I changed was to use the mysql puppet module to set up the db, which I think was different than upstream | 23:26 |
clarkb | sdague: it is differnet, switching to that module is on the list of things to do | 23:27 |
sdague | and I created a monit rule to watch etherpad, so it would restarted it when it crashed | 23:27 |
mordred | yup. so if you've got that patch, we'd love to see it | 23:27 |
clarkb | sdague: also support for arbitrary mysql dbs so that we can use mysql as a service | 23:27 |
sdague | yeh, I can set asside an hour tomorrow to try to pull this back in as a patch to upstream | 23:27 |
sdague | or I can do a merge later if you want to hack and slash now clarkb | 23:28 |
*** sarob has joined #openstack-infra | 23:28 | |
sdague | or just throw my stuff up randomly somewhere | 23:28 |
mordred | yeah. I want to move the mysql setup info from ::etherpad to openstack_project::etherpad and pass mysql connection info to ::etherpad (which is a pattern I'd like to do everywhere) | 23:28 |
sdague | if you want to copy / paste things you like | 23:28 |
clarkb | sdague: I feel like a little hack and slash will be a good thing, if nothing else it will refamiliarize myself with the module | 23:28 |
lifeless | jeblair: I haven't touched the etherpad recipe properly, no. | 23:28 |
sdague | yep, no worries | 23:28 |
mordred | as a step towards composibility and possibly dbaas | 23:28 |
sdague | ok, got to deal with dinner things | 23:28 |
lifeless | jeblair: indeed I haven't /really/ done anything for reusability. | 23:29 |
sdague | talk to you tomorrow, and throw me on any reviews for this, I can throw in $0.02 on stuff I found useful | 23:29 |
lifeless | jeblair: though that is the goal, mainly I've just been capturing 'a' recipe to fork with lots of instructions. | 23:29 |
lifeless | jeblair: I think the next stage in this arc would be to treat all the places I say 'copy' or 'migrate' as bugs, either in how I approached it, or in the reusability-of-the-modules | 23:29 |
lifeless | and then start burning them down | 23:29 |
lifeless | if sdague is running a clone, and mordred is/has someone running a clone | 23:30 |
lifeless | there should be lots of folk interested in doing that work | 23:30 |
lifeless | jeblair: on reviews - I am now in consolidate-and-present mode; after that - e.g. Monday - I will go through and address code review comments on these changes | 23:31 |
jeblair | lifeless: np; i think you have a week before auto-abandon | 23:32 |
clarkb | sdague: will do | 23:32 |
lifeless | jeblair: a week of no changes in fact :) | 23:32 |
jeblair | mordred, clarkb: https://jenkins02.openstack.org/view/All/job/gate-tempest-devstack-vm-iad-trial/1/console | 23:32 |
lifeless | jeblair: my main point is that if you want to code review now, I won't be pushing up new patchsets without addressing code review comments *first* | 23:32 |
lifeless | jeblair: since the manic rush to get to the point of being able to present *something* is over | 23:33 |
clarkb | lifeless: I think you can probably make them independent changes | 23:34 |
clarkb | jeblair: cool | 23:34 |
lifeless | clarkb: I could, but effort. | 23:35 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/config: Move MySQL creation out of gerrit class https://review.openstack.org/45351 | 23:37 |
mordred | clarkb: ^^ | 23:37 |
mordred | clarkb: straw-man | 23:37 |
mordred | clarkb: I'm going to WIP it - but something like that is a step I was thinking | 23:37 |
lifeless | mordred: so things like that will cause derived infra to have to copy the change - thats the sort of stuff I really want to see stop being copy-paste | 23:37 |
mordred | lifeless: agree | 23:38 |
*** senk has joined #openstack-infra | 23:40 | |
*** jhesketh_ has joined #openstack-infra | 23:41 | |
*** jhesketh has quit IRC | 23:42 | |
*** dims has joined #openstack-infra | 23:42 | |
*** pcrews has joined #openstack-infra | 23:49 | |
*** ericw has quit IRC | 23:49 | |
*** atiwari has quit IRC | 23:50 | |
*** dina_belova has joined #openstack-infra | 23:53 | |
*** pcrews has quit IRC | 23:55 | |
openstackgerrit | Russell Bryant proposed a change to openstack-infra/reviewstats: Update nova-core. https://review.openstack.org/45356 | 23:56 |
openstackgerrit | A change was merged to openstack-infra/reviewstats: Update nova-core. https://review.openstack.org/45356 | 23:56 |
*** dina_belova has quit IRC | 23:58 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!