jog0 | lifeless: and based on http://logstash.openstack.org/#eyJzZWFyY2giOiJtZXNzYWdlOlwiSW1wb3J0RXJyb3I6IE5vIG1vZHVsZSBuYW1lZCBwYXNzbGliLmhhc2hcIiIsImZpZWxkcyI6W10sIm9mZnNldCI6MCwidGltZWZyYW1lIjoiMTcyODAwIiwiZ3JhcGhtb2RlIjoiY291bnQiLCJ0aW1lIjp7InVzZXJfaW50ZXJ2YWwiOjB9LCJzdGFtcCI6MTM5MDI2MjQwNjMzOH0= | 00:00 |
---|---|---|
jog0 | I ignored it | 00:00 |
jog0 | being I am not supposed to be working | 00:00 |
lifeless | jog0: ack | 00:02 |
jog0 | lifeless: I think your effort to make delete faster will probablly give us the most bang for our buck right now | 00:03 |
*** dcramer_ has quit IRC | 00:03 | |
*** david-lyle_ has quit IRC | 00:05 | |
jog0 | and the gate just reset again | 00:05 |
lifeless | jog0: So its done AFAICT, just needs infra folk to | 00:06 |
bknudson | jog0: it's the "Error: iSCSI device not found at /dev/disk/by-path/ip-127.0.0.1:3260-iscsi-" problem. | 00:06 |
bknudson | in gate-grenade this time. | 00:07 |
*** rakhmerov has quit IRC | 00:08 | |
*** rakhmerov1 has joined #openstack-infra | 00:08 | |
*** oubiwann_ has quit IRC | 00:08 | |
lifeless | clarkb: fungi: mordred: ^ So , any of you around to apply this nodepool stack ? | 00:11 |
lifeless | https://review.openstack.org/#/c/67985/ + https://review.openstack.org/#/c/67986/ | 00:12 |
lifeless | clarkb: fungi: mordred: ^ | 00:12 |
*** ryanpetrello has quit IRC | 00:13 | |
*** rakhmerov1 has quit IRC | 00:14 | |
openstackgerrit | Sean Dague proposed a change to openstack-infra/elastic-recheck: narrow to just pci explosion https://review.openstack.org/67989 | 00:16 |
*** markwash has quit IRC | 00:16 | |
*** markwash has joined #openstack-infra | 00:18 | |
*** moted has quit IRC | 00:19 | |
*** moted has joined #openstack-infra | 00:19 | |
fungi | lifeless: i can look in a moment. trying to grab a quick bite and get evening chores out of the way so i can pack bags--not a lot of time to kill and clean up after nodepool so we can restart it. the external delete loop i've got going the past couple hours seems to be helping some | 00:21 |
lifeless | fungi: what cleanup do you need to do after you kill it ? | 00:22 |
fungi | lifeless: at least previously, it had a tendency to leave in-progress image builds, node builds and so on in disarray | 00:24 |
lifeless | fungi: I've fixed at least one such bug in passing :) | 00:24 |
fungi | so mostly circling back around to clear those out. sounds like maybe it will be less effort with that | 00:25 |
lifeless | fungi: note that there is an explicit 15m delay before it cleans all those out | 00:25 |
fungi | also, i do want to make sure that if i put any changes in place, i'll also be around to roll them back if we end up breaking it badly for some unforseen reason | 00:26 |
fungi | hopefully in about 30 minutes i can free up to give it a go | 00:26 |
fungi | lifeless: how much additional log noise do you expect 67924 to add? | 00:27 |
fungi | just curious if we're going to be blowing up the logs with that | 00:28 |
lifeless | hopefully none | 00:28 |
lifeless | would you like me to make it debug ? | 00:28 |
lifeless | but for instance, if the rackspace api takes 100ms to call | 00:28 |
lifeless | we'd log that message 1/second when we have lots of servers to delete | 00:28 |
fungi | no, i think warning is fine if this is a condition we should warn the administrator about | 00:28 |
lifeless | mmm | 00:29 |
lifeless | perhaps info | 00:29 |
lifeless | fungi: it's easy to change if needed | 00:30 |
lifeless | fungi: the problem is we just don't know if it's happening or not at the moment | 00:30 |
fungi | right, i'm fine with it. should be back in a bit to go over the rest | 00:31 |
openstackgerrit | A change was merged to openstack-infra/elastic-recheck: narrow to just pci explosion https://review.openstack.org/67989 | 00:32 |
lifeless | oh, possibly - lol I suspect we're throttling all our api calls (e.g. floating ip deallocation) to 1/ second | 00:32 |
lifeless | we probably *do* want to make more calls than that | 00:32 |
lifeless | since deleting a server is 1 list, 1 list floating ips, 1 remove ip, 1 delete ip, 1 list keypair, 1 delete keypair 1 delete server 1 list server | 00:33 |
lifeless | or 8 seconds per server at the moment | 00:33 |
*** xchu_ has joined #openstack-infra | 00:39 | |
*** sandywalsh has quit IRC | 00:42 | |
sdague | fungi: when you get back, I'm going to propose ninja merging a test skip | 00:43 |
*** rnirmal has quit IRC | 00:45 | |
lifeless | fungi: found another one | 00:46 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Default to a ratelimit of 10/second for API calls https://review.openstack.org/67993 | 00:46 |
lifeless | deleting 300 servers without this patch == 8 seconds per, or 2400 seconds. | 00:47 |
sdague | lifeless: nice | 00:47 |
jeblair | lifeless: there is one manager per provider, we have 6 providers | 00:47 |
jeblair | lifeless: divide all your numbers by 6 | 00:48 |
lifeless | jeblair: ack - still | 00:48 |
jeblair | lifeless: please base the rate limits on actual rates from our providers | 00:48 |
*** markmcclain has joined #openstack-infra | 00:48 | |
jeblair | lifeless: istr that 1/sec is an actual rate from at least one of our providers | 00:48 |
lifeless | jeblair: per provider actual figures should go in the config no? We haven't set that at all | 00:48 |
lifeless | jeblair: I will have a look and see what I can find | 00:49 |
jeblair | lifeless: hitting rate limit timeouts is not going to make it faster | 00:49 |
lifeless | jeblair: I know | 00:49 |
lifeless | jeblair: next step along that path is to issue less API calls, e.g. request deletion of several resources at once | 00:49 |
jeblair | lifeless: also, the 900 second delay in the node delete method is to avoid the periodic cleanup thread stepping on nodes that are currently being handled by a parallel delete thread | 00:50 |
lifeless | for HP cloud, you have to query them - I can only find the limits for *me*, not for infra | 00:50 |
jeblair | lifeless: please don't remove that protection -- having multiple threads working on the same node is bad news for sqlalchemy | 00:50 |
lifeless | http://docs.hpcloud.com/api/compute/#listLimits-jumplink-span | 00:50 |
lifeless | jeblair: the next patch avoids that problem | 00:50 |
lifeless | jeblair: I can fold the patches together if you want, though they are conceptually separate; reordering doesn't make much sense either | 00:51 |
lifeless | jeblair: btw the 900 second thing hasn't been working ever | 00:51 |
*** dcramer_ has joined #openstack-infra | 00:51 | |
lifeless | jeblair: https://review.openstack.org/#/c/67980/ | 00:51 |
jeblair | lifeless: i understand, but removing it without putting some kind of protection in isn't going to make it better | 00:52 |
lifeless | jeblair: so while it might be bad news, nodepool thus far has been operating *without it* | 00:52 |
lifeless | jeblair: I fix it | 00:52 |
jeblair | lifeless: no, it's just been erroring out in a different place | 00:52 |
lifeless | jeblair: thats too cryptic for me, I don't understand how time.time() < 900 could ever have been true | 00:53 |
jeblair | lifeless: i'm happy you are fixing it; i'm just asking that you please not design a system where two threads are fighting over the same node -- we already know it doesn't work | 00:53 |
jeblair | lifeless: i'm not saying it worked | 00:53 |
lifeless | ok, 10/s is faster than rax's default limits, though we can get them changed | 00:56 |
lifeless | http://docs.rackspace.com/loadbalancers/api/v1.0/clb-devguide/content/Determining_Limits_Programmatically-d1e1039.html | 00:56 |
lifeless | needs to be followed to find out if they have been changed for us already | 00:56 |
lifeless | I'm happy to translate the answers for both HP cloud and rax into the nodepool yaml file; and I'll change the patch to be more conservative (2/s) by default | 00:56 |
bknudson | do reviews like https://review.openstack.org/#/c/65161/ slow down the gate much? | 00:57 |
bknudson | it's approved but fails to merge now | 00:57 |
bknudson | due to some other conflicting change merged already | 00:57 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Default to a ratelimit of 2/second for API calls https://review.openstack.org/67993 | 00:58 |
jeblair | lifeless: i'm sorry i'm not well enough to stick around and work through this with you; i need to go and rest. i'm just trying to provide info to help you know that there is at least some real-world experience in the choices in nodepool. | 00:58 |
lifeless | jeblair: I know there; I'm trying to preserve that | 00:59 |
*** yongli has quit IRC | 00:59 | |
lifeless | jeblair: I don't see any new races possible in my code; I ack that there is an existing one, but will this make it worse? | 00:59 |
lifeless | jeblair: dropping out of delete early should in fact make it better than it was, AFAICT | 01:00 |
jeblair | lifeless: "This may cause multiple conccurent deletion attempts with the completion handler threads, but that should be harmless (no worse than someone running nodepool node-delete from the command line)." | 01:00 |
jeblair | lifeless: that should really be a red flag; deleting things from the command line causes stuff to fail all the time | 01:00 |
lifeless | jeblair: yes, which fungi is triggering right now | 01:00 |
jeblair | we don't want the program to do that on purpose | 01:00 |
lifeless | jeblair: ah - new data! | 01:00 |
mikal | ewindisch: http://zuul.rcbops.com/workload/ is a graph of what turbo hipster is doing, to give you an idea of workload | 01:01 |
jeblair | lifeless: that 900 second delay was intended to avoid that (i grant you it failed, but it was added because we saw _real_ failures) | 01:01 |
lifeless | jeblair: ok, I will make sure they threads don't stomp on each other | 01:01 |
lifeless | using a similar mechanism | 01:01 |
*** yongli has joined #openstack-infra | 01:01 | |
jeblair | lifeless: another change went in to the peridoc cleanup at the same time which caused us to not notice/care as much | 01:01 |
ewindisch | mikal: nice! | 01:01 |
mikal | :) | 01:01 |
mikal | The big dip is because I can't have nice things | 01:02 |
jeblair | lifeless: (the skip on failure and proceed with a new transaction change) | 01:02 |
jeblair | lifeless: okay, i'm really going to go now; i think i've conveyed the info i needed to in order to help; thanks for pitching in. | 01:03 |
mordred | mikal: that's an interesting graph - do we have that in the infra system? | 01:04 |
mikal | mordred: no, because you're lame | 01:04 |
mikal | :P | 01:04 |
*** UtahDave has quit IRC | 01:04 | |
mordred | mikal: really? I don't remember seeing the patch to put it in ... | 01:04 |
mikal | mordred: that's just a quick and dirty graph of the output of the mysql reporter, which hasn't landed in zuul and is unlikely to be used by infra | 01:04 |
mikal | mordred: but which rocks my world for adhoc reporting | 01:05 |
mordred | ah - so you aren't reporting stats to a statsd that could be used by graphite and thus reused upstream? I'm sad taht you're making something for ad-hoc reporting taht the project can't use | 01:05 |
mikal | Please hold, cuddling daughter | 01:07 |
mikal | Ok, so | 01:08 |
mikal | Its more complicated than that of course | 01:08 |
mikal | And has been discussed with James / in a gerrit code review | 01:08 |
mikal | I wrote a mysql reporter because I needed a way of rapidly finding TH misvotes so I could recheck stuff while bedding down TH | 01:09 |
mikal | Its perfect for that | 01:09 |
mikal | Infra says they're not interested because they can use logstash / elastic search to do that sort of thing | 01:09 |
mikal | Whereas I'm not excited by setting up logstash at this point in my life | 01:09 |
sdague | so given that we're not seeming to make progress on the various volume issues - I'm going to propose we turn off the test that seems to be tickling them most often - https://review.openstack.org/#/c/67991/ | 01:09 |
mikal | The graphing is a 50 line python script which just dumps a summary in the right format for the graphing library | 01:10 |
mikal | And infra is welcome to it if they want it | 01:10 |
sdague | mikal: you should be excited to setup logstash, it's kind of awesome :) | 01:10 |
mikal | mordred: ^-- | 01:10 |
mikal | sdague: sure, but I'm focussed on TH reliability at the moment. Anything not on the critical path for that isn't going to happen. | 01:10 |
mikal | Well, isn't going to happen in the next few weeks. | 01:10 |
sdague | mikal: fair | 01:10 |
*** ivar-lazzaro_ has joined #openstack-infra | 01:12 | |
sdague | http://logs.openstack.org/71/67971/2/check/check-tempest-dsvm-full/a71a086/console.html#_2014-01-21_00_16_14_358 - looks like some network hiccups with the mirror | 01:16 |
*** krotscheck has quit IRC | 01:17 | |
*** zhiwei has joined #openstack-infra | 01:18 | |
*** dcramer_ has quit IRC | 01:19 | |
mordred | mikal: fair enough. still, I'm sad whenever you make tools that I can't use. you know, because of bunnies and unicorns | 01:21 |
lifeless | jeblair: on the off chance you aren't actually gone, I think the right thing to do is to move all the deletes to periodic, avoid the races entirely. I'm going to make my patch do that and we can discuss in detail when you're feeling better. | 01:21 |
sdague | mordred: don't forget ponnies | 01:22 |
fungi | okay, back now | 01:23 |
mordred | sdague: I'm sure there are - I REALLY need to make a mirror-per-cloud-region | 01:23 |
mordred | OR - what would be great ... | 01:24 |
sdague | get on that slacker.... ;) | 01:24 |
mordred | dstufft: you know, npm has this feature "--offline", which will satisfy requests purely from local cache if a local cache exists and will not hit internet indexes ... | 01:25 |
mordred | dstufft: it would help _soooo_ many things if pip had one of those | 01:25 |
sdague | yeh, agreed | 01:25 |
*** markmcclain has quit IRC | 01:25 | |
sdague | fungi: ok so next time we get a gate reset - https://review.openstack.org/#/c/67991/ promote that | 01:26 |
sdague | or ninja merge | 01:26 |
fungi | sdague: yep, saw your ping on -qa | 01:26 |
fungi | looking it over real fast like | 01:26 |
sdague | ok, great so you have context | 01:26 |
sdague | basically, turns off a test | 01:26 |
* fungi nods | 01:26 | |
sdague | that is failing enough that we can go figure it out in a side channel | 01:27 |
sdague | because figuring it out in the current gate state isn't working | 01:27 |
mordred | sdague: ++ | 01:28 |
sdague | and with that, I'm done for the day. As I've been plugging for about 14hrs at this point. I'm also not going to be very responsive until afternoon my time tomorrow, as I need to do a couple other things in the morning. | 01:29 |
dstufft | mordred: file a ticket please | 01:29 |
openstackgerrit | Monty Taylor proposed a change to openstack-infra/storyboard-webclient: Storyboard API Interface and basic project management https://review.openstack.org/67582 | 01:29 |
mordred | dstufft: damn. that's a sensible response. I just wanted to complain :) | 01:29 |
*** david-lyle_ has joined #openstack-infra | 01:34 | |
openstackgerrit | Ken'ichi Ohmichi proposed a change to openstack-infra/devstack-gate: Copy libvirt log file after tempest run https://review.openstack.org/61892 | 01:35 |
mikal | mordred: oh, you _can_ run it if you want. Its just that no one wanted to. | 01:35 |
mikal | mordred: it was asserted it was possible to trivially count number of failed test runs in a day with elastic search, at which point I kind of lost interest in the sales pitch | 01:36 |
*** jaypipes has quit IRC | 01:37 | |
*** greghaynes has quit IRC | 01:40 | |
*** weshay has joined #openstack-infra | 01:43 | |
openstackgerrit | Eli Klein proposed a change to openstack-infra/jenkins-job-builder: Added rbenv-env wrapper https://review.openstack.org/65352 | 01:47 |
openstackgerrit | Davanum Srinivas (dims) proposed a change to openstack-infra/devstack-gate: Temporary HACK : Enable UCA https://review.openstack.org/67564 | 01:49 |
*** krotscheck has joined #openstack-infra | 01:50 | |
*** ryanpetrello has joined #openstack-infra | 01:50 | |
*** coolsvap has quit IRC | 01:51 | |
*** ok_delta has joined #openstack-infra | 01:53 | |
*** krotscheck has quit IRC | 01:58 | |
lifeless | incoming | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Avoid redundant updates of node.state=DELETE. https://review.openstack.org/68001 | 01:59 |
lifeless | fungi: ^ here tis | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Log the time a node has been in state DELETE. https://review.openstack.org/68002 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Split out the logic for deleting a nodedb node. https://review.openstack.org/68003 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Use the nonblocking cleanupServer. https://review.openstack.org/68004 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Cleanup nodes in state DELETE immediately. https://review.openstack.org/67979 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Default to a ratelimit of 2/second for API calls https://review.openstack.org/67993 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Provide diagnostics when task rate limiting. https://review.openstack.org/67924 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Log how long nodes have been in DELETE state. https://review.openstack.org/67982 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Consolidate duplicate logging messages. https://review.openstack.org/67983 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Fix early-exit in cleanupOneNode https://review.openstack.org/67980 | 01:59 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Make cleanupServer optionally nonblocking. https://review.openstack.org/67985 | 01:59 |
fungi | lifeless: thanks, having a look | 01:59 |
*** nosnos has joined #openstack-infra | 01:59 | |
lifeless | fungi: https://review.openstack.org/#/c/68004/ is the top | 01:59 |
lifeless | fungi: now I'm going to go work on tripleo-gate again | 02:00 |
lifeless | fungi: ping me if you need anything | 02:00 |
fungi | lifeless: will do--thanks again | 02:01 |
openstackgerrit | Eli Klein proposed a change to openstack-infra/jenkins-job-builder: Add local-branch option https://review.openstack.org/65369 | 02:01 |
*** yaguang has joined #openstack-infra | 02:01 | |
*** ryanpetrello has quit IRC | 02:04 | |
*** ivar-lazzaro_ has quit IRC | 02:09 | |
*** oubiwann_ has joined #openstack-infra | 02:09 | |
lifeless | hmm | 02:10 |
* lifeless resists the temptation to poke at it more | 02:11 | |
openstackgerrit | A change was merged to openstack-infra/storyboard-webclient: Add tox.ini file to run things via tox https://review.openstack.org/67721 | 02:11 |
*** dcramer_ has joined #openstack-infra | 02:12 | |
*** jhesketh__ has quit IRC | 02:13 | |
fungi | there's that gate reset we were waiting for. promoting 67991,1 now | 02:13 |
*** morganfainberg|z is now known as morganfainberg | 02:13 | |
openstackgerrit | Evgeny Fadeev proposed a change to openstack-infra/askbot-theme: modified the launchpad answers importer script https://review.openstack.org/68008 | 02:20 |
*** yamahata has quit IRC | 02:29 | |
*** vkozhukalov has joined #openstack-infra | 02:34 | |
*** jhesketh__ has joined #openstack-infra | 02:37 | |
*** jasondotstar has joined #openstack-infra | 02:40 | |
*** jerryz has quit IRC | 02:40 | |
openstackgerrit | A change was merged to openstack-infra/nodepool: Provide diagnostics when task rate limiting. https://review.openstack.org/67924 | 02:48 |
openstackgerrit | A change was merged to openstack-infra/nodepool: Default to a ratelimit of 2/second for API calls https://review.openstack.org/67993 | 02:48 |
*** dkranz has joined #openstack-infra | 02:48 | |
*** mrda_ is now known as mrda_away | 02:49 | |
*** mrda_away is now known as mrda_ | 02:49 | |
*** sdake has quit IRC | 02:49 | |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Use the nonblocking cleanupServer. https://review.openstack.org/68004 | 02:55 |
*** markmcclain has joined #openstack-infra | 02:57 | |
*** markmcclain1 has joined #openstack-infra | 02:59 | |
*** markmcclain has quit IRC | 03:02 | |
*** markmcclain1 has quit IRC | 03:03 | |
*** dkranz has quit IRC | 03:05 | |
*** weshay has quit IRC | 03:05 | |
lifeless | I need a pointer, where does the pip cache etc on jenkins nodes come from ? | 03:06 |
lifeless | I'm trying to reproduce running tripleo-gate without jenkins; but with a successfully built nodepool image | 03:07 |
fungi | i believe the nodepool prep scripts build those | 03:12 |
* fungi looks | 03:12 | |
openstackgerrit | A change was merged to openstack-infra/askbot-theme: made launchpad importer read and write data separately https://review.openstack.org/67567 | 03:13 |
openstackgerrit | A change was merged to openstack-infra/askbot-theme: modified the launchpad answers importer script https://review.openstack.org/68008 | 03:13 |
fungi | well, the deb cache is built in http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/nodepool/scripts/cache_devstack.py | 03:14 |
fungi | empty pip cache directory is added in http://git.openstack.org/cgit/openstack-infra/config/tree/modules/openstack_project/files/nodepool/scripts/prepare_devstack.sh | 03:15 |
*** dcramer_ has quit IRC | 03:16 | |
fungi | i think we don't prepopulate the pip cache because pip will use what versions it finds there even if newer ones are on the remote mirror | 03:17 |
fungi | but i could be mistaken | 03:17 |
fungi | that might be the egg build cache i'm thinking of | 03:18 |
openstackgerrit | A change was merged to openstack-infra/nodepool: Fix early-exit in cleanupOneNode https://review.openstack.org/67980 | 03:21 |
openstackgerrit | A change was merged to openstack-infra/nodepool: Avoid redundant updates of node.state=DELETE. https://review.openstack.org/68001 | 03:21 |
openstackgerrit | A change was merged to openstack-infra/nodepool: Log the time a node has been in state DELETE. https://review.openstack.org/68002 | 03:21 |
*** emagana has quit IRC | 03:21 | |
*** dkliban has quit IRC | 03:27 | |
*** dcramer_ has joined #openstack-infra | 03:28 | |
*** reed has quit IRC | 03:30 | |
*** yamahata has joined #openstack-infra | 03:32 | |
*** talluri has joined #openstack-infra | 03:33 | |
*** carl_baldwin has joined #openstack-infra | 03:37 | |
*** carl_baldwin has quit IRC | 03:40 | |
*** greghaynes has joined #openstack-infra | 03:40 | |
*** talluri has quit IRC | 03:41 | |
*** salv-orlando has quit IRC | 03:42 | |
*** salv-orlando has joined #openstack-infra | 03:42 | |
*** ok_delta has quit IRC | 03:43 | |
*** talluri has joined #openstack-infra | 03:43 | |
*** gsamfira has joined #openstack-infra | 03:46 | |
*** carl_baldwin has joined #openstack-infra | 03:47 | |
*** pcrews has joined #openstack-infra | 03:47 | |
*** xchu_ has quit IRC | 03:50 | |
*** dkranz has joined #openstack-infra | 03:52 | |
*** mriedem has quit IRC | 03:56 | |
*** ryanpetrello has joined #openstack-infra | 03:57 | |
*** marun has joined #openstack-infra | 04:00 | |
*** mayu has joined #openstack-infra | 04:05 | |
mayu | git puppet error | 04:05 |
mayu | git clone https://git.openstack.org/openstack-infra/config /opt/config/production | 04:05 |
mayu | error: Failed to connect to 2001:4800:7813:516:3bc3:d7f6:ff04:aacb: Network is unreachable while accessing https://git.openstack.org/openstack-infra/config/info/refs | 04:06 |
mayu | I open that link in my browser, there is nothing | 04:07 |
mayu | I follow the guide http://ci.openstack.org/puppet.html#id2 | 04:07 |
mayu | a wrong guide ? | 04:08 |
*** reed has joined #openstack-infra | 04:08 | |
dstufft | mordred: clarkb fungi https://twitter.com/dstufft/status/425480075816239104 | 04:09 |
*** marun has quit IRC | 04:12 | |
*** marun has joined #openstack-infra | 04:13 | |
*** marun has quit IRC | 04:14 | |
*** marun has joined #openstack-infra | 04:16 | |
*** gema has quit IRC | 04:17 | |
*** krotscheck has joined #openstack-infra | 04:19 | |
*** ArxCruz has quit IRC | 04:20 | |
*** krotscheck has quit IRC | 04:24 | |
mordred | dstufft: woot | 04:25 |
*** krotscheck has joined #openstack-infra | 04:26 | |
*** mayu has quit IRC | 04:30 | |
*** pcrews has quit IRC | 04:31 | |
*** emagana has joined #openstack-infra | 04:32 | |
*** emagana has quit IRC | 04:37 | |
*** coolsvap has joined #openstack-infra | 04:38 | |
*** Ryan_Lane has joined #openstack-infra | 04:39 | |
*** ryanpetrello has quit IRC | 04:40 | |
*** talluri has quit IRC | 04:42 | |
*** dstanek has quit IRC | 04:53 | |
*** rcleere has joined #openstack-infra | 04:57 | |
*** carl_baldwin has quit IRC | 04:59 | |
*** carl_baldwin has joined #openstack-infra | 05:01 | |
*** SergeyLukjanov_ is now known as SergeyLukjanov | 05:06 | |
*** marun has quit IRC | 05:08 | |
*** krotscheck has quit IRC | 05:12 | |
*** chandankumar_ has joined #openstack-infra | 05:14 | |
*** oubiwann_ has quit IRC | 05:20 | |
*** dstanek has joined #openstack-infra | 05:20 | |
*** carl_baldwin has quit IRC | 05:25 | |
*** nicedice has quit IRC | 05:28 | |
*** crank has quit IRC | 05:28 | |
*** crank has joined #openstack-infra | 05:29 | |
*** dpyzhov has joined #openstack-infra | 05:31 | |
lifeless | fungi: thanks | 05:36 |
lifeless | fungi: I ask because devstack-gate throws errors all over teh place which are ignored (no set -eu in the scripts) from jenkins :) | 05:36 |
StevenK | Hm | 05:38 |
StevenK | Telling devstack to use postgres ends up with keystone token-get failing horrible | 05:39 |
StevenK | *horribly | 05:39 |
lifeless | lol, don't do that :) | 05:40 |
StevenK | lifeless: Trying to address mikal's whinge about how people whinge about mysql usage, but then don't front up and help about postgres or other engine | 05:41 |
lifeless | StevenK: you will probably regret it | 05:41 |
lifeless | StevenK: personally, I wouldn't even touch anything other than *mysql* until we've got full HA no-downtime on mysql | 05:42 |
lifeless | StevenK: as it will be a massive time sink | 05:42 |
lifeless | because you'll be playing catchup with 900 developers | 05:42 |
StevenK | lifeless: Why not until we have no-downtime HA? | 05:44 |
lifeless | StevenK: because when we have that we're at feasible production deployment stage | 05:44 |
lifeless | StevenK: and if you disappear off into a pit of postgresql for 6 months, we'll still have at least shipped something :) | 05:45 |
lifeless | StevenK: or if I were to do it, same thing :) | 05:45 |
*** jasondotstar has quit IRC | 05:57 | |
bradm | StevenK: I hear sqlite is webscale, maybe you should port it to that.. ;) | 05:58 |
*** emagana has joined #openstack-infra | 05:59 | |
StevenK | bradm: Hahaha | 06:00 |
StevenK | bradm: Poor paste.u.c | 06:00 |
*** reed has quit IRC | 06:01 | |
bradm | StevenK: yet it still runs, I'm surprised how well | 06:01 |
*** rakhmerov has joined #openstack-infra | 06:03 | |
*** nati_ueno has quit IRC | 06:06 | |
* mordred avoids trolling StevenK about postgres | 06:08 | |
StevenK | mordred: Aw | 06:08 |
mordred | StevenK: the thing lifeless didn't mention is that also I'll troll you mercilessly if you decide to go jump into the bottomless pit of postgres | 06:08 |
mordred | not, mind you, because I don't like postgres | 06:09 |
mordred | but because for openstack it quite simply does not matter - postgres is better than mysql by having a bunch of features mysql doesn't have - openstack doesn't use any of them - hence, postgres is sensless overhead | 06:10 |
* mordred isn't opinionated though - certainly didn't used to work for MySQL | 06:10 | |
mordred | StevenK: so ignore me as appropriate | 06:10 |
lifeless | mordred: all true, including transactions | 06:10 |
* mordred stabs lifeless in the face, neck and hands | 06:11 | |
StevenK | Haha | 06:11 |
*** dpyzhov_ has joined #openstack-infra | 06:11 | |
mordred | StevenK: now - if you want to do a db related boondoggle that could be useful - get some fixtures written to control drizzle so that we can just use that in all of the project's unittests | 06:12 |
*** SergeyLukjanov is now known as SergeyLukjanov_a | 06:12 | |
*** dpyzhov has quit IRC | 06:12 | |
*** dpyzhov_ is now known as dpyzhov | 06:12 | |
*** SergeyLukjanov_a is now known as SergeyLukjanov_ | 06:13 | |
lifeless | fo shizzle | 06:14 |
StevenK | We have a postgres fixture. *cough* | 06:14 |
*** nati_ueno has joined #openstack-infra | 06:22 | |
*** nati_ueno has quit IRC | 06:23 | |
*** nati_ueno has joined #openstack-infra | 06:24 | |
* mordred refrains from pointing out that postgres didn't quite make it to usable in the high-volume large-scale install space before people stopped caring about RDBMS's | 06:30 | |
* mordred neglects to mention that the chance of facebook or twitter or google migrating to postgres instead of just migrating to mongodb (from MySQL, of course) is probably pretty low | 06:31 | |
* mordred declines to rise to the troll-bait - preferring the high road | 06:32 | |
mikal | Actually, it really bothers me that all the postgres advocates I meet hate mysql because of problems in myisam that became irrelevant nearly a decade ago | 06:32 |
mikal | Its like they set their opinion on something, and then wont listen to facts... | 06:32 |
mordred | mikal: yeah. taht actually does bother me - especially when I go out of my way when trolling the to point out the areas in which postgres is actually pretty decent | 06:33 |
mikal | Heh | 06:33 |
mordred | mikal: but, you know, sort of like being a white male, considering MySQL's multiple-orders-of-magnitude level of dominance, I suppose expecting quid pro quo and the avoidance of outdated cheapshots is unreasonable | 06:34 |
mikal | I need that on a tshirt | 06:34 |
mikal | I also feel that if the strongest argument you can make in favour of your product is to make cheap shots at the other guys, you have a problem | 06:35 |
mikal | I mean that in general | 06:35 |
mikal | I would be very upset if we had a "why eucalyptus sucks" wiki page or something | 06:35 |
mordred | I do too - I try my best to remember/think about that | 06:35 |
mordred | yup | 06:35 |
mikal | I am glad we finally agree on something | 06:35 |
mikal | I consider this the peak of my day | 06:35 |
mordred | I occasionally feel unhappy when my response to cloudstack is "fuck those guys" | 06:35 |
mikal | But its been a pretty bad day | 06:36 |
mikal | I'm sure cloudstack has a perfectly valid use case. I just don't know what it is. And don't care. | 06:36 |
mordred | I mean, I don't dwell on the unhappiness and usually just have a beer - but I totally agree with you - I'd rather focus on my product and not knock other people's | 06:36 |
mordred | mikal: I hear it has a gui installer | 06:37 |
mordred | mikal: also, it might make sushi | 06:37 |
mordred | because, you know, gui installers and sushi are both super important features for long-term cloud operation | 06:37 |
mikal | Heh | 06:37 |
mikal | Surely some of the distros have gui installers though? | 06:37 |
mikal | So that's not that unique a point | 06:37 |
mikal | I wonder if they have stupid single threaded python problems like us? | 06:38 |
mikal | I think that's probably our biggest flaw | 06:38 |
mordred | mikal: name the last time OpenStack had a problem due to Python's GIL | 06:38 |
mordred | other than people complaining about it while rubbing their neckbeards | 06:38 |
mikal | Oh, I more mean greenthreads | 06:38 |
mikal | I see problems with that a fair bit | 06:38 |
mordred | yeah. that goes in the hipster category | 06:39 |
mikal | We scale with multiple processes | 06:39 |
bradm | mikal: just rewrite everything in go, that'll fix it | 06:39 |
mordred | of "we have to solve the GIL before it'sa problem!" | 06:39 |
*** crank has quit IRC | 06:39 | |
mordred | mikal: you know zuul uses threads, right? seems to scale pretty well up into the realms of crazypants | 06:39 |
mordred | I mean, we just had to move it onto a tmpfs because it was cloning so many damned git repos that the fs couldn't keep up | 06:40 |
mordred | I do not believe we've EVER had a problem with the GIL | 06:40 |
clarkb | mordred: turns out git is slow >_> | 06:40 |
mordred | clarkb: nicely done | 06:40 |
mikal | mordred: sure, but I didn't mean the GIL. | 06:41 |
mordred | mikal: I know - but I got into ranty mode | 06:41 |
mikal | Heh | 06:41 |
mikal | Be nice, I have a headache | 06:41 |
* mordred jumps up and down on mikal | 06:41 | |
*** SergeyLukjanov_ is now known as SergeyLukjanov | 06:42 | |
* mordred hands mikal a cookie | 06:43 | |
mikal | Heh | 06:43 |
mikal | I think salt might help | 06:43 |
* mordred wants a cookie now | 06:43 | |
* mikal wanders off to eat chips | 06:43 | |
*** jamielennox is now known as jamielennox|away | 06:43 | |
*** vkozhukalov has quit IRC | 06:47 | |
*** rcleere has quit IRC | 06:50 | |
*** crank has joined #openstack-infra | 06:51 | |
morganfainberg | clarkb your name and mordred's are the same color in my IRC client... when you were talking about git... i initially thought mordred was talking to himself. | 06:52 |
* morganfainberg wanders off after supplying some random to the channel | 06:52 | |
lifeless | mordred: so neckbeards aside, GIL problems can be fairly subtle, I wouldn't want to rule them out in openstack at this point :) | 06:53 |
*** dpyzhov has quit IRC | 06:56 | |
*** crank has quit IRC | 07:02 | |
*** crank has joined #openstack-infra | 07:02 | |
*** nati_uen_ has joined #openstack-infra | 07:12 | |
*** nati_uen_ has quit IRC | 07:13 | |
*** nati_uen_ has joined #openstack-infra | 07:14 | |
*** nati_ueno has quit IRC | 07:16 | |
*** jufeng has joined #openstack-infra | 07:16 | |
*** yolanda has joined #openstack-infra | 07:18 | |
*** mrda has quit IRC | 07:27 | |
mordred | morganfainberg: hahahahahaha. nice | 07:28 |
*** SergeyLukjanov is now known as SergeyLukjanov_ | 07:29 | |
mordred | lifeless: indeed, I wouldn't either. however, I do think it's possible that we over-optimized over-early based on neckbeard stroking and not, you know, reasons | 07:30 |
mordred | BUT | 07:30 |
mordred | that's the boat we're in | 07:30 |
mordred | so we'll just work with it | 07:30 |
mordred | I also may be wrong | 07:30 |
*** amotoki has joined #openstack-infra | 07:38 | |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 07:45 | |
*** sdake has joined #openstack-infra | 07:58 | |
*** nati_ueno has joined #openstack-infra | 08:02 | |
*** nati_ueno has quit IRC | 08:02 | |
*** nati_ueno has joined #openstack-infra | 08:03 | |
*** pblaho has joined #openstack-infra | 08:03 | |
*** nati_uen_ has quit IRC | 08:05 | |
*** moted has quit IRC | 08:06 | |
*** _ruhe is now known as ruhe | 08:11 | |
*** jcoufal has joined #openstack-infra | 08:13 | |
*** flaper87|afk is now known as flaper87 | 08:14 | |
*** mrmartin has joined #openstack-infra | 08:15 | |
*** dizquierdo has joined #openstack-infra | 08:18 | |
*** vkozhukalov has joined #openstack-infra | 08:29 | |
*** SergeyLukjanov_ is now known as SergeyLukjanov | 08:37 | |
*** fbo_away is now known as fbo | 08:41 | |
*** vogxn has joined #openstack-infra | 08:42 | |
*** dpyzhov has joined #openstack-infra | 08:42 | |
*** DinaBelova_ is now known as DinaBelova | 08:43 | |
*** ilyashakhat has quit IRC | 08:43 | |
*** ilyashakhat has joined #openstack-infra | 08:43 | |
*** yamahata has quit IRC | 08:45 | |
*** yassine has joined #openstack-infra | 08:47 | |
*** nati_uen_ has joined #openstack-infra | 08:51 | |
*** mancdaz_away is now known as mancdaz | 08:52 | |
*** nati_ueno has quit IRC | 08:55 | |
*** praneshp has quit IRC | 08:55 | |
*** nati_uen_ has quit IRC | 08:56 | |
*** markwash has quit IRC | 08:56 | |
*** nati_ueno has joined #openstack-infra | 08:57 | |
*** dpyzhov has quit IRC | 08:57 | |
*** dcramer_ has quit IRC | 08:59 | |
*** boris-42 has quit IRC | 09:00 | |
*** derekh has joined #openstack-infra | 09:00 | |
*** boris-42 has joined #openstack-infra | 09:00 | |
*** afazekas has joined #openstack-infra | 09:03 | |
*** jhesketh__ has quit IRC | 09:04 | |
*** jpich has joined #openstack-infra | 09:07 | |
openstackgerrit | Nikita Konovalov proposed a change to openstack-infra/storyboard: Introducing basic REST API https://review.openstack.org/63118 | 09:07 |
openstackgerrit | Nikita Konovalov proposed a change to openstack-infra/storyboard: API Testset draft https://review.openstack.org/67447 | 09:08 |
openstackgerrit | Mark McLoughlin proposed a change to openstack/requirements: Allow use of oslo.messaging 1.3.0a3 from pypi https://review.openstack.org/68040 | 09:08 |
openstackgerrit | daisy-ycguo proposed a change to openstack-infra/config: Job to push Horizon translation to Transifex https://review.openstack.org/68042 | 09:10 |
*** jhesketh_ has quit IRC | 09:10 | |
*** dcramer_ has joined #openstack-infra | 09:11 | |
*** markmc has joined #openstack-infra | 09:20 | |
*** jufeng has quit IRC | 09:21 | |
markmc | ttx, morning | 09:21 |
markmc | ttx, do you have perms to add me to https://pypi.python.org/pypi?%3Aaction=pkg_edit&name=oslo.messaging ? | 09:21 |
markmc | ttx, or rather https://pypi.python.org/pypi?:action=role_form&package_name=oslo.messaging | 09:21 |
markmc | "Package Index Owner: openstackci " | 09:22 |
*** dpyzhov has joined #openstack-infra | 09:25 | |
matel | Hi, I would like to add a package (XenAPI) to the openstack infra pip repo, how should I do that? | 09:32 |
AJaeger | markmc, what do you need to change? This should be all autogenerated once you upload a package... | 09:35 |
AJaeger | markmc, I never edited the page - thanks to python magic, it looks nice: https://pypi.python.org/pypi/openstack-doc-tools/0.3 | 09:36 |
markmc | AJaeger, was just going to upload manually while waiting for https://review.openstack.org/68040 to be merged | 09:38 |
AJaeger | markmc, so, uploading via normal tagging does not work for you? | 09:39 |
markmc | AJaeger, not until https://review.openstack.org/68040 is merged | 09:39 |
AJaeger | markmc, I'm not an expert here - but you point out requests in the requirements repo. | 09:40 |
markmc | AJaeger, hah | 09:40 |
AJaeger | And we do have jobs that upload new tarballs of python packages to pypi once you tag them. | 09:41 |
markmc | AJaeger, https://review.openstack.org/67131 sorry | 09:41 |
AJaeger | markmc, yeah, that's a patch you need ;) | 09:41 |
AJaeger | So, you have two options: Ask to get 67131 approved or manual upload - correct? | 09:42 |
markmc | yes | 09:44 |
markmc | and the people who can do either are likely asleep, unless ttx can do the latter | 09:45 |
*** zhiwei has quit IRC | 09:50 | |
*** jhesketh has joined #openstack-infra | 09:50 | |
*** nati_uen_ has joined #openstack-infra | 09:50 | |
AJaeger | markmc, understood - sorry, can't help myself either ;( | 09:51 |
markmc | AJaeger, np, thanks | 09:51 |
*** jhesketh__ has joined #openstack-infra | 09:52 | |
*** nati_ueno has quit IRC | 09:54 | |
*** salv-orlando has quit IRC | 09:55 | |
*** salv-orlando has joined #openstack-infra | 09:56 | |
*** nati_uen_ has quit IRC | 09:57 | |
*** nati_ueno has joined #openstack-infra | 09:58 | |
*** jooools has joined #openstack-infra | 10:00 | |
*** jasondotstar has joined #openstack-infra | 10:00 | |
*** johnthetubaguy has joined #openstack-infra | 10:05 | |
*** mancdaz is now known as mancdaz_away | 10:06 | |
*** coolsvap has quit IRC | 10:09 | |
*** sileht has quit IRC | 10:11 | |
*** coolsvap has joined #openstack-infra | 10:12 | |
*** jp_at_hp has joined #openstack-infra | 10:12 | |
*** rakhmerov has quit IRC | 10:12 | |
*** mancdaz_away is now known as mancdaz | 10:14 | |
*** bauzas has joined #openstack-infra | 10:17 | |
*** mrmartin has quit IRC | 10:17 | |
bauzas | folks, does anyone know if the Zuul pipe is broken ? | 10:18 |
bauzas | https://review.openstack.org/#/c/52296/ is asking for a recheck, but can't see it in the check queue | 10:18 |
bauzas | http://status.openstack.org/zuul/ | 10:18 |
AJaeger | bauzas, see the line at the top of the status page: "Queue lengths: 419 events, 4 results. " | 10:19 |
bauzas | I guess there are capacity issues with icehouse-2 but I would at least expect Zuul is putting in its pipe the review | 10:19 |
AJaeger | those 419+4 elements are not shown... | 10:19 |
AJaeger | The gates are really busy | 10:19 |
bauzas | AJaeger: oh, totally missed it | 10:19 |
bauzas | AJaeger: thanks | 10:20 |
AJaeger | bauzas, waiting times of two hours or more until a job starts seem to be normal, so drink a coffee and enjoy | 10:21 |
bauzas | AJaeger: well, I usually see the review appearing in the check queue, but defined as queued | 10:21 |
*** sileht has joined #openstack-infra | 10:21 | |
bauzas | AJaeger: that's the first time I even don't see it in the status page | 10:21 |
AJaeger | it's busy these days ;) | 10:23 |
*** coolsvap has quit IRC | 10:23 | |
matel | Anyone knows how to add a new package to be served by http://pypi.openstack.org/openstack ? | 10:25 |
*** dpyzhov has quit IRC | 10:26 | |
*** coolsvap has joined #openstack-infra | 10:27 | |
AJaeger | matel, perhaps getting it in the global requirements.txt file? | 10:30 |
matel | AJaeger: Thanks, is it the global-requirements.txt @ https://github.com/openstack/requirements ? | 10:31 |
AJaeger | matel, yeah. | 10:32 |
matel | AJaeger: Thanks, I'm pushing a patch! | 10:32 |
AJaeger | matel, I'm not 100 per cent sure that having it in there will have it show up at pypi.o.o - just an educated guess | 10:32 |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/config: Add rtfd trigger jobs for climate project https://review.openstack.org/68062 | 10:33 |
matel | AJaeger: Who whould be the best person to ask? | 10:34 |
AJaeger | matel, the infra team once they are awake ;) | 10:34 |
AJaeger | matel, if your project uses a requirements.txt file, it's good to add all entries to the global file ;) | 10:36 |
*** boris-42 has quit IRC | 10:38 | |
*** yassine has quit IRC | 10:38 | |
*** ruhe is now known as _ruhe | 10:39 | |
matel | AJaeger: I am working on getting xenapi tested within the gate, and to get it up and running, I need the XenAPI package: https://github.com/matelakat/xenapi-os-testing/blob/start-devstack/launch-node.sh#L51 - and I don't want to depend on the official pypi | 10:40 |
AJaeger | matel, better wait for the experts in that case - AFAIU pypi.o.o is just a mirror, so you do something special here | 10:41 |
*** yassine has joined #openstack-infra | 10:42 | |
*** rakhmerov has joined #openstack-infra | 10:42 | |
matel | AJaeger: I think pypi.o.o is a mirror - but only some packages are mirrored. Imagine, a full pypi mirror is a huge thing. | 10:44 |
AJaeger | yep | 10:44 |
AJaeger | Sorry, can't help further. | 10:44 |
matel | Ajaeger: but's see what they say. Approximately, what time does the guys join? | 10:44 |
AJaeger | in four or five hours you should catch some unless they travel (and some might travel today) | 10:45 |
matel | AJaeger: Thanks for your help! | 10:45 |
AJaeger | matel, you're welcome | 10:45 |
*** nati_uen_ has joined #openstack-infra | 10:46 | |
*** rakhmerov has quit IRC | 10:47 | |
*** ArxCruz has joined #openstack-infra | 10:48 | |
openstackgerrit | Sergey Lukjanov proposed a change to openstack-infra/config: Enable tempest/savanna gate tests https://review.openstack.org/68066 | 10:48 |
*** nati_ueno has quit IRC | 10:49 | |
*** yassine has quit IRC | 10:50 | |
*** yassine has joined #openstack-infra | 10:50 | |
*** _ruhe is now known as ruhe | 10:54 | |
*** yassine has quit IRC | 10:57 | |
*** jasondotstar has quit IRC | 11:01 | |
*** max_lobur_afk is now known as max_lobur | 11:05 | |
*** jhesketh__ has quit IRC | 11:07 | |
*** jasondotstar has joined #openstack-infra | 11:07 | |
*** emagana has quit IRC | 11:07 | |
*** yassine has joined #openstack-infra | 11:10 | |
*** dpyzhov has joined #openstack-infra | 11:14 | |
*** SergeyLukjanov is now known as SergeyLukjanov_ | 11:15 | |
*** lcestari has joined #openstack-infra | 11:17 | |
*** jcoufal has quit IRC | 11:17 | |
*** SergeyLukjanov_ is now known as SergeyLukjanov | 11:19 | |
*** yassine has quit IRC | 11:21 | |
*** yassine has joined #openstack-infra | 11:23 | |
*** rfolco has joined #openstack-infra | 11:32 | |
*** emagana has joined #openstack-infra | 11:38 | |
*** talluri has joined #openstack-infra | 11:39 | |
openstackgerrit | Sergey Lukjanov proposed a change to openstack/requirements: Bump paramiko version to 1.9.0 https://review.openstack.org/68088 | 11:40 |
openstackgerrit | Sergey Lukjanov proposed a change to openstack/requirements: Bump paramiko version to 1.9.0 https://review.openstack.org/68088 | 11:40 |
*** ociuhandu has joined #openstack-infra | 11:41 | |
*** jcoufal has joined #openstack-infra | 11:42 | |
*** rakhmerov has joined #openstack-infra | 11:43 | |
*** emagana has quit IRC | 11:46 | |
*** rakhmerov has quit IRC | 11:48 | |
flaper87 | fungi: around? I really need to figure out what's going on here: | 11:49 |
flaper87 | fungi: here https://review.openstack.org/#/c/65499/ | 11:50 |
flaper87 | fungi: I set up an ubuntu saucy box but I couldn't replicate the issue :/ | 11:50 |
flaper87 | fungi: if there's a chance I can get access to one of those boxes, that'd be cool. You said that py26 is almost impossible, perhaps py27 ? | 11:51 |
*** yaguang has quit IRC | 11:55 | |
*** ociuhandu has quit IRC | 11:55 | |
openstackgerrit | Victor Sergeyev proposed a change to openstack-infra/config: Enable ironicclient py33 tests voting https://review.openstack.org/68092 | 11:58 |
*** ociuhandu has joined #openstack-infra | 12:03 | |
*** ruhe is now known as _ruhe | 12:05 | |
*** boris-42 has joined #openstack-infra | 12:05 | |
*** _ruhe is now known as ruhe | 12:11 | |
*** gsamfira has quit IRC | 12:11 | |
*** coolsvap has quit IRC | 12:22 | |
*** b3nt_pin has joined #openstack-infra | 12:23 | |
*** b3nt_pin is now known as beagles | 12:23 | |
sdague | flaper87: the test nodes a precise, I'd start with that to try replication | 12:24 |
*** andreaf has quit IRC | 12:25 | |
*** madmike has quit IRC | 12:27 | |
*** mancdaz is now known as mancdaz_away | 12:31 | |
*** mancdaz_away is now known as mancdaz | 12:33 | |
*** jasondotstar has quit IRC | 12:36 | |
*** DinaBelova is now known as DinaBelova_ | 12:37 | |
*** talluri has quit IRC | 12:37 | |
*** jhesketh has quit IRC | 12:38 | |
*** talluri has joined #openstack-infra | 12:38 | |
*** talluri has quit IRC | 12:42 | |
*** rakhmerov has joined #openstack-infra | 12:44 | |
*** rakhmerov has quit IRC | 12:49 | |
*** smarcet has joined #openstack-infra | 12:49 | |
flaper87 | sdague: oh well, that's a good point. I should've known that | 12:53 |
flaper87 | sdague: danke | 12:53 |
*** heyongli has joined #openstack-infra | 12:57 | |
*** mrmartin has joined #openstack-infra | 12:57 | |
*** markmcclain has joined #openstack-infra | 12:58 | |
*** amotoki has quit IRC | 12:58 | |
*** david-lyle_ has quit IRC | 12:59 | |
*** yaguang has joined #openstack-infra | 12:59 | |
*** gsamfira has joined #openstack-infra | 13:02 | |
*** SergeyLukjanov is now known as SergeyLukjanov_a | 13:02 | |
*** SergeyLukjanov_a is now known as SergeyLukjanov_ | 13:03 | |
*** afazekas has quit IRC | 13:04 | |
*** pblaho has quit IRC | 13:07 | |
*** DinaBelova_ is now known as DinaBelova | 13:07 | |
*** SergeyLukjanov_ is now known as SergeyLukjanov | 13:09 | |
*** Shrews_ has joined #openstack-infra | 13:10 | |
*** Shrews has quit IRC | 13:12 | |
*** Shrews_ is now known as Shrews | 13:12 | |
*** Shrews_ has joined #openstack-infra | 13:13 | |
*** amotoki has joined #openstack-infra | 13:14 | |
*** Shrews_ has quit IRC | 13:14 | |
*** Shrews_ has joined #openstack-infra | 13:15 | |
*** Shrews has quit IRC | 13:16 | |
*** Shrews_ has quit IRC | 13:16 | |
*** Shrews has joined #openstack-infra | 13:19 | |
*** afazekas has joined #openstack-infra | 13:19 | |
*** markmcclain has quit IRC | 13:22 | |
*** emagana has joined #openstack-infra | 13:26 | |
*** nosnos has quit IRC | 13:30 | |
*** nosnos has joined #openstack-infra | 13:31 | |
*** emagana has quit IRC | 13:31 | |
openstackgerrit | Kei YAMAZAKI proposed a change to openstack-infra/jenkins-job-builder: Fix multibyte character problem https://review.openstack.org/64610 | 13:35 |
*** mfink has joined #openstack-infra | 13:36 | |
*** nosnos has quit IRC | 13:38 | |
*** miqui_ has joined #openstack-infra | 13:38 | |
*** miqui_ has quit IRC | 13:39 | |
ttx | markmc: I can't | 13:39 |
*** miqui has joined #openstack-infra | 13:39 | |
*** gema has joined #openstack-infra | 13:42 | |
*** ruhe is now known as _ruhe | 13:42 | |
*** rpodolyaka has joined #openstack-infra | 13:44 | |
ttx | Wow 123 | 13:44 |
*** _ruhe is now known as ruhe | 13:45 | |
*** rakhmerov has joined #openstack-infra | 13:45 | |
*** gema has left #openstack-infra | 13:46 | |
*** dpyzhov has quit IRC | 13:46 | |
*** jasondotstar has joined #openstack-infra | 13:47 | |
fungi | ttx: wow high or wow lower than you expected? note that after the two major gate-loosening patches yesterday we went from averaging a commit every 6 hours to every hour overnight. still terribad tho | 13:47 |
ttx | high :) | 13:48 |
openstackgerrit | Nikita Konovalov proposed a change to openstack-infra/storyboard: API tests for rest https://review.openstack.org/67447 | 13:48 |
ttx | one commit per hour ? | 13:49 |
fungi | the time at the head of the gate was as high as 60 hours, now it's at least back down to 50 | 13:49 |
*** julim has joined #openstack-infra | 13:49 | |
fungi | yeah | 13:49 |
fungi | looking at the commit log for openstack/openstack: http://git.openstack.org/cgit/openstack/openstack/log/ | 13:49 |
*** rakhmerov has quit IRC | 13:50 | |
fungi | just a reminder to all, i'm basically afk or semi-away most of today for travel | 13:50 |
ttx | fungi: so it's not just load. We also merge less | 13:51 |
*** dpyzhov has joined #openstack-infra | 13:51 | |
*** ryanpetrello has joined #openstack-infra | 13:51 | |
bknudson | would it be better to merge multiple commits into one big one? | 13:51 |
*** dstanek has quit IRC | 13:51 | |
*** mriedem has joined #openstack-infra | 13:51 | |
fungi | ttx: yeah, the reset rate is still rather high | 13:51 |
*** thomasem has joined #openstack-infra | 13:52 | |
fungi | and we're still strained for available test node quota, causing zuul to swing from long periods trying to satisfy the check backlog, to long periods trying to service the gate, then back again | 13:53 |
ttx | fungi: we'll consider pushing back the milestone one week | 13:54 |
fungi | and as developer activity increases during the day, the check pipeline buildup/runthrough gets bigger and bigger, making the swing period longer and longer | 13:54 |
ttx | fungi: but that's a bit useless if we don't think we can fix it to chrun at least 50 paches a day | 13:54 |
bknudson | do you want us not to approve anything? | 13:55 |
fungi | i'm not sure not approving changes will necessarily help, since once we catch up we'll just be facing a hung dump of new approvals | 13:56 |
fungi | at pea yesterday core reviewers were (re)approving a roughly dozen changes an hour | 13:56 |
fungi | peak | 13:56 |
mriedem | have we seen things get better after sdague skipped test_volume_boot_pattern to get past 1720608? | 13:56 |
fungi | mriedem: yes | 13:56 |
mriedem | ok | 13:56 |
fungi | we made some headway since then | 13:57 |
*** mancdaz is now known as mancdaz_away | 13:57 | |
*** thuc has joined #openstack-infra | 13:57 | |
sdague | mriedem: there are still other volumes tests failing, at a lower fail rate | 13:57 |
sdague | so I think the revert is still a good idea | 13:58 |
*** yaguang has quit IRC | 13:58 | |
*** dcramer_ has quit IRC | 13:58 | |
*** rcleere has joined #openstack-infra | 13:59 | |
sdague | ttx: I had a -dev email with some updates this morning, including merge count for last 12 hrs | 13:59 |
ttx | sdague: ok, I'm way behind | 14:00 |
*** dprince has joined #openstack-infra | 14:00 | |
ttx | catching up | 14:00 |
*** vogxn has quit IRC | 14:00 | |
ttx | 7am here | 14:00 |
sdague | pshaw, I'm always up at 7am ;) | 14:00 |
*** coolsvap has joined #openstack-infra | 14:00 | |
*** sgordon_ has joined #openstack-infra | 14:01 | |
*** heyongli has quit IRC | 14:02 | |
*** viktors has joined #openstack-infra | 14:02 | |
bknudson | it looks like whatever's scheduling all the gate jobs is preferring the "easy" tests through all the commits... | 14:03 |
*** max_lobur is now known as max_lobur_afk | 14:04 | |
bknudson | so rather than running gate-tempest-dsvm-full for the first commit, it's running gate-cinder-pep8 for all the commits in the queue | 14:04 |
bknudson | wouldn't it be better to have all the jobs in the first commit running? | 14:05 |
*** mancdaz_away is now known as mancdaz | 14:05 | |
fungi | bknudson: that's because of node availability. it has plenty of long-running nodes for those jobs, but not enough single-use nodes for the others | 14:05 |
*** dims has quit IRC | 14:06 | |
fungi | if all jobs used the exact same pool of systems, then yes what you describe is how it would work | 14:06 |
bknudson | ok, maybe because of the delete time that was discussed earlier. | 14:06 |
ttx | sdague: If we agree that the current situation is exceptional, I think deferring one week makes sense | 14:06 |
ttx | sdague: might screw your gateblocking bugday a bit but I think we are past that | 14:06 |
anteaya | fungi I can answer questions today, I'm trying to catch up on the situation, is there a current standing suggestion for +A's? | 14:06 |
fungi | bknudson: and, also, the build time. creating new nodes from snapshots and registering them as jenkins slaves is time-consuming activity as well | 14:07 |
sdague | ttx: honestly, I think that if i2 pushes back, we run the bugday on monday regardless | 14:07 |
*** dkliban has joined #openstack-infra | 14:07 | |
ttx | yep | 14:07 |
fungi | it seems like we've been running a gate bugday for the past two weeks already | 14:07 |
sdague | well, a few of us have been | 14:08 |
sdague | which is why I'm trying to call out specifically who's helping in the emails... maybe encourage others to help | 14:08 |
fungi | anteaya: business as usual as far as i know. also clarkb should be around in a few hours | 14:08 |
anteaya | very good | 14:08 |
anteaya | will do my best to be helpful | 14:08 |
anteaya | fungi: travel safely | 14:08 |
fungi | sdague: i'm taking your word for it--still been too busy to read e-mail | 14:09 |
*** ryanpetrello has quit IRC | 14:09 | |
fungi | thanks anteaya | 14:09 |
anteaya | :D | 14:09 |
sdague | fungi: I know I keep bugging folks on this one - https://review.openstack.org/#/c/67591/ but it will really help let more people find the missing bugs | 14:09 |
sdague | because right now it's jog0 and I until we get that hit list out there public | 14:09 |
mriedem | sdague: the cinder revert has passed jenkins twice already https://review.openstack.org/#/c/67973/ | 14:12 |
mriedem | any more rechecks after that won't run test_volume_boot_pattern now that it's skipped, | 14:12 |
mriedem | but should be good to recheck still if there are more flaky volume fails | 14:13 |
sdague | mriedem: they are at a lower fail rate, so at this point if you suspect it's the culprit, I'd just get it approved, promote, and see how the gate reacts. | 14:13 |
mriedem | it's got one +2 | 14:14 |
mriedem | jgriffith was still looking into it yesterday last i heard, he wasn't sure how the test ever worked... | 14:15 |
mriedem | i'll keep following it today | 14:15 |
matel | Anyone knows how to add a new package to be served by http://pypi.openstack.org/openstack ? I would like to add the XenAPI package. | 14:15 |
anteaya | hi matel, fungi is in transit today so you might have to wait about 3 more hours for clarkb to show up | 14:16 |
matel | anteaya: Thanks for the info, I will wait. | 14:17 |
anteaya | unless someone else with that knowledge is around, unfortunately I am not such a person | 14:17 |
anteaya | matel: thanks for your patience | 14:17 |
*** changbl has quit IRC | 14:17 | |
openstackgerrit | Trevor McKay proposed a change to openstack/requirements: Update python-savannaclient version https://review.openstack.org/68122 | 14:17 |
*** dims has joined #openstack-infra | 14:19 | |
*** yassine has quit IRC | 14:21 | |
*** yassine has joined #openstack-infra | 14:22 | |
*** mgagne1 has joined #openstack-infra | 14:22 | |
*** yaguang has joined #openstack-infra | 14:22 | |
*** mgagne has quit IRC | 14:23 | |
*** mgagne has joined #openstack-infra | 14:24 | |
*** CaptTofu has joined #openstack-infra | 14:25 | |
openstackgerrit | Trevor McKay proposed a change to openstack/requirements: Update python-savannaclient version https://review.openstack.org/68122 | 14:26 |
openstackgerrit | Ruslan Kamaldinov proposed a change to openstack-infra/storyboard: Add tests for Alembic migrations https://review.openstack.org/66414 | 14:26 |
*** max_lobur_afk is now known as max_lobur | 14:27 | |
*** mgagne1 has quit IRC | 14:27 | |
*** odyssey4me has joined #openstack-infra | 14:28 | |
*** eharney has joined #openstack-infra | 14:29 | |
*** thuc has quit IRC | 14:30 | |
*** mancdaz is now known as mancdaz_away | 14:31 | |
*** thuc has joined #openstack-infra | 14:31 | |
*** mancdaz_away is now known as mancdaz | 14:32 | |
*** dizquierdo has quit IRC | 14:33 | |
*** mfer has joined #openstack-infra | 14:33 | |
*** BobBall is now known as BobBallAWay | 14:34 | |
*** thuc has quit IRC | 14:36 | |
*** SergeyLukjanov is now known as SergeyLukjanov_a | 14:42 | |
*** SergeyLukjanov_a is now known as SergeyLukjanov_ | 14:43 | |
*** prad has joined #openstack-infra | 14:44 | |
*** coolsvap_away has joined #openstack-infra | 14:45 | |
*** prad has quit IRC | 14:46 | |
*** rakhmerov has joined #openstack-infra | 14:46 | |
*** gokrokve has joined #openstack-infra | 14:46 | |
*** coolsvap has quit IRC | 14:47 | |
*** coolsvap_away is now known as coolsvap | 14:48 | |
*** mriedem has quit IRC | 14:49 | |
*** dstanek has joined #openstack-infra | 14:49 | |
*** dcramer_ has joined #openstack-infra | 14:50 | |
*** changbl has joined #openstack-infra | 14:50 | |
*** SergeyLukjanov_ is now known as SergeyLukjanov | 14:50 | |
*** rakhmerov has quit IRC | 14:50 | |
*** bauzas has quit IRC | 14:51 | |
*** vogxn has joined #openstack-infra | 14:51 | |
*** vogxn has quit IRC | 14:52 | |
*** mgagne has quit IRC | 14:53 | |
*** bauzas has joined #openstack-infra | 14:53 | |
*** boris-42 has quit IRC | 14:54 | |
*** amotoki has quit IRC | 14:55 | |
*** oubiwann_ has joined #openstack-infra | 15:02 | |
*** jcoufal has quit IRC | 15:02 | |
*** jcoufal_ has joined #openstack-infra | 15:02 | |
*** vkozhukalov has quit IRC | 15:04 | |
*** chmouel has quit IRC | 15:05 | |
*** chmouel_ has joined #openstack-infra | 15:05 | |
*** chmouel_ is now known as chmouel | 15:06 | |
*** prad has joined #openstack-infra | 15:07 | |
*** IvanBerezovskiy has left #openstack-infra | 15:08 | |
*** yassine has quit IRC | 15:08 | |
*** emagana has joined #openstack-infra | 15:08 | |
*** yassine has joined #openstack-infra | 15:09 | |
*** vogxn has joined #openstack-infra | 15:10 | |
*** ryanpetrello has joined #openstack-infra | 15:10 | |
*** emagana has quit IRC | 15:13 | |
gsamfira | Hey guys. What plugin should I use for testr to generate HTML results. Like these for example: http://logs.openstack.org/71/68071/1/check/gate-ceilometer-python27/b68e4db/ | 15:15 |
*** mgagne has joined #openstack-infra | 15:16 | |
anteaya | hi gsamfira, I don't know if that is a plugin | 15:17 |
anteaya | sdague: do you happen to know? | 15:17 |
*** mgagne1 has joined #openstack-infra | 15:17 | |
anteaya | gsamfira: I suspect it has less to do with testr and more to do with how we have set up logging and compression with our system | 15:19 |
*** vogxn has quit IRC | 15:19 | |
*** rcleere has quit IRC | 15:20 | |
*** mgagne has quit IRC | 15:21 | |
gsamfira | thank you anteaya | 15:22 |
anteaya | np, sorry I don't have more details for you | 15:22 |
gsamfira | that's ok. I'll keep digging | 15:23 |
*** marun has joined #openstack-infra | 15:25 | |
gsamfira | for posterity, it apears to be a helper script: http://goo.gl/JqCE4s | 15:25 |
anteaya | gsamfira: I think this might be helpful: http://git.openstack.org/cgit/openstack-infra/os-loganalyze/tree/os_loganalyze/cmd/htmlify_log.py | 15:25 |
gsamfira | thanks. I'll have a look at that as well | 15:26 |
anteaya | yes that is probably part of it as well | 15:26 |
anteaya | sure, glad you were able to find something useful | 15:26 |
*** mgagne1 has quit IRC | 15:30 | |
*** rnirmal has joined #openstack-infra | 15:30 | |
openstackgerrit | A change was merged to openstack/requirements: Allow for sqla 0.8... finally https://review.openstack.org/64831 | 15:31 |
*** mriedem has joined #openstack-infra | 15:31 | |
anteaya | dolphm: I'm seeing a string of 5 keystone failures at the head of the gate, are you watching these patches? | 15:33 |
*** yassine has quit IRC | 15:35 | |
*** yassine has joined #openstack-infra | 15:35 | |
*** fifieldt has joined #openstack-infra | 15:35 | |
anteaya | 67204, 64587, 64749, 64589, 64758, 64759 | 15:35 |
anteaya | so 6 now | 15:35 |
dolphm | anteaya: yes | 15:36 |
dolphm | anteaya: they're all dependent on each other | 15:36 |
*** jchiles has joined #openstack-infra | 15:37 | |
*** UtahDave has joined #openstack-infra | 15:37 | |
anteaya | ah | 15:37 |
dolphm | anteaya: if they get bounced i'm squashing them into one change lol | 15:38 |
anteaya | so the first one is failing | 15:38 |
anteaya | k | 15:38 |
dolphm | anteaya: the last two (still showing green) are separate | 15:38 |
anteaya | k | 15:38 |
anteaya | thanks for being on top of it | 15:38 |
anteaya | I'm not a fan of the squash, but if the net result is < 200 lines of code, so be it | 15:39 |
dolphm | anteaya: agree | 15:39 |
anteaya | it that error a heat error? | 15:40 |
*** nati_uen_ has quit IRC | 15:40 | |
anteaya | I haven't seen it before, but I haven't been looking at random test logs lately | 15:40 |
*** rcleere has joined #openstack-infra | 15:41 | |
*** emagana has joined #openstack-infra | 15:42 | |
sdague | ok, new sin of the day: people running reverify on merge conflict -2s | 15:42 |
sdague | https://review.openstack.org/#/c/55449/ | 15:42 |
anteaya | dolphm: so can you snipe out those patches then so they don't get in the gate reset, please? | 15:43 |
*** basic` has quit IRC | 15:43 | |
*** mrmartin has quit IRC | 15:43 | |
sdague | anteaya: the stuff at the top is disconnected | 15:43 |
sdague | it won't reset anything any more | 15:43 |
anteaya | so do the dependant keystone patches need to be sniped or no? | 15:44 |
sdague | no | 15:44 |
anteaya | okay great | 15:44 |
anteaya | dolphm: ignore me | 15:44 |
anteaya | good thing reverify is retired | 15:44 |
sdague | though I just looked at the inprogress job on 64575,15 | 15:45 |
sdague | and it's failing right now | 15:45 |
sdague | so about to do another reset | 15:45 |
russellb | sdague: i have another patch for the PCI extension bug, will have it up in a minute after tests finish locally | 15:45 |
russellb | sdague: patch yesterday didn't get it all | 15:45 |
anteaya | so should dolphm snipe or not snipe? | 15:45 |
sdague | russellb: great | 15:45 |
dolphm | anteaya: sdague: let me know what you need from me | 15:45 |
sdague | anteaya: not bother | 15:46 |
*** rakhmerov has joined #openstack-infra | 15:46 | |
anteaya | dolphm: so 67204,1 has another swing at the gate it appears | 15:46 |
sdague | lets get the full test results on the fail so we can figure out what's wrong | 15:46 |
* anteaya digs out her rubber chicken | 15:46 | |
*** emagana has quit IRC | 15:47 | |
*** sandywalsh has joined #openstack-infra | 15:50 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 15:51 | |
*** markwash has joined #openstack-infra | 15:52 | |
russellb | sdague: https://review.openstack.org/68147 | 15:52 |
russellb | need that one reviewed and promoted i think | 15:53 |
bknudson | setUpClass (tempest.api.compute.v3.servers.test_create_server.ServersV3TestJSON) ... FAIL -- so the gate's going to reset on that one. | 15:53 |
sdague | russellb: +2 from me. And I agree a promote is in order on that | 15:54 |
russellb | sdague: thanks | 15:54 |
anteaya | so right now fungi and clarkb and jeblair (when his is not ill) can promote, yes? | 15:55 |
anteaya | mordred do you have access to that magic button? | 15:56 |
mordred | anteaya: access, yes. knowledge, no. | 15:56 |
openstackgerrit | Pavel Sedlák proposed a change to openstack-infra/jenkins-job-builder: Add support for Test Stability with Junit https://review.openstack.org/68152 | 15:57 |
*** jcoufal_ has quit IRC | 15:57 | |
*** SergeyLukjanov is now known as SergeyLukjanov_ | 15:57 | |
anteaya | mordred: dangerous combination, we will wait for clarkb | 15:58 |
*** mancdaz is now known as mancdaz_away | 15:58 | |
anteaya | he tells me promotions are expensive, and I don't know what the trade off is for the use | 15:58 |
*** mancdaz_away is now known as mancdaz | 15:58 | |
mordred | they cause a gate reset | 15:59 |
mordred | usually we watch for a reset and then do the promotion real quick | 15:59 |
anteaya | cause? or float to the top? | 15:59 |
mordred | cause | 15:59 |
anteaya | ah so timing is important | 15:59 |
mordred | one of these days, I think we should add a feature "promote-on-next-reset" | 15:59 |
fungi | i can try to promote 68147 from the airport in a but | 15:59 |
anteaya | yeah, thou shalt not cause a gate reset | 15:59 |
anteaya | fungi: thanks, as you see fit | 16:00 |
fungi | er in a bit | 16:00 |
fungi | but not if it's not urgent | 16:00 |
* fungi will bbiaw | 16:00 | |
anteaya | fungi: I think we have a reset coming up | 16:00 |
mordred | fungi: your flight got uncancelled? | 16:00 |
sdague | sqla 0.8 in global requirements just merged | 16:00 |
mordred | woot | 16:00 |
mordred | (we didn't decide to just go to 0.9?) | 16:00 |
sdague | doesn't work | 16:00 |
*** senk has joined #openstack-infra | 16:01 | |
sdague | https://review.openstack.org/#/c/66156/ | 16:01 |
mordred | k. excellent | 16:01 |
sdague | that's a partial fix, but more is needed | 16:01 |
sdague | I think it's probably only migrate that's a problem | 16:01 |
sdague | stepping away for a few | 16:02 |
*** DinaBelova is now known as DinaBelova_ | 16:02 | |
*** dmsimard has joined #openstack-infra | 16:02 | |
*** boris-42 has joined #openstack-infra | 16:02 | |
dmsimard | Hi guys, what's the magic word to reschedule a merge of a patchset ? It failed because of yesterday's github issues. | 16:02 |
*** alexpilotti has joined #openstack-infra | 16:02 | |
anteaya | dmsimard: can you post the url of the patchset please? | 16:03 |
dmsimard | Nevermind, I opened my eyes :) | 16:03 |
dmsimard | anteaya: https://review.openstack.org/#/c/67571/ | 16:03 |
* russellb is unaware of what "yesterday's github issues" are ... | 16:03 | |
*** carl_baldwin has joined #openstack-infra | 16:03 | |
russellb | ah | 16:04 |
dmsimard | russellb: https://twitter.com/githubstatus | 16:04 |
*** burt1 has joined #openstack-infra | 16:04 | |
*** dpyzhov has quit IRC | 16:06 | |
anteaya | dmsimard: okay you need a reapproval for this to head to the gate | 16:06 |
anteaya | dmsimard: track down one of the cores for this repo and ask them to +A again, that will trigger it | 16:06 |
anteaya | we have removed "reverify" since it was getting abused | 16:07 |
*** dpyzhov has joined #openstack-infra | 16:07 | |
dmsimard | anteaya: Understandable, and it was surely worse considering there has been a lot of problems lately | 16:07 |
*** nati_ueno has joined #openstack-infra | 16:08 | |
anteaya | dmsimard: thanks for understanding | 16:08 |
anteaya | yes, some folks pay attention to the bigger picture and some folks not so much | 16:08 |
*** dpyzhov has quit IRC | 16:09 | |
*** chandankumar_ has quit IRC | 16:10 | |
*** luqas has joined #openstack-infra | 16:11 | |
*** mrodden has joined #openstack-infra | 16:11 | |
*** prad has quit IRC | 16:11 | |
*** prad has joined #openstack-infra | 16:12 | |
*** dmsimard1 has joined #openstack-infra | 16:12 | |
*** dmsimard has quit IRC | 16:13 | |
*** mgagne has joined #openstack-infra | 16:14 | |
*** kruskakli has joined #openstack-infra | 16:15 | |
*** markwash has quit IRC | 16:15 | |
*** eharney has quit IRC | 16:15 | |
*** mgagne1 has joined #openstack-infra | 16:16 | |
*** sgordon_ has quit IRC | 16:18 | |
*** dmsimard1 is now known as dmsimard | 16:18 | |
*** mgagne has quit IRC | 16:19 | |
*** gyee has joined #openstack-infra | 16:19 | |
dmsimard | anteaya: Got a core to re-approve, let's wait and see | 16:25 |
mriedem | sdague: fungi: we should get this to the top of the queue: https://review.openstack.org/#/c/68147/ | 16:26 |
anteaya | dmsimard: may the force be with you | 16:26 |
russellb | i think fungi may be on a plane today | 16:27 |
russellb | so maybe clarkb when he's around? | 16:27 |
*** mgagne1 has quit IRC | 16:28 | |
*** mgagne has joined #openstack-infra | 16:28 | |
*** LinuxJedi is now known as LinuxJedi__ | 16:29 | |
*** mancdaz is now known as mancdaz_away | 16:29 | |
*** mrodden has quit IRC | 16:30 | |
*** mgagne1 has joined #openstack-infra | 16:30 | |
*** mancdaz_away is now known as mancdaz | 16:31 | |
locke105 | mriedem: #devstackgateheroes | 16:32 |
mriedem | locke105: you must be at home then | 16:33 |
*** mgagne has quit IRC | 16:33 | |
*** DinaBelova_ is now known as DinaBelova | 16:33 | |
*** eharney has joined #openstack-infra | 16:33 | |
russellb | locke105: there's nobody in there :( | 16:34 |
locke105 | lol, was more supposed to be a twitter hashtag | 16:35 |
dmsimard | lol. | 16:35 |
locke105 | pretty sure infra is the place where devstackgateheroes hang out | 16:36 |
anteaya | would be a great channel for the gate bugday | 16:37 |
*** pcrews has joined #openstack-infra | 16:37 | |
*** kgriffs has joined #openstack-infra | 16:38 | |
russellb | locke105: :-p | 16:38 |
russellb | hashtags have ruined us all | 16:39 |
russellb | not long ago i noticed my orange, yes a fruit, had a hashtag sticker on it | 16:39 |
russellb | didn't even make any sense | 16:39 |
Mithrandir | was it #orange? | 16:39 |
russellb | https://twitter.com/russellbryant/status/413729136990236672 | 16:39 |
russellb | #spin | 16:39 |
russellb | wat | 16:39 |
mriedem | fungi: clarkb: can this be promoted? https://review.openstack.org/#/c/68147/ | 16:39 |
mriedem | my wife watches the bachelor/ette, they think it's cool to put tweets on the screen during the show. the show is bad enough on it's own, but that takes it to a new level. | 16:41 |
mriedem | #omfgimsojealousinlove | 16:41 |
*** markwash has joined #openstack-infra | 16:41 | |
mgagne1 | can we get someone to review this one? https://review.openstack.org/#/c/65406/ it's about removing a check/gate job for puppet projects | 16:41 |
*** mgagne1 is now known as mgagne | 16:42 | |
anteaya | mgagne: today we have mordred and clarkb | 16:42 |
anteaya | and clarkb has not arrived yet | 16:42 |
mgagne | anteaya: and mordred is in a plane? =) | 16:43 |
anteaya | surprising not | 16:43 |
anteaya | I don't think | 16:43 |
anteaya | fungi is on a plane | 16:43 |
mgagne | anteaya: alright, thanks =) | 16:43 |
anteaya | and jeblair is sick and soon to be on a plane, I expect | 16:43 |
anteaya | np | 16:43 |
*** LinuxJedi__ has quit IRC | 16:44 | |
*** thuc has joined #openstack-infra | 16:45 | |
*** bauzas has quit IRC | 16:45 | |
*** praneshp has joined #openstack-infra | 16:45 | |
*** odyssey4me has quit IRC | 16:45 | |
*** Aarongr_afk is now known as AaronGr | 16:46 | |
*** markmc has quit IRC | 16:46 | |
*** emagana has joined #openstack-infra | 16:47 | |
*** kraman has joined #openstack-infra | 16:47 | |
*** kraman has left #openstack-infra | 16:48 | |
*** gokrokve has quit IRC | 16:49 | |
*** gokrokve has joined #openstack-infra | 16:49 | |
*** mrodden has joined #openstack-infra | 16:50 | |
*** gokrokve_ has joined #openstack-infra | 16:51 | |
*** FallenPegasus has joined #openstack-infra | 16:51 | |
sdague | dmsimard: why does github outage affect gate? | 16:51 |
sdague | we should be fully isolated | 16:51 |
dmsimard | sdague: In the context of puppet openstack, jenkins clones dependencies from github | 16:52 |
dmsimard | sdague: To run tests | 16:52 |
sdague | dmsimard: ah, gotcha | 16:52 |
sdague | so it's a stackforge thing, that's fine | 16:53 |
dmsimard | sdague: yup. | 16:53 |
matel | Anyone knows how to add a new package to be served by http://pypi.openstack.org/openstack ? I would like to add the XenAPI package. | 16:54 |
*** gokrokve has quit IRC | 16:54 | |
*** yaguang has quit IRC | 16:55 | |
*** fbo is now known as fbo_away | 16:55 | |
*** talluri has joined #openstack-infra | 16:55 | |
*** kraman has joined #openstack-infra | 16:56 | |
*** fbo_away is now known as fbo | 16:57 | |
*** DinaBelova is now known as DinaBelova_ | 16:58 | |
max_lobur | Hi Everyone! | 16:58 |
max_lobur | I'm from Ironic project | 16:58 |
max_lobur | Somebody from requirements core group, could you please review/approve the patches for us: | 16:58 |
max_lobur | https://review.openstack.org/#/c/66349/3 this one already have +1 from core reviewer | 16:58 |
max_lobur | https://review.openstack.org/#/c/66077/ | 16:58 |
max_lobur | I would greatly appreciate | 16:59 |
anteaya | hi max_lobur here are your requirements core people: https://review.openstack.org/#/admin/groups/131,members | 17:00 |
fungi | mriedem: russellb: does 68147 still warrant moving to the front? | 17:00 |
max_lobur | anteaya, thanks! | 17:00 |
anteaya | max_lobur: I note none of them are infra-core | 17:00 |
anteaya | max_lobur: np, happy hunting | 17:00 |
russellb | fungi: yes | 17:01 |
jeblair | mordred: ping | 17:01 |
russellb | it's still one of the top issues | 17:01 |
mordred | jeblair: otp | 17:01 |
mriedem | fungi: yeah, it's for 1720680 | 17:01 |
max_lobur | anteaya, do you think it's ok if I contact some of them by email for this? | 17:01 |
jeblair | mordred: i wondered if you wanted to promote 68147 | 17:01 |
*** talluri has quit IRC | 17:01 | |
fungi | mriedem: russellb: is it the cause of that keystone failure at the front? | 17:01 |
anteaya | max_lobur: I personally have nothing against that and don't think any of the folks on the list will object | 17:01 |
*** talluri has joined #openstack-infra | 17:02 | |
mriedem | fungi: doesn't sound like it, no | 17:02 |
anteaya | note that joe heck hasn't been doing much openstack lately | 17:02 |
*** med_ has quit IRC | 17:02 | |
*** nati_uen_ has joined #openstack-infra | 17:02 | |
anteaya | I'm not sure who dave walker is | 17:02 |
*** nati_uen_ has quit IRC | 17:02 | |
fungi | okay, i'll wait a few minutes for the front of the gate to clear | 17:02 |
anteaya | the rest are active in openstack | 17:02 |
max_lobur | anteaya, thanks a lot! | 17:02 |
anteaya | max_lobur: np | 17:03 |
*** nati_uen_ has joined #openstack-infra | 17:03 | |
jeblair | mordred: istr you saying you would have time to help out more this week; and i wasn't sure if you had promoted a change yet. but i guess if you're busy, nevermind. | 17:03 |
*** nati_ueno has quit IRC | 17:03 | |
*** markwash has quit IRC | 17:03 | |
*** hashar has joined #openstack-infra | 17:04 | |
fungi | mordred: for reference, i'm planning to run 'sudo zuul promote --pipeline gate --changes 68147,3' | 17:04 |
russellb | fungi: sounds good, thank you! | 17:04 |
anteaya | jeblair: he has said he can promote but doesn't have the promotion knowledge giving him confidence to promote | 17:04 |
*** medberry has joined #openstack-infra | 17:04 | |
jeblair | anteaya: that's what i was trying to fix. | 17:04 |
anteaya | and the rest of us are honouring his good judgement | 17:04 |
anteaya | ah | 17:04 |
anteaya | he was around a minute ago | 17:04 |
anteaya | I realize that doesn't help much | 17:04 |
anteaya | jeblair: feeling any better today? | 17:05 |
*** talluri has quit IRC | 17:06 | |
*** dstanek_afk has joined #openstack-infra | 17:06 | |
*** johnthetubaguy1 has joined #openstack-infra | 17:06 | |
*** vkozhukalov has joined #openstack-infra | 17:06 | |
*** ociuhandu_ has joined #openstack-infra | 17:07 | |
*** CaptTofu_ has joined #openstack-infra | 17:07 | |
*** gokrokve_ has quit IRC | 17:08 | |
*** saschpe has joined #openstack-infra | 17:08 | |
*** FallenPegasus has quit IRC | 17:08 | |
*** julim has quit IRC | 17:09 | |
*** kgriffs has quit IRC | 17:09 | |
*** johnthetubaguy has quit IRC | 17:09 | |
*** niska has quit IRC | 17:09 | |
*** saschpe_ has quit IRC | 17:09 | |
*** CaptTofu has quit IRC | 17:09 | |
*** kruskakli has quit IRC | 17:09 | |
*** dcramer_ has quit IRC | 17:09 | |
*** lifeless has quit IRC | 17:09 | |
*** mrda_ has quit IRC | 17:09 | |
*** wayneeseguin has quit IRC | 17:09 | |
*** mrda has joined #openstack-infra | 17:09 | |
*** lifeless1 has joined #openstack-infra | 17:09 | |
*** dstanek has quit IRC | 17:09 | |
*** afazekas has quit IRC | 17:09 | |
*** ociuhandu has quit IRC | 17:09 | |
*** skraynev has quit IRC | 17:09 | |
*** jeblair has quit IRC | 17:09 | |
*** ociuhandu_ is now known as ociuhandu | 17:09 | |
*** wayneseguin has joined #openstack-infra | 17:09 | |
*** skraynev has joined #openstack-infra | 17:09 | |
*** niska has joined #openstack-infra | 17:09 | |
*** wayneseguin is now known as wayneeseguin | 17:09 | |
*** afazekas has joined #openstack-infra | 17:09 | |
*** kgriffs has joined #openstack-infra | 17:10 | |
*** julim has joined #openstack-infra | 17:10 | |
*** dcramer_ has joined #openstack-infra | 17:10 | |
*** jeblair has joined #openstack-infra | 17:10 | |
*** morganfainberg has quit IRC | 17:11 | |
*** morganfainberg has joined #openstack-infra | 17:11 | |
dims | jog0, sdague - is there a etherpad for the gate issues? | 17:11 |
matel | fungi: do you knows how to add a new package to be served by http://pypi.openstack.org/openstack ? I would like to add the XenAPI package. | 17:11 |
fungi | matel: it would need to be added to the openstack/requirements global-requirements.txt file | 17:13 |
matel | fungi: thanks. | 17:13 |
matel | fungi: How frequently does it get refreshed? | 17:13 |
*** jasondotstar has quit IRC | 17:13 | |
fungi | mmm, that keystone change at the very front is going to fail too, according to the jenkins log for its last running job | 17:13 |
fungi | i'l go ahead and promote that change now to give them another chance i guess | 17:14 |
fungi | since when that one goes, it's a full gate reset regardless | 17:14 |
bknudson | do we want to let the job finish so we can get some logs on the failure? | 17:15 |
fungi | matel: refreshed in what way? | 17:15 |
fungi | bknudson: not really. it looks like a failure pattern we're already tracking anyway | 17:15 |
anteaya | this gate reset has been going on for some time | 17:15 |
matel | So, if I add a new entry, and it gets merged, when can I download the package from pypi.openstack.org? | 17:15 |
bknudson | ok, if it's a known failure. | 17:15 |
anteaya | to my eyes, it feels like this gate reset has been taking over an hour | 17:16 |
fungi | matel: usually around an hour after the change to openstack/requirements merges | 17:17 |
*** thuc has quit IRC | 17:17 | |
matel | fungi: thanks | 17:17 |
fungi | merge events for that repository trigger mirror refreshes | 17:18 |
dolphm | anteaya: about 1.5 hours now | 17:18 |
jeblair | anteaya: what 'gate reset?' | 17:18 |
*** thuc has joined #openstack-infra | 17:18 | |
*** mgagne has quit IRC | 17:18 | |
anteaya | 64575 at the top of the gate has been there for over an hour | 17:18 |
*** MarkAtwood has joined #openstack-infra | 17:18 | |
jeblair | anteaya: that doesn't mean it has taken zuul an hour to reset the changes after it | 17:19 |
anteaya | oh I see this job is still running: https://jenkins03.openstack.org/job/gate-tempest-dsvm-postgres-full/2605/ | 17:19 |
russellb | anteaya: tempest runs take a bit over an hour now | 17:19 |
jeblair | anteaya: the postgres job has taken 3 hours to run | 17:19 |
fungi | that change has been there because there's a job on it which ran for almost 3 hours | 17:19 |
russellb | 3 hours, eep | 17:19 |
russellb | what the duece | 17:19 |
anteaya | my mistake, yeah this job - which is going to fail (correct dolphm?) is still running | 17:19 |
anteaya | then the gate reset | 17:19 |
*** CaptTofu_ has quit IRC | 17:19 | |
openstackgerrit | Mate Lakat proposed a change to openstack/requirements: Add XenAPI to OpenStack dependencies https://review.openstack.org/68181 | 17:20 |
*** fifieldt has quit IRC | 17:21 | |
matel | fungi: Thanks for the info, patch uploaded. | 17:22 |
jeblair | fungi: you might have to wait until it actually shows up in the pipeline since there's an event queue backlog right now | 17:22 |
*** SergeyLukjanov_ is now known as SergeyLukjanov | 17:22 | |
*** thuc has quit IRC | 17:22 | |
anteaya | sdague | though I just looked at the inprogress job on 64575,15 | 17:23 |
anteaya | 15:45:16 sdague | and it's failing right now | 17:23 |
*** smurugesan has joined #openstack-infra | 17:23 | |
fungi | jeblair: yeah, it didn't take yet | 17:23 |
*** praneshp has quit IRC | 17:23 | |
fungi | so chances are all those keystone changes will be ejected before i cat get the new change promoted anyway | 17:23 |
jeblair | fungi: are there any operational issues you would like me to address? | 17:24 |
*** pcrews has quit IRC | 17:24 | |
fungi | jeblair: lifeless1 has an updated patch series up for nodepool which should help the deleted node handling. i approved some of the initial ones in the series because they looked minimal and safe, but the others may merit more eyes to confirm they don't take the design in unintended directions | 17:25 |
*** jasondotstar has joined #openstack-infra | 17:25 | |
*** thuc has joined #openstack-infra | 17:26 | |
*** thuc has quit IRC | 17:27 | |
jeblair | fungi: did you restart nodepool with any of those changes? | 17:27 |
*** thuc has joined #openstack-infra | 17:27 | |
fungi | jeblair: no, not yet. puppet's disabled on nodepool at the moment still because of the failure we hit early last week, so i could temporarily hand-edit out the configuration for the tripleo poc provider which went offline and tanked nodepool | 17:28 |
*** rakhmerov has quit IRC | 17:28 | |
openstackgerrit | Evgeny Fadeev proposed a change to openstack-infra/askbot-theme: removed a broken line from the script https://review.openstack.org/68183 | 17:28 |
fungi | i have a patch proposed to address where i saw the exceptions in the log which seemed to be preventing it from adding any new nodes | 17:28 |
jeblair | fungi: ah. | 17:29 |
fungi | (any new nodes on any provider) | 17:29 |
russellb | fungi: looks like a good time to promote | 17:29 |
fungi | russellb: zuul hasn't spotted the approval event for that patch yet, so it'll be a few more minutes | 17:29 |
russellb | ah ok | 17:29 |
jeblair | i don't understand some of lifeless1's changes that were merged | 17:30 |
SergeyLukjanov | it looks like propose req updates job failing now ;( http://logs.openstack.org/a9/a94a666767516699d7ee689661f2a157bc73671e/post/propose-requirements-updates/ | 17:30 |
SergeyLukjanov | morning guys | 17:30 |
* fungi looks bach at them to refresh memory | 17:30 | |
jeblair | SergeyLukjanov: good morning | 17:30 |
jeblair | https://review.openstack.org/#/c/67924/ | 17:30 |
jeblair | fungi: what will that tell us? | 17:30 |
*** mancdaz is now known as mancdaz_away | 17:30 | |
*** viktors has left #openstack-infra | 17:30 | |
mfer | fungi howdy | 17:30 |
jeblair | fungi: other than that the log system is working, by emitting 10 lines/second? | 17:30 |
kmartin | any chance a core could look at https://review.openstack.org/#/c/65179/ it has four +1's and now gate appears to be working as sdague just had a patch land. This requirement change in holding up a new cinder driver that is suppose to landed in Icehouse-2. | 17:30 |
anteaya | kmartin: might want to check with jgriffith on that | 17:31 |
*** senk has quit IRC | 17:31 | |
anteaya | ttx is advising PTLs to continue with what ever is currently in the gate and mark anything not in the gate for i3 | 17:32 |
kmartin | anteaya: jgriffith already gave it a +1, he's not able to +2's as far as I know | 17:32 |
*** fifieldt has joined #openstack-infra | 17:33 | |
fungi | jeblair: oh, it looked like that would only be hit if we entered that early. now that i look at it with fresh eyes, i agree lifeless1's suggestion that it would only be emitted at most once a second and only when under heavy load may have been mistaken | 17:33 |
jeblair | fungi: i left a comment on it | 17:33 |
fungi | jeblair: it was also suggested as a short-term debugging test to find out how often we were hitting that delay and being forced to wait | 17:33 |
anteaya | kmartin: you are looking for requirements cores: https://review.openstack.org/#/admin/groups/131,members | 17:33 |
jeblair | fungi: i expect the answer is 'constantly' | 17:34 |
anteaya | kmartin: none of which are infra cores | 17:34 |
jeblair | by design | 17:34 |
kmartin | anteaya: got it...thanks :) | 17:34 |
anteaya | kmartin: np | 17:34 |
fungi | jeblair: yeah, now i think so too. i should not try to do code review when it's that late at night | 17:35 |
jeblair | fungi: left a comment on https://review.openstack.org/#/c/67993/ as well | 17:35 |
*** afazekas has quit IRC | 17:35 | |
jgriffith | anteaya: kmartin is correct, I have no +2 powers here :) | 17:36 |
fungi | jeblair: yep, i totally missed that we had a knob for that | 17:36 |
anteaya | jgriffith: very good, just trying to spread the message ttx is advocating regarding what is in the gate is good to stay and if not retargeting for i3, which is the PTLs decision of course | 17:38 |
*** marun has quit IRC | 17:38 | |
jgriffith | anteaya: agreed | 17:38 |
jeblair | fungi: the rest of the merged ones in that series lgtm | 17:38 |
jgriffith | anteaya: that one meets that criteria | 17:38 |
anteaya | jgriffith: okay, thank you | 17:38 |
jgriffith | anteaya: you're welcome, and thanks to you! | 17:39 |
anteaya | :D | 17:39 |
dmsimard | anteaya: Still haven't gotten jenkins to reverify and merge my patch set, I tried looking at jenkins01/02 in search of issues but I don't exactly know where to look. Any ideas or is it a matter of patience ? | 17:39 |
fungi | jeblair: thanks for checking back over those | 17:39 |
zaro | morning | 17:39 |
* fungi is about to disappear again. flight is boarding shortly and i think this 5-hour leg may have no wifi :( | 17:40 | |
anteaya | dmsimard: go here: http://status.openstack.org/zuul/ | 17:40 |
zaro | fungi: did new scp plugin get installed? | 17:40 |
anteaya | dmsimard: see this? Queue lengths: 1056 events, 630 results. | 17:40 |
fungi | zaro: on all the jenkins masters running newer jenkins, yes | 17:40 |
fungi | zaro: we still need to upgrade jenkins and plugin on jenkins.o.o and 01 | 17:40 |
anteaya | there are over 1000 events that zuul hasn't processed yet, your reapproval is in them | 17:41 |
zaro | fungi: cool, thx. enjoy the flight. | 17:41 |
anteaya | after your patch hits the gate queue it is still a 50 hour wait | 17:41 |
dmsimard | anteaya: Okay, I thought zuul was only for primary openstack projects, not for stackforge - makes sense. | 17:41 |
*** esker has joined #openstack-infra | 17:41 | |
anteaya | zaro morning | 17:42 |
jeblair | fungi, lifeless: why do you think making all deletes serialized will help? | 17:42 |
mfer | fungi did you ever hear back on the openstack sdk naming convention stuff? | 17:42 |
*** gokrokve has joined #openstack-infra | 17:43 | |
*** gokrokve has quit IRC | 17:43 | |
*** gokrokve has joined #openstack-infra | 17:43 | |
fungi | jeblair: i gather the suggestion there is to avoid having the event-driven deletes conflict with the queued deletes, and then run through the queue more aggressively instead | 17:43 |
*** fifieldt has quit IRC | 17:44 | |
*** marun has joined #openstack-infra | 17:44 | |
jeblair | lifeless1: and why would reordering the delete and check calls help? | 17:44 |
*** wenlock has joined #openstack-infra | 17:44 | |
fungi | mfer: i have e-mailed mark collier again asking for an update. i will also see him in person tomorrow so i'll be sure to find out the status then if nothing else (but i won't be around irc much for the next few days either) | 17:44 |
jeblair | fungi: i don't get it. right now we can parallelize work across all providers, this serializes it. | 17:44 |
*** DinaBelova_ is now known as DinaBelova | 17:45 | |
fungi | jeblair: i'm still unconvinced as well. maybe we would do better to have queue per provider and iterate through those with one or more parallel tasks each? | 17:46 |
jeblair | fungi: this means that if we're waiting our turn to send an api call to rackspace (because we're rate limiting), we won't be using the same time to send an api call to hpcloud (because it's serialized behind the request to rackspace) | 17:46 |
dstufft | sdague: some day pip will figure it out :[ | 17:46 |
jeblair | fungi: yeah, that's what we have now. | 17:46 |
mfer | fungi ok. if mordred knows the direction and i can twist his arm into doing it manually i'll be happy enough | 17:46 |
*** fifieldt has joined #openstack-infra | 17:46 | |
zaro | is there openstack-infra meeting today? | 17:46 |
kraman | jeblair: ping. can I get some of your time to bounce some ideas off you re solum project's use of zuul | 17:47 |
sdague | dstufft: yes, which is fine | 17:47 |
sdague | I'm just trying to be realistic on that thread | 17:47 |
sdague | because I feel like there is a POV coming from "ah it's already a solved problem" by a set of people that have never looked into the details | 17:47 |
*** sarob has joined #openstack-infra | 17:48 | |
dstufft | sdague: yea I commented already saying that relying on 22+ things is probably going to be painful for end users | 17:48 |
anteaya | zaro if so looks like it will be you me and clarkb and whoever else shows up | 17:49 |
anteaya | pleia2 will probably be there too, double teaming with the tripleo meeting | 17:50 |
fungi | zaro: anteaya: i'll join from the flight if there's wifi, otherwise just assume most of the action items i assigned myself last week are still unaddressed | 17:50 |
*** gothicmindfood has joined #openstack-infra | 17:50 | |
clarkb | hello | 17:50 |
*** nati_ueno has joined #openstack-infra | 17:51 | |
jeblair | kraman: time is in short supply right now. can it wait until later in the week? | 17:51 |
*** nati_ueno has quit IRC | 17:51 | |
kraman | jeblair: sure | 17:51 |
*** harlowja_away is now known as harlowja | 17:51 | |
anteaya | fungi: very good | 17:51 |
*** nati_ueno has joined #openstack-infra | 17:51 | |
anteaya | clarkb: hello | 17:51 |
kraman | jeblair: when should i ping you back? would sometime thursday be better? | 17:52 |
anteaya | jeblair: do you think you will have wifi during meeting time? | 17:52 |
jeblair | anteaya: i'm at home, sick, remember? | 17:52 |
anteaya | jeblair: sorry I found it hard to tell | 17:52 |
anteaya | :D | 17:52 |
anteaya | so no Utah for you? | 17:52 |
*** rakhmerov has joined #openstack-infra | 17:53 | |
*** kgriffs is now known as kgriffs_afk | 17:53 | |
*** CaptTofu has joined #openstack-infra | 17:53 | |
*** coolsvap is now known as coolsvap_away | 17:54 | |
*** mestery has quit IRC | 17:55 | |
*** nati_uen_ has quit IRC | 17:55 | |
pleia2 | anteaya: yep, I'm around | 17:55 |
anteaya | jeblair: so you will be here to chair? | 17:55 |
*** sarob_ has joined #openstack-infra | 17:56 | |
* anteaya is trying to figure out if she should be warming up the bot commands in case she has to chair | 17:56 | |
anteaya | pleia2: awesome | 17:56 |
*** sarob__ has joined #openstack-infra | 17:57 | |
*** ruhe is now known as _ruhe | 17:57 | |
pleia2 | (and I can chair if jeblair is sick or plane-ing) | 17:57 |
anteaya | grand | 17:57 |
*** sarob has quit IRC | 17:58 | |
*** sarob___ has joined #openstack-infra | 17:58 | |
*** derekh has quit IRC | 17:58 | |
gsamfira | hello again :). We are getting ready to bring the Hyper-V CI online, and we are looking into monitoring solutions for the various services that are running on the nodes. What are you guys using? Nagios, Zabbix? | 17:58 |
*** sarob has joined #openstack-infra | 17:59 | |
anteaya | gsamfira: do us all a favour and hold off until at least next week if what you are discussing is an 3rd party testing system | 18:00 |
clarkb | gsamfira: we have a simple cacti server http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=7&select_first=true | 18:00 |
clarkb | anteaya: ? | 18:00 |
*** sarob has quit IRC | 18:00 | |
clarkb | also do changes still need promotion? | 18:00 |
anteaya | clarkb: should it not? | 18:00 |
fungi | jeblair: clarkb: mordred: if one of you can 'sudo zuul promote --pipeline gate --changes 68147,3' on zuul.o.o once 68147,3 shows up in the gate (its approval is still i the event queue), it would be appreciated. i have to drop offline now, possibly for many hours | 18:00 |
*** sarob_ has quit IRC | 18:00 | |
clarkb | fungi: will do thanks | 18:00 |
anteaya | can we tolerate the possible spam of a new 3rd party testing system right now? | 18:01 |
* fungi departs | 18:01 | |
clarkb | anteaya: should it not what? | 18:01 |
clarkb | anteaya: it won't affect anything on our end | 18:01 |
anteaya | okay | 18:01 |
clarkb | anteaya: it may affect code review | 18:01 |
*** jamielennox|away is now known as jamielennox | 18:01 | |
*** Alex_Gaynor has quit IRC | 18:02 | |
*** sarob__ has quit IRC | 18:02 | |
gsamfira | we will not be voting right away, untill we make sure that it works as expected. Right now we are finishing the last bits. If you think we should hold off untill next week, we will. But we need to get this online by 31/01/2014 (the latest) | 18:03 |
*** sarob___ has quit IRC | 18:03 | |
*** Alex_Gaynor has joined #openstack-infra | 18:03 | |
clarkb | gsamfira: I don't think you need to wait as long as you hold off on voting until you are confident in the system | 18:04 |
gsamfira | awesome | 18:04 |
*** gothicmindfood has quit IRC | 18:05 | |
*** hogepodge has joined #openstack-infra | 18:06 | |
*** fifieldt has quit IRC | 18:07 | |
*** elasticio has joined #openstack-infra | 18:07 | |
*** max_lobur is now known as max_lobur_afk | 18:07 | |
*** nicedice has joined #openstack-infra | 18:09 | |
*** yamahata has joined #openstack-infra | 18:13 | |
*** sarob has joined #openstack-infra | 18:13 | |
*** praneshp has joined #openstack-infra | 18:14 | |
*** markwash has joined #openstack-infra | 18:15 | |
clarkb | zaro and I found a small bug in the scp plugin change that went in. Noticed that there are a bunch of tests on jenkins masters that have been going for hours and hours. Looks like they ran into network trouble which killed the file upload thread before it could notify the main job thread. I am manually killing those jobs and we will work on a fix | 18:16 |
jeblair | clarkb: ack thx | 18:17 |
*** dims has quit IRC | 18:17 | |
*** fifieldt has joined #openstack-infra | 18:20 | |
*** jerryz has joined #openstack-infra | 18:21 | |
*** smarcet has left #openstack-infra | 18:24 | |
*** smarcet has joined #openstack-infra | 18:24 | |
*** dims has joined #openstack-infra | 18:24 | |
*** jp_at_hp has quit IRC | 18:27 | |
*** dripton is now known as dripton_shovelin | 18:30 | |
anteaya | pleia2: you are chairing today's meeting | 18:30 |
anteaya | let me know if there is anything I can do to help | 18:31 |
*** kgriffs_afk is now known as kgriffs | 18:31 | |
*** pcrews has joined #openstack-infra | 18:31 | |
*** jooools has quit IRC | 18:32 | |
clarkb | jeblair: are you hacking on zuul or trying to recover from illness? | 18:32 |
clarkb | (or both? :) ) | 18:32 |
jeblair | clarkb: mostly trying to catch up and ascertain the current situation. | 18:33 |
jeblair | clarkb: and recover. | 18:33 |
jeblair | clarkb: fungi suggested that improving nodepool delete performance was the most operationally critical thing i should look at. | 18:33 |
pleia2 | anteaya: thanks | 18:33 |
clarkb | jeblair: I would mostly agree with that. I also think that having a rate limit of some sort in zuul would help tremendously in moving the queue because we can stop wasting resources on the 50th change in the gate | 18:34 |
jeblair | clarkb: do you want to write that? | 18:34 |
clarkb | jeblair: I can try :) | 18:35 |
clarkb | thinking about I actually think it won't be too hard as we can just preslice the queues that zuul operates over based on $thing | 18:35 |
*** markwash has quit IRC | 18:38 | |
mordred | clarkb: having it be an adaptive rate limit would be interesting ... perhaps set it based on the past X time's success/failure ratio? | 18:39 |
*** nati_uen_ has joined #openstack-infra | 18:40 | |
fungi | there is wireless on this flight. looks like zuul still hasn't spotted 68147,3 though | 18:40 |
mordred | but - a static configured one is probably a great step towards that | 18:40 |
mordred | fungi: yay inflight wifi | 18:40 |
clarkb | mordred: yup, probably going to start simple and do something similar to tcp | 18:40 |
mordred | I love it that "something similar to tcp" was your follow up to "start simple" | 18:41 |
*** markwash has joined #openstack-infra | 18:41 | |
clarkb | mordred: well the window sizing in tcp is simple | 18:41 |
clarkb | increment increment increment, trouble reduce by half, increment increment increment | 18:42 |
clarkb | ttx: I can't context switch enough right now, but I really don't like zuul handling the assumption that tests will fail | 18:43 |
*** nati_ueno has quit IRC | 18:43 | |
clarkb | assuming tests will fail implies to me that we should just turn the gate off and have tea | 18:43 |
jeblair | tea sounds lovely right now | 18:43 |
ttx | clarkb: well, not really. | 18:43 |
fungi | yay taxiing back to the gate for a warning light in the cockpit... grr | 18:44 |
ttx | clarkb: we are only ignoring them as far as the top changes in the gate go | 18:44 |
jeblair | fungi: that's what you get for using wifi | 18:44 |
fungi | indeed | 18:44 |
mordred | ttx, clarkb: context? | 18:44 |
clarkb | mordred: suggestions for improving the gate thread | 18:44 |
ttx | clarkb: I agree it's an aggressive change, but I still prefer that to turning gate off | 18:44 |
mordred | gotcha. looking | 18:44 |
clarkb | but I can't email right now because ETOOMUCHGOINGON | 18:44 |
ttx | if it boils down to those two optons | 18:45 |
clarkb | ttx: its an aggressive change that assumes nothing will mege | 18:45 |
clarkb | so why do it | 18:45 |
ttx | clarkb: it just automates what we are doing anyway | 18:45 |
mordred | ttx: what's the thread subject? | 18:45 |
ttx | could be temporary too | 18:45 |
mordred | ttx: I'd like to read the suggestion | 18:45 |
jeblair | [ 80: Sean Dague ] [OpenStack-Infra] suggestions for gate optimizations | 18:45 |
ttx | mordred: on openstack-infra | 18:45 |
mordred | ahhh | 18:45 |
openstackgerrit | SlickNik proposed a change to openstack-infra/devstack-gate: Add Trove testing support https://review.openstack.org/65040 | 18:46 |
jeblair | Message-ID: <52DEAF38.6070409@openstack.org> | 18:46 |
mordred | gotit | 18:46 |
clarkb | jeblair: right now I am thinking only dependent pipelines need rate limiting? because independent will get their jobs run once and done. Does that seem correct to you or should we limit the entire space to give everything a more level playing field? | 18:46 |
ttx | fwiw I'm not saying we should do that. Just want to put it on the table, just above the "turn gate off" nuclear option. | 18:46 |
mordred | yeah - we've talked about more parallel testing of optional combinations - but I think we should implement throttling first | 18:47 |
jeblair | clarkb: i think dependent only | 18:47 |
clarkb | jeblair: ok | 18:47 |
*** dstanek_afk is now known as dstanek | 18:47 | |
mordred | yes | 18:47 |
*** obondarev_ has joined #openstack-infra | 18:47 | |
*** luqas has quit IRC | 18:48 | |
jeblair | mordred: by throttling do you mean what clarkb is working on? | 18:48 |
*** gothicmindfood has joined #openstack-infra | 18:48 | |
*** mfer has quit IRC | 18:48 | |
mordred | jeblair: yes | 18:48 |
*** mfer has joined #openstack-infra | 18:48 | |
*** marun has quit IRC | 18:48 | |
jeblair | mordred: i would not view that as addressing the fundamental problem | 18:49 |
mordred | jeblair: because I think that before we could even think about running more speculative branches of things, we'd have to have a mechanism to control resorces consumption. | 18:49 |
*** marun has joined #openstack-infra | 18:49 | |
jeblair | mordred: oh i see the connection. ack. | 18:49 |
mordred | I don't think either address the fundamental problem - I just think that jumping to speculative first-fail branches right now is untennable | 18:49 |
HenryG | How can I fetch the latest patchset of a review without knowing how many patchsets it has? | 18:51 |
*** fbo is now known as fbo_away | 18:52 | |
fungi | HenryG: with git-review? it should pick the latest patchset bu default if you don't specify one | 18:52 |
HenryG | fungi: actually I was hoping to just download the diff | 18:53 |
jeblair | fungi: do you have an idea of what the actual time to delete a rax server is? | 18:53 |
sdague | clarkb: if you are putting in rate limmitting, if you can do it in such a way that the value could be updated without a zuul restart | 18:53 |
sdague | we could do it pseudo static | 18:53 |
clarkb | sdague: ya | 18:53 |
sdague | I think right now a value of 10 would be appropriate | 18:54 |
*** dripton_shovelin is now known as dripton | 18:54 | |
clarkb | sdague: possibly as a zuul rpc command to start with | 18:54 |
*** _david_ has joined #openstack-infra | 18:54 | |
*** marun has quit IRC | 18:54 | |
fungi | jeblair: in current nodepool, or just in general? | 18:54 |
sdague | clarkb: sure, or a config that it rereads | 18:54 |
sdague | so it would be persistent across reboots | 18:54 |
jeblair | fungi: in general? if 10 mins isn't long enough, how long is enough? | 18:54 |
fungi | as in, how long is nodepool taking to clear a rax vm, or how long does it take after i nova delete before i see it no longer in nova list? | 18:55 |
sdague | ok... really going to get lunch | 18:55 |
jeblair | fungi: i think the first one | 18:55 |
fungi | jeblair: i've seen rax, particularly ord, vms hanging around in nodepool list for up to 3 hours, but usually they're gone within 0.5 to 1 hour | 18:56 |
*** krotscheck has joined #openstack-infra | 18:56 | |
jeblair | fungi: you have something running right now that deletes things side-channel? | 18:56 |
*** kgriffs is now known as kgriffs_afk | 18:57 | |
jeblair | fungi: i see a lot of "NotFound: The resource could not be found. (HTTP 404)" errors in the logs; could it be because of that? | 18:57 |
*** jcoufal-mobile has joined #openstack-infra | 18:57 | |
jeblair | fungi: (as in, is it possible your side-channel deleting is getting the jump on the NodeCompleteThread delete?) | 18:57 |
fungi | jeblair: right now, entirely probable | 18:58 |
fungi | i can stop and let it run its course | 18:58 |
jeblair | fungi: don't worry, if you think it's helping overall it's finj | 18:58 |
jeblair | fine | 18:58 |
*** jchiles has quit IRC | 18:58 | |
russellb | what's the page to the monitoring dashboard for infra hosts ... so i can see things like load avg on zuul | 18:58 |
russellb | or cpu consumption or whatever | 18:58 |
*** _david_ has quit IRC | 18:58 | |
jeblair | russellb: http://cacti.openstack.org/cacti/graph_view.php?action=tree&tree_id=1&leaf_id=23 | 18:58 |
fungi | it is. without doing it, the width of the "used" band in the graph dwindles to about 20% of the aggregate quota at most | 18:58 |
russellb | jeblair: thanks! | 18:58 |
*** lifeless1 is now known as lifeless | 18:59 | |
lifeless | jeblair: will talk after meetings :/ | 18:59 |
*** thuc has quit IRC | 18:59 | |
*** thuc has joined #openstack-infra | 19:00 | |
pleia2 | meeting time (once they finish up) | 19:00 |
mriedem | well the cinder revert patch for 1270608 passed jenkins for a 3rd time https://review.openstack.org/#/c/67973/ | 19:00 |
*** thuc has quit IRC | 19:00 | |
russellb | jeblair: load not crazy high it seems ... just wondering why it hasn't picked up a change from a while ago (68147,3), which fixes one of the top gate bugs | 19:00 |
mriedem | although the last one didn't have test_volume_boot_pattern | 19:00 |
*** thuc has joined #openstack-infra | 19:01 | |
*** jerryz has quit IRC | 19:02 | |
*** otherwiseguy has joined #openstack-infra | 19:03 | |
jeblair | russellb: zuul is not extremely efficient. the suggestion i sent to the list about splitting out the merger component and horizontally scaling it would help. also, making it so that it doesn't have to process the full queue on every result event would help. | 19:03 |
russellb | jeblair: cool, i'm hoping i can study up on the zuul code here soon ... i'd like to help | 19:03 |
jeblair | russellb: that last one sounds trivial, but the last time i looked it seemed moderately complex. | 19:03 |
*** _david_ has joined #openstack-infra | 19:04 | |
russellb | there was one suggestion i saw on the infra list that i don't think got a response but sounded interesting ... which was to run more than one zuul, like run one for just check, and the rest in another | 19:05 |
jeblair | russellb: (basically, the "result queue" should just be a boolean flag (">=1 new result received; queue processing needed") | 19:05 |
russellb | not sure if that's even possible though | 19:05 |
jeblair | russellb: that would be more complex than actually improving zuul, and degrade our experience at the same time. | 19:05 |
russellb | k :) | 19:05 |
*** johnthetubaguy1 has quit IRC | 19:06 | |
*** gema has joined #openstack-infra | 19:07 | |
*** ok_delta has joined #openstack-infra | 19:08 | |
*** ok_delta__ has joined #openstack-infra | 19:08 | |
*** dstufft has quit IRC | 19:10 | |
*** dstufft has joined #openstack-infra | 19:10 | |
*** praneshp has quit IRC | 19:10 | |
ArxCruz | jeblair: hey, happy new year :) backing to business now, whenever you have chance, can you take a look in my patch https://review.openstack.org/#/c/62739/ | 19:12 |
ArxCruz | :) | 19:12 |
*** dkliban is now known as dkliban_afk | 19:12 | |
ArxCruz | anteaya: you too :) | 19:12 |
*** markmcclain has joined #openstack-infra | 19:12 | |
anteaya | hey ArxCruz | 19:12 |
ArxCruz | =D | 19:12 |
anteaya | nice to see you back | 19:12 |
ArxCruz | thanks | 19:15 |
*** melwitt has joined #openstack-infra | 19:15 | |
ArxCruz | we've very busy preparing everything to start report results | 19:15 |
ArxCruz | and now it's almost done | 19:15 |
anteaya | cool | 19:16 |
*** praneshp has joined #openstack-infra | 19:16 | |
*** fifieldt has quit IRC | 19:18 | |
*** krtaylor has quit IRC | 19:22 | |
*** aburaschi has joined #openstack-infra | 19:23 | |
*** sarob has quit IRC | 19:26 | |
*** Ajaeger1 has joined #openstack-infra | 19:27 | |
*** vkozhukalov has quit IRC | 19:27 | |
*** nati_ueno has joined #openstack-infra | 19:27 | |
*** gokrokve has quit IRC | 19:28 | |
aburaschi | Hi guys, newbie question: I've noticed something changed in the way tests are listed in tempest. Newlines and formatting in general is no longer working as before. Is this an intended behavior? Is there a different way to run tempest other than run_tempest.sh or run_tests.sh? | 19:30 |
*** mriedem has quit IRC | 19:30 | |
*** pblaho has joined #openstack-infra | 19:31 | |
*** nati_uen_ has quit IRC | 19:31 | |
anteaya | aburaschi: tox | 19:31 |
anteaya | http://git.openstack.org/cgit/openstack/tempest/tree/README.rst | 19:32 |
*** markmcclain has quit IRC | 19:32 | |
fungi | aburaschi: also, questions about tempest development are probably better handled in #openstack-qa | 19:32 |
anteaya | well these instructions use testr | 19:32 |
anteaya | and what fungi said | 19:32 |
*** yamahata has quit IRC | 19:33 | |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Revert "Default to a ratelimit of 2/second for API calls" https://review.openstack.org/68213 | 19:33 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Revert "Provide diagnostics when task rate limiting." https://review.openstack.org/68214 | 19:33 |
openstackgerrit | James E. Blair proposed a change to openstack-infra/nodepool: Try longer to delete servers in the NodeCompleteThread https://review.openstack.org/68215 | 19:33 |
*** yamahata has joined #openstack-infra | 19:33 | |
aburaschi | anteaya: hey! hi again :) I'm having hard time running tox. It fails in oslo.config package. In the meantime, I was trying with run_* and testr. | 19:33 |
emagana | Hi Guys! Anyone from Infra who can help with a service account for third party testing? | 19:34 |
anteaya | aburaschi: see if this helps: http://git.openstack.org/cgit/openstack/oslo-incubator/tree/TESTING.rst | 19:34 |
anteaya | emagana: best if you email the infra email list | 19:34 |
aburaschi | anteaya: Ok! I'll review the directions you mention and come back if still no luck. | 19:34 |
anteaya | stand by for the address | 19:34 |
emagana | anteya: Yes, I did already! | 19:34 |
aburaschi | anteaya: Thanks again :) | 19:34 |
emagana | anteaya: My email awaits moderator approval | 19:35 |
anteaya | emagana: are you subscribed to teh list yet? | 19:35 |
emagana | anteaya: No, I am not! | 19:36 |
anteaya | we are in a meeting, after the meeting I will ask mordred or jeblair or pleia2 to approve your email | 19:36 |
anteaya | emagana: I recommend subscribing | 19:36 |
emagana | anteaya: I will subscribe but email will still in hold, right? | 19:36 |
fungi | emagana: anteaya: pleia2 usually goes through and approves moderated messages periodically | 19:36 |
anteaya | emagana: for this one, yes | 19:37 |
pleia2 | I let it through | 19:37 |
emagana | pleia2: Thanks!!! | 19:37 |
pleia2 | sure thing | 19:37 |
sdague | hmmpph, zuul's about 4hrs behind processing events now it seems (based on spot check of something in the review queue) | 19:37 |
anteaya | sdague: is that an improvement or slower than it has been? | 19:37 |
*** fifieldt has joined #openstack-infra | 19:38 | |
emagana | anteaya: I guest I just need to wait for the account creation.. | 19:38 |
fungi | though be forewarned, i have a pretty big backlog of third-party testing account add/change requests since i've been busy with other things. i'll take a look at them sometime in the next few days if nobody else beats me to it | 19:38 |
anteaya | emagana: you might get an email asking for additional details, I don't know I haven't read it yet, if so respond promptly please | 19:38 |
emagana | anteaya: I will for sure! | 19:39 |
sdague | anteaya: this is new | 19:39 |
emagana | anteaya: Thanks a lot! | 19:39 |
anteaya | and yes, as fungi says, please be patient | 19:39 |
*** apevec has joined #openstack-infra | 19:39 | |
*** apevec has joined #openstack-infra | 19:39 | |
anteaya | sdague: hmmmmph | 19:39 |
sdague | new devstack patch posted at 10:30am EST - got to check queue 30 minutes ago | 19:39 |
emagana | fungi: Thanks as well | 19:39 |
sdague | so slightly less than 4 hrs | 19:39 |
apevec | sdague, was maybe approve +1 removed ? I had few stable-maint try to approve this leftover release-bump: https://review.openstack.org/62127 | 19:40 |
apevec | and all they could do was +2 review | 19:40 |
apevec | but not approve | 19:40 |
sdague | apevec: yes, it was removed, because stable/havana can't pass | 19:41 |
sdague | and stable maint folks were apparently ignoring the emails we sent about that | 19:41 |
apevec | sdague, ok, but this is grizzly | 19:41 |
sdague | yeh, that couldn't pass either for a while | 19:41 |
apevec | sdague, big hammer always works :) | 19:41 |
*** mestery has joined #openstack-infra | 19:41 | |
sdague | not sure if we are passing there or not | 19:41 |
apevec | sdague, but grizzly should be good now, no? | 19:41 |
sdague | apevec: probably, you have current test results? | 19:42 |
apevec | lemme try recheck on that one | 19:42 |
sdague | apevec: so please never +A something like that -https://review.openstack.org/62127 | 19:43 |
*** senk1 has joined #openstack-infra | 19:43 | |
sdague | the last valid tests are from Dec 15th | 19:43 |
sdague | *so* much could have changed since then | 19:43 |
apevec | s/could// | 19:43 |
jeblair | http://logs.openstack.org/periodic-stable/periodic-tempest-dsvm-full-grizzly/3fc6377/console.html | 19:43 |
sdague | right :) | 19:43 |
jeblair | sdague: apevec: ^ | 19:43 |
jeblair | (from 2 days ago) | 19:44 |
sdague | jeblair: yeh, I think we got a fix in after that, it was definitely broken through the weekend | 19:44 |
jeblair | oh | 19:44 |
jeblair | http://logs.openstack.org/periodic-stable/periodic-tempest-dsvm-full-grizzly/31473f3/console.html | 19:44 |
apevec | so no joy on grizzly yet | 19:44 |
jeblair | sdague: yeah, that's most recent and is success | 19:45 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Add rate limiting to dependent pipeline queues https://review.openstack.org/68219 | 19:45 |
jeblair | apevec: ^ | 19:45 |
clarkb | jeblair: sdague ^ super simple not configurable, but wanted eyes on the general mechanism before I get into it too deep | 19:45 |
*** yolanda has quit IRC | 19:45 | |
sdague | apevec: right now, at current gate queue length, if you approve a change today, it will probably merge friday | 19:45 |
apevec | sdague, ugh, and we planned stable/havana freeze next week :( | 19:46 |
fungi | sdague: that's optimistic of you! | 19:46 |
sdague | fungi: well we were merging > 1 patch / hr overnight | 19:46 |
fungi | yep | 19:46 |
portante | sdague: if the change does not need to be reverified, right? | 19:46 |
sdague | portante: correct | 19:47 |
sdague | though grizzly has like no tests on it | 19:47 |
portante | fair enough | 19:47 |
*** sarob has joined #openstack-infra | 19:47 | |
sdague | the fact that the entire volumes infrastructure was broken in our setup for grizzly until my fix, and tempest only failed 5 tests, gives you an indication of how much lighter it was tested | 19:48 |
sdague | sorry, not my fix | 19:48 |
sdague | a fix I got bumped up | 19:48 |
*** mriedem has joined #openstack-infra | 19:49 | |
jgriffith | sdague: which fix are you referring to BTW? | 19:51 |
jgriffith | sdague: I realize infra, but still curious | 19:51 |
sdague | the pip -e install one | 19:51 |
sdague | it was a devstack fix | 19:51 |
jgriffith | sdague: ahh :) | 19:51 |
jgriffith | sdague: yes | 19:51 |
*** marun has joined #openstack-infra | 19:51 | |
sdague | https://review.openstack.org/#/c/67425/ | 19:52 |
jgriffith | so it wasn't the "no-wheel" | 19:52 |
sdague | no, turns out it was an older issue | 19:53 |
sdague | honestly, I have no idea how it was working before | 19:53 |
*** thuc has quit IRC | 19:53 | |
*** thuc has joined #openstack-infra | 19:53 | |
sdague | apevec: this is needed for stable/havana to work - https://review.openstack.org/#/c/67739/ | 19:53 |
*** thuc has quit IRC | 19:54 | |
sdague | it's in the queue, but given the queue length, I didn't figure it needed promoting | 19:54 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/config: Pack zuul git refs daily. https://review.openstack.org/68222 | 19:54 |
*** thuc has joined #openstack-infra | 19:54 | |
*** ArxCruz has quit IRC | 19:55 | |
*** kgriffs_afk is now known as kgriffs | 19:57 | |
*** GheRivero has quit IRC | 19:57 | |
*** GheRivero has joined #openstack-infra | 19:58 | |
*** dkliban_afk is now known as dkliban | 19:58 | |
*** gokrokve has joined #openstack-infra | 19:58 | |
sdague | clarkb: why did you add change queue to the queue item in - https://review.openstack.org/68219 ? | 19:59 |
apevec | sdague, can the queue be force-flushed to let this through? | 19:59 |
anteaya | thanks for doing a great job chairing, pleia2 | 19:59 |
clarkb | sdague: because there was no other way I could find to get at the queue a queue item belonged too | 19:59 |
anteaya | you ended before I could squeeze that into the logs | 19:59 |
anteaya | :D | 19:59 |
clarkb | sdague: all of the result processing is on queue items, so from there need to be able to tell the queue to change throttle | 19:59 |
sdague | clarkb: ok, I see now | 19:59 |
*** sarob has quit IRC | 20:00 | |
pleia2 | anteaya: hah, thanks :) | 20:00 |
*** gokrokve_ has joined #openstack-infra | 20:00 | |
fungi | clarkb: i suppose we could manually run the repack and see if times improve, but i'm not sure how long it's likely to take to finish under current load | 20:00 |
*** sarob has joined #openstack-infra | 20:00 | |
*** markmc has joined #openstack-infra | 20:01 | |
sdague | apevec: right now, promoting anything other than a suspected fix for a gate reset causing bug needs really strong justification | 20:01 |
fungi | i guess with it on tmpfs though, shouldn't be too impactful | 20:01 |
*** pblaho has quit IRC | 20:02 | |
fungi | load average is half the core count on zuul | 20:03 |
apevec | sdague, I mean 67739 which is a supposed fix | 20:03 |
*** gokrokve has quit IRC | 20:03 | |
sdague | right, but we blocked stable, so stable's not reseting the gate now | 20:04 |
*** rakhmerov has quit IRC | 20:04 | |
*** sarob has quit IRC | 20:05 | |
jeblair | clarkb: you have comments on 68219; in general looks excellent | 20:05 |
apevec | sdague, I can't see from zuul status page, are jobs for 67739 running or is it waiting in the queue? | 20:05 |
*** gokrokve_ has quit IRC | 20:05 | |
clarkb | jeblair: thanks | 20:06 |
jeblair | is terry wilson in irc? that looks like a good comment too. | 20:06 |
*** mrodden1 has joined #openstack-infra | 20:07 | |
*** mrodden has quit IRC | 20:07 | |
clarkb | I don't know who terry wilson is and yes good comment | 20:07 |
jeblair | clarkb: i left one more | 20:07 |
apevec | otherwiseguy is terry wilson | 20:07 |
*** DinaBelova is now known as DinaBelova_ | 20:08 | |
*** bauzas has joined #openstack-infra | 20:08 | |
otherwiseguy | jeblair: hi! | 20:09 |
jeblair | otherwiseguy: nice to meet you! thanks for the good comment on clarkb's zuul change | 20:09 |
fungi | he made more than one good comment on it, in fact | 20:10 |
otherwiseguy | jeblair: nice to meet you as well. hopefully I'll be around a little more in the near future and be able to contribute a bit. | 20:10 |
fungi | though i guess one was a comment on a comment. in general i agree with most of the tuning suggestions there | 20:11 |
fungi | s/most/all/ | 20:11 |
* otherwiseguy is new so might not always know what he's talking about | 20:12 | |
otherwiseguy | ;) | 20:12 |
*** markwash has quit IRC | 20:12 | |
fungi | i think most of those numbers should grow knobs in the config though | 20:12 |
clarkb | fungi: yup definitely plan to make this configable | 20:12 |
clarkb | wanted to get mechanics of it in front of people asap though | 20:13 |
fungi | yeah, it makes sense to me, from a core design perspective | 20:13 |
fungi | increment in the good times, halve in the bad, start with a reasonably large number we're unlikely to exceed except when under heavy load | 20:14 |
otherwiseguy | The exponential backoff combined with creeping up by 1 means that the actionable_size will probably mostly be half of what we are capable of, though. | 20:15 |
otherwiseguy | We'll creep up, then bam! | 20:15 |
*** krtaylor has joined #openstack-infra | 20:16 | |
*** prad_ has joined #openstack-infra | 20:16 | |
clarkb | otherwiseguy: yup | 20:16 |
*** nati_uen_ has joined #openstack-infra | 20:16 | |
jeblair | otherwiseguy: basically i was thinking that having 1 < min > 4 wouldn't waste too many resources, and could still get some of the benefit of the parallel jobs | 20:16 |
*** mgagne has joined #openstack-infra | 20:16 | |
*** elasticio has quit IRC | 20:17 | |
*** prad has quit IRC | 20:17 | |
*** prad_ is now known as prad | 20:17 | |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Add rate limiting to dependent pipeline queues https://review.openstack.org/68219 | 20:17 |
clarkb | that addresses the comments. I am going to grab lunch then work on making it configurable | 20:17 |
clarkb | also tests | 20:18 |
*** jcoufal-mobile has quit IRC | 20:18 | |
*** mriedem has quit IRC | 20:18 | |
clarkb | oh lol my #slice and dice here comment should be delete :) | 20:18 |
harlowja | if u guys get a sec, https://review.openstack.org/#/c/65135/ please :) | 20:18 |
sdague | clarkb: yeh, honestly floor of 3 is probably safe | 20:18 |
clarkb | gah I missed logging too ... anyways after lunch | 20:18 |
clarkb | sdague: ok | 20:18 |
*** nati_ueno has quit IRC | 20:19 | |
harlowja | clarkb for rate-limiting, if u want i made this a while ago, http://paste.openstack.org/show/61647/ | 20:21 |
harlowja | might be useful in your stuff | 20:21 |
fungi | clarkb: i'm trying to think through what happens when you have (head) a, b, c, d, e in the gate, a is passing, b is failing, c depends on b and is skipped, d and e are passing (nnfi basing d on a)... with actionable at, say 3 how does that scenario play out? | 20:22 |
fungi | do we not test d and e until a,b,c flush through? | 20:22 |
jeblair | fungi: i believe that's correct; c counts as one of the actionables | 20:22 |
fungi | in that case, i wonder whether long dependent series might have some unforeseen impact when actionable gets low | 20:24 |
fungi | i guess not until one of them fails | 20:24 |
*** NikitaKonovalov_ is now known as NikitaKonovalov | 20:24 | |
jeblair | fungi: yeah, i think we could end up actually testing 1 change if b and c are dependents and a is failing. | 20:25 |
jeblair | fungi: which would be a bit sad; zuul would be running jobs for 0 potentially mergable changes. | 20:26 |
sdague | jeblair: oh, because we don't distinguish runnable | 20:26 |
sdague | hmmm | 20:27 |
jeblair | we could say that's acceptable for a first cut of this algorithm, or we could have 'getActionableItems' be a bit smarter and try to reach deeper and always find x runnable items | 20:27 |
sdague | so why don't we just fix the value at 10, with an rpc way to update | 20:27 |
fungi | agreed. this is great as a starting point | 20:27 |
sdague | and not try to be clever and adaptive | 20:27 |
sdague | because it's going to cause possibly other behavior we don't understand | 20:28 |
sdague | and right now, is not a great time to add uncertainty | 20:28 |
fungi | sdague: a dependent patch series of 10 would still have the same effect in that case | 20:28 |
sdague | fungi: sure | 20:28 |
jeblair | i think if we set the min to a high enough value, we can go with the clever and adaptive bit | 20:28 |
sdague | but getting those all approved at once is rare | 20:28 |
jeblair | so clever and adaptive with min=10 perhaps? | 20:28 |
sdague | jeblair: sure | 20:28 |
*** fifieldt has quit IRC | 20:30 | |
otherwiseguy | just out of curiosity, how many has it been getting up to right now? | 20:30 |
sdague | merging, or running? | 20:30 |
jeblair | otherwiseguy: recently we've merged batches of 3 changes together | 20:31 |
jeblair | otherwiseguy: in the best of times we've seen 20 | 20:31 |
*** gyee has quit IRC | 20:32 | |
*** nati_uen_ has quit IRC | 20:33 | |
otherwiseguy | sdague: running. i.e. if setting the min actionable_items to 10, how does that compare to what we've been hitting. | 20:33 |
sdague | otherwiseguy: we've been getting 40 deep | 20:33 |
sdague | which is the problem | 20:33 |
sdague | then we have to throw all that away | 20:33 |
fungi | basically about enough jobs to use most of our 450 nodepool node capacity | 20:33 |
fungi | which varies a bit depending on which changes are in the front of the pipeline and how long jobs run until one completes with a failing result | 20:34 |
otherwiseguy | Yeah, overcommit ratios only work when at least some resources are occasionally idle. :) | 20:35 |
*** mriedem has joined #openstack-infra | 20:35 | |
russellb | still don't see the nova gate bug fix in the queue so it can be promoted :-/ | 20:36 |
* russellb offers zuul a cookie | 20:36 | |
zaro | clarkb: i didn't need the generic exception. the new scp plugin is on review-dev and have pushed code to the github pull request. | 20:37 |
*** mrmartin has joined #openstack-infra | 20:39 | |
*** sarob has joined #openstack-infra | 20:39 | |
sdague | russellb: yeh, we're event backlogged now | 20:39 |
* russellb nods | 20:39 | |
sdague | notice the event queue length > 1000 | 20:39 |
russellb | oh snap | 20:39 |
sdague | this was like it was on wed | 20:39 |
russellb | i wasn't looking at that part of status | 20:39 |
sdague | so it's like 3.5hrs for an event to get processed | 20:40 |
russellb | can we ninja merge stuff past all this? that nova bug has 340 hits in 12 hours | 20:41 |
sdague | russellb: yeh, it passed check earlier? | 20:41 |
sdague | i'm cool with a ninja merge if we have one set of good test results | 20:42 |
russellb | no ... hasn't made it to check yet, either | 20:42 |
*** pcrews has quit IRC | 20:42 | |
*** apevec has quit IRC | 20:43 | |
*** gokrokve has joined #openstack-infra | 20:43 | |
NobodyCam | -infra just a passing "Great Job guys!" | 20:43 |
fungi | NobodyCam: thanks! | 20:43 |
NobodyCam | lol think I posted that in the t meeting :-p by mistake | 20:44 |
russellb | NobodyCam: +1 :) | 20:44 |
anteaya | NobodyCam: guys in the gender inclusive sense, of course? | 20:44 |
NobodyCam | :-p | 20:44 |
NobodyCam | ofc | 20:44 |
anteaya | :D | 20:44 |
*** marun has quit IRC | 20:46 | |
*** gokrokve has quit IRC | 20:47 | |
*** mrda_ has joined #openstack-infra | 20:49 | |
*** fifieldt has joined #openstack-infra | 20:50 | |
*** burt1 has quit IRC | 20:52 | |
portante | funny, if NobodyCam said, "Great Job girls!", girls in the gender inclusive sense, of course, would it come across in the same way? | 20:54 |
*** markmcclain has joined #openstack-infra | 20:54 | |
NobodyCam | portante: :-p | 20:55 |
*** kgriffs has left #openstack-infra | 20:56 | |
*** jasondotstar has quit IRC | 20:56 | |
portante | NobodyCam: no offense intended, just curious about folks use language | 20:57 |
NobodyCam | none taken... | 20:57 |
NobodyCam | :) | 20:57 |
*** senk1 has quit IRC | 20:57 | |
* fungi would take no offense | 20:58 | |
fungi | then again, my long hair gets me called "ma'am" by waiters and store clerks all the time | 20:58 |
anteaya | I try to go for other terms such as: folks, group, people | 20:59 |
*** obondarev_ has quit IRC | 20:59 | |
fungi | "friends!" | 20:59 |
anteaya | that works | 20:59 |
NobodyCam | +1 for friends | 20:59 |
portante | stackers | 21:00 |
NobodyCam | lol | 21:00 |
anteaya | actually there was an informal poll about the use of the term guys | 21:00 |
anteaya | females read the term as excluding them | 21:00 |
*** emagana has quit IRC | 21:00 | |
* NobodyCam comment was NOT ment to exclude anyone! | 21:00 | |
*** emagana has joined #openstack-infra | 21:00 | |
*** mrodden has joined #openstack-infra | 21:01 | |
jog0 | if I reembmer correctly long queue lengths mean zuul is slow to process gerrit events | 21:01 |
*** mrodden1 has quit IRC | 21:01 | |
* NobodyCam rewords to: -infra just a passing "Great Job everyone!" | 21:02 | |
openstackgerrit | lifeless proposed a change to openstack-infra/config: Move tuskar-ui to horizon program. https://review.openstack.org/68264 | 21:02 |
*** madmike has joined #openstack-infra | 21:02 | |
sdague | jog0: yuo are correct | 21:02 |
*** kraman is now known as kraman_lunch | 21:02 | |
anteaya | NobodyCam: :D | 21:02 |
NobodyCam | :) | 21:02 |
*** kgriffs has joined #openstack-infra | 21:02 | |
kgriffs | guys, we really need this patch in for i-2 but zuul hasn't picked it up for hours now - https://review.openstack.org/#/c/68161/ | 21:03 |
jog0 | looks like at least we are landing code in o/o today | 21:03 |
jog0 | https://github.com/openstack/openstack/graphs/commit-activity sums it up well | 21:03 |
kgriffs | anything I can do, or is zuul just backed up? | 21:03 |
lifeless | jeblair: ok, so nodepool | 21:03 |
*** gokrokve has joined #openstack-infra | 21:03 | |
*** dangers_away is now known as dangers | 21:04 | |
jog0 | sdague: did something change late last night because nodepool graph looks less jagged | 21:04 |
jog0 | much less jagged | 21:04 |
fungi | kgriffs: too many devs (and too many bugs). i think ttx said something about probably postponing i-2 | 21:04 |
kgriffs | oh | 21:04 |
kgriffs | gtk! | 21:04 |
kgriffs | I'll hop over to #openstack-meeting and see | 21:05 |
*** mfink has quit IRC | 21:05 | |
*** prad has quit IRC | 21:05 | |
fungi | jog0: i'm constantly running a loop looking for nodepool nodes in a delete state and executing a nodepool delete from the cli as a follow-on to catch deletes which the providers are ignoring | 21:05 |
*** zul has quit IRC | 21:06 | |
fungi | (with some rate limiting so i don't spam it too badly) | 21:06 |
jog0 | btw is the console.html missing in logstash bug fixed? | 21:06 |
*** markwash has joined #openstack-infra | 21:07 | |
fungi | jog0: should be as of the past 24 hours, yes | 21:07 |
jog0 | fungi: ahh that may be it, neat | 21:07 |
*** oubiwann_ has quit IRC | 21:07 | |
jog0 | fungi: thanks, getting good data in es is really helful, thanks | 21:07 |
fungi | not really neat. ugly, hackish and something we want to solve in nodepool source instead. but for now it's s eeking out a little extra capacity | 21:08 |
fungi | (the delete loop) | 21:08 |
jog0 | the neat part is that the graph is less jagged | 21:09 |
jog0 | and that we *think* we know why | 21:09 |
*** julim has quit IRC | 21:09 | |
mikal | Doh | 21:10 |
*** mrmartin has quit IRC | 21:10 | |
mikal | IRC fail | 21:10 |
mikal | Is there a guide to running tempest somewhere? I can't see one on the wiki and am ashamed to admit I've never run it in person. | 21:10 |
mordred | clarkb, fungi: can one of you msg the tenant id of the nodepool account at rackspace to pvo ? | 21:10 |
*** markwash_ has joined #openstack-infra | 21:10 | |
mtreinish | mikal: I don't think so. There is: http://docs.openstack.org/developer/tempest/ | 21:10 |
fungi | mordred: sure. doing that now | 21:10 |
mtreinish | but that probably doesn't have enough detail | 21:10 |
russellb | mikal: it's pretty easy with devstack | 21:11 |
mikal | russellb: oh, its just a script in devstack? | 21:11 |
mikal | Cool | 21:11 |
mtreinish | mikal: I've been meaning to write some real guides for people who are hand configuring things. (not devstack) | 21:11 |
*** pballand has joined #openstack-infra | 21:11 | |
mikal | I'm interested in seeing how bad tempest is for libvirt+lxc | 21:11 |
*** markwash has quit IRC | 21:12 | |
*** markwash_ is now known as markwash | 21:12 | |
jog0 | https://jenkins04.openstack.org/job/gate-nova-python27/1215/console HUH | 21:12 |
jog0 | ] Resource temporarily unavailable | 21:12 |
jog0 | seen in gate | 21:12 |
russellb | mikal: i think devstack sets up tempest for you ... check the devstack readme | 21:12 |
mtreinish | russellb: yeah it does: http://git.openstack.org/cgit/openstack-dev/devstack/tree/lib/tempest | 21:13 |
jog0 | fungi: ^ | 21:13 |
*** oubiwann_ has joined #openstack-infra | 21:14 | |
fungi | jog0: on stdout's fd | 21:15 |
fungi | weird | 21:15 |
fungi | jog0: oh, that's stdout of a child process | 21:16 |
*** mrda is now known as mrda__ | 21:16 | |
fungi | so the child presumably died in flames? | 21:16 |
*** MarkAtwood has quit IRC | 21:16 | |
*** mrda_ is now known as mrda | 21:16 | |
jog0 | logstash message:"os.read(self.stdout.fileno(), 1024)" | 21:17 |
*** pcrews has joined #openstack-infra | 21:17 | |
dims | jog0, just 14 hits in last 7 days, i see a bunch on 16th | 21:18 |
jog0 | dims: yeah strange | 21:18 |
jog0 | its only in gate-nova-python27 | 21:18 |
fungi | so subprocess tried to run get_schema_cmd and the child's stdout was inacessible (either not started yet, termiated, disassociated) | 21:18 |
*** madmike has quit IRC | 21:19 | |
fungi | i don't think subprocess can normally return control if the descriptors haven't been attached yet (though i could be wrong), so it probably either disassociated after starting or, more likely, died | 21:19 |
*** prad has joined #openstack-infra | 21:20 | |
*** oubiwann_ has quit IRC | 21:20 | |
sdague | clarkb: so are we backed up on events because of the git gc issue? | 21:21 |
*** aburaschi1 has joined #openstack-infra | 21:21 | |
sdague | because zuul was totally on top of events until today | 21:21 |
sdague | even with more things in the queue | 21:21 |
jog0 | fungi: thanks filing a nova bug on this | 21:22 |
*** aburaschi has quit IRC | 21:22 | |
*** nati_ueno has joined #openstack-infra | 21:23 | |
*** dprince has quit IRC | 21:23 | |
*** markmcclain has quit IRC | 21:23 | |
*** aburaschi1 has left #openstack-infra | 21:23 | |
*** oubiwann_ has joined #openstack-infra | 21:24 | |
*** markmcclain has joined #openstack-infra | 21:24 | |
*** hashar has quit IRC | 21:25 | |
*** beagles has quit IRC | 21:25 | |
*** b3nt_pin has joined #openstack-infra | 21:25 | |
*** b3nt_pin is now known as beagles | 21:25 | |
*** hashar has joined #openstack-infra | 21:27 | |
openstackgerrit | Davanum Srinivas (dims) proposed a change to openstack-infra/devstack-gate: Temporary HACK : Enable UCA https://review.openstack.org/67564 | 21:28 |
*** jhesketh has joined #openstack-infra | 21:29 | |
*** markmc has quit IRC | 21:30 | |
jeblair | sdague: a full queue gate reset currently takes 13 minutes; it likely would have taken 4 minutes 4 days ago | 21:31 |
sdague | jeblair: ok | 21:31 |
jeblair | sdague: that could certainly be a big factor, if not the main cause of the queue backlog | 21:31 |
sdague | so why the event backup | 21:31 |
lifeless | jeblair: hi | 21:31 |
jeblair | lifeless: hi | 21:31 |
sdague | we are litterally taking hours to queue something in check now | 21:32 |
sdague | not run anything on it | 21:32 |
sdague | but just to add it to the check queue | 21:32 |
lifeless | jeblair: I tried to arrange my patchset in order of most-likely-that-jeblair-will-approve :) | 21:32 |
*** hashar is now known as hasharMeeting | 21:32 | |
*** NikitaKonovalov is now known as NikitaKonovalov_ | 21:32 | |
jeblair | sdague: i think clarkb's limiter, along with horizontally scaling mergers will substantially help. | 21:33 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Add fingerprint for bug 1271331 https://review.openstack.org/68270 | 21:33 |
sdague | jeblair: cool | 21:33 |
jeblair | sdague: i also believe the optimization i mentioned to russellb (about treating results as a flag, not a queue) would help too, but not quite as substantially | 21:33 |
russellb | jeblair: yeah i started looking at doing that and then meetings started | 21:34 |
openstackgerrit | Matt Ray proposed a change to openstack-infra/config: Chef style testing enablement and minor speed cleanup starting with checks https://review.openstack.org/67964 | 21:34 |
*** melwitt has quit IRC | 21:34 | |
*** melwitt has joined #openstack-infra | 21:34 | |
*** melwitt has quit IRC | 21:35 | |
*** melwitt has joined #openstack-infra | 21:35 | |
*** melwitt has quit IRC | 21:35 | |
clarkb | ok back | 21:36 |
*** jhesketh has quit IRC | 21:37 | |
clarkb | jog0: console.html bug is mostly fixed, we need to put a slightly newer verson of the plugin on the masters that fixes a bug we added | 21:37 |
*** gyee has joined #openstack-infra | 21:37 | |
*** melwitt1 has joined #openstack-infra | 21:38 | |
jeblair | mordred: if we add more quota, we need to add more jenkins masters (1 master / 100 nodes of quota) | 21:38 |
*** jhesketh has joined #openstack-infra | 21:38 | |
zaro | mgagne: were you looking for me? | 21:38 |
jhesketh | Howdy | 21:38 |
mgagne | zaro: no more =) | 21:38 |
*** jhesketh_ has joined #openstack-infra | 21:39 | |
jog0 | clarkb: awesome sauce | 21:40 |
*** jhesketh_ has quit IRC | 21:40 | |
*** CaptTofu has quit IRC | 21:41 | |
*** beagles has quit IRC | 21:41 | |
*** dkliban has quit IRC | 21:41 | |
*** jhesketh_ has joined #openstack-infra | 21:43 | |
openstackgerrit | A change was merged to openstack-infra/config: Upload pre-release oslo.messaging tags to pypi https://review.openstack.org/67131 | 21:43 |
*** b3nt_pin has joined #openstack-infra | 21:43 | |
*** b3nt_pin is now known as beagles | 21:43 | |
pballand | mordred: were you able to take a look that the congress build scripts? | 21:44 |
*** dstufft_ has joined #openstack-infra | 21:45 | |
*** Ajaeger1 has quit IRC | 21:46 | |
*** fifieldt has quit IRC | 21:47 | |
openstackgerrit | Khai Do proposed a change to openstack-infra/config: point zuul-dev to review-dev https://review.openstack.org/68271 | 21:47 |
*** dstufft has quit IRC | 21:47 | |
*** dstufft_ is now known as dstufft | 21:48 | |
*** beagles has quit IRC | 21:48 | |
*** reed has joined #openstack-infra | 21:52 | |
*** dmsimard has quit IRC | 21:52 | |
*** jasondotstar has joined #openstack-infra | 21:52 | |
*** sarob_ has joined #openstack-infra | 21:53 | |
*** sarob_ has quit IRC | 21:54 | |
*** sarob_ has joined #openstack-infra | 21:55 | |
*** sarob has quit IRC | 21:56 | |
*** jcoufal has joined #openstack-infra | 21:56 | |
*** sarob_ has quit IRC | 21:57 | |
sdague | 7 straight patch merge about to happen | 21:57 |
sdague | sorry 6 | 21:57 |
*** rcleere has quit IRC | 21:58 | |
*** fifieldt has joined #openstack-infra | 21:58 | |
*** senk has joined #openstack-infra | 21:58 | |
*** jerryz has joined #openstack-infra | 21:59 | |
*** rcleere has joined #openstack-infra | 21:59 | |
jerryz | fungi: ping | 21:59 |
fungi | jerryz: what's up? (i'm on a plane right now, so my ping round-trip time isn't so great) | 22:00 |
*** rcleere has quit IRC | 22:00 | |
*** sarob has joined #openstack-infra | 22:00 | |
*** sarob has quit IRC | 22:00 | |
*** smarcet has left #openstack-infra | 22:00 | |
*** sarob has joined #openstack-infra | 22:01 | |
jerryz | fungi: i want to revert a change, but after the revert is created, zuul didn't verify or submit | 22:01 |
*** hasharMeeting is now known as hashar | 22:02 | |
*** sarob_ has joined #openstack-infra | 22:02 | |
*** jooools has joined #openstack-infra | 22:02 | |
*** sarob_ has quit IRC | 22:02 | |
*** miqui has quit IRC | 22:04 | |
*** miqui has joined #openstack-infra | 22:04 | |
*** miqui has quit IRC | 22:04 | |
jog0 | clarkb sdague: https://review.openstack.org/#/c/67485/ Don't run non-voting gate-grenade-dsvm-neutron | 22:04 |
jog0 | 22:04 | |
fungi | jerryz: it's in the gerrit event fifo zuul maintains, and will emerge eventually. at the moment we're seeing zuul take several hours to acknowledge a patchset upload or approval/comment due to voume | 22:05 |
jog0 | that should marginally help with resources and help with our classification rate | 22:05 |
*** CaptTofu has joined #openstack-infra | 22:05 | |
fungi | er, due to volume | 22:05 |
jerryz | fungi: i didn't see the change in queue | 22:06 |
fungi | jerryz: the event queue is nearly 1500. that's the list of new events zuul hasn't looked at yet | 22:06 |
*** sarob has quit IRC | 22:06 | |
jog0 | sdague clarkb: if we map logs/new in grenade to logs then they would get dumped into logstash right? | 22:07 |
fungi | jerryz: once the corresponding event is processed, the change will appear in the appropriate pipeline(s) on zuul's status page | 22:07 |
jog0 | any downside with that approach? | 22:08 |
jerryz | fungi: ok. just didn't expect to take that long | 22:08 |
jerryz | fungi: it was created 11:30pst in the morning | 22:08 |
fungi | jerryz: yes, we're experiencing unprecedented test volume levels today | 22:08 |
*** sarob has joined #openstack-infra | 22:09 | |
jerryz | jog0: i just used that workaround a month ago. old stack logs are not indexed though. | 22:10 |
*** dizquierdo has joined #openstack-infra | 22:10 | |
*** nati_ueno has quit IRC | 22:10 | |
*** mfer has quit IRC | 22:10 | |
jerryz | fungi: thank you. just confirm with you revert is no different than a new patchset creation. | 22:11 |
*** nati_ueno has joined #openstack-infra | 22:11 | |
fungi | jerryz: correct. from zuul's (and gerrit's, and git's) perspective it's just another commit | 22:12 |
jog0 | jerryz: right, this is forward facing | 22:12 |
clarkb | jog0: a better way to do it would be to implement recursive log searching to some sane depth in the logstash gearman client/workers | 22:12 |
*** esker has quit IRC | 22:12 | |
jog0 | jerryz: did you just have a symbolic link or ssomething | 22:12 |
clarkb | jog0: but that may be a lot more work | 22:13 |
jerryz | jog0: i added the copy in scp plugin. but in my env, the jobs are not generated by jjb | 22:13 |
*** harlowja is now known as harlowja_away | 22:14 | |
jog0 | clarkb: yeah, we are seeing grenade have a decent number of failures (since we run tempest) | 22:14 |
jog0 | and without the logs we can't classify em | 22:14 |
jog0 | clarkb: what about setting source-file names to: logs/(new)+/screen-... | 22:15 |
jog0 | can the names have regex | 22:15 |
mgagne | zaro: https://review.openstack.org/#/c/64610/3 | 22:16 |
lifeless | jeblair: do you run nodepool with log set to debug ? | 22:16 |
mgagne | zaro: can we approve? | 22:16 |
fungi | clarkb: jeblair: is there any interest in seeing whether a manual run of the repack routine from cron gets git operation times on zuul back down to what we saw last week? i/o impact should be minimal since that's all on tmpfs, and zuul's only occupying about half its available vcpus so i would expect performance impact while it's running to be fairly minimal | 22:17 |
clarkb | fungi: good idea | 22:17 |
zaro | mgagne: jenkins-job-builder-compare-xml test failed on that one. not sure why. | 22:17 |
mgagne | utf-8 | 22:17 |
fungi | lifeless: we use a logging config which spits debug level to a separate file from info | 22:17 |
clarkb | lifeless: we have nodepool log debug and greater to a debug.log and not debug to other logs | 22:17 |
clarkb | I may have gone a bit overboard on the next zuul rate limit patchset >_> once I have it pep8 clean will push it up | 22:18 |
mgagne | zaro: so this might retrigger a full update of all jenkins jobs on openstack-infra | 22:19 |
*** vipul is now known as vipul-away | 22:19 | |
lifeless | fungi: oh, I didn't realise that there was a production debug log | 22:20 |
lifeless | uhm | 22:20 |
lifeless | so I'd like to gather that data | 22:20 |
lifeless | could you perhaps ship me a few hours of debug log ? | 22:20 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Sort uncategorized fails by time https://review.openstack.org/67761 | 22:20 |
zaro | mgagne: what does that mean? | 22:21 |
mgagne | zaro: hopefully, nothing =) | 22:21 |
fungi | lifeless: you mean after we restart with that patch in place? | 22:21 |
lifeless | fungi: answered https://review.openstack.org/#/c/67979/ for you | 22:22 |
fungi | i've hesitated to restart nodepoold yet since i don't want to set us back even further resource-wise | 22:22 |
*** jamielennox is now known as jamielennox|away | 22:22 | |
lifeless | fungi: oh, nvm then | 22:22 |
lifeless | but I'd really like those stats somehow | 22:22 |
lifeless | fungi: how do we know that it logs too much data if we haven't run it ? | 22:23 |
lifeless | jeblair: ^ | 22:23 |
*** burt1 has joined #openstack-infra | 22:23 | |
*** oubiwann_ has quit IRC | 22:23 | |
zaro | mgagne: got the +2 from me. | 22:23 |
*** yassine has quit IRC | 22:23 | |
fungi | lifeless: i think the assertion was that under volume like we have now, every worker (all 10 of them) would be logging that every second | 22:23 |
*** nati_ueno has quit IRC | 22:24 | |
dims | hmm, updated a review abut 45 mins ago and don't see it yet on zuul/ page. another symptom of existing issues? new one? | 22:24 |
*** dmsimard has joined #openstack-infra | 22:25 | |
*** vipul-away is now known as vipul | 22:25 | |
fungi | dims: it's related to the current activity volume | 22:25 |
dims | thx, just making sure | 22:25 |
lifeless | fungi: would statsd be ok then ? | 22:25 |
*** vipul is now known as vipul-away | 22:25 | |
lifeless | fungi: OTOH if everyone is convinced we're hitting rate limits, perhaps the focus should be on getting below the limit | 22:25 |
*** Hefeweizen has quit IRC | 22:26 | |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Add rate limiting to dependent pipeline queues https://review.openstack.org/68219 | 22:26 |
dmsimard | dims: the queue is quite lengthy right now, have commits that were done 3 hours ago that have not yet been checked | 22:27 |
mgagne | zaro, fungi, clarkb, jeblair: the approval of this change will retrigger an update of all jenkins jobs through the API: https://review.openstack.org/#/c/64610/ (due to XML change) Is there any reason to refrain from approving it? | 22:27 |
*** prad has quit IRC | 22:27 | |
clarkb | jeblair: otherwiseguy sdague latest patchset makes it configurage via the layout.yaml | 22:27 |
clarkb | harlowja_away: thanks, but we are ratelimiting not over time but based on success and failure rates | 22:28 |
*** prad has joined #openstack-infra | 22:28 | |
fungi | lifeless: good question. maybe statsd (though that generates network traffic on every hit instead) or some sort of internal counter we could periodically report on? | 22:31 |
*** thomasem has quit IRC | 22:31 | |
jog0 | do we have a fingerprint for git.openstack.org[0: 2001:4800:7813:516:3bc3:d7f6:ff04:aacb]: errno=Network is unreachable | 22:31 |
*** dims has quit IRC | 22:31 | |
fungi | jog0: not sure. we also have some additional diagnostics for that one, but it'll be in the gate for a couple days still | 22:33 |
*** jamielennox|away is now known as jamielennox | 22:33 | |
jog0 | fungi: is message:"fatal: unable to connect to git.openstack.org:" AND filename:"console.html" to generic? | 22:33 |
*** praneshp has quit IRC | 22:33 | |
openstackgerrit | Gregory Haynes proposed a change to openstack-infra/gitdm: Add Gregory Haynes to HP https://review.openstack.org/68277 | 22:33 |
fungi | jog0: you might match on the ipv6 address (it should remain constant unless we need to move the haproxy frontend to a new server) | 22:34 |
jog0 | fungi: that works for me, thanks | 22:34 |
fungi | jog0: and errno=Network is unreachable | 22:34 |
*** harlowja_away is now known as harlowja | 22:35 | |
fungi | basically this is hpcloud vms thinking they have ipv6 connectivity even though they don't, so would be good to capture that fairly explicitly | 22:35 |
harlowja | clarkb kk, np, thought it might be useful anyway :) | 22:35 |
openstackgerrit | lifeless proposed a change to openstack-infra/nodepool: Use the nonblocking cleanupServer. https://review.openstack.org/68004 | 22:35 |
jog0 | git.openstack.org: Temporary failure in name resolution | 22:35 |
jog0 | wee | 22:35 |
*** dmsimard has quit IRC | 22:36 | |
jog0 | two hits for message:"git.openstack.org[0: 2001:4800:7813:516:3bc3:d7f6:ff04:aacb]: errno=Network is unreachable" AND filename:"console.html" | 22:36 |
harlowja | clarkb although success/failure rate might just be something u can use instead, replace time dimension with success/failure dimension with that and there u go | 22:36 |
*** jooools has quit IRC | 22:36 | |
jog0 | 20 for message:"git.openstack.org: Temporary failure in name resolution" AND filename:"console.html" | 22:36 |
* jog0 files two bugs | 22:36 | |
fungi | jog0: that leading 0: might be throwing it, depending on which local address it tries to source from | 22:38 |
jog0 | https://bugs.launchpad.net/openstack-ci/+bug/1270382 | 22:39 |
*** kraman_lunch is now known as kraman | 22:39 | |
fungi | jog0: also, we've seen it manifest in the setup logs, not just in the console. i expect a lot more hits there | 22:40 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Add fingerprint for bug 1270382 https://review.openstack.org/68280 | 22:41 |
*** yassine has joined #openstack-infra | 22:41 | |
jog0 | are setup logs in elasticsearch | 22:42 |
*** praneshp has joined #openstack-infra | 22:42 | |
fungi | noidea | 22:42 |
jog0 | message:"2001:4800:7813:516:3bc3:d7f6:ff04:aacb]: errno=Network is unreachable" has only two hits still | 22:42 |
*** krtaylor has quit IRC | 22:42 | |
jog0 | with only two hits is it worth filing a CI bug? | 22:43 |
lifeless | jeblair: I'd like to go through in more detail the approach my patch queue shows, when would be a good time for you? | 22:43 |
lifeless | jog0: will it always be that ip address? | 22:44 |
jog0 | thats the git.o.o ipv6 addr | 22:44 |
*** sarob has quit IRC | 22:45 | |
*** dcramer_ has quit IRC | 22:45 | |
*** sarob has joined #openstack-infra | 22:45 | |
*** dims has joined #openstack-infra | 22:46 | |
*** dkliban has joined #openstack-infra | 22:46 | |
jog0 | sdague: here is another fun failure message:"Unable to lock the administration directory (/var/lib/dpkg/), is another process using it" | 22:48 |
fungi | jog0: i think there alrady is a bug filed | 22:48 |
jog0 | fungi: last hit was 2014-01-21T06:38:36.000 | 22:48 |
jog0 | oh filed | 22:48 |
jog0 | what project? | 22:48 |
fungi | openstack-ci i thought | 22:49 |
fungi | maybe i imagined it | 22:49 |
fungi | we've not seen it often, which is why we didn't promote the diag patch for that | 22:49 |
jog0 | https://bugs.launchpad.net/openstack-ci/+bug/1097592 | 22:49 |
jog0 | message:"Unable to lock the administration directory (/var/lib/dpkg/), is another process using it" AND filename:"console.html" | 22:50 |
*** yamahata has quit IRC | 22:50 | |
*** SergeyLukjanov is now known as SergeyLukjanov_a | 22:50 | |
*** yamahata has joined #openstack-infra | 22:50 | |
*** SergeyLukjanov_a is now known as SergeyLukjanov_ | 22:51 | |
fungi | that happens when two things try to install a deb at the same time | 22:52 |
openstackgerrit | Joe Gordon proposed a change to openstack-infra/elastic-recheck: Add fingerprint for bug 1097592 https://review.openstack.org/68282 | 22:52 |
fungi | could indicate a hung apt-get install run | 22:52 |
ewindisch | sdague / jeblair - to continue yesterday's chat about the docker gate... ;-) | 22:55 |
*** praneshp has quit IRC | 22:55 | |
ewindisch | sdague / jeblair - we're looking at running external infrastructure since running in openstack's infra seems contentious. The question is - if we give you keys to the castle, would you want them? | 22:56 |
*** prad has quit IRC | 22:56 | |
*** gothicmindfood has quit IRC | 22:57 | |
*** hashar has quit IRC | 22:57 | |
jog0 | ewindisch: infra is already constrained on human resources, I don't know if infra can handle any more load without more people | 22:57 |
*** sarob has quit IRC | 22:58 | |
*** praneshp has joined #openstack-infra | 22:58 | |
ewindisch | jog0: and the idea wouldn't be that we'd hand it off, we'd continue to support it ourselves, but infra would have access as well. | 22:59 |
jog0 | ewindisch: what would the benefit of giving infra access be? | 22:59 |
ewindisch | but I agree it could be distracting, which is one reason I ask | 22:59 |
ewindisch | jog0: maybe someone that cares and does have time WOULD want to have access? | 22:59 |
*** luis_ has quit IRC | 23:00 | |
sdague | ewindisch: it was only contentious in timeline, like I said lets do this at juno summit | 23:00 |
sdague | and plan for it during that cycle | 23:00 |
*** _david_ has quit IRC | 23:00 | |
*** jpich has quit IRC | 23:00 | |
jog0 | sdague: so gate queue delay dropped under 2 full days, woot! | 23:00 |
sdague | yay! | 23:00 |
*** hashar has joined #openstack-infra | 23:01 | |
sdague | no it didn't | 23:01 |
sdague | 44hrs at top of queue | 23:01 |
ewindisch | sdague: right - so we acknowledge that running it externally is far less contentious for Icehouse and we're looking to do that | 23:01 |
jog0 | 48hrs in two days right? | 23:01 |
sdague | oh, to under 2 days | 23:01 |
sdague | I thought we lost 2 days of lag | 23:01 |
jog0 | sdague: baby steps | 23:01 |
sdague | :) | 23:01 |
*** sarob has joined #openstack-infra | 23:03 | |
fungi | 68147,3 is finally in the gate. i'll try to catch the next reset to promote it | 23:03 |
*** jaypipes has joined #openstack-infra | 23:03 | |
jog0 | sdague: you have a minute to talk gate failure classification rate? | 23:03 |
*** rnirmal has quit IRC | 23:04 | |
ewindisch | jog0 / sdague: anyway, if we provide servers and run them ourselves, do you explicitly not want access to them? We're not asking you to manage or run them, but offering you access should you desire it. | 23:04 |
jeblair | lifeless: having the cleanup thread do it fails to maximise paralellism | 23:04 |
ewindisch | at least until Juno and we can discuss the future at teh summit | 23:04 |
*** CaptTofu has quit IRC | 23:04 | |
sdague | ewindisch: ask jeblair on that one | 23:04 |
*** ryanpetrello has quit IRC | 23:04 | |
*** jcoufal has quit IRC | 23:04 | |
ewindisch | sdague: yeah, I cc'ed his nick earlier, but I guess he isn't on IRC right now | 23:05 |
ewindisch | thanks | 23:05 |
*** dizquierdo has quit IRC | 23:05 | |
sdague | ewindisch: there is just a ton going on right now with the current gate | 23:05 |
jeblair | ewindisch: remind me why running it in openstack-infra is contentious? | 23:05 |
ewindisch | jeblair: sdague doesn't want us to ;-) | 23:05 |
ewindisch | er, in Icehouse | 23:05 |
sdague | jeblair: I provided push back that it wasn't fair to come to infra at i2 and try to get that in by icehouse | 23:05 |
ewindisch | sdague can elaborate, but basically the project is constrained on human and hardware resources | 23:06 |
sdague | you are welcome to overrule :) | 23:06 |
jeblair | ewindisch: what are the technical requirements? | 23:06 |
russellb | could run in existing dsvm nodes, but requires a 3rd party apt repo for docker right now | 23:06 |
*** HenryG has quit IRC | 23:07 | |
jeblair | russellb: is the 3rd party repo bit likely to change in the future? | 23:07 |
russellb | basically need a devstack-gate job running a subset of tempest tests against the docker driver | 23:07 |
russellb | jeblair: i have no clue | 23:07 |
*** sarob has quit IRC | 23:07 | |
*** harlowja has quit IRC | 23:07 | |
jeblair | (and is the 3rd party repo ubuntu cloud archive by any chance?) | 23:07 |
russellb | fungi: cool ... it doesn't have check results yet, but it's probably worth the risk ... isn't going to make things much worse anyway | 23:07 |
*** gothicmindfood has joined #openstack-infra | 23:07 | |
russellb | jeblair: it is not | 23:07 |
ewindisch | jeblair: we're trying to run devstack-gate with full tempest, triggered on all patchsets. Tests can run inside VMs which don't necessarily need to be respawned for subsequent tests (i.e. jenkins slave could run on a VM) | 23:08 |
* russellb should let ewindisch speak | 23:08 | |
*** hashar has quit IRC | 23:08 | |
russellb | ewindisch: what's the repo you need though, a docker managed one? | 23:08 |
ewindisch | jeblair / russellb : we have precise packages in a docker-managed repo. Tahr has a package upstream, but that doesn't help us... If necessary, I can see about pushing a backport from Tahr into precise or cloud-archive | 23:09 |
*** luisg has joined #openstack-infra | 23:10 | |
ewindisch | jeblair / russellb: another option is to run docker on the slave and we use our own image pre-loaded with Docker and having all the network resources pre-installed or cached -- I suspect this might be more contentious? Still, this is what we would do for our own externally-managed gate, if we ran it. | 23:11 |
russellb | ewindisch: i kinda feel like at this point timing wise, you should shoot for running it yourself short term, and aim to get into infra under less time pressure | 23:12 |
russellb | that lets you control your own destiny a lot more with respect to the nova deadline | 23:12 |
jog0 | russellb: ++ | 23:12 |
jeblair | russellb, ewindisch: ++ | 23:12 |
russellb | and also doesn't put extra pressure on -infra, when they're pretty slammed right now with some critical work | 23:12 |
ewindisch | russellb: which is what I've been doing. Today's question was if openstack-infra would want keys to our castle, no strings attached | 23:12 |
russellb | OK, awesome | 23:12 |
portante | clarkb: the check job of 67920 seems to be hung in the swift functional tests, which does not usually happen, all the other jobs have finished | 23:13 |
ewindisch | basically, we'd own it and run it, but we're offering keys to the castle | 23:13 |
russellb | ewindisch: keys to the castle isn't too hard, there are public ssh keys in the infra config repo | 23:13 |
jeblair | ewindisch: if this is runnable in openstack-infra and russellb+nova wants it tested, let's talk about getting it in there in the long run, but i don't think we have time now to do it justice | 23:13 |
jeblair | ewindisch: i don't think we would want/need/use keys | 23:13 |
russellb | jeblair: yeah, that sounds good to me | 23:13 |
ewindisch | jeblair: I think we all agree to get it upstream in the long term | 23:13 |
ewindisch | jeblair: alright then. | 23:14 |
jeblair | ewindisch: cool; so no keys, and we'll talk when we're less busy. thanks for asking/checking in. :) | 23:14 |
*** burt1 has quit IRC | 23:14 | |
ewindisch | jeblair: thanks | 23:14 |
russellb | huzzah, sounds like a good path forward | 23:14 |
russellb | thanks ewindisch | 23:14 |
sdague | agreed | 23:14 |
david-lyle | ok | 23:15 |
david-lyle | oops | 23:15 |
*** dstanek has quit IRC | 23:15 | |
jeblair | lifeless: where was i? right now the complete threads funnel their actions through >=6 provider managers in parallel | 23:15 |
clarkb | portante: https://jenkins03.openstack.org/job/check-swift-dsvm-functional/286/console the test does appear to have just stopped | 23:15 |
jeblair | lifeless: having the cleanup thread means that there's only one thread driving those managers, so you're essentially serializing all of the delete operations | 23:15 |
clarkb | tests++ | 23:15 |
jeblair | lifeless: perhaps if you had a cleanup thread per provider that would alleviate that problem | 23:16 |
portante | is there a way for me to look at the syslog for that system? | 23:16 |
*** dstanek has joined #openstack-infra | 23:17 | |
clarkb | portante: yes, I will hold the node and you can poke at it | 23:17 |
jeblair | clarkb: that's a lot of knobs. you sure you want all of them? | 23:18 |
clarkb | jeblair: no I am not sure I want all of them | 23:19 |
clarkb | jeblair: I definitely want floor and the starting level, the linear vs exponential and factor stuff I am not sold on | 23:19 |
clarkb | jeblair: I have a test locally that isn't working because there is at least one bug in that patchset | 23:19 |
*** HenryG has joined #openstack-infra | 23:19 | |
clarkb | portante: have an rsa public key I can put on that host? | 23:20 |
portante | sec | 23:20 |
jeblair | clarkb: drat. i did not see the bug. :) | 23:21 |
*** jorisroovers has joined #openstack-infra | 23:21 | |
clarkb | jeblair: there are more, each time I fix a thing the test points out more :) | 23:21 |
jeblair | clarkb: anyway, your call on the knobs. seems excessive to me but i don't object. i think this is the highest priority thing for zuul, so when it's ready we'll put it into prod immediately. | 23:22 |
clarkb | ok | 23:22 |
jorisroovers | fungi: ping | 23:22 |
jeblair | clarkb: (i definitely agree that floor and level are good; it's the others i'm also not so sure about) | 23:22 |
fungi | clarkb: the repack finished a minute ago | 23:23 |
fungi | jorisroovers: what's up? | 23:23 |
jorisroovers | fungi, I've got an issue with git when trying to git review a patch that was rebased on a different patch | 23:23 |
jorisroovers | fungi, "Errors running git rebase -i remotes/gerrit/master" | 23:23 |
jorisroovers | not sure what is going on there. I had to checkout my patchset on a different machine, made a few edits, did a commit, no issues | 23:24 |
jorisroovers | but then when doing git review, I ran into this issue | 23:24 |
fungi | jorisroovers: that usually means git-review thinks it needs a rebase but is running into merge conflicts trying to perform one for you. have output you can put on paste.openstack.org and give me a link? | 23:24 |
jorisroovers | fungi, yeah, that is what I figure as well, but not seeing the issue | 23:25 |
jorisroovers | not sure whether output will be very helpful, but pasting anyway, 1 sec | 23:25 |
jorisroovers | fungi, http://paste.openstack.org/show/61653/ | 23:26 |
jorisroovers | fungi, this is the relevant patch set: https://review.openstack.org/#/c/65515/ | 23:26 |
* fungi waits for the horrible in-fight wireless to let him through the web proxy | 23:26 | |
jorisroovers | fungi :-) | 23:27 |
*** kgriffs is now known as kgriffs_afk | 23:27 | |
*** krtaylor has joined #openstack-infra | 23:27 | |
ttx | jeblair: sorry we'll not be seeing you here | 23:28 |
jorisroovers | so there is 2 patches, 1 that does creates a new directory, and a second that adds a python file. I rebased the second patch on the first as the new file needs to go in the new directory | 23:28 |
fungi | jorisroovers: so you're trying to have 66854 rebased on the tip of tempest master and 61653 on 66854 | 23:28 |
fungi | correct? | 23:29 |
openstackgerrit | Sean Dague proposed a change to openstack-infra/config: add in elastic-recheck-unclassified report https://review.openstack.org/67591 | 23:29 |
* jorisroovers is figuring the patchset numbers | 23:29 | |
*** oubiwann_ has joined #openstack-infra | 23:30 | |
jorisroovers | fungi, think you mixed up some numbers | 23:30 |
*** miqui has joined #openstack-infra | 23:30 | |
devananda | one ironic patch landed today! \o/ | 23:30 |
jorisroovers | trying to rebase 65515 of 66854 | 23:30 |
fungi | jorisroovers: the error message mentions the commit sha of 66854, which currently (n gerrit) has 61653 depending on it | 23:31 |
devananda | (not sarcastic, i didn't expect that and am happy) | 23:31 |
anteaya | devananda: :D | 23:31 |
fungi | devananda: don't spend it all in one place | 23:31 |
openstackgerrit | Clark Boylan proposed a change to openstack-infra/zuul: Add rate limiting to dependent pipeline queues https://review.openstack.org/68219 | 23:31 |
devananda | fungi: lol | 23:31 |
fungi | wow, lag getting worse here | 23:32 |
jorisroovers | fungi: so 61653 is giving me a complete different patchset | 23:32 |
jeblair | ttx: me too. i not only will miss seeing you all, but also possibly my only chance to ski this season as there is no snow in CA. | 23:32 |
jorisroovers | fungi, I don't know about that one | 23:32 |
clarkb | jeblair: ^ that change fixes a bunch of bugs, makes volptuous happy and adds a simple test. I am hoping to remember how the test stuff works so that I can check the window value after each change reports | 23:32 |
fungi | jorisroovers: it looks like the problem is 66854 conflicts with the tip of master and needs a rebase | 23:32 |
ttx | jeblair: :( | 23:32 |
jorisroovers | fungi: 61653 is the paste number :p | 23:33 |
jorisroovers | ah, I need to rebase 66854 on latest master? | 23:33 |
jorisroovers | that would make sense | 23:33 |
jeblair | clarkb: cool, lemme know if you have specific questions | 23:34 |
fungi | jorisroovers: whups... s/61653/65515/ | 23:34 |
fungi | sorry | 23:34 |
jorisroovers | fungi, no worries. | 23:34 |
fungi | jorisroovers: yes | 23:34 |
jorisroovers | fungi, ok let me try that | 23:34 |
*** eharney has quit IRC | 23:34 | |
*** slong_ has joined #openstack-infra | 23:34 | |
*** slong has quit IRC | 23:35 | |
*** thuc has quit IRC | 23:35 | |
*** thuc has joined #openstack-infra | 23:36 | |
fungi | wow, neutron changes are still gate wrecking balls. when i see one failing, they inevitably have more than one failed voting job | 23:37 |
anteaya | fungi: which one? | 23:38 |
russellb | "gate wrecking balls" | 23:38 |
russellb | wow | 23:38 |
fungi | 4 days since the last successful merge into neutron master | 23:38 |
fungi | anteaya: the current failure near the top of the gate | 23:39 |
anteaya | yeah we are having some problems with isolated testing | 23:39 |
anteaya | I see https://review.openstack.org/#/c/65245/ failing | 23:39 |
anteaya | what is it taking down with it? | 23:39 |
fungi | anteaya: a few hundred virtual machines | 23:39 |
jorisroovers | fungi, so, I believe the issue is that I moved a file to a new directory and that file got updated in master before my patch got merged | 23:39 |
fungi | jorisroovers: sounds likely | 23:40 |
jorisroovers | fungi, how do I now move that latest version of the file | 23:40 |
jorisroovers | or better, update the version that I moved to the latest version in master | 23:41 |
anteaya | fungi I feel like I am the target of your frustration | 23:41 |
anteaya | which I understand | 23:41 |
jeblair | fungi, clarkb: when we want to restart zuul to pick up clarkb's patch, i think we could pause all the jenkins masters and let the event queue catch up; then when it's 0, we can save it, restart zuul, restore the queue and unpause the masters | 23:41 |
fungi | anteaya: not at all! | 23:41 |
anteaya | but what I lack is context to go back and take action | 23:41 |
*** markmcclain has quit IRC | 23:41 | |
jeblair | mordred: ^ | 23:41 |
* fungi is in no way frustrated, merely mentioning that pretty much every neutron change in the gate right now is causing resets all the way up as it travels | 23:41 | |
jeblair | mordred, clarkb, fungi: (iow, that is an outline of a possible generalized way to allow zuul's queues to catch up in situations like this) | 23:42 |
fungi | jeblair: that sounds reasonable | 23:42 |
anteaya | fungi: if that is the case, do you want them all sniped out? | 23:42 |
fungi | anteaya: i have no particular wants in this situation, but others might | 23:43 |
anteaya | if the decision is to snipe I will snipe | 23:43 |
anteaya | I have worked hard as have others to remove the scapegoat energy from neutron | 23:43 |
fungi | jeblair: i likely won't be online for the zuul restart (i expect it to happen about when i'm sprinting between gates in vegas) but sounds good | 23:43 |
anteaya | we seemed to have been making progress and I don't want to undo all our hard work today | 23:43 |
* anteaya is prepared to snipe | 23:43 | |
fungi | anteaya: most (all?) of the neutron changes from the sprint etherpad weren't passing check jobs, but the tempest changes all went in so maybe rechecking some now would get fresh results on their efficacy | 23:44 |
fungi | anteaya: unless the devs are still in a pow-wow over how to improve them first | 23:45 |
anteaya | would that change the rate of failure on the neutron jobs in the gate? | 23:45 |
anteaya | mostly they keep asking me what they can do to fix the gate situation and not make it worse | 23:46 |
anteaya | I have most of them in a holding pattern | 23:46 |
anteaya | I was suggesting to them to not submit anything new to gerrit that wasn't absolutely required | 23:47 |
anteaya | I can recheck the sprint etherpad neutron changes but I would prefer to wait until i2 is cut tomorrow | 23:47 |
anteaya | load on zuul and all | 23:47 |
sdague | anteaya: I think the neutron changes need to be prioritized based on what will make the isolated job pass first | 23:47 |
*** blamar has quit IRC | 23:47 | |
anteaya | I would like to focus on the gate tonight | 23:47 |
sdague | because it's failure rate is super high right now | 23:48 |
russellb | fungi: looks like no dsvm jobs running at the head of the gate, may be a good time to promote? | 23:48 |
anteaya | if is failing, and I am paraphrasing salv-orlando here because the isolated job is not in the voting jobs, it is just in the experiemental jobs | 23:49 |
fungi | russellb: yep, promotingitnow | 23:49 |
fungi | wow, serious spacebar fail | 23:49 |
clarkb | anteaya: isolated is voting | 23:49 |
* russellb crosses fingers he doesn't break the world | 23:49 | |
anteaya | so to ensure new patches don't break the isolated job, the experimental jobs need to pass first | 23:49 |
*** praneshp has quit IRC | 23:49 | |
clarkb | gate-tempest-dsvm-neutron-isolated and gate-tempest-dsvm-neutron-isolated-pg | 23:50 |
anteaya | then I am confused | 23:50 |
clarkb | er pg-isolated | 23:50 |
*** thuc has quit IRC | 23:50 | |
*** thuc has joined #openstack-infra | 23:50 | |
fungi | anteaya: isolated is voting *on neutron changes* so any neutron changes approved into the gate are basically destined to fail | 23:51 |
anteaya | so I should snipe them | 23:51 |
anteaya | since I don't know what patches are designed to fix isolated jobs | 23:51 |
fungi | or we should set those jobs non-voting check-only, if that's the decision | 23:51 |
anteaya | that is what I am hearing | 23:51 |
anteaya | but then that code gets merged and then we have to go back and fix that in order to make those jobs voting again | 23:52 |
anteaya | we haven't merged anything in 4 days | 23:53 |
anteaya | not merging anything today is not a hardship | 23:53 |
sdague | anteaya: parallel is not voting | 23:53 |
sdague | but serial isolated is | 23:53 |
fungi | another alternative is to revoke approval on neutron like was done for stable branches, but that's a very heavy hammer and i'm not a fan of it | 23:54 |
*** thuc has quit IRC | 23:55 | |
anteaya | I see two neutron patches in the gate | 23:55 |
anteaya | I will snipe them and shoulder the neutron fallout | 23:55 |
jorisroovers | fungi, still strugling here, starting over... | 23:55 |
fungi | yeah, i think those were probably approved before word got around | 23:55 |
*** alexpilotti has quit IRC | 23:56 | |
anteaya | https://review.openstack.org/#/c/65245/ and https://review.openstack.org/#/c/53609/ | 23:57 |
fungi | jorisroovers: sorry, i don't know off the top of my head how to work around git's file move detection not recognizing it during a rebase/conflict resolution. i started to search the internet for recommendations, but it's not easy from an airplane | 23:57 |
anteaya | I'd rather take the wrath of neutron than try to undo the wrath of everybody else directed at neutron | 23:57 |
fungi | jorisroovers: maybe someone else here knows (or can find you) the answer if you're coming up short | 23:57 |
jorisroovers | fungi, no worries | 23:57 |
jorisroovers | I tried checking out the relevant files from master and moving them again, but that didn't really work | 23:58 |
fungi | jorisroovers: i would probably resort to cherry-picking the changes, maybe with a manual move of the file i between | 23:58 |
fungi | i'm sure there's a way to do it in a interactiv rebase short of manually diffing the two files during the conflict resolution, i just don't know what the best alternative is | 23:59 |
*** praneshp has joined #openstack-infra | 23:59 |
Generated by irclog2html.py 2.14.0 by Marius Gedminas - find it at mg.pov.lt!